RAG AI Assistant — Chat with Your Documents

A full-stack AI application that allows users to upload documents (PDF or text) and ask questions about them using a Retrieval-Augmented Generation (RAG) pipeline.

Overview

This project implements a production-style RAG system that combines:

Document ingestion (PDF and text)
Text chunking and embedding generation
Vector similarity search
LLM-based question answering

Users can interact through a simple web interface to query their own documents, similar to a private ChatGPT.

Tech Stack

Backend

Python
FastAPI
FAISS (vector database)
OpenAI API (embeddings + LLM)
Docker

Frontend

Streamlit
Requests

Features

Upload PDF or text documents
Intelligent text chunking for improved retrieval
Semantic search using vector embeddings (FAISS)
LLM-powered question answering (RAG pipeline)
Interactive UI with Streamlit
Persistent vector storage (FAISS index + documents)
Rate limiting for API protection
Basic metrics endpoint
Logging and error handling
Fully dockerized (API + UI)

Running Locally

1. Set environment variables

Create a .env file:

OPENAI_API_KEY=your_api_key_here

2. Run with Docker

docker-compose up --build

3. Access the app

API docs: http://localhost:8000/docs
UI: http://localhost:8501

API Endpoints

Health Check

GET /v1/health

Response:

{
  "status": "healthy"
}

Upload Document

POST /v1/upload

Upload a .txt or .pdf file. The document is processed, chunked, embedded, and stored in the vector database.

Ask Question

POST /v1/ask

Request:

{
  "question": "What is this document about?"
}

Response:

{
  "answer": "..."
}

Metrics

GET /v1/metrics

Returns basic usage statistics.

How It Works

Documents are uploaded and converted into text
Text is split into overlapping chunks
Each chunk is transformed into an embedding
Embeddings are stored in a FAISS vector index
User queries are embedded and matched against stored chunks
Relevant context is retrieved and passed to an LLM
The LLM generates a grounded answer based on the context

Production-Oriented Features

API versioning (/v1/...)
Rate limiting to prevent abuse
Logging of requests and errors
Persistent storage of vector index
Separation of services (API + UI)
Environment-based configuration

Future Improvements

Authentication (API keys / JWT)
Streaming responses (real-time answers)
Advanced UI (chat interface with history)
Hybrid search (keyword + semantic)
Model switching (local vs API-based)
CI/CD pipeline
Monitoring (Prometheus + Grafana)

Live Demo

Frontend: https://llm-rag-api-frontend.onrender.com
Backend: https://llm-rag-api.onrender.com/docs

Use Case

This project demonstrates how to build real-world AI applications such as:

Document Q&A systems
Internal knowledge assistants
Customer support copilots
AI-powered search engines

Author

Built as part of a Machine Learning / AI Engineering portfolio project.

Motivation

This project showcases the ability to:

Design and implement RAG systems
Work with LLMs in production-like environments
Build full-stack AI applications
Deploy scalable, modular systems using modern tools

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
app		app
ui		ui
.gitignore		.gitignore
Dockerfile.api		Dockerfile.api
Dockerfile.ui		Dockerfile.ui
README.md		README.md
docker-compose.yml		docker-compose.yml
requirements-ui.txt		requirements-ui.txt
requirements.txt		requirements.txt
start.sh		start.sh
template_env		template_env

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RAG AI Assistant — Chat with Your Documents

Overview

Tech Stack

Backend

Frontend

Features

Running Locally

1. Set environment variables

2. Run with Docker

3. Access the app

API Endpoints

How It Works

Production-Oriented Features

Future Improvements

Live Demo

Use Case

Author

Motivation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

RAG AI Assistant — Chat with Your Documents

Overview

Tech Stack

Backend

Frontend

Features

Running Locally

1. Set environment variables

2. Run with Docker

3. Access the app

API Endpoints

How It Works

Production-Oriented Features

Future Improvements

Live Demo

Use Case

Author

Motivation

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages