RAG Chatbot with Advanced Retrieval

title	RAG Chatbot
emoji	🤖
colorFrom	blue
colorTo	purple
sdk	docker
app_port	7860
pinned	false

RAG Chatbot with Advanced Retrieval

A question-answering system that lets you upload documents and ask questions about them. The system retrieves relevant information from your documents and generates accurate answers.

How It Works

When You Upload a Document

1. Upload File (PDF/DOCX/TXT)
        ↓
2. Extract Text
        ↓
3. Split into Chunks (512 tokens each)
        ↓
4. Convert to Embeddings (384D vectors)
        ↓
5. Store in Vector Database (Qdrant)
        ↓
6. Save Metadata in MongoDB

What happens: Your document is broken into small chunks, each chunk is converted into a numerical vector that captures its meaning, and stored in a database for fast searching.

When You Ask a Question

1. Type Your Question
        ↓
2. Check Cache (answered before?)
        ↓
3. Search Documents (if RAG is ON)
   - BM25: Find keyword matches
   - Vector: Find similar meanings
        ↓
4. Rerank Results (pick top 5 most relevant)
        ↓
5. Build Context from Chunks
        ↓
6. Generate Answer with LLM
        ↓
7. Stream Response to You

What happens: The system searches for relevant chunks from your documents, combines them as context, and uses an AI model to generate an answer based on that context.

Key Components

Document Processing

DocumentProcessor - Main coordinator for document uploads

Validates file type and size
Calls the right loader for PDF, DOCX, or TXT files
Manages the entire processing pipeline

Embedder - Converts text to vectors

Uses FastEmbed with BAAI/bge-small-en-v1.5 model
Generates 384-dimensional vectors for semantic search
Each chunk becomes a searchable vector

Qdrant Vector Store - Stores embeddings

Fast similarity search across millions of vectors
Returns most relevant chunks for any query
Handles all vector operations

Question Answering

HybridRetriever - Finds relevant information

BM25: Traditional keyword search (good for exact matches)
Vector Search: Semantic search (understands meaning)
Combines both for better results

Reranker - Improves search quality

Uses FlashRank model to score relevance
Filters the best 5 chunks from 20 candidates
Ensures only the most relevant context is used

Generator - Creates answers

Uses Groq LLM (llama-3.1-70b)
Streams responses in real-time
Bases answers on retrieved context when RAG is ON
Uses general knowledge when RAG is OFF

Semantic Cache - Speeds up responses

Remembers previous questions and answers
Returns cached response if same question asked again
Separate caches for RAG ON vs RAG OFF

Memory & Storage

Conversation Memory - Remembers chat history

Stores last 10 messages in Redis
Enables follow-up questions
Each session has independent history

MongoDB - Document metadata

Tracks uploaded documents
Stores file info, upload time, chunk count
Links to vectors in Qdrant

Redis - Fast caching

Stores conversation history
Caches LLM responses
In-memory for instant access

Technology Stack

LangChain 0.3.13 - RAG framework
Groq API - Fast LLM (llama-3.1-70b)
FastEmbed - Embedding generation
FlashRank - Result reranking
Qdrant - Vector database
MongoDB - Document storage
Redis - Caching layer
FastAPI - Web framework

Quick Start

Installation

# Clone and install
git clone https://github.com/Abeshith/RAG.git
cd RAG
pip install -r requirements.txt

Configuration

Create .env file:

GROQ_API_KEY=your_groq_key
MONGODB_URI=your_mongodb_uri
REDIS_URL=your_redis_url
QDRANT_URL=your_qdrant_url
QDRANT_API_KEY=your_qdrant_key
JWT_SECRET_KEY=your_secret_key

Run

uvicorn app.main:app --host 0.0.0.0 --port 7860

Open: http://localhost:7860

Usage

Upload Documents: Click upload, select PDF/DOCX/TXT file
Ask Questions: Type question in chat box
Toggle RAG:
- ON = answers from your documents
- OFF = general knowledge answers
View Sources: See which document chunks were used

API Endpoints

GET  /health/                    - Check system status
POST /chat/stream                - Send question, get streaming answer
POST /documents/upload           - Upload new document
GET  /documents/                 - List all documents
GET  /documents/stats            - Get document statistics
DELETE /documents/{id}           - Delete specific document

Docker Deployment

docker build -t rag-chatbot .
docker run -p 7860:7860 --env-file .env rag-chatbot

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
.github/workflows		.github/workflows
app		app
config		config
frontend		frontend
ingestion		ingestion
prompts		prompts
tests		tests
workers		workers
.env.example		.env.example
.gitignore		.gitignore
DEPLOYMENT.md		DEPLOYMENT.md
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
docker-compose.yml		docker-compose.yml
pytest.ini		pytest.ini
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RAG Chatbot with Advanced Retrieval

How It Works

When You Upload a Document

When You Ask a Question

Key Components

Document Processing

Question Answering

Memory & Storage

Technology Stack

Quick Start

Installation

Configuration

Run

Usage

API Endpoints

Docker Deployment

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

RAG Chatbot with Advanced Retrieval

How It Works

When You Upload a Document

When You Ask a Question

Key Components

Document Processing

Question Answering

Memory & Storage

Technology Stack

Quick Start

Installation

Configuration

Run

Usage

API Endpoints

Docker Deployment

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages