RAG Chatbot

A production-grade Retrieval-Augmented Generation (RAG) chatbot built with Python, FastAPI, ChromaDB, and OpenAI. This chatbot allows you to ask questions about your documents and get accurate answers based on the content.

Features

🚀 Fast & Modern API: Built with FastAPI for high performance
📚 Multi-format Support: Handles PDF, TXT, and Markdown files
🧠 Smart Chunking: Intelligent text chunking with overlap for better context
🔍 Vector Search: Uses ChromaDB for efficient similarity search
💬 GPT-4o-mini: Powered by OpenAI's latest model for accurate answers
🎨 Beautiful UI: Includes a modern chat interface
📝 Production Ready: Comprehensive logging, error handling, and validation

Tech Stack

Python 3.8+
FastAPI - Web framework
ChromaDB - Vector database for embeddings
OpenAI API - Embeddings (text-embedding-3-small) and Chat (GPT-4o-mini)
PyPDF - PDF processing
Uvicorn - ASGI server

Project Structure

RAG Chatbot/
├── app.py              # FastAPI application with endpoints
├── ingest.py           # Document ingestion pipeline
├── rag.py              # RAG retrieval and answer generation
├── test_rag.py         # Test script
├── requirements.txt    # Python dependencies
├── README.md           # This file
├── .env                # Environment variables (create this)
├── docs/               # Place your documents here
└── chroma_db/          # ChromaDB storage (auto-created)

Installation

1. Clone or Download the Project

cd "RAG Chatbot"

2. Create a Virtual Environment (Recommended)

python -m venv venv

# On macOS/Linux:
source venv/bin/activate

# On Windows:
venv\Scripts\activate

3. Install Dependencies

pip install -r requirements.txt

4. Set Up Environment Variables

Create a .env file in the project root:

touch .env

Add your OpenAI API key to the .env file:

OPENAI_API_KEY=sk-your-api-key-here

Get your API key: Visit OpenAI Platform

Usage

Step 1: Add Documents

Place your documents in the docs/ folder:

# Create docs folder if it doesn't exist
mkdir -p docs

# Add your documents (PDF, TXT, or MD files)
cp /path/to/your/document.pdf docs/

Step 2: Ingest Documents

Run the ingestion pipeline to process documents and store them in ChromaDB:

python ingest.py

This will:

Read all documents from the docs/ folder
Split them into chunks (500-1000 characters)
Create embeddings using OpenAI
Store everything in ChromaDB

Expected output:

2024-01-01 12:00:00 - __main__ - INFO - Starting document ingestion pipeline...
2024-01-01 12:00:00 - __main__ - INFO - Processing file: document.pdf
2024-01-01 12:00:00 - __main__ - INFO - Created 15 chunks from document.pdf
2024-01-01 12:00:00 - __main__ - INFO - Total chunks created: 15
2024-01-01 12:00:00 - __main__ - INFO - Creating embeddings...
2024-01-01 12:00:01 - __main__ - INFO - Successfully ingested 15 chunks into ChromaDB
2024-01-01 12:00:01 - __main__ - INFO - Ingestion pipeline completed successfully!

Step 3: Run the API

Start the FastAPI server:

uvicorn app:app --reload

The server will start at http://localhost:8000

Step 4: Use the Chatbot

Option A: Web Interface (Recommended)

Open your browser and go to:

http://localhost:8000

You'll see a beautiful chat interface where you can ask questions!

Option B: API Endpoint

Send POST requests to the /ask endpoint:

curl -X POST "http://localhost:8000/ask" \
  -H "Content-Type: application/json" \
  -d '{"question": "What is this document about?"}'

Response format:

{
  "answer": "Based on the provided documents...",
  "context": [
    {
      "text": "Relevant chunk of text...",
      "source": "document.pdf",
      "chunk_index": "0"
    }
  ]
}

Option C: Test Script

Run the test script to see the RAG system in action:

python test_rag.py

API Documentation

Once the server is running, visit:

Interactive API docs: http://localhost:8000/docs
Alternative docs: http://localhost:8000/redoc

Endpoints

`GET /`

Returns the chat UI (HTML page)

`GET /health`

Health check endpoint

Response:

{
  "status": "healthy",
  "service": "RAG Chatbot API",
  "version": "1.0.0"
}

`POST /ask`

Ask a question to the chatbot

Request body:

{
  "question": "Your question here"
}

Response:

{
  "answer": "The answer based on your documents",
  "context": [
    {
      "text": "Relevant text chunk",
      "source": "filename.pdf",
      "chunk_index": "0"
    }
  ]
}

Configuration

You can modify these settings in the respective files:

Ingestion Configuration (`ingest.py`)

DOCS_FOLDER = "./docs"           # Where to read documents from
CHROMA_DB_PATH = "./chroma_db"   # Where to store the database
CHUNK_SIZE = 750                 # Characters per chunk (500-1000)
CHUNK_OVERLAP = 100              # Overlap between chunks
EMBEDDING_MODEL = "text-embedding-3-small"  # OpenAI embedding model

RAG Configuration (`rag.py`)

TOP_K = 5                        # Number of chunks to retrieve
CHAT_MODEL = "gpt-4o-mini"       # OpenAI chat model

How It Works

1. Document Ingestion

Documents → Read & Parse → Chunk Text → Create Embeddings → Store in ChromaDB

Reads PDF, TXT, and MD files
Splits documents into overlapping chunks for better context
Uses OpenAI's text-embedding-3-small to create vector embeddings
Stores chunks with metadata in ChromaDB

2. Question Answering

Question → Create Embedding → Search ChromaDB → Retrieve Top-K → Generate Answer

Converts question to embedding
Finds most similar chunks using vector search
Passes relevant chunks to GPT-4o-mini
Generates answer using only the provided context

3. System Prompt

The chatbot is instructed to:

Use ONLY the provided context to answer questions
Say "I don't know" if information isn't in the documents
Be concise and accurate
Not make up information

Troubleshooting

"OPENAI_API_KEY not found"

Make sure you created a .env file in the project root
Add your API key: OPENAI_API_KEY=sk-...
Restart the server after adding the key

"ChromaDB collection not found"

Run the ingestion pipeline first: python ingest.py
Make sure documents are in the docs/ folder

"No documents processed"

Check that your documents are in the docs/ folder
Supported formats: .pdf, .txt, .md
Check file permissions

Port 8000 already in use

# Use a different port
uvicorn app:app --reload --port 8001

Re-ingesting Documents

To update the knowledge base with new documents:

Add new files to the docs/ folder
Run the ingestion pipeline again:
```
python ingest.py
```

This will delete the old ChromaDB collection and create a new one with all documents.

Development

Running Tests

python test_rag.py

Checking Logs

The application uses Python logging. Check the console output for detailed logs.

Adding New Document Types

To support additional file types, modify the read_document() function in ingest.py:

def read_document(file_path: str) -> str:
    extension = Path(file_path).suffix.lower()
    
    if extension == '.pdf':
        return read_pdf(file_path)
    elif extension in ['.txt', '.md']:
        return read_text_file(file_path)
    elif extension == '.docx':  # Add your custom handler
        return read_docx(file_path)
    # ...

Performance Tips

Chunk Size: Adjust CHUNK_SIZE based on your documents
- Smaller chunks (500): More precise, but may miss context
- Larger chunks (1000): More context, but less precise
Top-K: Adjust TOP_K for retrieval
- More chunks: Better context, but more tokens/cost
- Fewer chunks: Faster, cheaper, but may miss information
Embeddings: The text-embedding-3-small model is cost-effective
- For better quality: Use text-embedding-3-large
Chat Model: gpt-4o-mini is fast and affordable
- For better reasoning: Use gpt-4o

Cost Estimation

Approximate costs (as of November 2024):

Ingestion (one-time per document):
- Embeddings: $0.00002/1K tokens
- Example: 100 pages ≈ $0.10
Each Question:
- Query embedding: $0.00002/1K tokens ≈ $0.00001
- GPT-4o-mini: $0.150/1M input tokens, $0.600/1M output tokens
- Example: ≈ $0.001 per question

License

This project is open source and available for personal and commercial use.

Support

For issues or questions:

Check the troubleshooting section
Review the logs for error messages
Ensure all dependencies are installed
Verify your OpenAI API key is valid

Next Steps

Add authentication for production use
Implement caching for faster responses
Add support for more document types
Deploy to cloud (AWS, GCP, Azure)
Add conversation history
Implement streaming responses

Built with ❤️ using Python, FastAPI, ChromaDB, and OpenAI

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
docs		docs
.env.example		.env.example
.gitignore		.gitignore
ARCHITECTURE.md		ARCHITECTURE.md
PROJECT_SUMMARY.md		PROJECT_SUMMARY.md
QUICKSTART.md		QUICKSTART.md
README.md		README.md
app.py		app.py
ingest.py		ingest.py
rag.py		rag.py
requirements.txt		requirements.txt
test_rag.py		test_rag.py

Folders and files

Latest commit

History

Repository files navigation

RAG Chatbot

Features

Tech Stack

Project Structure

Installation

1. Clone or Download the Project

2. Create a Virtual Environment (Recommended)

3. Install Dependencies

4. Set Up Environment Variables

Usage

Step 1: Add Documents

Step 2: Ingest Documents

Step 3: Run the API

Step 4: Use the Chatbot

Option A: Web Interface (Recommended)

Option B: API Endpoint

Option C: Test Script

API Documentation

Endpoints

GET /

GET /health

POST /ask

Configuration

Ingestion Configuration (ingest.py)

RAG Configuration (rag.py)

How It Works

1. Document Ingestion

2. Question Answering

3. System Prompt

Troubleshooting

"OPENAI_API_KEY not found"

"ChromaDB collection not found"

"No documents processed"

Port 8000 already in use

Re-ingesting Documents

Development

Running Tests

Checking Logs

Adding New Document Types

Performance Tips

Cost Estimation

License

Support

Next Steps

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

`GET /`

`GET /health`

`POST /ask`

Ingestion Configuration (`ingest.py`)

RAG Configuration (`rag.py`)

Packages