Skip to content

Sarishc/RAG-Chatbot

Repository files navigation

RAG Chatbot

A production-grade Retrieval-Augmented Generation (RAG) chatbot built with Python, FastAPI, ChromaDB, and OpenAI. This chatbot allows you to ask questions about your documents and get accurate answers based on the content.

Features

  • 🚀 Fast & Modern API: Built with FastAPI for high performance
  • 📚 Multi-format Support: Handles PDF, TXT, and Markdown files
  • 🧠 Smart Chunking: Intelligent text chunking with overlap for better context
  • 🔍 Vector Search: Uses ChromaDB for efficient similarity search
  • 💬 GPT-4o-mini: Powered by OpenAI's latest model for accurate answers
  • 🎨 Beautiful UI: Includes a modern chat interface
  • 📝 Production Ready: Comprehensive logging, error handling, and validation

Tech Stack

  • Python 3.8+
  • FastAPI - Web framework
  • ChromaDB - Vector database for embeddings
  • OpenAI API - Embeddings (text-embedding-3-small) and Chat (GPT-4o-mini)
  • PyPDF - PDF processing
  • Uvicorn - ASGI server

Project Structure

RAG Chatbot/
├── app.py              # FastAPI application with endpoints
├── ingest.py           # Document ingestion pipeline
├── rag.py              # RAG retrieval and answer generation
├── test_rag.py         # Test script
├── requirements.txt    # Python dependencies
├── README.md           # This file
├── .env                # Environment variables (create this)
├── docs/               # Place your documents here
└── chroma_db/          # ChromaDB storage (auto-created)

Installation

1. Clone or Download the Project

cd "RAG Chatbot"

2. Create a Virtual Environment (Recommended)

python -m venv venv

# On macOS/Linux:
source venv/bin/activate

# On Windows:
venv\Scripts\activate

3. Install Dependencies

pip install -r requirements.txt

4. Set Up Environment Variables

Create a .env file in the project root:

touch .env

Add your OpenAI API key to the .env file:

OPENAI_API_KEY=sk-your-api-key-here

Get your API key: Visit OpenAI Platform

Usage

Step 1: Add Documents

Place your documents in the docs/ folder:

# Create docs folder if it doesn't exist
mkdir -p docs

# Add your documents (PDF, TXT, or MD files)
cp /path/to/your/document.pdf docs/

Step 2: Ingest Documents

Run the ingestion pipeline to process documents and store them in ChromaDB:

python ingest.py

This will:

  • Read all documents from the docs/ folder
  • Split them into chunks (500-1000 characters)
  • Create embeddings using OpenAI
  • Store everything in ChromaDB

Expected output:

2024-01-01 12:00:00 - __main__ - INFO - Starting document ingestion pipeline...
2024-01-01 12:00:00 - __main__ - INFO - Processing file: document.pdf
2024-01-01 12:00:00 - __main__ - INFO - Created 15 chunks from document.pdf
2024-01-01 12:00:00 - __main__ - INFO - Total chunks created: 15
2024-01-01 12:00:00 - __main__ - INFO - Creating embeddings...
2024-01-01 12:00:01 - __main__ - INFO - Successfully ingested 15 chunks into ChromaDB
2024-01-01 12:00:01 - __main__ - INFO - Ingestion pipeline completed successfully!

Step 3: Run the API

Start the FastAPI server:

uvicorn app:app --reload

The server will start at http://localhost:8000

Step 4: Use the Chatbot

Option A: Web Interface (Recommended)

Open your browser and go to:

http://localhost:8000

You'll see a beautiful chat interface where you can ask questions!

Option B: API Endpoint

Send POST requests to the /ask endpoint:

curl -X POST "http://localhost:8000/ask" \
  -H "Content-Type: application/json" \
  -d '{"question": "What is this document about?"}'

Response format:

{
  "answer": "Based on the provided documents...",
  "context": [
    {
      "text": "Relevant chunk of text...",
      "source": "document.pdf",
      "chunk_index": "0"
    }
  ]
}

Option C: Test Script

Run the test script to see the RAG system in action:

python test_rag.py

API Documentation

Once the server is running, visit:

Endpoints

GET /

Returns the chat UI (HTML page)

GET /health

Health check endpoint

Response:

{
  "status": "healthy",
  "service": "RAG Chatbot API",
  "version": "1.0.0"
}

POST /ask

Ask a question to the chatbot

Request body:

{
  "question": "Your question here"
}

Response:

{
  "answer": "The answer based on your documents",
  "context": [
    {
      "text": "Relevant text chunk",
      "source": "filename.pdf",
      "chunk_index": "0"
    }
  ]
}

Configuration

You can modify these settings in the respective files:

Ingestion Configuration (ingest.py)

DOCS_FOLDER = "./docs"           # Where to read documents from
CHROMA_DB_PATH = "./chroma_db"   # Where to store the database
CHUNK_SIZE = 750                 # Characters per chunk (500-1000)
CHUNK_OVERLAP = 100              # Overlap between chunks
EMBEDDING_MODEL = "text-embedding-3-small"  # OpenAI embedding model

RAG Configuration (rag.py)

TOP_K = 5                        # Number of chunks to retrieve
CHAT_MODEL = "gpt-4o-mini"       # OpenAI chat model

How It Works

1. Document Ingestion

Documents → Read & Parse → Chunk Text → Create Embeddings → Store in ChromaDB
  • Reads PDF, TXT, and MD files
  • Splits documents into overlapping chunks for better context
  • Uses OpenAI's text-embedding-3-small to create vector embeddings
  • Stores chunks with metadata in ChromaDB

2. Question Answering

Question → Create Embedding → Search ChromaDB → Retrieve Top-K → Generate Answer
  • Converts question to embedding
  • Finds most similar chunks using vector search
  • Passes relevant chunks to GPT-4o-mini
  • Generates answer using only the provided context

3. System Prompt

The chatbot is instructed to:

  • Use ONLY the provided context to answer questions
  • Say "I don't know" if information isn't in the documents
  • Be concise and accurate
  • Not make up information

Troubleshooting

"OPENAI_API_KEY not found"

  • Make sure you created a .env file in the project root
  • Add your API key: OPENAI_API_KEY=sk-...
  • Restart the server after adding the key

"ChromaDB collection not found"

  • Run the ingestion pipeline first: python ingest.py
  • Make sure documents are in the docs/ folder

"No documents processed"

  • Check that your documents are in the docs/ folder
  • Supported formats: .pdf, .txt, .md
  • Check file permissions

Port 8000 already in use

# Use a different port
uvicorn app:app --reload --port 8001

Re-ingesting Documents

To update the knowledge base with new documents:

  1. Add new files to the docs/ folder
  2. Run the ingestion pipeline again:
    python ingest.py

This will delete the old ChromaDB collection and create a new one with all documents.

Development

Running Tests

python test_rag.py

Checking Logs

The application uses Python logging. Check the console output for detailed logs.

Adding New Document Types

To support additional file types, modify the read_document() function in ingest.py:

def read_document(file_path: str) -> str:
    extension = Path(file_path).suffix.lower()
    
    if extension == '.pdf':
        return read_pdf(file_path)
    elif extension in ['.txt', '.md']:
        return read_text_file(file_path)
    elif extension == '.docx':  # Add your custom handler
        return read_docx(file_path)
    # ...

Performance Tips

  1. Chunk Size: Adjust CHUNK_SIZE based on your documents

    • Smaller chunks (500): More precise, but may miss context
    • Larger chunks (1000): More context, but less precise
  2. Top-K: Adjust TOP_K for retrieval

    • More chunks: Better context, but more tokens/cost
    • Fewer chunks: Faster, cheaper, but may miss information
  3. Embeddings: The text-embedding-3-small model is cost-effective

    • For better quality: Use text-embedding-3-large
  4. Chat Model: gpt-4o-mini is fast and affordable

    • For better reasoning: Use gpt-4o

Cost Estimation

Approximate costs (as of November 2024):

  • Ingestion (one-time per document):

    • Embeddings: $0.00002/1K tokens
    • Example: 100 pages ≈ $0.10
  • Each Question:

    • Query embedding: $0.00002/1K tokens ≈ $0.00001
    • GPT-4o-mini: $0.150/1M input tokens, $0.600/1M output tokens
    • Example: ≈ $0.001 per question

License

This project is open source and available for personal and commercial use.

Support

For issues or questions:

  1. Check the troubleshooting section
  2. Review the logs for error messages
  3. Ensure all dependencies are installed
  4. Verify your OpenAI API key is valid

Next Steps

  • Add authentication for production use
  • Implement caching for faster responses
  • Add support for more document types
  • Deploy to cloud (AWS, GCP, Azure)
  • Add conversation history
  • Implement streaming responses

Built with ❤️ using Python, FastAPI, ChromaDB, and OpenAI

About

A production-ready Retrieval-Augmented Generation (RAG) chatbot built with Python, FastAPI, ChromaDB, and OpenAI embeddings. Includes document ingestion, vector search, and a conversational API.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages