RAG Chatbot

This is a Retrieval-Augmented Generation (RAG) chatbot built with Python, using Ollama for local LLM inference, ChromaDB for vector storage, and Gradio for the web interface. It processes documents (e.g., PDFs) to answer questions based on their content.

Features

Local RAG Pipeline: Runs entirely offline using Ollama for LLM inference and embeddings, ensuring privacy and no API costs.
Multi-Format Document Support: Handles PDFs (via PyMuPDF), DOCX, Markdown, JSON (structured entities/relations), and plain text files.
Efficient Processing: Uses recursive text splitting with configurable chunk sizes, batch embedding generation with progress tracking, and persistent ChromaDB storage to avoid reprocessing.
Strict Context-Based Answers: LLM responses are constrained to document content only, preventing hallucinations or external knowledge leakage.
User-Friendly Interface: Web-based Gradio UI for easy interaction, with real-time question-answering.
Customizable: Easy to swap models (e.g., Qwen2.5-Coder for LLM, All-MiniLM for embeddings) and adjust parameters like chunk size or batch processing.

Prerequisites

Before setting up the project, ensure you have the following:

Python 3.8 or higher: Download from python.org. Verify with python --version.
Ollama: A tool for running large language models locally.
- Download and install from ollama.com.
- Start Ollama: Run ollama serve in your terminal (or it may start automatically).
- Validate Ollama is running: Open a new terminal and run curl http://localhost:11434. You should see a response like Ollama is running.
- Pull required models:
  - For the main LLM: ollama pull qwen2.5-coder:1.5b
  - For embeddings: ollama pull all-minilm (Note: Ensure Ollama supports this embedding model; if not available, you may need to adjust the code to use a compatible embedding model like nomic-embed-text).
Documents: Place your input file (e.g., Accessibility_Web_WCAG2_2_easy_EN_2025_revilla_carreras.pdf) in the resources/ folder. This folder is gitignored, so add your own files.

Installation

Clone the Repository:

git clone https://github.com/yourusername/rag-chatbot.git
cd rag-chatbot

Create a Virtual Environment (Recommended to isolate dependencies):
- On macOS/Linux:
```
python -m venv venv
source venv/bin/activate
```
- On Windows (if using Git Bash or similar):
```
python -m venv venv
source venv/Scripts/activate
```
- Verify activation: Your terminal prompt should show (venv).
Install Python Dependencies: Run the following command to install required packages:
```
pip install ollama gradio numpy langchain langchain-community langchain-chroma chromadb pymupdf python-docx unstructured markdown
```
- This covers: Ollama client, Gradio UI, NumPy, LangChain (text splitters, loaders for PDF/DOCX/MD/JSON/TXT), ChromaDB, and PyMuPDF for PDF handling.
- If you encounter issues with unstructured, it may require additional system dependencies (e.g., libmagic on macOS via brew install libmagic).
Optional: Create a requirements.txt: For reproducibility, generate one after installation:
```
pip freeze > requirements.txt
```
Then install via pip install -r requirements.txt in future setups.

Running the Application

Ensure Ollama is running (see Prerequisites).
Run the chatbot script:
```
python chatbot.py
```
The Gradio interface will launch in your browser (usually at http://127.0.0.1:7860). Ask questions about the document in the textbox.

Usage

The script processes the PDF in resources/ on first run, creating a ChromaDB vector store in ./chroma_db_* (gitignored).
Subsequent runs reuse the database unless deleted.
Questions are answered strictly based on the document context using the RAG pipeline.
To process a different file, update FILE_PATH in chatbot.py and rerun.

Troubleshooting

Ollama not responding: Ensure ollama serve is running and models are pulled.
Embedding errors: Verify the embedding model is compatible with Ollama; fallback to nomic-embed-text if needed.
PDF loading issues: Install pymupdf via pip install pymupdf.
Port conflicts: Gradio may use a different port; check the terminal output.

Notes

The vector database is stored locally and persists across runs.
For production, consider Dockerizing or adding error handling.
This setup runs entirely locally for privacy.

If you encounter issues, check the console output or provide more details.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
.continue/mcpServers		.continue/mcpServers
.gradio/flagged		.gradio/flagged
.vscode		.vscode
assets		assets
chroma_db_2024_07_25-Whitepaper-UX-UI-Design-Essentials		chroma_db_2024_07_25-Whitepaper-UX-UI-Design-Essentials
chroma_db_Accessibility_Web_WCAG2_2_easy_EN_2025_revilla_carreras		chroma_db_Accessibility_Web_WCAG2_2_easy_EN_2025_revilla_carreras
chroma_db_The_basics_of_user_experience_design		chroma_db_The_basics_of_user_experience_design
chroma_db_zivarri-mcp-memory		chroma_db_zivarri-mcp-memory
resources		resources
.DS_Store		.DS_Store
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
chatbot.py		chatbot.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RAG Chatbot

Features

Prerequisites

Installation

Running the Application

Usage

Troubleshooting

Notes

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

RAG Chatbot

Features

Prerequisites

Installation

Running the Application

Usage

Troubleshooting

Notes

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages