A local Retrieval-Augmented Generation (RAG) system that allows users to upload PDF documents and ask questions using a locally running Large Language Model (LLM) via Ollama.
This project demonstrates how to build a production-style GenAI document assistant without using paid APIs.
- Upload and index PDF documents
- Generate embeddings using Ollama
- Vector storage using FAISS
- Semantic search over documents
- Question answering with citations
- Filename-based document filtering
- Local LLM inference
- REST API with FastAPI
- GitHub Actions CI
- Ruff linting
- Pytest testing
Client → FastAPI → FAISS → Ollama (Embeddings + LLM)
- Python 3.10+
- FastAPI
- Ollama
- FAISS
- NumPy
- PyPDF
- Requests
- Pytest
- Ruff
Download:
Start Ollama:
ollama serve
Pull models:
ollama pull qwen2.5:1.5b
ollama pull nomic-embed-text
git clone https://github.com/nivith1029/chatbot.git
cd chatbot
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
uvicorn rag_service:app --reload --port 8001
Server URL:
curl -X POST "http://127.0.0.1:8001/rag/ingest"
-F "file=@document.pdf"
curl -X POST "http://127.0.0.1:8001/rag/query"
-H "Content-Type: application/json"
-d '{
"question": "What is the deadline?",
"filename": "document.pdf",
"top_k": 5
}'
{ "answer": "- Deadline: April 15, 2025 (Source 1)", "sources": [ { "filename": "policy.pdf", "page_num": 1 } ], "latency_ms": 28000, "model": "qwen2.5:1.5b" }
Run tests:
pytest
Run lint:
ruff check .
- HR document assistants
- Legal document search
- Compliance monitoring
- Internal knowledge base
- Contract analysis
- Resume screening
- Research assistants
- Web UI
- Docker support
- Cloud deployment
- Authentication
- Caching
- Multi-user support
Nivith Avula
GitHub: https://github.com/nivith1029
Focus Areas:
- Generative AI
- RAG Systems
- LLM Engineering
- Backend APIs
- Cloud & DevOps
MIT License