A FastAPI backend for document processing and question answering using Retrieval-Augmented Generation (RAG). Built for HackRx 6.0 Hackathon.
RAG Query System enables accurate question answering over documents using retrieval, query decomposition, and reranking techniques. It supports hybrid search, web search, and language translation tools to enhance LLM capabilities.
- Hybrid Search: Combines semantic and keyword search for improved accuracy.
- Query Decomposition: Splits complex questions into simpler sub-queries for better retrieval.
- Reranking: Selects the most relevant document chunks for precise answers.
- Web Browser & Translation: Integrates web search and language translation tools for the LLM.
- FastAPI (backend API)
- unstructured.io & PyMuPDF4LLM (document parsing and chunking)
- PostgreSQL + SQLAlchemy (database via Supabase)
- Pinecone (vector database, embedding, and reranking models; fully provided by GCP Marketplace)
- Google Gemini (LLM)
- Google Cloud Platform (hosting; using GCP Free Trial)
The following system dependencies are installed via the Dockerfile:
curlca-certificateslibgl1libmagic-devlibglib2.0-0poppler-utilstesseract-ocrtesseract-ocr-engtesseract-ocr-mallibreofficewgetpandoc(installed from official release)
- Python 3.12
- uv (dependency management)
- Access to required API keys (see
.env.example)
-
Install
uv: Follow the uv installation guide. -
Clone the repository:
git clone https://github.com/heshinth/rag-query-system cd rag-query-system -
Install dependencies:
uv sync
-
Configure environment variables:
- Copy
.env.exampleto.envand fill in your credentials.
- Copy
-
Run the application:
fastapi run ./app/main.py
GET /api/v1/health— Health checkPOST /api/v1/hackrx/run— Process documents and answer questions
| Model | Provider | Purpose |
|---|---|---|
| Gemini 2.5 Flash | Google AI Studio | Main LLM |
| Gemini 2.0 Flash Lite | Google AI Studio | Query decomposition |
| Cohere Rerank 3.5 | Pinecone | Reranking |
| llama-text-embed-v2 | Pinecone | Dense embedding & semantic search |
| pinecone-sparse-english-v0 | Pinecone | Sparse embedding & lexical search |
This project is licensed under the MIT License - see the LICENSE file for details.