This project provides a robust Document Chatbot utilizing a FastAPI backend and a React (Vite) frontend. It leverages LangGraph for orchestrated agent workflows and RAGAS for automated evaluation of retrieval and generation quality.
- 📤 Multi-Document Upload: Upload multiple PDF documents (e.g., Apple, Tesla, Uber financial reports) for indexing.
- 💬 Intelligent Chat: Ask complex questions and receive context-aware answers citing specific documents.
- 🕵️ Agentic Workflow: Uses a graph of agents (Classifier, Retriever, Summarizer, Judge) to handle different query types.
- ⚖️ Automated Evaluation: Integrated RAGAS suite to audit response quality (faithfulness, relevancy, correctness) against ground truth.
- 📝 Judge Feedback: Real-time feedback from a "Judge" agent on the quality of the generated answer.
The system uses a sophisticated agent workflow:
- Router: Classifies query (General vs. RAG vs. Summarization).
- Retriever: Fetches relevant chunks from ChromaDB.
- Generator: Synthesizes answer using context.
- Judge: Verifies the answer against the retrieved context before returning to the user.
Document-QA/
├── backend/ # FastAPI Backend
│ ├── app/ # Main application package
│ │ ├── api/ # API layer
│ │ │ └── routers/ # API route handlers
│ │ │ ├── chat.py # Chat endpoints
│ │ │ ├── documents.py # Document upload
│ │ │ └── evaluation.py # RAGAS evaluation
│ │ ├── core/ # Core configuration
│ │ │ └── config.py # Environment variables
│ │ ├── models/ # Data models
│ │ │ └── schemas.py # Pydantic schemas
│ │ ├── services/ # Business logic
│ │ │ ├── workflow/ # LangGraph agents
│ │ │ │ ├── agents.py # Agent implementations
│ │ │ │ ├── graph.py # Workflow graph
│ │ │ │ └── chat.py # Chat service
│ │ │ ├── rag/ # RAG functionality
│ │ │ │ ├── indexing.py # Document indexing
│ │ │ │ └── documents.py # Document service
│ │ │ └── evaluation/ # Evaluation
│ │ │ └── ragas.py # RAGAS metrics
│ │ ├── main.py # FastAPI app entry
│ │ └── utils.py # Shared utilities
│ ├── data/ # Data storage
│ │ ├── chroma_db/ # Vector database
│ │ ├── doc_store/ # Document store
│ │ └── outputs/ # Generated files
│ ├── prompts/ # YAML prompt templates
│ │ ├── rag_core.yml # Core RAG prompts
│ │ └── rag_eval.yml # Evaluation prompts
│ ├── tests/ # Test files
│ └── pyproject.toml # Python dependencies
├── frontend/ # React + Vite Frontend
│ ├── components/ # React components
│ ├── services/ # API client
│ ├── App.tsx # Main app component
│ └── package.json # Node dependencies
├── docker-compose.yml # Docker orchestration
└── .env # Environment variables
Ensure the following are installed:
- Docker & Docker Compose
- Python 3.11+ (for local dev)
- Node.js 18+ (for local dev)
-
Clone the repository
git clone <repo-url> cd document-qa
-
Environment Setup
- Create a
.envfile in the root directory. - Add your API keys:
OPENAI_API_KEY=your_key_here OPENAI_MODEL_NAME=gpt-4o EMBEDDING_MODEL=text-embedding-3-small HF_TOKEN=your_huggingface_token_if_needed
- Create a
-
Run the Application
docker-compose up --build
-
Access the App
- Frontend: http://localhost:5173
- Backend API: http://localhost:8000/docs
Backend:
cd backend
uv sync
uv run uvicorn app.main:app --reload --port 8000Frontend:
cd frontend
npm install
npm run devThe project includes a built-in evaluation tab to run RAGAS metrics.
- Navigate to the Evaluations tab in the UI.
- Click "Start Document Audit".
- View detailed metrics:
- Faithfulness: Is the answer derived from context?
- Answer Correctness: Does it match ground truth?
- Context Recall: Did we retrieve all necessary info?
