A production-ready document Q&A system using cooperating AI agents, Gemini AI, LangGraph orchestration, and Typesense hybrid search with persistent memory and intelligent context engineering.
- Multi-Agent System: 4 specialized AI agents working together using LangGraph
- Document Upload: Support for PDF, DOCX, and text files
- Hybrid Search: Combines semantic (vector) and keyword search using Typesense
- Intelligent Context Engineering: Token-aware context management and optimization
- Persistent Memory: Conversation history stored in Typesense for scalability
- Session Management: Multi-user support with session-based conversations
- Agent Transparency: Detailed processing steps and agent analysis in responses
- Query Analyzer Agent: Analyzes user intent, extracts key concepts, determines query complexity
- Search Strategy Agent: Chooses optimal search approach based on query analysis
- Document Retrieval Agent: Executes searches using determined strategy
- Answer Synthesis Agent: Combines information and generates comprehensive responses
Query → Analyzer → Strategy → Retrieval → Synthesis → Response
- Install Dependencies
pip install -r requirements.txt- Start Typesense
docker compose up typesense -d- Environment Configuration
cp .env.example .env
# Edit .env with your Gemini API key- Run Application
python main.py# Build the image
docker build -t smart-document-qa .
# Then run it
docker run -p 8000:8000 --env-file .env smart-document-qa# Start both Typesense and the app together
docker compose up --buildRecommended: Use docker compose up --build for the easiest setup!
POST /upload- Upload documentsPOST /ask- Ask questions about documents (returns enhanced response with agent analysis)GET /sessions/{session_id}/history- Get conversation historyDELETE /sessions/{session_id}- Clear session
The /ask endpoint now returns detailed agent analysis:
{
"answer": "Comprehensive answer based on retrieved documents",
"session_id": "user123",
"processing_steps": ["query_analyzed", "strategy_determined", "documents_retrieved", "answer_synthesized"],
"agent_analysis": {
"intent": "definition",
"key_concepts": ["artificial intelligence", "machine learning"],
"search_strategy": "semantic_focused"
}
}import requests
# Upload document
files = {'file': open('document.pdf', 'rb')}
response = requests.post('http://localhost:8000/upload', files=files)
# Ask question with enhanced response
question_data = {
"question": "What is the main topic of the document?",
"session_id": "user123"
}
response = requests.post('http://localhost:8000/ask', json=question_data)
result = response.json()
print(f"Answer: {result['answer']}")
print(f"Agent Analysis: {result['agent_analysis']}")
print(f"Processing Steps: {result['processing_steps']}")flowchart TD
A[User] --> B[FastAPI Server]
B --> C{Request Type}
C -->|Upload| D[Document Processing]
D --> E[Text Extraction]
E --> F[Gemini Embeddings]
F --> G[Typesense Storage]
C -->|Question| H[Multi-Agent System]
H --> I[Query Analyzer Agent]
I --> J[Search Strategy Agent]
J --> K[Document Retrieval Agent]
K --> L[Answer Synthesis Agent]
K --> M[Typesense Hybrid Search]
M --> N[Context Engineering]
N --> L
L --> O[Enhanced Response]
O --> P[Memory Storage]
P --> B
G --> M
style A fill:#e1f5fe
style B fill:#f3e5f5
style H fill:#fff3e0
style G fill:#e8f5e8
style O fill:#f0f4c3
- FastAPI: REST API framework
- LangGraph: Multi-agent orchestration and workflow management
- Gemini 2.5 Flash: LLM for intelligent reasoning and response generation
- Gemini Embeddings: Vector embeddings for semantic search
- Typesense: Hybrid search engine and persistent memory storage
- Context Engineering: Intelligent token management and optimization
- Intelligent Query Analysis: Understands user intent and query complexity
- Adaptive Search Strategies: Chooses optimal search approach per query type
- Context-Aware Responses: Maintains conversation context across multiple turns
- Scalable Memory: Persistent storage that survives application restarts
- Production Ready: Comprehensive error handling and clean architecture
See FUTURE_IMPLEMENTATION.md for planned monitoring, evaluation, and advanced learning features including: