Multi-Agent-Document-QA

A production-ready document Q&A system using cooperating AI agents, Gemini AI, LangGraph orchestration, and Typesense hybrid search with persistent memory and intelligent context engineering.

Features

Multi-Agent System: 4 specialized AI agents working together using LangGraph
Document Upload: Support for PDF, DOCX, and text files
Hybrid Search: Combines semantic (vector) and keyword search using Typesense
Intelligent Context Engineering: Token-aware context management and optimization
Persistent Memory: Conversation history stored in Typesense for scalability
Session Management: Multi-user support with session-based conversations
Agent Transparency: Detailed processing steps and agent analysis in responses

Agent Architecture

Cooperating AI Agents

Query Analyzer Agent: Analyzes user intent, extracts key concepts, determines query complexity
Search Strategy Agent: Chooses optimal search approach based on query analysis
Document Retrieval Agent: Executes searches using determined strategy
Answer Synthesis Agent: Combines information and generates comprehensive responses

Agent Workflow

Query → Analyzer → Strategy → Retrieval → Synthesis → Response

Setup (for dev)

Install Dependencies

pip install -r requirements.txt

Start Typesense

docker compose up typesense -d

Environment Configuration

cp .env.example .env
# Edit .env with your Gemini API key

Run Application

python main.py

Docker Options (for deployment )

Option 1: Build Docker Image First

# Build the image
docker build -t smart-document-qa .

# Then run it
docker run -p 8000:8000 --env-file .env smart-document-qa

Option 2: Use Docker Compose (Recommended)

# Start both Typesense and the app together
docker compose up --build

Recommended: Use docker compose up --build for the easiest setup!

API Endpoints

POST /upload - Upload documents
POST /ask - Ask questions about documents (returns enhanced response with agent analysis)
GET /sessions/{session_id}/history - Get conversation history
DELETE /sessions/{session_id} - Clear session

Enhanced Response Format

The /ask endpoint now returns detailed agent analysis:

{
  "answer": "Comprehensive answer based on retrieved documents",
  "session_id": "user123",
  "processing_steps": ["query_analyzed", "strategy_determined", "documents_retrieved", "answer_synthesized"],
  "agent_analysis": {
    "intent": "definition",
    "key_concepts": ["artificial intelligence", "machine learning"],
    "search_strategy": "semantic_focused"
  }
}

Usage Example

import requests

# Upload document
files = {'file': open('document.pdf', 'rb')}
response = requests.post('http://localhost:8000/upload', files=files)

# Ask question with enhanced response
question_data = {
    "question": "What is the main topic of the document?",
    "session_id": "user123"
}
response = requests.post('http://localhost:8000/ask', json=question_data)
result = response.json()

print(f"Answer: {result['answer']}")
print(f"Agent Analysis: {result['agent_analysis']}")
print(f"Processing Steps: {result['processing_steps']}")

Architecture

flowchart TD
    A[User] --> B[FastAPI Server]
    B --> C{Request Type}
    
    C -->|Upload| D[Document Processing]
    D --> E[Text Extraction]
    E --> F[Gemini Embeddings]
    F --> G[Typesense Storage]
    
    C -->|Question| H[Multi-Agent System]
    H --> I[Query Analyzer Agent]
    I --> J[Search Strategy Agent]
    J --> K[Document Retrieval Agent]
    K --> L[Answer Synthesis Agent]
    
    K --> M[Typesense Hybrid Search]
    M --> N[Context Engineering]
    N --> L
    
    L --> O[Enhanced Response]
    O --> P[Memory Storage]
    P --> B
    
    G --> M
    
    style A fill:#e1f5fe
    style B fill:#f3e5f5
    style H fill:#fff3e0
    style G fill:#e8f5e8
    style O fill:#f0f4c3

Technology Stack

FastAPI: REST API framework
LangGraph: Multi-agent orchestration and workflow management
Gemini 2.5 Flash: LLM for intelligent reasoning and response generation
Gemini Embeddings: Vector embeddings for semantic search
Typesense: Hybrid search engine and persistent memory storage
Context Engineering: Intelligent token management and optimization

Key Capabilities

Intelligent Query Analysis: Understands user intent and query complexity
Adaptive Search Strategies: Chooses optimal search approach per query type
Context-Aware Responses: Maintains conversation context across multiple turns
Scalable Memory: Persistent storage that survives application restarts
Production Ready: Comprehensive error handling and clean architecture

Future Enhancements

See FUTURE_IMPLEMENTATION.md for planned monitoring, evaluation, and advanced learning features including:

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
schema		schema
services		services
utils		utils
.env.example		.env.example
.gitignore		.gitignore
Dockerfile		Dockerfile
FUTURE_IMPLEMENTATION.md		FUTURE_IMPLEMENTATION.md
README.md		README.md
docker-compose.yml		docker-compose.yml
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Multi-Agent-Document-QA

Features

Agent Architecture

Cooperating AI Agents

Agent Workflow

Setup (for dev)

Docker Options (for deployment )

Option 1: Build Docker Image First

Option 2: Use Docker Compose (Recommended)

API Endpoints

Enhanced Response Format

Usage Example

Architecture

Technology Stack

Key Capabilities

Future Enhancements

About

Uh oh!

Releases 1

Packages

Languages

Devparihar5/multi-agent-document-qa

Folders and files

Latest commit

History

Repository files navigation

Multi-Agent-Document-QA

Features

Agent Architecture

Cooperating AI Agents

Agent Workflow

Setup (for dev)

Docker Options (for deployment )

Option 1: Build Docker Image First

Option 2: Use Docker Compose (Recommended)

API Endpoints

Enhanced Response Format

Usage Example

Architecture

Technology Stack

Key Capabilities

Future Enhancements

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages