Advanced document analysis system using OpenAI Agents SDK with autonomous multi-agent orchestration, RAG pipeline, and interactive UI.
This system implements a sophisticated multi-agent architecture using OpenAI Agents SDK (v0.6.1) for intelligent PDF document analysis. It features autonomous intent detection, retrieval-augmented generation (RAG), specialized reasoning agents, and an interactive Streamlit interface with citation highlighting.
User Query → Planner Agent (Intent Detection)
↓
Appropriate Agent Chain
↓
RAG Agent (Retrieval + Generation)
↓
Specialized Reasoning Agent
↓
Response with Cited Evidence
- Planner Agent - Autonomous orchestrator using handoffs
- RAG Agent - Retrieval-augmented generation with FAISS
- Summarization Agent - Full-document summarization
- Comparator Agent - Cross-document comparison analysis
- Timeline Builder Agent - Chronological event organization
- Aggregator Agent - Multi-source information synthesis
| Component | Technology |
|---|---|
| Agent Framework | OpenAI Agents SDK v0.6.1 |
| LLM | OpenAI (provider-agnostic) |
| Vector Database | FAISS (IndexFlatIP) |
| Embeddings | sentence-transformers (384-dim) |
| PDF Processing | pdfplumber + PyMuPDF |
| UI Framework | Streamlit |
✅ Autonomous Intent Detection - No manual mode selection
✅ RAG Pipeline - Semantic search with grounded responses
✅ Multi-Document Analysis - Cross-document retrieval
✅ Citation Tracking - Every answer includes ranked evidence
✅ Interactive PDF Viewer - Click-to-navigate with highlighting
✅ Agent Orchestration - Dynamic agent chaining
✅ Tool Calling - Agents call Python functions (@function_tool)
✅ Autonomous Handoffs - LLM-driven delegation (no manual routing)
✅ Global State Management - Tools access shared Vector Store
✅ Evidence Highlighting - Yellow highlights on cited passages
✅ Execution Tracing - Transparent agent workflow via Runner logs
- Python 3.9+
- OpenAI API key (Get one here) OR a Gemini API Key
# Clone repository
git clone <repository-url>
cd pdf_agent_system
# Install dependencies
pip install -r requirements.txt# Copy environment template
cp .env.example .env
# Edit .env and add your API key
OPENAI_API_KEY=your_key_herestreamlit run app.pyAccess the application at: http://localhost:8501
pdf_agent_system/
├── agents/
│ ├── __init__.py
│ ├── tools.py # Standalone tools for SDK agents
│ ├── rag_agent.py # RAG Agent definition
│ ├── summarization_agent.py # Summarization Agent definition
│ ├── specialized_agents.py # Reasoning Agents (Comparator, Timeline, etc.)
│ └── planner_agent.py # Orchestrator with Handoffs
├── utils/
│ ├── __init__.py
│ ├── state.py # Singleton for tool access
│ ├── pdf_processor.py # PDF extraction + chunking
│ └── vector_store.py # FAISS vector database
├── config/
│ ├── __init__.py
│ └── settings.py # Configuration
├── app.py # Streamlit UI
├── requirements.txt # Dependencies
├── .env.example # Configuration template
├── .gitignore # Git ignore rules
└── README.md # This file
We use the native Agent and Runner primitives:
from agents import Agent, Runner
# Agents invoke tools and hand off to others
result = Runner.run_sync(planner_agent, user_query)
print(result.final_output)Tools are defined using the @function_tool decorator and access shared state:
@function_tool
def retrieve_documents(query: str):
"""Retrieve relevant chunks"""
return global_state.vector_store.search(query)The Planner Agent uses instructions and the handoffs list to route dynamically:
planner_agent = Agent(
name="Planner",
instructions="Route queries to the correct specialist...",
handoffs=[rag_agent, summarization_agent, comparator_agent]
)User: "What are the main findings in the research paper?"
System Flow:
- Planner delegates to RAG Agent
- RAG Agent calls 'retrieve_documents' tool
- Agent generates answer with citations
Output:
Answer: "The research identifies three main findings: [1] X, [2] Y, [3] Z"
User: "Compare the methodologies across these papers"
System Flow:
- Planner delegates to RAG Agent
- RAG Agent retrieves methodology sections
- RAG Agent hands off to Comparator Agent
- Comparator Agent analyzes differences
Output:
Structured comparison with specific examples
# Required
OPENAI_API_KEY=sk-your-key-here
# Optional (defaults shown)
CHUNK_SIZE=1000
CHUNK_OVERLAP=200
TOP_K_RETRIEVAL=5| Model | Input (per 1M tokens) | Output (per 1M tokens) |
|---|---|---|
| gpt-4o-mini | $0.150 | $0.600 |
| gpt-4o | $2.50 | $10.00 |
- Per Query: ~2,000 input + 500 output tokens = ~$0.0006
- Per Session: ~10 queries = ~$0.006
- OpenAI - Agents SDK framework
- Facebook Research - FAISS vector search
- Sentence Transformers - Embedding models
- Streamlit - Interactive UI framework
✨ Built with OpenAI Agents SDK v0.6.1 | Multi-Agent Architecture ✨