Developed by Luminary AI Solutions LLC
The Open Deep Research API Framework provides a powerful, modular, multi-agency foundation for building sophisticated AI-powered research systems. Instead of a single monolithic application, this framework allows you to create distinct Agencies (e.g., Deep Research, Financial Analysis, Code Review), each orchestrating multiple specialized Large Language Model (LLM) Agents. Agencies leverage shared core tools for LLMs to invoke (currently no tools are implemented) and Services to be invoked programatically. Currently implemented are web search, advanced content scraping, chunking, and ranking to generate comprehensive, cited reports, often streamed over WebSockets.
Build Your Own Research AI! This framework is designed for extension. Easily add new agencies, agents, or services to tackle diverse research domains.
- Highly Modular Design: Built for extension! Easily add new specialized research
Agencies(e.g., for finance, legal) inapp/agencies/or enhance existing ones with newAgents. - Multi-Agency Architecture: Organizes research tasks using specialized
Agencies. Each agency runs independently with its own set of agents and orchestration logic. - Shareable Core Tools and Services: Common tasks like web search (
app/services/search), content scraping (app/services/scraper), text chunking (app/services/chunking), PDF handling (app/services/scraping_utils), and ranking (app/agencies/services/ranking.py) are isolated inapp/services/orapp/agencies/services/, ready to be reused by any agency. an LLM tool directory is also implemented but currently no tools are implemented. - Agency-Specific Orchestration: Each agency defines its own workflow logic in its
orchestrator.pyfile, allowing for diverse and complex research processes tailored to the domain. - Clear Agent Roles: Agents within an agency typically have defined responsibilities (e.g., Planning, Summarizing, Writing, Refining), simplifying development, testing, and maintenance (
app/agencies/<agency_name>/agents.py). - Structured LLM Interaction: Leverages Pydantic (
app/core/schemas.py,app/agencies/<agency_name>/schemas.py) to define clear input/output schemas for LLM agents, ensuring reliable and validated data flow. - Integrated Web Search: Built-in support for Serper (
app/services/search/serper_service.py), easily adaptable for other providers. - Advanced Content Scraping: Uses Crawl4AI (
app/services/scraper/crawl4ai_scraper.py) for robust web content extraction, including utilities for handling complex sites and formats like PDFs. - Content Reranking: Employs reranking models (e.g., via Together AI API) to prioritize the most relevant search results and text chunks for LLM context (
app/agencies/services/ranking.py, used byhelpers.py). - Asynchronous Streaming: Provides real-time progress updates via WebSockets (see
websocket_guide.md). - Configurable: Shared provider credentials live in environment variables, while each agency/task can use its own model IDs via env-backed config.
- Optional State Persistence: Can track task status and store final results using Firestore (
firestore_schema.md).
The framework promotes modularity through a clear separation of concerns:
- Agencies (
app/agencies/<agency_name>/): Self-contained units focused on a specific research domain. Each agency typically contains:orchestrator.py: Defines the main workflow and sequence of steps for the agency.agents.py: Implements the specialized LLM-powered agents (e.g., Planner, Writer) used in the orchestration.schemas.py: Defines Pydantic models for the agency's specific data structures and agent outputs.prompts.py(Optional): Stores prompts used by the agents.helpers.py(Optional): Contains utility functions specific to the agency's workflow, often combining calls to shared services.
- Services (
app/services/,app/agencies/services/): Reusable, often non-LLM components providing core functionalities like search, scraping, chunking, ranking, etc. These are designed to be stateless and callable by any agency's orchestrator or helpers. - Core (
app/core/): Contains application-wide configurations (config.py), common Pydantic schemas (schemas.py), exception handling (exceptions.py), and the FastAPI application setup (main.py). - Pydantic Schemas: Act as the "glue" defining the data contracts between agents, services, and the API layer, ensuring consistency and enabling validation.
The included deep_research agency (app/agencies/deep_research/) serves as an example implementation, orchestrating the following steps:
- Planning: The
Planneragent generates aWritingPlanandSearchTasks. - Initial Search: Calls the
SearchService. - Initial Reranking: Uses the
RankingService(viahelpers.py) to prioritize results. - Content Processing: Calls
ScraperService(viahelpers.py), then usesSummarizeragent orChunkingService+RankingService(viahelpers.py). - Initial Writing: The
Writeragent creates a draft using processed content, potentially requesting more info viaSearchRequesttags. - Refinement Loop: Executes further searches if requested, processes new content, and uses the
Refineragent with only new information to update the draft iteratively. - Final Assembly: Formats citations and adds a reference list using helper functions.
- Response & Persistence: Sends the final report and usage stats via WebSocket and saves to Firestore (if configured).
This detailed workflow is specific to the deep_research agency; other agencies can implement entirely different processes while reusing the core services.
- Framework: FastAPI
- Data Validation & Settings: Pydantic V2
- LLM Interaction: Primarily OpenAI API client (via libraries like
openaior potentially routing services like OpenRouter), adaptable for others supporting structured output (JSON mode/Tool Calling). Pydantic enforces output structure. - Multi-Agency Orchestration: Custom logic or Pydantic-AI within
app/agencies/ - LLM tools To be implemented with Pydantic-AI in future agencies.
- Web Scraping: Crawl4AI (via
app/services/scraper/) - PDF Parsing: MarkItDown (via Crawl4AI or directly in
app/services/scraping_utils/) - Web Search: Serper API (via
app/services/search/) - Reranking: Together AI API (via
app/agencies/services/ranking.py) - Chunking: Custom implementation in
app/services/chunking/ - State Persistence (Optional): Google Firestore
- Language: Python 3.10+
- Async:
asyncio
Pydantic V2 is fundamental to the API's reliability and structure:
- API Layer: Validates incoming requests (
ResearchRequest) and outgoing WebSocket messages (WebSocketUpdateHandler), ensuring schema adherence. - Configuration: Manages application settings robustly (
app/core/config.py). - LLM Interaction: Defines precise Pydantic models (
app/agencies/deep_research/schemas.py) used as the required output format for LLM agents (e.g.,PlannerOutput,WriterOutput). This allows direct parsing and validation of agent responses, catching errors early. - Internal Data Flow: Structures data passed between components (e.g.,
SearchResult,Chunk,UsageStatisticsinapp/core/schemas.py), reducing errors from inconsistent data handling.
This project relies heavily on the ability of modern LLMs (like OpenAI's GPT series) to generate output conforming to a specified structure, particularly JSON schemas derived from Pydantic models.
- Prompt Engineering: Prompts for agents (
app/agencies/deep_research/agents.py) include instructions and often JSON schema definitions or examples to guide the LLM. - API Features: We utilize LLM API features (e.g., OpenAI's
response_format={"type": "json_object"}and function/tool calling where appropriate) to encourage structured output. - Validation: The Pydantic models defined in
schemas.pyact as the final validator. The application attempts to parse the LLM's string output directly into the target Pydantic model. If parsing fails, it indicates the LLM didn't adhere to the requested structure, and appropriate error handling or retries are triggered.
This approach replaces the need for external libraries like LiteLLM solely for managing multiple providers, focusing instead on leveraging provider-specific features for reliable structured output generation guided by Pydantic schemas.
- Clone the repository:
git clone https://github.com/siddiki8/ODR-api.git cd ODR-api - Install uv (recommended) if you do not have it:
# Windows (PowerShell) irm https://astral.sh/uv/install.ps1 | iex # macOS / Linux curl -LsSf https://astral.sh/uv/install.sh | sh
- Install dependencies (uses
pyproject.toml+uv.lock):Alternatively, with plain pip:uv sync uv run python -m playwright install --with-deps
pip install -r requirements.txt(file is generated from the lockfile viauv export) and thenplaywright install --with-deps.
The API relies on environment variables for configuration, particularly API keys.
-
Create a
.envfile in the project root directory. -
Copy
.env.exampleto.envand fill in your credentials:# Shared provider credentials SERPER_API_KEY="your_serper_api_key" OPENROUTER_API_KEY="your_openrouter_api_key" COHERE_API_KEY="your_cohere_api_key" TOGETHER_API_KEY="your_together_api_key" # Optional: Firestore persistence (app runs fine without it) FIREBASE_SERVICE_ACCOUNT_KEY_JSON="/path/to/your/service-account-key.json" # Deep Research task-level model selection DEEP_RESEARCH_PLANNER_MODEL_ID="openrouter/openrouter/quasar-alpha" DEEP_RESEARCH_SUMMARIZER_MODEL_ID="google/gemini-2.0-flash-001" DEEP_RESEARCH_WRITER_MODEL_ID="google/gemini-2.5-pro-preview-03-25" DEEP_RESEARCH_REFINER_MODEL_ID="google/gemini-2.5-pro-preview-03-25" DEEP_RESEARCH_RERANK_PROVIDER="cohere" # CPE task-level model selection CPE_PLANNER_MODEL_ID="openrouter/openrouter/quasar-alpha" CPE_EXTRACTOR_MODEL_ID="google/gemini-2.5-pro-preview-03-25"
See
.env.examplefor all available options including per-provider base URLs and scraper settings.
-
Run the FastAPI server:
uv run uvicorn app.main:app --reload --port 8000
If you use a classic venv instead of uv, run
uvicorn app.main:app --reload --port 8000with that environment activated. The API will be available athttp://localhost:8000. Access the interactive docs athttp://localhost:8000/docs. -
Interact via WebSocket:
- The primary endpoint for the deep research agency is
/deep_research/ws/research. - Connect using a WebSocket client (see
clients/ws_client.pyβ run withpython clients/ws_client.py "your query here"from the repo root). - Send an initial JSON message matching the
ResearchRequestschema (app/core/schemas.py):{ "query": "Your research query here", "max_search_tasks": null } - Receive JSON status updates conforming to the structure in
websocket_guide.md(also seedocs/FRONTEND_API.mdfor a concise frontend reference):{ "step": "STEP_NAME", // e.g., "PLANNING", "SEARCHING", "RANKING", "PROCESSING", "WRITING", "REFINING", "FINALIZING", "COMPLETE", "ERROR" "status": "STATUS", // e.g., "START", "END", "IN_PROGRESS", "SUCCESS", "ERROR", "WARNING", "INFO" "message": "Human-readable status message", "details": { ... } // Optional dictionary with context (structure varies) } - The final success message (
step: "COMPLETE", status: "END") includes final usage statistics indetails. The actual report content is sent earlier in thestep: "FINALIZING", status: "END"message'sdetails.
- The primary endpoint for the deep research agency is
-
Company Profile Extractor (CPE) Agency:
- WebSocket endpoint:
/cpe/ws/cpeβ send aCPERequestJSON message withquery, optionallocation, andmax_search_tasks. - Results (array of company profiles) arrive in the final
COMPLETEmessage'sdetails. Seeclients/cpe_client.pyfor a full example.
- WebSocket endpoint:
-
Other Endpoints:
GET /β health check.GET /deep_research/result/{task_id}β fetch persisted deep research result from Firestore.POST /deep_research/stop/{task_id}β request cancellation.GET /cpe/result/{task_id}β fetch persisted CPE result from Firestore.POST /cpe/stop/{task_id}β request cancellation.- See
docs/FRONTEND_API.mdfor the full endpoint reference.
Contributions are highly encouraged! This framework is designed to grow. Help us build a diverse ecosystem of powerful research agencies.
How to Contribute:
- Add a New Agency:
- Create Directory: Make a new folder
app/agencies/your_agency_name/. - Define Components: Inside, create
__init__.py,orchestrator.py,agents.py, andschemas.py. Addprompts.pyorhelpers.pyas needed. - Implement Logic:
- Write your orchestration flow in
orchestrator.py. - Define your agent logic in
agents.py, leveraging LLMs and structured output via Pydantic schemas defined inschemas.py. - Reuse core services from
app/services/(e.g.,SearchService,WebScraper) by importing and calling them in your orchestrator or helpers.
- Write your orchestration flow in
- Define API Endpoint: Add FastAPI routes (e.g., a WebSocket endpoint) for your new agency in
app/main.pyor by creating a dedicated router in your agency directory and including it inapp/main.py. - Add Configuration: Update
app/core/config.pyif your agency requires specific settings. - Document: Add a README or update this one explaining your agency's purpose and workflow.
- Create Directory: Make a new folder
- Add a New Service:
- Create a new module or directory under
app/services/(for general services) or potentiallyapp/agencies/services/(if strongly tied to agent concepts like ranking). - Implement your service logic (e.g., connecting to a new search API, implementing a data analysis tool).
- Ensure it's easily callable, ideally stateless, and potentially asynchronous.
- Add necessary configuration to
app/core/config.py.
- Create a new module or directory under
- Enhance Existing Components: Improve agents, services, error handling, add tests, or refine documentation.
General Contribution Steps:
- Fork the repository.
- Create a new branch (
git checkout -b feature/your-feature-name). - Make your changes.
- Add tests for your changes (highly recommended!).
- Ensure code passes linting and formatting checks (e.g., using Ruff/Black).
- Commit your changes (
git commit -m 'Add some feature'). - Push to the branch (
git push origin feature/your-feature-name). - Open a Pull Request against the main repository.
Other Areas for Contribution:
- LLM Support: Improve compatibility or add configuration options for more LLM providers (especially those with strong structured output support).
- Specialized Scrapers: Add robust scrapers for specific sites or content types in
app/services/scraping_utils/. - Error Handling & Resilience: Refine exception handling, retries, and state management across the framework.
- Testing: Add more comprehensive unit, integration, and agent simulation tests.
- Documentation: Improve READMEs, code comments, architecture diagrams, or API documentation (
websocket_guide.md,firestore_schema.md,docs/FRONTEND_API.md).
This project builds upon concepts and architectures explored in academic research. If you use or extend this work, please consider citing the relevant papers, including:
@misc{alzubi2025opendeepsearchdemocratizing,
title={Open Deep Search: Democratizing Search with Open-source Reasoning Agents},
author={Salaheddin Alzubi and Creston Brooks and Purva Chiniya and Edoardo Contente and Chiara von Gerlach and Lucas Irwin and Yihan Jiang and Arda Kaz and Windsor Nguyen and Sewoong Oh and Himanshu Tyagi and Pramod Viswanath},
year={2025},
eprint={2503.20201},
archivePrefix={arXiv},
primaryClass={cs.LG},
url={https://arxiv.org/abs/2503.20201},
}Luminary AI Solutions LLC - info@luminarysolutions.ai - luminarysolutions.ai
