Skip to content

menloresearch/deep-research

Repository files navigation

🔬 DeepResearcher

Next-Generation Research Intelligence

Unleashing the power of advanced AI for comprehensive analysis, multi-dimensional reasoning, and breakthrough insights

Status AI Powered Research Agent



🌟 What is DeepResearcher?

DeepResearcher represents a paradigm shift in AI-driven research methodology. Developed by Menlo Research, this cutting-edge research agent transcends traditional information retrieval by employing sophisticated multi-hop reasoning, contextual synthesis, and adaptive learning mechanisms.

Our model doesn't just find answers—it understands, connects, and discovers.


Empowering researchers, analysts, and innovators to unlock deeper understanding through intelligent automation.

⚙️ Setup

  1. Clone the Repository:

    git clone https://github.com/menloresearch/deep-research.git
    cd deep-research
  2. Set up Python Environment (using uv): If you use pyenv, it will automatically pick up the version from .python-version.

    # Create a virtual environment
    uv venv
    # Activate the virtual environment
    source .venv/bin/activate
    # Install dependencies
    uv pip install -r requirements.txt # Or `uv pip install` if pyproject.toml is primary

    Note: This installs dependencies specified in pyproject.toml and locked in uv.lock.

  3. Configure Environment Variables: Copy the example environment file and fill in your API keys:

    cp .env.example .env
    nano .env  # Or your preferred editor

    You'll need to provide keys for services like TAVILY_API_KEY, HF_TOKEN, etc., as used by the tools.

💡 Running the Demo

You can experience the Deep Research model in action in two primary ways:

1. Local Demo with demo_app.py (Recommended for Development)

This sets up a Gradio interface to interact with the agent.

  • Step A: Start the VLLM Model Server (Critical!) The demo application (src/demo_app.py) expects a VLLM-compatible API server running locally. We use a fine-tuned Qwen3-14B model. Open a new terminal and run:

    # (Activate your virtual environment if not already done: source .venv/bin/activate)
    vllm serve jan-hq/Qwen3-14B-v0.2-deepresearch-no-think-100-step \
        --port 8000 \
        --tensor-parallel-size 1 \
        --max-model-len 8192 # Adjust as per your model's needs
        # Add --enable-prefix-caching or other VLLM optimizations if desired

    Note: Ensure you have VLLM installed (uv pip install vllm) and sufficient GPU memory. The model jan-hq/Qwen3-14B-v0.2-deepresearch-no-think-100-step is specifically configured in demo_app.py. If you change it, update the script.

  • Step B: Run the Gradio Demo Application In another terminal (with the virtual environment activated):

    python src/demo_app.py

    This will launch a Gradio web interface, typically at http://127.0.0.1:7860. Open this URL in your browser to interact with the agent.

2. Via Jan App (Production Model)

Our production-ready Deep Research model is integrated into the Jan App.

🏋️ Training

📚 Data Preparation & RAG System

The Deep Research model relies on a robust Retrieval Augmented Generation (RAG) system. We primarily use the Musique dataset.

  • Prepare Data & Build Index: The Makefile provides a convenient way to download the Musique dataset, preprocess it, and build a FAISS index:

    make data

    This command will:

    1. Download the Musique dataset (raw files into ./data/raw/).
    2. Process it into a corpus.jsonl file (./data/processed/corpus.jsonl).
    3. Build a FAISS index using intfloat/e5-base-v2 embeddings (index stored in ./index_musique_db/).
  • Run the RAG Server: Once the data is prepared and the index is built, you can start the RAG server (which demo_app.py and evaluation scripts might query if configured to do so, or if they implement their own RAG client):

    bash src/rag_setup/rag_server.sh

    This server uses FlashRAG components and serves retrieved documents based on semantic similarity.

📊 Evaluation

To evaluate the model's performance on benchmarks like SimpleQA and WebWalkerQA, please refer to the detailed instructions in: ➡️ src/evaluate/README.md

This guide covers:

  • Setting up a VLLM server specifically for evaluation.
  • Running the Gradio-based evaluation interface (src/evaluate/eval_app.py).
  • Executing benchmark-specific grading scripts.

🙏 Acknowledgements

This project builds upon the great work of the open-source community. We'd like to thank:

  • Verifier for their foundational training code, which significantly inspired our work in verifiers-deepresearch/.
  • Search-R1 for their insightful RAG (Retrieval Augmented Generation) methodologies
  • The Hugging Face team for smol-agents.
  • The Autogen team (Microsoft) for utility scripts like mdconvert.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •