Unleashing the power of advanced AI for comprehensive analysis, multi-dimensional reasoning, and breakthrough insights
DeepResearcher represents a paradigm shift in AI-driven research methodology. Developed by Menlo Research, this cutting-edge research agent transcends traditional information retrieval by employing sophisticated multi-hop reasoning, contextual synthesis, and adaptive learning mechanisms.
Our model doesn't just find answers—it understands, connects, and discovers.
Empowering researchers, analysts, and innovators to unlock deeper understanding through intelligent automation.
-
Clone the Repository:
git clone https://github.com/menloresearch/deep-research.git cd deep-research
-
Set up Python Environment (using
uv
): If you usepyenv
, it will automatically pick up the version from.python-version
.# Create a virtual environment uv venv # Activate the virtual environment source .venv/bin/activate # Install dependencies uv pip install -r requirements.txt # Or `uv pip install` if pyproject.toml is primary
Note: This installs dependencies specified in
pyproject.toml
and locked inuv.lock
. -
Configure Environment Variables: Copy the example environment file and fill in your API keys:
cp .env.example .env nano .env # Or your preferred editor
You'll need to provide keys for services like
TAVILY_API_KEY
,HF_TOKEN
, etc., as used by the tools.
You can experience the Deep Research model in action in two primary ways:
This sets up a Gradio interface to interact with the agent.
-
Step A: Start the VLLM Model Server (Critical!) The demo application (
src/demo_app.py
) expects a VLLM-compatible API server running locally. We use a fine-tuned Qwen3-14B model. Open a new terminal and run:# (Activate your virtual environment if not already done: source .venv/bin/activate) vllm serve jan-hq/Qwen3-14B-v0.2-deepresearch-no-think-100-step \ --port 8000 \ --tensor-parallel-size 1 \ --max-model-len 8192 # Adjust as per your model's needs # Add --enable-prefix-caching or other VLLM optimizations if desired
Note: Ensure you have VLLM installed (
uv pip install vllm
) and sufficient GPU memory. The modeljan-hq/Qwen3-14B-v0.2-deepresearch-no-think-100-step
is specifically configured indemo_app.py
. If you change it, update the script. -
Step B: Run the Gradio Demo Application In another terminal (with the virtual environment activated):
python src/demo_app.py
This will launch a Gradio web interface, typically at
http://127.0.0.1:7860
. Open this URL in your browser to interact with the agent.
Our production-ready Deep Research model is integrated into the Jan App.
The Deep Research model relies on a robust Retrieval Augmented Generation (RAG) system. We primarily use the Musique dataset.
-
Prepare Data & Build Index: The
Makefile
provides a convenient way to download the Musique dataset, preprocess it, and build a FAISS index:make data
This command will:
- Download the Musique dataset (raw files into
./data/raw/
). - Process it into a
corpus.jsonl
file (./data/processed/corpus.jsonl
). - Build a FAISS index using
intfloat/e5-base-v2
embeddings (index stored in./index_musique_db/
).
- Download the Musique dataset (raw files into
-
Run the RAG Server: Once the data is prepared and the index is built, you can start the RAG server (which
demo_app.py
and evaluation scripts might query if configured to do so, or if they implement their own RAG client):bash src/rag_setup/rag_server.sh
This server uses
FlashRAG
components and serves retrieved documents based on semantic similarity.
To evaluate the model's performance on benchmarks like SimpleQA and WebWalkerQA, please refer to the detailed instructions in:
➡️ src/evaluate/README.md
This guide covers:
- Setting up a VLLM server specifically for evaluation.
- Running the Gradio-based evaluation interface (
src/evaluate/eval_app.py
). - Executing benchmark-specific grading scripts.
This project builds upon the great work of the open-source community. We'd like to thank:
- Verifier for their foundational training code, which significantly inspired our work in
verifiers-deepresearch/
. - Search-R1 for their insightful RAG (Retrieval Augmented Generation) methodologies
- The Hugging Face team for
smol-agents
. - The Autogen team (Microsoft) for utility scripts like
mdconvert
.