This is the official code repository for SearchSwarm: Towards Delegation Intelligence in Agentic LLMs for Long-Horizon Deep Research.
SearchSwarm trains a main research agent to use subagents as an active context-management mechanism. The main agent decomposes long-horizon research tasks, dispatches bounded evidence-gathering subtasks, receives compact citation-grounded reports, and synthesizes the final answer under a finite context budget.
📃 Project Page | 🤗 Model Weights | 🤗 SFT Dataset | 📑 Paper
SearchSwarm focuses on delegation intelligence in agentic LLMs:
- Subagents as context management: subagents work in independent contexts and return compact, evidence-grounded reports to the main agent.
- Harness-guided trajectory synthesis: the harness encourages decomposition, comprehensive subagent briefing, verification, and citation-grounded reporting.
- High-quality SFT data for delegation: cleaned trajectories teach when to delegate, how to brief, and how to verify returned findings.
- Strong lightweight performance: SearchSwarm-30B-A3B achieves state-of-the-art results among comparable 30B-A3B open-source lightweight research agents.
See the paper for the complete comparison tables and evaluation details.
The harness reads configuration from harness/.env. Start from the example file:
cd harness
pip install -r requirements.txt
cp .env.example .env
# Edit .env with your model path, dataset path, and API keys.The repository ships only a tiny synthetic example dataset under harness/eval_data/example/ to demonstrate the expected schema. Real benchmark data is not redistributed; obtain benchmark files from their official sources, convert them to the supported JSONL schema, and point DATASET to your local file:
{"task_question": "<question>", "ground_truth": "<answer>", "file_name": "", "metadata": {}}Use this mode when the main model and subagent model are served by an OpenAI-compatible endpoint.
cd harness
cp .env.example .env
# Set MODEL_MODE=api, API_BASE_URL, API_KEY, MODEL_PATH, DATASET, OUTPUT_PATH.
bash run_react_infer.shUse this mode when running the model locally on eight vLLM servers.
cd harness
cp .env.example .env
# Set MODEL_MODE=local and MODEL_PATH.
bash deploy_model.sh
bash run_react_infer.shdeploy_model.sh starts one vLLM server per GPU on ports 6001-6008. If both the main agent and subagents use API mode, you can skip deployment.
For full harness configuration, including ENABLE_SUB_AGENT, SEARCH_MODE, TOOL_TYPE, subagent budgets, and LLM-as-judge settings, see harness/README.md.
The training scripts run full-parameter SFT with ms-swift's Megatron backend.
SearchSwarm-SFT stores one bundle per row: a main-agent conversation plus the sub-agent conversations it dispatched (messages + subagents columns). train/convert_share_to_cached.py streams the parquet and unrolls it into flat ms-swift messages records — one per main and per sub-agent trajectory:
cd train
hf download SearchSwarm/SearchSwarm-SFT --repo-type dataset --local-dir SearchSwarm-SFT
python convert_share_to_cached.py --parquet SearchSwarm-SFT/train.parquet --out data.jsonlThe parquet must be read streaming — pandas.read_parquet / pyarrow.parquet.read_table fail on its single 2.1 GB row group. See train/README.md for details and the pre-tokenization step.
This validates the environment and launch chain with a small model and the bundled debug data. It is not a production SearchSwarm training run.
cd train
bash setup_env.sh
bash train_megatron.shProduction-scale 30B-A3B training is designed for a multi-node GPU cluster. The repository provides three launch paths:
train_megatron_ray.sh: Ray-based dispatch for cloud clusters without inter-node SSH.train_megatron_multinode.sh: SSH / torchrun path for traditional clusters.train_megatron_shared_fs.sh: shared-filesystem rendezvous path for schedulers such as Kubernetes jobs or cloud batch.
See train/README.md for the full setup, dataset preparation and pre-tokenization, parallelism defaults, and launcher-specific instructions.
The repository intentionally does not bundle full benchmark test sets such as BrowseComp, BrowseComp-ZH, GAIA, or xbench-DeepSearch. Please obtain these datasets from their official sources and follow their redistribution / no-train policies.
@misc{searchswarm2026,
title = {SearchSwarm: Towards Delegation Intelligence in Agentic LLMs for Long-Horizon Deep Research},
author = {Ning, Pu and Chen, Quan and Tao, Kun and Tang, Xinyu and Wang, Tianshu and Cao, Qianggang and Kong, Xinyu and Wen, Zujie and Zhang, Zhiqiang and Zhou, Jun},
year = {2026},
note = {Under review}
}This repository builds on open-source infrastructure from the agent and LLM training ecosystem, including vLLM, ms-swift, Megatron-LM, Qwen-Agent, Serper, and Jina.




