Awesome Deep Research Projects

For more detailed report, please refer to A Comprehensive Survey of Deep Research: Systems, Methodologies, and Applications

Projects

Category	Subcategory	Name	URL
Open-source	Agent Framework	XAgent	https://github.com/OpenBMB/XAgent
Open-source	Agent Framework	AutoGen	https://github.com/microsoft/autogen
Open-source	Agent Framework	Qwen-Agent	https://github.com/QwenLM/Qwen-Agent
Open-source	Agent Framework	OpenAI Agents SDK	https://github.com/openai/openai-agents-python
Open-source	Agent Framework	N8n	https://github.com/n8n-io/n8n
Open-source	Agent Framework	AutoChain	https://github.com/Forethought-Technologies/AutoChain
Open-source	Agent Framework	AgentGPT	https://github.com/reworkd/AgentGPT
Open-source	Agent Framework	Open-operator	https://github.com/browserbase/open-operator
Open-source	Agent Framework	BabyAGI	https://github.com/yoheinakajima/babyagi
Open-source	Agent Framework	AutoGPT	https://github.com/Significant-Gravitas/AutoGPT
Open-source	Agent Framework	MetaGPT	https://github.com/geekan/MetaGPT
Open-source	Agent Framework	Llama_index	https://github.com/run-llama/llama_index
Open-source	Agent Framework	LangGraph	https://github.com/langchain-ai/langgraph
Open-source	Agent Framework	GoogleADK	https://google.github.io/adk-docs/
Open-source	Agent Framework	CrewAI	https://github.com/crewAIInc/crewAI
Open-source	Agent Framework	Agno	https://github.com/agno-agi/agno
Open-source	Agent Framework	Temporal	https://github.com/temporalio/temporal
Open-source	Agent Framework	Orkes	https://orkes.io/use-cases/agentic-workflows
Open-source	Agent Framework	Pydantic-AI	https://github.com/pydantic/pydantic-ai
Open-source	Agent Framework	Letta	https://github.com/letta-ai/letta
Open-source	Agent Framework	Mastra	https://github.com/mastra-ai/mastra
Open-source	Agent Framework	Semantic Kernel	https://github.com/microsoft/semantic-kernel
Open-source	Agent Orchestration Platform	Dify	https://github.com/langgenius/dify
Closed-source	Agent Orchestration Platform	Coze Space	https://www.coze.cn/space-preview
Closed-source	Agent Orchestration Platform	Flowise	https://flowiseai.com/
Closed-source	AI Assistant Tools	NotebookLm	https://notebooklm.google/
Closed-source	AI Assistant Tools	MGX.dev	https://mgx.dev
Closed-source	AI Assistant Tools	You	https://you.com/about
Closed-source	AI Assistant Tools	Microsoft Copilot	https://www.microsoft.com/en-us/microsoft-copilot/organizations
Closed-source	Workflow	Claude Research	https://www.anthropic.com/news/research
Open-source	Workflow	Google-gemini/gemini-fullstack-langgraph-quickstart	https://github.com/google-gemini/gemini-fullstack-langgraph-quickstart
Open-source	Workflow	Dzhng/deep-research	https://github.com/dzhng/deep-research
Open-source	Workflow	Jina-AI/node-DeepResearch	https://github.com/jina-ai/node-DeepResearch
Open-source	Workflow	LangChain-AI/open_deep_research	https://github.com/langchain-ai/open_deep_research
Open-source	Workflow	TheBlewish/Automated-AI-Web-Researcher-Ollama	https://github.com/TheBlewish/Automated-AI-Web-Researcher-Ollama
Open-source	Workflow	Btahir/open_deep_research	https://github.com/btahir/open-deep-research
Open-source	Workflow	Nickscamara/open-deep-research	https://github.com/nickscamara/open-deep-research
Open-source	Workflow	Mshumer/OpenDeepResearcher	https://github.com/mshumer/OpenDeepResearcher
Open-source	Workflow	Grapeot/deep_research_agent	https://github.com/grapeot/deep_research_agent
Open-source	Workflow	Smolagents/open_deep_research	https://github.com/huggingface/smolagents/tree/main/examples/open_deep_research
Open-source	Workflow	Assafelovic/GPT-Researcher	https://github.com/assafelovic/gpt-researcher/
Open-source	Workflow	HKUDS/Auto-Deep-Research	https://github.com/HKUDS/Auto-Deep-Research
Open-source	Workflow	AgentLaboratory	https://github.com/SamuelSchmidgall/AgentLaboratory
Closed-source	Multi-modal Agent UI	Manus	https://manus.im/
Closed-source	Multi-modal Agent UI	Flowith-Oracle Mode	https://flowith.net/
Open-source	Multi-modal Agent UI	OpenManus	https://github.com/FoundationAgents/OpenManus
)
Open-source	Multi-modal Agent UI	Camel-AI/OWL	https://github.com/camel-ai/owl
Open-source	Multi-modal Agent UI	TARS	https://github.com/bytedance/UI-TARS-desktop
Open-source	Multi-modal Agent UI	Nanobrowser	https://github.com/nanobrowser/nanobrowser
Open-source	Multi-modal Agent UI	JARVIS	https://github.com/microsoft/JARVIS
Closed-source	Multi-modal Agent UI	Devin	https://devin.ai/
Closed-source	Foundation Models	OpenAI Deep Research	https://openai.com/index/introducing-deep-research/
Closed-source	Foundation Models	Gimini Deep Research	https://blog.google/products/gemini/google-gemini-deep-research/
Closed-source	Foundation Models	Perplexity Deep Research	https://www.perplexity.ai/hub/blog/introducing-perplexity-deep-research
Closed-source	Foundation Models	Grok 3 Beta	https://x.ai/news/grok-3
Closed-source	Foundation Models	AutoGLM-Research	https://autoglm-research.zhipuai.cn/
Closed-source	Foundation Models	DeepSeek-R1	https://arxiv.org/abs/2501.12948
Closed-source	Developer Tools	Vercel	https://vercel.com/
Closed-source	Developer Tools	Bolt	https://bolt.new/
Closed-source	Developer Tools	Cursor	https://www.cursor.com/
Closed-source	Developer Tools	Github Copilot	https://github.com/features/copilot?ref=nav.poetries.top
Open-source	Developer Tools	Cline	https://github.com/cline/cline
Open-source	Developer Tools	GPT-pilot	https://github.com/Pythagora-io/gpt-pilot
Open-source	Developer Tools	Restate	https://restate.dev/
Open-source	Developer Tools	OpenAI Codex	https://github.com/openai/codex
Closed-source	Research/Academic Search	Elicit	https://elicit.com/?redirected=true
Closed-source	Research/Academic Search	ResearchRabbit	https://www.researchrabbit.ai/
Closed-source	Research/Academic Search	STORM	https://storm.genie.stanford.edu/
Closed-source	Research/Academic Search	Consensus	https://consensus.app/
Closed-source	Research/Academic Search	Scite	https://scite.ai/
Closed-source	Research/Academic Search	Scispace	https://scispace.com/
Closed-source	Research/Academic Search	FutureHouse Platform	https://www.futurehouse.org/research-announcements/launching-futurehouse-platform-ai-agents
Open-source	Research/Academic Search	PaperQA	https://github.com/Future-House/paper-qa
Open-source	Research/Academic Search	HKUDS/AI-Researcher	https://github.com/HKUDS/AI-Researcher
Open-source	Model Training Frameworks	Agent-RL/ReSearch	https://github.com/Agent-RL/ReSearch
Open-source	Model Training Frameworks	DSPy	https://github.com/stanfordnlp/dspy
Open-source	Model Training Frameworks	Gair-NLP/DeepResearcher	https://github.com/GAIR-NLP/DeepResearcher
Open-source	Model Training Frameworks	ModelTC/lightllm	https://github.com/ModelTC/lightllm
Open-source	Other LLM Tools	Ollama	https://github.com/ollama/ollama
Open-source	Other LLM Tools	Vllm	https://github.com/vllm-project/vllm
Open-source	Other LLM Tools	Web-LLM	https://github.com/mlc-ai/web-llm

Papers

Category	Paper Title	URL
AI Agent Frameworks & Development	AutoAgent: A Fully-Automated and Zero-Code Framework for LLM Agents	https://arxiv.org/pdf/2502.05957
AI Agent Frameworks & Development	Building effective agents	https://www.anthropic.com/engineering/building-effective-agents
AI Agent Frameworks & Development	OpenAgents: An Open Platform for Language Agents in the Wild	https://arxiv.org/pdf/2310.10634
AI Agent Frameworks & Development	Agentic Reasoning: Reasoning LLMs with Tools for the Deep Research	https://arxiv.org/pdf/2502.04644
AI Agent Frameworks & Development	AutoGLM: Autonomous Foundation Agents for GUIs	https://arxiv.org/pdf/2411.00820
AI Agent Frameworks & Development	TapeAgents: A Holistic Framework for Agent Development and Optimization	https://arxiv.org/pdf/2412.08445
AI Agent Frameworks & Development	How to think about agent frameworks	https://blog.langchain.dev/how-to-think-about-agent-frameworks/
AI for Scientific Research	Towards an AI Co-Scientist	https://storage.googleapis.com/coscientist_paper/ai_coscientist.pdf
AI for Scientific Research	DeepResearcher: Scaling Deep Research via Reinforcement Learning	https://arxiv.org/pdf/2504.03160
AI for Scientific Research	AI Achieves Silver-Medal Standard Solving IMO Problems	https://deepmind.google/discover/blog/ai-solves-imo-problems-at-silver-medal-level/
AI for Scientific Research	Accelerating Scientific Research Through Multi-LLM Frameworks	https://arxiv.org/pdf/2502.07960
AI for Scientific Research	The AI Scientist: Fully Automated Open-Ended Scientific Discovery	https://arxiv.org/pdf/2408.06292
AI for Scientific Research	Transforming Science with LLMs: Survey on AI-Assisted Discovery	https://arxiv.org/pdf/2502.05151
AI for Scientific Research	AI's Deep Research Revolution in Biomedical Literature	https://journals.lww.com/jcma/citation/9900/ai_s_deep_research_revolution__transforming.508.aspx
AI for Scientific Research	Unlocking AI Researchers' Potential in Scientific Discovery	https://arxiv.org/pdf/2503.05822
AI for Scientific Research	Empowering Biomedical Discovery with AI Agents	https://arxiv.org/pdf/2404.02831
AI for Scientific Research	Automated Scientific Discovery Systems	https://arxiv.org/abs/2305.02251
LLM Tool Integration & API Control	ToolLLM: Mastering 16K+ Real-World APIs	https://arxiv.org/pdf/2307.16789
LLM Tool Integration & API Control	MetaGPT: Multi-Agent Collaborative Framework	https://arxiv.org/pdf/2308.00352
LLM Tool Integration & API Control	AutoGen: Next-Gen LLM Apps via Multi-Agent Conversation	https://arxiv.org/pdf/2308.08155
LLM Tool Integration & API Control	LLaVA-Plus: Creating Multimodal Agents with Tools	https://arxiv.org/pdf/2311.05437
LLM Tool Integration & API Control	ChemCrow: Augmenting LLMs with Chemistry Tools	https://arxiv.org/pdf/2304.05376
LLM Tool Integration & API Control	TORL: Scaling Tool-Integrated Reinforcement Learning	https://arxiv.org/pdf/2503.23383
Deep Research Systems	OpenAI's 'Deep Research' Tool: Usefulness for Scientists	https://www.nature.com/articles/d41586-025-00377-9
Deep Research Systems	OpenAI's Deep Research: Functionality and Applications	https://www.youreverydayai.com/openais-deep-research-how-it-works-and-what-to-use-it-for/
Deep Research Systems	Deep Research System Card	https://cdn.openai.com/deep-research-system-card.pdf
Deep Research Systems	Gemini Launches Deep Research on Gemini 2.5 Pro	https://www.ctol.digital/news/gemini-deep-research-launch-2-5-pro-vs-openai/
Deep Research Systems	Deep Research Now Available on Gemini 2.5 Pro Experimental	https://blog.google/products/gemini/deep-research-gemini-2-5-pro-experimental/
Deep Research Systems	ChatGPT's Deep Research vs. Google's Gemini 1.5 Pro: Comparison	https://whitebeardstrategies.com/ai-prompt-engineering/chatgpts-deep-research-vs-googles-gemini-1-5-pro-with-deep-research-a-detailed-comparison/
Deep Research Systems	ChatGPT Deep Research vs Perplexity: Comparative Analysis	https://blog.getbind.co/2025/02/03/chatgpt-deep-research-is-it-better-than-perplexity/
Deep Research Systems	Sonar by Perplexity [Technical Documentation]	https://docs.perplexity.ai/guides/model-cards#research-models
RAG Technology	Ragnarök: Reusable RAG Framework for TREC 2024	http://arxiv.org/pdf/2406.16828
RAG Technology	From Documents to Dialogue: KG-RAG Enhanced AI Assistants	https://arxiv.org/pdf/2502.15237
RAG Technology	GEAR-Up: AI-Augmented Scholarly Search for Systematic Reviews	https://arxiv.org/pdf/2312.09948
RAG Technology	Survey on RAG for Large Language Models	https://arxiv.org/pdf/2405.06211
RAG Technology	Knowledge Retrieval Based on Generative AI	https://arxiv.org/pdf/2501.04635
LLM Reasoning & Optimization	Self-Consistency Improves Chain-of-Thought Reasoning	https://arxiv.org/pdf/2203.11171
LLM Reasoning & Optimization	Chain-of-Thought Prompting Elicits Reasoning in LLMs	https://arxiv.org/pdf/2201.11903
LLM Reasoning & Optimization	Training LLMs to Follow Instructions with Human Feedback	https://arxiv.org/pdf/2203.02155
LLM Reasoning & Optimization	Debate Enhances Weak-to-Strong Generalization	https://arxiv.org/pdf/2501.13124
LLM Reasoning & Optimization	Mask-DPO: Factuality Alignment for LLMs	https://arxiv.org/pdf/2503.02846
LLM Reasoning & Optimization	QuestBench: Can LLMs Ask Optimal Questions?	https://arxiv.org/abs/2503.22674
Multi-Agent Systems	AgentVerse: Multi-Agent Collaboration and Emergent Behaviors	https://arxiv.org/pdf/2308.10848
Multi-Agent Systems	MetaAgents: Human Behavior Simulation for Task Coordination	https://arxiv.org/pdf/2310.06500
Multi-Agent Systems	CAMEL: Communicative Agents for LLM Society Exploration	https://arxiv.org/pdf/2303.17760
Multi-Agent Systems	Many Heads Improve Scientific Idea Generation	https://arxiv.org/pdf/2410.09403
Multi-Agent Systems	Why Multi-Agent LLM Systems Fail	https://arxiv.org/pdf/2503.13657
Multi-Agent Systems	Multi-Agent System for Cosmological Parameter Analysis	https://arxiv.org/pdf/2412.00431
Code & Software Development	CodeA11y: Accessible Web Development with AI	https://arxiv.org/pdf/2502.10884
Code & Software Development	AutoDev: Automated AI-Driven Development	https://arxiv.org/pdf/2403.08299
Code & Software Development	ChatDev: Communicative Agents for Software Development	https://aclanthology.org/2024.acl-long.810.pdf
Code & Software Development	Natural Language as a Programming Language	https://drops.dagstuhl.de/storage/00lipics/lipics-vol071-snapl2017/LIPIcs.SNAPL.2017.4/LIPIcs.SNAPL.2017.4.pdf
Code & Software Development	AIDE: AI-Driven Code Exploration	https://arxiv.org/pdf/2502.13138
Code & Software Development	AI-Assisted Programming: Big Code NLP	https://arxiv.org/pdf/2307.02503
Code & Software Development	AI-Assisted SQL Authoring at Industry Scale	https://arxiv.org/pdf/2407.13280
Code & Software Development	Steward: Natural Language Web Automation	https://arxiv.org/pdf/2409.15441
Domain-Specific AI Tools	MatPilot: AI Materials Scientist	https://arxiv.org/pdf/2411.08063
Domain-Specific AI Tools	EvoPat: Multi-LLM Patent Summarization Agent	https://arxiv.org/pdf/2412.18100
Domain-Specific AI Tools	ChartCitor: Fine-Grained Chart Attribution Framework	https://arxiv.org/pdf/2502.00989
Domain-Specific AI Tools	PatentGPT: Knowledge-Based Patent Drafting	https://arxiv.org/pdf/2409.00092
Domain-Specific AI Tools	SciAgents: Multi-Agent Scientific Discovery	https://arxiv.org/pdf/2409.05556
Domain-Specific AI Tools	Dolphin: Closed-Loop Open-Ended Auto-Research	https://arxiv.org/pdf/2501.03916
Domain-Specific AI Tools	SeqMate: Automating RNA Sequencing with LLMs	https://arxiv.org/pdf/2407.03381
Domain-Specific AI Tools	Knowledge Synthesis of Photosynthesis via LLMs	https://arxiv.org/pdf/2502.01059
Domain-Specific AI Tools	GeoLLM: Geospatial Knowledge Extraction from LLMs	https://arxiv.org/pdf/2310.06213
HCI & AI User Experience	System Usability Scale: Evolution and Future	https://doi.org/10.1080/10447318.2018.1455307
HCI & AI User Experience	CARE: Collaborative AI Reading Environment	https://arxiv.org/pdf/2302.12611
HCI & AI User Experience	VISAR: Visual Argumentative Writing Assistant	https://arxiv.org/pdf/2304.07810
HCI & AI User Experience	AdaptoML-UX: User-Centered AutoML Toolkit	https://arxiv.org/pdf/2410.17469
HCI & AI User Experience	AI Assistants for Semi-Automated Data Wrangling	https://arxiv.org/pdf/2211.00192
HCI & AI User Experience	Documentation Matters: Human-Centered AI Systems	https://arxiv.org/pdf/2102.12592
HCI & AI User Experience	Need Help? Proactive Programming Assistants	https://arxiv.org/abs/2410.04596
HCI & AI User Experience	Large-Scale Survey on AI Programming Assistant Usability	https://arxiv.org/abs/2303.17125
AI Evaluation & Benchmarking	TruthfulQA: Measuring Model Mimicry of Human Falsehoods	https://arxiv.org/pdf/2109.07958
AI Evaluation & Benchmarking	HotpotQA: Dataset for Multi-hop Question Answering	https://arxiv.org/pdf/1809.09600
AI Evaluation & Benchmarking	WebArena: Web Agent Benchmark	https://github.com/web-arena-x/webarena
AI Evaluation & Benchmarking	Measuring Short-Form Factuality in LLMs	https://cdn.openai.com/papers/simpleqa.pdf
AI Evaluation & Benchmarking	Survey on LLM-Generated Text Detection	https://arxiv.org/pdf/2310.14724
AI Evaluation & Benchmarking	Evaluating AI-Assisted Code Generation Tools	https://arxiv.org/pdf/2304.10778
AI Evaluation & Benchmarking	Benchmarking ChatGPT, Codeium, and GitHub Copilot	https://arxiv.org/pdf/2409.19922
AI Evaluation & Benchmarking	FinEval: Chinese Financial Knowledge Benchmark	https://arxiv.org/pdf/2308.09975
AI Evaluation & Benchmarking	Knowledge-Based Evaluation Methodology for AI Assistants	https://arxiv.org/pdf/2406.05603
AI Evaluation & Benchmarking	GRADE Guidelines: Rating Evidence Quality	https://pubmed.ncbi.nlm.nih.gov/21208779/
AI Evaluation & Benchmarking	Holistic Evaluation of Language Models	https://arxiv.org/pdf/2211.09110
AI Evaluation & Benchmarking	AGIEvalA Human-Centric Benchmark for Evaluating Foundation Models	https://arxiv.org/pdf/2304.06364
AI Evaluation & Benchmarking	GAIA:A Benchmark for General AI Assistants	https://arxiv.org/pdf/2311.12983
AI Evaluation & Benchmarking	MMLU benchmarkTesting LLMs multi-task capabilities	https://www.bracai.eu/post/mmlu-benchmark
AI Evaluation & Benchmarking	Enabling AI Scientists to Recognize InnovationA Domain-Agnostic Algorithm for Assessing Novelty	https://arxiv.org/pdf/2503.01508
AI Evaluation & Benchmarking	The impact of AI and peer feedback on research writing skillsa study using the CGScholar platform among Kazakhstani scholars	https://arxiv.org/pdf/2503.05820
AI Evaluation & Benchmarking	Supporting the development of Machine Learning for fundamental science in a federated Cloud with the AI_INFN platform	https://arxiv.org/pdf/2502.21266
AI Evaluation & Benchmarking	EAIRAEstablishing a Methodology for Evaluating AI Models as Scientific Research Assistants	https://arxiv.org/pdf/2502.20309
AI Evaluation & Benchmarking	Bridging Logic Programming and Deep Learning for Explainability through ILASP	https://arxiv.org/pdf/2502.09227
AI Evaluation & Benchmarking	Self-Explanation in Social AI Agents	https://arxiv.org/pdf/2501.13945
AI Evaluation & Benchmarking	Fine-Grained Appropriate RelianceHuman-AI Collaboration with a Multi-Step Transparent Decision Workflow for Complex Task Decomposition	https://arxiv.org/pdf/2501.10909
AI Evaluation & Benchmarking	CATERLeveraging LLM to Pioneer a Multidimensional, Reference-Independent Paradigm in Translation Quality Evaluation	https://arxiv.org/pdf/2412.11261
AI Evaluation & Benchmarking	GigaCheckDetecting LLM-generated Content	https://arxiv.org/pdf/2410.23728
AI Evaluation & Benchmarking	Vital InsightAssisting Experts' Context-Driven Sensemaking of Multi-modal Personal Tracking Data Using Visualization and Human-In-The-Loop LLM Agents	https://arxiv.org/pdf/2410.14879
AI Evaluation & Benchmarking	Aligning AI-driven discovery with human intuition	https://arxiv.org/pdf/2410.07
AI Evaluation & Benchmarking	Theoretical Physics Benchmark (TPBench) -- a Dataset and Study of AI Reasoning Capabilities in Theoretical Physics	https://arxiv.org/pdf/2502.15815
AI Evaluation & Benchmarking	Insect-FoundationA Foundation Model and Large Multimodal Dataset for Vision-Language Insect Understanding	https://arxiv.org/pdf/2502.09906
AI Evaluation & Benchmarking	MinervaA Programmable Memory Test Benchmark for Language Models	https://arxiv.org/pdf/2502.03358
AI Evaluation & Benchmarking	UGPhysicsA Comprehensive Benchmark for Undergraduate Physics Reasoning with Large Language Models	https://arxiv.org/pdf/2502.00334
AI Evaluation & Benchmarking	Learning to Coordinate with Experts	https://arxiv.org/pdf/2502.09583
AI Evaluation & Benchmarking	Auto-BenchAn Automated Benchmark for Scientific Discovery in LLMs	https://arxiv.org/pdf/2502.15224
AI Evaluation & Benchmarking	How Well Do LLMs Generate Code for Different Application Domains? Benchmark and Evaluation	https://arxiv.org/pdf/2412.18573
AI Evaluation & Benchmarking	LLM4DSEvaluating Large Language Models for Data Science Code Generation	https://arxiv.org/pdf/2411.11908
AI Evaluation & Benchmarking	RedCodeRisky Code Execution and Generation Benchmark for Code Agents	https://arxiv.org/pdf/2411.07781
AI Evaluation & Benchmarking	SeafloorAIA Large-scale Vision-Language Dataset for Seafloor Geological Survey	https://arxiv.org/pdf/2411.00172
AI Evaluation & Benchmarking	INQUIREA Natural World Text-to-Image Retrieval Benchmark	https://arxiv.org/pdf/2411.02537
AI Evaluation & Benchmarking	AAAR-1.0Assessing AI's Potential to Assist Research	https://arxiv.org/pdf/2410.22394
AI Evaluation & Benchmarking	AutoPenBenchBenchmarking Generative Agents for Penetration Testing	https://arxiv.org/pdf/2410.03225
AI Evaluation & Benchmarking	CodeMMLUA Multi-Task Benchmark for Assessing Code Understanding Capabilities of CodeLLMs	https://arxiv.org/pdf/2410.01999
AI Evaluation & Benchmarking	UniSumEvalTowards Unified, Fine-Grained, Multi-Dimensional Summarization Evaluation for LLMs	https://arxiv.org/pdf/2409.19898
AI Evaluation & Benchmarking	CI-BenchBenchmarking Contextual Integrity of AI Assistants on Synthetic Data	https://arxiv.org/pdf/2409.13903
AI Evaluation & Benchmarking	ChemDFM-XTowards Large Multimodal Model for Chemistry	https://arxiv.org/pdf/2409.13194
AI Evaluation & Benchmarking	DSBenchHow Far Are Data Science Agents to Becoming Data Science Experts?	https://arxiv.org/pdf/2409.07703
AI Evaluation & Benchmarking	GMAI-MMBenchA Comprehensive Multimodal Evaluation Benchmark Towards General Medical AI	https://arxiv.org/pdf/2408.03361
AI Evaluation & Benchmarking	MMSciA Dataset for Graduate-Level Multi-Discipline Multimodal Scientific Understanding	https://arxiv.org/pdf/2407.04903
AI Evaluation & Benchmarking	SciCodeA Research Coding Benchmark Curated by Scientists	https://arxiv.org/pdf/2407.13168
AI Evaluation & Benchmarking	MASSWA New Dataset and Benchmark Tasks for AI-Assisted Scientific Workflows	https://arxiv.org/pdf/2406.06357
AI Evaluation & Benchmarking	Turing Tests For An AI Scientist	https://arxiv.org/pdf/2405.13352
AI Evaluation & Benchmarking	LHRS-BotEmpowering Remote Sensing with VGI-Enhanced Large Multimodal Language Model	https://arxiv.org/pdf/2402.02544
AI Evaluation & Benchmarking	GAIAa benchmark for General AI Assistants	https://arxiv.org/pdf/2311.12983
AI Evaluation & Benchmarking	OceanGPTA Large Language Model for Ocean Science Tasks	https://arxiv.org/pdf/2310.02031
AI Evaluation & Benchmarking	LatEvalAn Interactive LLMs Evaluation Benchmark with Incomplete Information from Lateral Thinking Puzzles	https://arxiv.org/pdf/2308.10855
AI Evaluation & Benchmarking	BOLAABenchmarking and Orchestrating LLM-augmented Autonomous Agents	https://arxiv.org/pdf/2308.05960
AI Evaluation & Benchmarking	MegaWikaMillions of reports and their sources across 50 diverse languages	https://arxiv.org/pdf/2307.07049
AI Evaluation & Benchmarking	Learn to ExplainMultimodal Reasoning via Thought Chains for Science Question Answering	https://arxiv.org/pdf/2209.09513
AI Evaluation & Benchmarking	Benchmarking Agentic Workflow Generation	https://arxiv.org/abs/2410.07869
AI Evaluation & Benchmarking	TheAgentCompanyBenchmarking LLM Agents on Consequential Real World Tasks	https://arxiv.org/abs/2412.14161

Citation

@misc{xu2025comprehensive,
    title={A Comprehensive Survey of Deep Research: Systems, Methodologies, and Applications},
    author={Renjun Xu and Jingwen Peng},
    year={2025},
    eprint={2506.12594},
    archivePrefix={arXiv},
    primaryClass={cs.AI}
}

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Awesome Deep Research Projects

Projects

Papers

Citation

About

Uh oh!

Releases

Packages

License

scienceaix/deepresearch

Folders and files

Latest commit

History

Repository files navigation

Awesome Deep Research Projects

Projects

Papers

Citation

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Packages