🧬 Awesome Auto Research

Tracking the systems that automate scientific research — from single-purpose code agents to full idea-to-paper pipelines

Autonomous research systems have gone from weekend experiments to NeurIPS Spotlight papers in under two years. This repository catalogues 30+ active projects across the full spectrum — lightweight literature scrapers, multi-agent experiment runners, and end-to-end systems that can take a vague research direction and output a reviewable manuscript — together with a capability comparison matrix, a pipeline map, a tool selection guide, and in-depth technical reports for the most impactful systems.

Latest Additions (2026-04-08): Tier updates across all sections (ARIS 5.8k★→🏆, Aider 43k★→🏆, Tongyi DeepResearch 18.6k★→🏆, SWE-agent 19k★→🏆), 7 new in-depth reports (Aider, ARIS, Tongyi DeepResearch, DeepResearchAgent, Idea2Paper, SciAgents, AIDE), Capability Matrix corrected to 18 rows

🗺️ Research Automation Landscape

Understanding where each tool fits in the research process is key to choosing the right one.

╔════════════════════════════════════════════════════════════════════════════════╗
║                     THE AUTONOMOUS RESEARCH PIPELINE                           ║
╠══════════════╦════════════════╦════════════════╦════════════╦══════════════════╣
║   DISCOVER   ║   SYNTHESIZE   ║  HYPOTHESIZE   ║   EXECUTE  ║  WRITE & REVIEW  ║
║              ║                ║                ║            ║                  ║
║  Idea2Paper  ║ STORM          ║ AI-Scientist   ║ OpenHands  ║ AI-Scientist     ║
║  SciAgents   ║ GPT Researcher ║ AI-Researcher  ║ SWE-agent  ║ Agent Lab        ║
║  ResAgent    ║ PaperQA2       ║ Agent Lab      ║ Aider      ║ AI-Researcher    ║
║              ║ OpenScholar    ║ autoresearch   ║ AIDE       ║                  ║
║              ║ DeerFlow       ║                ║            ║                  ║
╠══════════════╩════════════════╩════════════════╩════════════╩══════════════════╣
║                     ◄── FULL PIPELINE (End-to-End) ──►                         ║
║  autoresearch · AI-Scientist v1/v2 · AI-Researcher · Agent Laboratory · Biomni ║
╚════════════════════════════════════════════════════════════════════════════════╝

📊 Capability Matrix

The Tier column groups systems by overall impact and maturity — this same tier label appears in every section table below, so you can quickly cross-reference.

Tier	System	Lit Review	Hypothesis	Code Exec	Paper Writing	Peer Review	Multimodal	Fully Local
🏆	OpenHands	❌	❌	✅	❌	❌	❌	✅
🏆	autoresearch	❌	✅	✅	❌	❌	❌	✅
🏆	DeerFlow	✅	❌	✅	✅	❌	❌	⚠️
🏆	STORM	✅	❌	❌	✅	❌	❌	⚠️
🏆	GPT Researcher	✅	❌	❌	✅	❌	❌	⚠️
🏆	SWE-agent	❌	❌	✅	❌	❌	❌	✅
🏆	deep-research	✅	❌	❌	✅	❌	❌	⚠️
🏆	AI-Scientist	✅	✅	✅	✅	✅	❌	⚠️
🏆	RD-Agent	❌	✅	✅	❌	❌	❌	⚠️
🏆	Open Deep Research	✅	❌	⚠️	✅	❌	❌	✅
🏆	PaperQA2	✅	❌	❌	❌	❌	❌	✅
🏆	MiroThinker	✅	❌	❌	✅	❌	⚠️	✅
🏆	ARIS	✅	✅	✅	✅	✅	❌	❌
🏆	Agent Laboratory	✅	✅	✅	✅	❌	❌	⚠️
🏆	AI-Scientist-v2	✅	✅	✅	✅	✅	❌	⚠️
🏆	AI-Researcher	✅	✅	✅	✅	❌	❌	✅
🌟	EvoScientist	✅	✅	✅	✅	✅	❌	⚠️
🌟	Biomni	✅	✅	✅	❌	❌	✅	⚠️

Tier legend: 🏆 Landmark — defined or significantly shaped the field · 🌟 Flagship — mature, widely adopted, strong results · 🔬 Notable — active, specialized, or emerging
Capability legend: ✅ Native · ⚠️ Partial / requires setup · ❌ Not supported

🚀 End-to-End Research Systems

Systems that automate the full research lifecycle: discovery → hypothesis → experiments → manuscript. The most ambitious category — each one aims to replace or augment the entire scientific process.

Tier	Project	Core Approach	Notes	Report
🏆	autoresearch _{Andrej Karpathy}	630-line agent; reads its own training script, forms hypotheses, modifies code, runs hundreds of experiments overnight	Minimal & self-contained; seminal proof-of-concept	📄
🏆	AI-Scientist _{SakanaAI · 2024}	Template-driven idea generation → experiment loop → LaTeX write-up → agentic peer review	First comprehensive end-to-end system; multiple ML research templates	📄
🏆	RD-Agent _{Microsoft Research · 2025}	Dual-agent R&D automation: Research agent (ideation) + Development agent (implementation) with iterative loops	#1 on MLE-bench (30.22%); NeurIPS 2025; data-centric multi-domain framework	📄
🏆	AI-Scientist-v2 _{SakanaAI · 2025}	BFTS (beam-search agentic tree search) + AIDE for code generation	First AI-written paper accepted through standard peer review	📄
🌟	DATAGEN _{starpig1129 · 2025}	Multi-agent orchestration: hypothesis generation → data analysis → visualization → report generation	LangChain + LangGraph; advanced state tracking via Note Taker agent	📄
🏆	AI-Researcher _{HKUDS · NeurIPS 2025 Spotlight}	LiteLLM multi-provider + Docker-sandboxed execution + Gradio UI	Broadest LLM compatibility; strong reproducibility focus	📄
🏆	Agent Laboratory _{SamuelSchmidgall · 2024}	Role-specialized multi-agent: Professor → PhD Student → Reviewer	arXiv + HuggingFace integration for literature and datasets	📄
🌟	EvoScientist _{EvoScientist Team · 2026}	Six-agent team (plan, research, code, analyze, write, review) with RL self-improvement	ICAIS 2025 Best Paper; #1 on DeepResearch Bench II; human-on-the-loop paradigm	📄
🌟	Biomni _{Stanford SNAP · 2025}	Biomedical datalake + know-how library + sandboxed code execution	Domain-specialized for biology & medicine; multimodal inputs	📄
🔬	MedResearcher-R1 _{AQ-MedAI · 2025}	KG-grounded multi-hop QA synthesis + trajectory generation for medical AI training	SOTA on MedBrowseComp; open 32B model + full training data released	📄
🔬	BioAgents _{bio-xyz · 2025}	Specialized literature + analysis agents for biological sciences; state-of-the-art on BixBench	SOTA analysis agent (48.78% open-answer); configurable dual-agent backend	📄
🏆	ARIS _wanshuiyin	Claude Code + MCP servers; runs overnight unattended	Cross-model review loops; Zotero + Obsidian integration	📄
🌟	Idea2Paper _{AgentAlphaAGI}	Multi-agent + Knowledge Graph alignment for novelty checking	Semantic Scholar + arXiv grounding; idea → draft pipeline	📄

🔍 Literature Review & Deep Research

Systems specialized in information gathering, synthesis, and structured report generation. The entry point for most research workflows — and often the most practical category for daily use.

Tier	Project	Core Approach	Notes	Report
🏆	deep-research _{dzhng (Aomni) · 2025}	Recursive depth/breadth search with Firecrawl + LLM extraction; <500 LoC reference scaffold	Most-forked deep-research scaffold; direct inspiration for Open Deep Research and DeerFlow	📄
🏆	STORM _{Stanford OVAL · NAACL 2024}	Multi-perspective question asking + DSPy pipeline	Generates full Wikipedia-style articles with citations; Co-STORM for collaborative mode	📄
🏆	GPT Researcher _{assafelovic · 2023}	Parallel web scraping agents + LangGraph orchestration	Outputs 5–6 page cited report (PDF / Docx / MD); MCP server support	📄
🏆	MiroThinker _{MiroMind AI · 2025}	RL-trained open-source agent (30B / 235B) with 256K context + 300 tool calls	SOTA on BrowseComp (88.2 H1, 74.0 open); step-verifiable long-chain reasoning	📄
🌟	CognitiveKernel-Pro _{Tencent AI Lab · 2025}	SFT-trained Qwen3-8B + Playwright web engine + multi-agent (web/file/main)	Outperforms RL-trained WebDancer/WebSailor on GAIA using SFT-only recipe; fully open model & data	📄
🏆	DeerFlow _{ByteDance · 2025}	Sub-agent orchestration with persistent memory + InfoQuest + LangGraph	Uniquely combines deep research with code generation in one pipeline	📄
🔬	Deeper-Seeker _{HarshJ23 · 2024}	Iterative research with follow-up questions + multi-step query generation + report synthesis	OSS alternative to OpenAI's Deep Research; Exa integration for web search	📄
🌟	PaperQA2 _{Future House · ICLR 2024}	Iterative RAG over full-text PDFs using tantivy search index	Highest-accuracy Q&A from local scientific papers; outperforms Perplexity Pro	📄
🌟	OpenScholar _{Asai et al. · Nature 2024}	Dense retrieval (Contriever) over 45M open-access papers	Outperforms PaperQA2 on scientific Q&A; evidence-grounded answers	📄
🌟	Open Deep Research _{LangChain · 2025}	LangGraph workflow + MCP tool plugins + LangSmith tracing	Reference implementation from LangChain; highly configurable	📄
🌟	ToolUniverse _{Harvard Medical School · 2025}	AI-Tool Interaction Protocol; 1,000+ tools (ML models, datasets, APIs, packages)	Universal LLM support (Claude, GPT, Gemini, Qwen, Deepseek); 68+ pre-built research skills	📄
🏆	Tongyi DeepResearch _{Alibaba NLP · 2025}	RL-trained agentic LLM (30.5B, GRPO)	SOTA on long-horizon information-seeking benchmarks; open-weight model	📄
🌟	DeepResearchAgent _{Skywork AI}	Hierarchical multi-agent + Autogenesis self-evolution	Planning agent coordinates specialized lower-level agents	📄
🔬	II-Researcher _{Intelligent Internet · 2025}	BAML-structured LLM functions + multi-provider web search + async reflection loop	84.12% on Frames multi-hop benchmark; MCP server support; pip-installable	📄

⚗️ Experiment Automation & Code Agents

The "hands" of an autonomous research pipeline. These systems write, execute, debug, and iterate on code — essential when a hypothesis needs to become a running experiment.

Tier	Project	Core Approach	Notes	Report
🏆	OpenHands _{All-Hands-AI · 2024}	Composable Python agent library; file editing + terminal + web browsing	72% on SWE-Bench Verified — best-in-class; production-ready UI	📄
🏆	SWE-agent _{Princeton NLP · 2024}	Agent-Computer Interface (ACI) giving structured file/bash/edit access	~19% on SWE-Bench (full); widely used as research baseline	📄
🏆	Aider _{Aider-AI · 2023}	AI pair programming in terminal with native Git integration	~18% on SWE-Bench; fastest daily iteration loop; supports 60+ models	📄
🔬	AutoGPT _{Significant Gravitas · 2023}	Plugin-based autonomous agent platform + Forge builder framework	Historically seminal; sparked the autonomous agent movement	📄
🌟	AIDE _{WecoAI · 2024}	Tree-search over ML solution space with iterative code refinement	ML-experiment-specific; used internally by AI-Scientist-v2	📄
🔬	AutoDidact _{dCaples · 2025}	GRPO RL + self-generated Q&A pairs to bootstrap research-agent LLMs on custom corpora	Doubles Llama-8B accuracy in 1 hr on single RTX 4090; fully local open-source pipeline	📄

✍️ Idea Generation & Writing Assistants

Systems focused on the creative and communicative ends of research: surfacing novel hypotheses, structuring arguments, and drafting manuscripts.

Tier	Project	Core Approach	Notes	Report
🌟	Idea2Paper _{AgentAlphaAGI}	Multi-agent pipeline with Knowledge Graph novelty alignment	Semantic Scholar + arXiv grounding; raw idea → structured research proposal	📄
🌟	SciAgents _{MIT · 2024}	Multi-agent system with ontology graph for scientific reasoning	Generates multi-step reasoning chains grounded in domain ontologies	📄
🏆	ARIS _wanshuiyin	Claude Code + MCP servers running overnight without supervision	Cross-model review loop; integrates Zotero, Obsidian, Kimi, DeepSeek	📄

📐 Benchmarks & Evaluation Suites

Principled evaluation frameworks for measuring the capabilities of autonomous research systems.

Benchmark	Maintained By	What It Measures	Link
SWE-Bench	Princeton NLP	Software engineering task resolution on real GitHub issues	github.com/princeton-nlp/SWE-bench
SWE-Bench Verified	OpenAI	Human-verified subset of SWE-Bench (cleaner signal)	openai.com/research
MLE-Bench	OpenAI	ML engineering quality on Kaggle competition tasks	github.com/openai/mle-bench
CORE-Bench	—	Computational reproducibility of published research	—
AI-Scientist Eval	SakanaAI	Paper quality via automated + human review	AI-Scientist
MLGym	Meta AI Research	13 open-ended AI research tasks (CV, NLP, RL, game theory) for benchmarking research agents	github.com/facebookresearch/MLGym · arXiv:2502.14499
DeepResearch Bench	Ayanami et al.	Comprehensive multi-domain benchmark for deep research agent quality	github.com/Ayanami0730/deep_research_bench
BixBench	bio-xyz	Biology-focused tool-use benchmark for research agents	github.com/bio-xyz/BioAgents
MedBrowseComp	AQ-MedAI	Medical knowledge synthesis via multi-hop web retrieval	github.com/AQ-MedAI/MedResearcher-R1

💡 Contributions to this section are especially welcome — if you know of additional evaluation suites for research agents, please open an issue or submit a PR.

🎓 Academic Surveys & Papers

Year	Title	Venue	Authors	Link
2024	The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery	arXiv	Lu et al. (SakanaAI)	arXiv:2408.06292
2024	From Copilot to Pilot: Towards AI-Driven Autonomous Scientific Research	arXiv	Guo et al.	arXiv:2409.14526
2024	Agent Laboratory: Using LLM Agents as Research Assistants	arXiv	Schmidgall et al.	arXiv:2501.04227
2024	STORM: Assisting in Writing Wikipedia-like Articles From Scratch	NAACL	Shao et al. (Stanford)	arXiv:2402.14207
2024	OpenScholar: Synthesizing Scientific Literature with Retrieval-Augmented LMs	Nature	Asai et al.	Nature
2024	PaperQA2: Accurate Scientific QA through Iterative Literature Search	ICLR	Skarlinski et al.	arXiv:2312.07559
2025	Towards Automated Research: A Survey of AI Agents for Scientific Discovery	arXiv	Various	—
2025	The AI Scientist-v2: Workshop-Level AI Research Automation	arXiv	Lu et al. (SakanaAI)	arXiv:2504.08066
2025	EvoScientist: Automated Scientific Discovery with Evolvable Multi-Agent Collaboration	ICAIS 2025 Best Paper	EvoScientist Team	—
2025	MLGym: A New Framework and Benchmark for Advancing AI Research Agents	arXiv	Nathani et al. (Meta)	arXiv:2502.14499
2025	SciAgents: Accelerating Scientific Discovery with Multi-Agent Intelligent Graph Reasoning	Advanced Materials	Buehler et al. (MIT)	—
2025	Tongyi DeepResearch: Reinforcement Learning for Deep Research Agents	arXiv	Alibaba NLP	—

🧾 In-Depth Analysis Reports

The reports/ folder is the core value of this repository. Each file contains a structured 10-section analysis: architecture internals, component breakdowns, benchmark context, and honest assessment of strengths and limitations.

📁 Browse all reports →

Tier	Report	System	Category	Key Topics Covered
🏆	ai-scientist.md	AI-Scientist	End-to-End	LaTeX pipeline, template-driven idea gen, agentic review loop
🏆	ai-scientist-v2.md	AI-Scientist v2	End-to-End	BFTS tree search, AIDE integration, peer review milestone
🏆	ai-researcher.md	AI-Researcher	End-to-End	LiteLLM multi-provider, Docker sandbox, NeurIPS 2025
🏆	agent-laboratory.md	Agent Laboratory	End-to-End	Role-specialized agents, arXiv + HuggingFace integration
🔬	biomni.md	Biomni	End-to-End	Biomedical datalake, know-how library, multimodal inputs
🔬	bioagents.md	BioAgents	End-to-End	Specialized literature + analysis agents, BixBench SOTA (48.78%)
🏆	storm.md	STORM	Literature	DSPy pipeline, multi-perspective QA, Co-STORM
🏆	gpt-researcher.md	GPT Researcher	Literature	Parallel scraping, LangGraph orchestration, MCP
🏆	deerflow.md	DeerFlow	Literature	ByteDance InfoQuest, sub-agent memory, code execution
🌟	paperqa2.md	PaperQA2	Literature	Iterative retrieval, tantivy indexing, ICLR results
🌟	openscholar.md	OpenScholar	Literature	45M paper index, Contriever dense retrieval, Nature paper
🌟	open-deep-research.md	Open Deep Research	Literature	LangChain MCP integration, LangSmith tracing
🏆	openhands.md	OpenHands	Code Agent	72% SWE-Bench Verified, composable agent architecture
🏆	swe-agent.md	SWE-agent	Code Agent	Agent-Computer Interface (ACI), Princeton NLP design
🔬	autogpt.md	AutoGPT	Code Agent	Historical context, Forge platform, Agent Protocol
🏆	autoresearch.md	autoresearch	End-to-End	630-line self-referential experiment loop, Karpathy design philosophy
🏆	deep-research.md	deep-research	Literature	Recursive depth/breadth scaffold, Firecrawl+Exa, TypeScript reference
🌟	cognitivekernel-pro.md	CognitiveKernel-Pro	Literature	SFT-trained Qwen3-8B, Playwright web engine, Tencent AI Lab
🏆	datagen.md	DATAGEN	End-to-End	Multi-agent hypothesis gen, data analysis pipeline, state tracking
🔬	medresearcher-r1.md	MedResearcher-R1	End-to-End	Medical KG-grounded trajectory synthesis, 32B model, MedBrowseComp SOTA
🏆	mirothinker.md	MiroThinker	Literature	RL-trained 30B/235B open models, 88.2 BrowseComp, interactive scaling
🔬	deeper-seeker.md	Deeper-Seeker	Literature	Iterative research, follow-up questions, multi-step synthesis
🔬	autodidact.md	AutoDidact	Code Agent	GRPO self-bootstrapping, Llama-8B, single-GPU research agent training
🔬	ii-researcher.md	II-Researcher	Literature	BAML structured LLM functions, 84.12% Frames, async multi-provider search
🏆	aider.md	Aider	Code Agent	AI pair programming, 60+ LLM models, SWE-Bench ~18%, Git-native commits
🏆	aris.md	ARIS	End-to-End	Claude Code + MCP overnight agent, cross-model review, Zotero + Obsidian
🏆	tongyi-deepresearch.md	Tongyi DeepResearch	Literature	RL-trained 30.5B (GRPO), SOTA long-horizon info-seeking, open-weight
🌟	deep-research-agent.md	DeepResearchAgent	Literature	Hierarchical multi-agent, Autogenesis self-evolution, Skywork AI
🌟	idea2paper.md	Idea2Paper	Idea Generation	Multi-agent + KG novelty alignment, Semantic Scholar + arXiv pipeline
🌟	sciagents.md	SciAgents	Idea Generation	Ontology graph + multi-agent reasoning, MIT Buehler lab
🌟	aide.md	AIDE	Code Agent	Tree-search over ML solution space, iterative code refinement, WecoAI

🧭 How to Choose the Right Tool

Answer the questions below in order — each branch ends at a concrete recommendation.

── START HERE ────────────────────────────────────────────────────────────────

 Q1: What is your end goal?
 │
 ├─ (A) Produce a full research paper / manuscript
 │       └─ go to Q2
 │
 ├─ (B) Survey a topic, synthesize literature, or generate a research report
 │       └─ go to Q5
 │
 ├─ (C) Run, debug, or automate code / ML experiments
 │       └─ go to Q8
 │
 └─ (D) Generate or refine novel research ideas
         └─ go to Q11

───────────────────────────────────────────────────────────────────────────────
 A: FULL PAPER / MANUSCRIPT
───────────────────────────────────────────────────────────────────────────────

 Q2: What research domain are you in?
 │
 ├─ General ML / Computer Science
 │       └─ go to Q3
 │
 ├─ Biomedical / Life Sciences
 │       └─ ✅  Biomni  (Stanford SNAP; biomedical datalake + know-how library)
 │
 └─ Other / interdisciplinary
         └─ go to Q3  (general systems are still useful starting points)

 Q3: How much control / human involvement do you want?
 │
 ├─ Fully autonomous — I want to set it running overnight
 │       └─ go to Q4
 │
 └─ Semi-autonomous — I want to steer hypothesis and review results
         └─ ✅  Agent Laboratory  (role-based: Professor → PhD Student → Reviewer;
                                    human can intervene at each stage)

 Q4: Do you prioritize pipeline maturity or LLM flexibility?
 │
 ├─ Mature pipeline, proven end-to-end results
 │       └─ ✅  AI-Scientist v1 / v2  (SakanaAI; produced first peer-reviewed AI paper)
 │
 └─ Broadest LLM provider support + reproducible Docker environment
         └─ ✅  AI-Researcher  (HKUDS; LiteLLM + Docker; NeurIPS 2025 Spotlight)

───────────────────────────────────────────────────────────────────────────────
 B: LITERATURE SURVEY / RESEARCH REPORT
───────────────────────────────────────────────────────────────────────────────

 Q5: Where does your source material come from?
 │
 ├─ The open web (news, blogs, general knowledge)
 │       └─ go to Q6
 │
 ├─ My own PDF collection (papers I've already downloaded)
 │       └─ ✅  PaperQA2  (iterative full-text RAG; highest accuracy on local PDFs)
 │
 └─ Academic papers at large scale (no local download needed)
         └─ ✅  OpenScholar  (45M open-access papers; Contriever dense retrieval;
                                Nature 2024; outperforms Perplexity Pro on sci Q&A)

 Q6: What output format do you need?
 │
 ├─ A structured, Wikipedia-style article with cited sections
 │       └─ ✅  STORM  (Stanford OVAL; DSPy pipeline; Co-STORM for collaboration;
                          NAACL 2024)
 │
 ├─ A concise 5–6 page factual report (PDF / Word / Markdown)
 │       └─ ✅  GPT Researcher  (parallel web agents + LangGraph; MCP support;
                                   fastest route to a cited report)
 │
 └─ A report that also includes runnable code or data analysis
         └─ go to Q7

 Q7: Do you need a production-grade, configurable pipeline?
 │
 ├─ Yes — I'm building this into a product or workflow
 │       └─ ✅  Open Deep Research  (LangChain; MCP tool plugins; LangSmith
                                       tracing; designed as a reference implementation)
 │
 └─ No — I need something working quickly out of the box
         └─ ✅  DeerFlow  (ByteDance; LangGraph + memory + code execution;
                             research + code in one pipeline)

───────────────────────────────────────────────────────────────────────────────
 C: CODE / EXPERIMENT AUTOMATION
───────────────────────────────────────────────────────────────────────────────

 Q8: What is your primary metric for choosing?
 │
 ├─ Raw benchmark performance on software engineering tasks
 │       └─ ✅  OpenHands  (72% on SWE-Bench Verified; best-in-class;
                               composable Python library + Web UI)
 │
 ├─ Structured, auditable, research-friendly interface
 │       └─ ✅  SWE-agent  (Princeton NLP; Agent-Computer Interface (ACI);
                               widely used as research baseline)
 │
 ├─ Daily pair-programming with Git integration (low overhead)
 │       └─ ✅  Aider  (terminal-native; Git-native commits; supports 60+ models)
 │
 └─ ML-experiment-specific iteration (Kaggle / benchmark tasks)
         └─ go to Q9

 Q9: Is your task similar to Kaggle-style ML competitions?
 │
 ├─ Yes
 │       └─ ✅  AIDE  (WecoAI; tree-search over solution space;
                          used internally by AI-Scientist-v2)
 │
 └─ No — I just want a pioneer framework to understand the space
         └─ ✅  AutoGPT  (historically seminal; Forge builder; broad plugin ecosystem)

───────────────────────────────────────────────────────────────────────────────
 D: NOVEL IDEA GENERATION
───────────────────────────────────────────────────────────────────────────────

 Q10: What kind of grounding do you need for the ideas?
 │
 ├─ Literature-grounded novelty checking (Semantic Scholar + arXiv KG)
 │       └─ ✅  Idea2Paper  (KG alignment; raw idea → structured proposal)
 │
 ├─ Domain ontology-based scientific reasoning
 │       └─ ✅  SciAgents  (MIT; multi-agent + ontology graphs)
 │
 ├─ Iterative critique loops against academic concept databases
 │       └─ ✅  ResearchAgent  (light-weight; good for early-stage idea exploration)
 │
 └─ Fully autonomous overnight ideation with cross-model review
         └─ ✅  ARIS  (Claude Code + MCP; runs unattended; Zotero + Obsidian)

── STILL UNSURE? ─────────────────────────────────────────────────────────────
 → Check the Capability Matrix above to compare any two systems side-by-side
 → Read the in-depth reports in reports/ for architecture and limitation details
──────────────────────────────────────────────────────────────────────────────

🤝 Contributing

Add a project — open an issue or submit a PR to the appropriate section table
Write an analysis report — see the report template, create reports/<slug>.md, update the Reports table above
Fix outdated info — broken links, stale star counts, new benchmark scores
Suggest new sections — open a Discussion

Please read CONTRIBUTING.md before submitting.

📈 Star History

Star growth of the leading research-specific tools since their respective launch dates.
AutoGPT (170k+ ⭐) is excluded from the chart to keep the research tools readable — view full comparison including AutoGPT →

Maintained by Peizheng Li · Licensed under MIT

If this repository helped your research, please consider giving it a ⭐

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
.github/ISSUE_TEMPLATE		.github/ISSUE_TEMPLATE
reports		reports
scripts		scripts
.gitignore		.gitignore
AGENTS.md		AGENTS.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🧬 Awesome Auto Research

🗺️ Research Automation Landscape

📑 Contents

📊 Capability Matrix

🚀 End-to-End Research Systems

🔍 Literature Review & Deep Research

⚗️ Experiment Automation & Code Agents

✍️ Idea Generation & Writing Assistants

📐 Benchmarks & Evaluation Suites

🎓 Academic Surveys & Papers

🧾 In-Depth Analysis Reports

🧭 How to Choose the Right Tool

🤝 Contributing

📈 Star History

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🧬 Awesome Auto Research

🗺️ Research Automation Landscape

📑 Contents

📊 Capability Matrix

🚀 End-to-End Research Systems

🔍 Literature Review & Deep Research

⚗️ Experiment Automation & Code Agents

✍️ Idea Generation & Writing Assistants

📐 Benchmarks & Evaluation Suites

🎓 Academic Surveys & Papers

🧾 In-Depth Analysis Reports

🧭 How to Choose the Right Tool

🤝 Contributing

📈 Star History

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages