🔬 Awesome Auto Research

🤖 A curated list of open-source projects that automate scientific research — from literature review to idea generation, experiment execution, paper writing, and peer review.

📅 Star counts last verified: 2026-05-17

📑 Table of Contents

🧪 End-to-End Autonomous Research Systems
📚 Deep Research & Literature Synthesis
⚙️ Automated Experiment & Code Agent
🔧 Research Skills & Plugin Collections
📋 Awesome Lists & Surveys
💡 How This Differs from General AI Agent Lists
🤝 Contributing

🧪 End-to-End Autonomous Research Systems

Projects that automate the full research lifecycle: idea → experiment → paper.

Project	Framework / Tools	Supported LLM APIs	Description
autoresearch	Custom (PyTorch, nanochat)	Anthropic Claude, OpenAI Codex	By Andrej Karpathy. 630-line AI agent that reads its own training script, forms hypotheses, modifies code, runs experiments, and evaluates results — hundreds of experiments overnight.
AI-Scientist	Custom (templates, LaTeX pipeline)	OpenAI, Anthropic Claude, DeepSeek, Gemini, OpenRouter, open-weight models	The first comprehensive system for fully automated open-ended scientific discovery. Automates idea generation, coding, experiments, and manuscript writing.
RD-Agent	Custom + LiteLLM, Docker, Streamlit, Qlib	OpenAI (GPT-4o/o1/o3), Azure OpenAI, DeepSeek; any LiteLLM provider	Microsoft. Automates R&D processes — factor/model evolution for quant, Kaggle automation, paper-to-code implementation. Top MLE-bench agent.
AutoResearchClaw	OpenClaw + Docker, LaTeX (NeurIPS/ICML/ICLR), OpenAlex, Semantic Scholar	OpenAI (GPT-4o), OpenRouter, DeepSeek, MiniMax; Claude/Gemini/Kimi via ACP	Fully autonomous research: idea → literature retrieval → sandbox experiments → multi-agent peer review → LaTeX paper output.
ARIS	Claude Code + MCP servers (Codex, llm-chat, Zotero, Obsidian)	Anthropic Claude, OpenAI GPT, GLM-5, MiniMax, Kimi, Qwen, DeepSeek, LongCat; any OpenAI-compatible	Claude Code skills for autonomous ML research: cross-model review loops, idea discovery, experiment automation, and paper writing.
AI-Scientist-v2	Custom (BFTS agentic tree search, AIDE)	OpenAI (o1/o3/GPT-4o), Anthropic (Bedrock), Gemini	Upgraded version using agentic tree search. Generated the first AI-written workshop paper accepted through peer review.
Agent Laboratory	Custom multi-agent (arXiv, HuggingFace, LaTeX)	OpenAI (o1/o3/GPT-4o), DeepSeek	End-to-end autonomous research workflow with specialized agents for literature review, experimentation, and report writing.
AI-Researcher	Custom + LiteLLM, Docker, Gradio	Anthropic, OpenAI, Gemini, DeepSeek, OpenRouter, GitHub AI (via LiteLLM)	NeurIPS 2025 Spotlight. Fully autonomous system covering literature review, hypothesis generation, algorithm implementation, and manuscript preparation.
claude-scholar	Claude Code / Codex CLI / OpenCode, Zotero MCP, Obsidian, LaTeX	Anthropic Claude, OpenAI (via Codex)	Semi-automated academic research assistant covering ideation → coding → experiments → writing → publication.
Biomni	Custom biomedical agent + code execution, datalake, know-how library	Anthropic Claude, OpenAI, Azure OpenAI, Gemini, Groq, AWS Bedrock, custom OpenAI-compatible APIs	Stanford. General-purpose biomedical AI agent that autonomously executes research tasks across biology and medicine, combining LLM reasoning, retrieval, and tool/code use.
EvoScientist	LangChain + DeepAgents, Docker (Python 3.11 + Node.js 24)	Anthropic Claude, OpenAI, Google Gemini, MiniMax, NVIDIA NIM	Self-evolving AI Scientists. Six-agent team with persistent memory autonomously explores and iteratively improves. Built-in messaging channels (Slack/Discord/Telegram/Feishu/WeChat).
DeepScientist	Custom (Bayesian optimization, Findings Memory, Research Map), Git worktrees, LaTeX	OpenAI (Codex CLI), Anthropic Claude, Moonshot Kimi, OpenCode; local backends	Local-first autonomous research studio. Findings Memory + Bayesian optimization orchestrate baseline reproduction → branched experiments → LaTeX paper drafts.
DATAGEN	LangChain + LangGraph, MCP servers, Firecrawl	OpenAI, Anthropic Claude, Gemini, Ollama, Groq	AI-driven multi-agent research assistant automating hypothesis generation, data analysis, visualization, and report writing.
Idea2Paper	AgentAlpha Framework (Multi-Agent), Vector DB, Knowledge Graph (KG)	DeepSeek V3/R1, Claude 3.5, GPT-4o; Semantic Scholar, ArXiv API	Advanced Research Idea Exploration Engine: Orchestrates multi-agent workflows for deep literature mining and KG alignment; Refines raw ideas into novel, structured research proposals.
InternAgent	Custom (Aider for codegen, persistent memory), Conda; Google Search, Semantic Scholar	OpenAI (incl. OpenAI-compatible), Anthropic Claude	Shanghai AI Lab. Unified agentic framework for long-horizon autonomous discovery across physics, biology, earth, and life sciences — reaction yield, molecular dynamics, protein engineering, climate diagnostics.

📚 Deep Research & Literature Synthesis

Projects focused on automated information gathering, literature review, and report generation.

Project	Framework / Tools	Supported LLM APIs	Description
DeerFlow	LangChain + LangGraph, InfoQuest	Any OpenAI-compatible API (GPT-4, Gemini via OpenRouter, etc.)	ByteDance. Open-source SuperAgent harness. Orchestrates sub-agents, memory, and sandboxes for deep research, code generation, and report writing.
STORM	DSPy + LiteLLM, Streamlit	All LiteLLM models (OpenAI, Azure, etc.); Search: You.com, Bing, Google, Brave, Tavily, SearXNG	Stanford. LLM-powered knowledge curation system that generates full-length Wikipedia-like articles with citations. Features Co-STORM.
GPT Researcher	LangGraph, MCP, FastAPI, NextJS	OpenAI, Anthropic Claude, Gemini; any OpenAI-compatible API	Autonomous agent for deep web & local research. Generates 5-6 page factual reports with citations in PDF/Docx/Markdown.
ChatPaper	PyMuPDF, arxiv.py, Flask, Docker	OpenAI (GPT-3.5/4)	Use ChatGPT to summarize arXiv papers, provide professional translation, paper polishing, peer review analysis, and reviewer response generation.
Tongyi DeepResearch	Custom (ReAct, IterResearch, GRPO RL); Serper, Jina, SandboxFusion	OpenAI-compatible, OpenRouter; Tongyi-30B-A3B, Dashscope/Bailian	Alibaba. Agentic LLM (30.5B params, 3.3B activated) for long-horizon deep information-seeking. SOTA on multiple benchmarks.
Open Deep Research	LangChain + LangGraph, MCP, LangSmith	OpenAI (GPT-5/4.1), Anthropic (Sonnet 4), OpenRouter, Ollama (local)	LangChain. Open-source deep research framework with configurable MCP tools and search APIs.
PaperQA2	Custom + LiteLLM, Pydantic, tantivy	OpenAI, Anthropic, Gemini, Ollama, llama.cpp; any LiteLLM provider	High-accuracy RAG for scientific documents. Dynamically retrieves full-text papers and iterates on answers. Published at ICLR.
local-deep-research	LangChain + LangGraph, FastAPI, FAISS, SQLCipher, SearXNG	Ollama, LM Studio, llama.cpp (local); OpenAI, Anthropic, Gemini, OpenRouter	Local-first deep research agent reaching ~95% on SimpleQA with local LLMs. Integrates arXiv/PubMed/Semantic Scholar/Wikipedia and 10+ other sources with encrypted storage.
DeepResearchAgent	Custom (Autogenesis self-evolution), MMEngine configs	OpenRouter (multi-model access)	Skywork. Hierarchical multi-agent system with top-level planning agent coordinating specialized lower-level agents.
Auto-Deep-Research	AutoAgent Framework + LiteLLM, Docker	Anthropic, OpenAI, Gemini, Mistral, Groq, OpenRouter, DeepSeek; any OpenAI-compatible	Open-source, cost-efficient alternative to OpenAI's Deep Research. Universal LLM compatibility, zero-config launch. Strong GAIA Benchmark results.
OpenScholar	Custom RAG (PyTorch, HuggingFace, Contriever)	OpenAI (GPT-4o), Llama 3.1 8B (self-hosted); Semantic Scholar API, You.com	Retrieval-augmented LM searching 45M open-access papers. Published in Nature. Outperforms PaperQA2 and Perplexity Pro.
ChatReviewer	Python, tiktoken, Docker, HuggingFace Spaces	OpenAI (GPT-3.5/4)	Uses ChatGPT to analyze paper strengths/weaknesses, provide improvement suggestions, and auto-generate reviewer responses. Companion to ChatPaper.
OpenResearcher	Megatron-LM (training), vLLM (serving), HuggingFace, Tevatron, BM25 + Qwen3-Embedding, Serper	OpenResearcher-30B-A3B (open-weight release); OpenAI API (scoring)	Fully open training + inference pipeline for long-horizon deep research. Releases 30B-A3B model, surpassing GPT-4.1 and Claude Opus 4 on BrowseComp-Plus.

⚙️ Automated Experiment & Code Agent

Projects that automate coding, experiment execution, and iterative optimization. These serve as the "hands" of auto-research systems.

Project	Framework / Tools	Supported LLM APIs	Description
AutoGPT	Custom (Agent Builder, workflow blocks), Docker	OpenAI, Anthropic, Groq, Llama, AI/ML API (300+ models)	One of the earliest autonomous AI agent frameworks. Includes Forge for agent creation, benchmarking suite, and user-friendly UI.
OpenHands	Custom agentic framework, composable Python lib	Anthropic Claude, OpenAI GPT, MiniMax; any LLM	AI-driven software development platform. Autonomous coding agents that edit files, run commands, browse web. 72% on SWE-Bench Verified.
Aider	Custom (AI pair-programming CLI), Git integration	Anthropic Claude, OpenAI, DeepSeek, OpenRouter, Ollama; nearly any LLM	AI pair programming in your terminal. Supports multi-file edits, git integration. Widely used as the coding backbone in research pipelines.
SWE-agent	Custom (YAML-config-driven), purpose-built for research	OpenAI (GPT-4o), Anthropic (Sonnet 4, Claude 3.7); configurable	Princeton. Turns LLMs into software engineering agents that fix real GitHub issues. Pioneered the SWE-Bench benchmark.
PaperBanana	Streamlit, OpenRouter	OpenAI, Anthropic, Gemini (via OpenRouter)	Reference-driven multi-agent framework for automated academic illustration. 5 specialized agents (Retriever, Planner, Stylist, Visualizer, Critic) produce publication-quality diagrams.
MLE-agent	Python, Kaggle integration, arXiv, Papers with Code	OpenAI, Anthropic Claude, Ollama (Llama3), Mistral	Intelligent companion for ML engineering and research. Integrates with arXiv and Papers with Code for better code/research plans. Auto-debugging.
AIDE	Python, Streamlit, Docker	OpenAI (GPT-4-turbo/4o), Anthropic Claude, Gemini, Ollama (local)	AI-Driven Exploration in the Space of Code. LLM agent that writes, evaluates, and improves ML code via agentic tree search. [paper] 4x more Kaggle medals than best linear agent. Hosted platform: Weco AI.

🔧 Research Skills & Plugin Collections

Reusable skill sets and plugin ecosystems that integrate with coding agents (Claude Code, Codex, Gemini CLI, etc.) to enable research workflows.

Project	Framework / Tools	Supported LLM APIs	Description
scientific-agent-skills	PyTorch Lightning, scikit-learn, BioPython, RDKit, DeepChem, Scanpy, OpenMM	Agent-agnostic (Claude Code, Cursor, Codex, Gemini CLI)	133 ready-to-use scientific skills across bioinformatics, drug discovery, clinical research, medical imaging, and materials science.
AI-Research-SKILLs	DeepSpeed, vLLM, LangChain, W&B, MLflow, and 80+ frameworks	Agent-agnostic (Claude Code, Codex, Gemini CLI, Qwen Code)	86 skills across 22 categories covering the full AI research lifecycle: literature review, idea generation, experimentation, and paper authoring.
OpenClaw-Medical-Skills	BioPython, GATK, Scanpy, RDKit, DeepChem, OpenMM, AlphaFold, pysam, MDAnalysis	Claude-based agents via OpenClaw / NanoClaw frameworks	869 medical AI skills spanning clinical reports, genomics, drug discovery, bioinformatics, structural biology, and biomedical databases.

📋 Awesome Lists & Surveys

Curated collections and survey papers on the auto-research landscape.

Project	Stars	Description
awesome-autoresearch		Curated index of autonomous improvement loops, research agents, and autoresearch-style systems inspired by Karpathy's autoresearch. 50+ entries.
awesome-ai-for-science		Curated list of AI tools, libraries, papers, datasets, and frameworks for scientific discovery across physics, chemistry, biology, and materials.
Autonomous-Agents		Daily-updated curated collection of research papers on autonomous LLM agents. Covers multi-agent systems, scientific computing, robotics, and more.
Awesome-Deep-Research		Curated collection of deep research agents — industry products, open-source implementations, 70+ recent papers, and benchmarks through early 2026.

💡 How This Differs from General AI Agent Lists

This list focuses specifically on automating the scientific research process — not general-purpose AI agents. We include projects that target one or more stages of the research lifecycle:

📖 Literature Review → 💡 Idea Generation → 🔍 Novelty Check → 📐 Experiment Design →
💻 Code Implementation → 🚀 Experiment Execution → 📊 Result Analysis → ✍️ Paper Writing → 📝 Peer Review

General-purpose coding agents (OpenHands, Aider, SWE-agent) are included because they serve as critical infrastructure for the experiment execution stage.

🤝 Contributing

PRs welcome! Please ensure the project:

Has 500+ GitHub stars (or is exceptionally notable with a top-venue publication)
Is directly related to automating scientific research
Is open-source with an active repository

Please keep entries sorted by star count (descending) within each category.

📈 Star History

📄 License

CC0 1.0 Universal

Name		Name	Last commit message	Last commit date
Latest commit History 29 Commits
.github/workflows		.github/workflows
fig		fig
scripts		scripts
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
LICENSE		LICENSE
README.md		README.md
README_CN.md		README_CN.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🔬 Awesome Auto Research

📑 Table of Contents

🧪 End-to-End Autonomous Research Systems

📚 Deep Research & Literature Synthesis

⚙️ Automated Experiment & Code Agent

🔧 Research Skills & Plugin Collections

📋 Awesome Lists & Surveys

💡 How This Differs from General AI Agent Lists

🤝 Contributing

📈 Star History

📄 License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🔬 Awesome Auto Research

📑 Table of Contents

🧪 End-to-End Autonomous Research Systems

📚 Deep Research & Literature Synthesis

⚙️ Automated Experiment & Code Agent

🔧 Research Skills & Plugin Collections

📋 Awesome Lists & Surveys

💡 How This Differs from General AI Agent Lists

🤝 Contributing

📈 Star History

📄 License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages