Skip to content

handsome-rich/Awesome-Auto-Research-Tools

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

29 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🔬 Awesome Auto Research Awesome

English | 中文

Awesome Auto Research

🤖 A curated list of open-source projects that automate scientific research — from literature review to idea generation, experiment execution, paper writing, and peer review.

📅 Star counts last verified: 2026-05-17


📑 Table of Contents


🧪 End-to-End Autonomous Research Systems

Projects that automate the full research lifecycle: idea → experiment → paper.

Project Stars Framework / Tools Supported LLM APIs Description
autoresearch Custom (PyTorch, nanochat) Anthropic Claude, OpenAI Codex By Andrej Karpathy. 630-line AI agent that reads its own training script, forms hypotheses, modifies code, runs experiments, and evaluates results — hundreds of experiments overnight.
AI-Scientist Custom (templates, LaTeX pipeline) OpenAI, Anthropic Claude, DeepSeek, Gemini, OpenRouter, open-weight models The first comprehensive system for fully automated open-ended scientific discovery. Automates idea generation, coding, experiments, and manuscript writing.
RD-Agent Custom + LiteLLM, Docker, Streamlit, Qlib OpenAI (GPT-4o/o1/o3), Azure OpenAI, DeepSeek; any LiteLLM provider Microsoft. Automates R&D processes — factor/model evolution for quant, Kaggle automation, paper-to-code implementation. Top MLE-bench agent.
AutoResearchClaw OpenClaw + Docker, LaTeX (NeurIPS/ICML/ICLR), OpenAlex, Semantic Scholar OpenAI (GPT-4o), OpenRouter, DeepSeek, MiniMax; Claude/Gemini/Kimi via ACP Fully autonomous research: idea → literature retrieval → sandbox experiments → multi-agent peer review → LaTeX paper output.
ARIS Claude Code + MCP servers (Codex, llm-chat, Zotero, Obsidian) Anthropic Claude, OpenAI GPT, GLM-5, MiniMax, Kimi, Qwen, DeepSeek, LongCat; any OpenAI-compatible Claude Code skills for autonomous ML research: cross-model review loops, idea discovery, experiment automation, and paper writing.
AI-Scientist-v2 Custom (BFTS agentic tree search, AIDE) OpenAI (o1/o3/GPT-4o), Anthropic (Bedrock), Gemini Upgraded version using agentic tree search. Generated the first AI-written workshop paper accepted through peer review.
Agent Laboratory Custom multi-agent (arXiv, HuggingFace, LaTeX) OpenAI (o1/o3/GPT-4o), DeepSeek End-to-end autonomous research workflow with specialized agents for literature review, experimentation, and report writing.
AI-Researcher Custom + LiteLLM, Docker, Gradio Anthropic, OpenAI, Gemini, DeepSeek, OpenRouter, GitHub AI (via LiteLLM) NeurIPS 2025 Spotlight. Fully autonomous system covering literature review, hypothesis generation, algorithm implementation, and manuscript preparation.
claude-scholar Claude Code / Codex CLI / OpenCode, Zotero MCP, Obsidian, LaTeX Anthropic Claude, OpenAI (via Codex) Semi-automated academic research assistant covering ideation → coding → experiments → writing → publication.
Biomni Custom biomedical agent + code execution, datalake, know-how library Anthropic Claude, OpenAI, Azure OpenAI, Gemini, Groq, AWS Bedrock, custom OpenAI-compatible APIs Stanford. General-purpose biomedical AI agent that autonomously executes research tasks across biology and medicine, combining LLM reasoning, retrieval, and tool/code use.
EvoScientist LangChain + DeepAgents, Docker (Python 3.11 + Node.js 24) Anthropic Claude, OpenAI, Google Gemini, MiniMax, NVIDIA NIM Self-evolving AI Scientists. Six-agent team with persistent memory autonomously explores and iteratively improves. Built-in messaging channels (Slack/Discord/Telegram/Feishu/WeChat).
DeepScientist Custom (Bayesian optimization, Findings Memory, Research Map), Git worktrees, LaTeX OpenAI (Codex CLI), Anthropic Claude, Moonshot Kimi, OpenCode; local backends Local-first autonomous research studio. Findings Memory + Bayesian optimization orchestrate baseline reproduction → branched experiments → LaTeX paper drafts.
DATAGEN LangChain + LangGraph, MCP servers, Firecrawl OpenAI, Anthropic Claude, Gemini, Ollama, Groq AI-driven multi-agent research assistant automating hypothesis generation, data analysis, visualization, and report writing.
Idea2Paper AgentAlpha Framework (Multi-Agent), Vector DB, Knowledge Graph (KG) DeepSeek V3/R1, Claude 3.5, GPT-4o; Semantic Scholar, ArXiv API Advanced Research Idea Exploration Engine: Orchestrates multi-agent workflows for deep literature mining and KG alignment; Refines raw ideas into novel, structured research proposals.
InternAgent Custom (Aider for codegen, persistent memory), Conda; Google Search, Semantic Scholar OpenAI (incl. OpenAI-compatible), Anthropic Claude Shanghai AI Lab. Unified agentic framework for long-horizon autonomous discovery across physics, biology, earth, and life sciences — reaction yield, molecular dynamics, protein engineering, climate diagnostics.

📚 Deep Research & Literature Synthesis

Projects focused on automated information gathering, literature review, and report generation.

Project Stars Framework / Tools Supported LLM APIs Description
DeerFlow LangChain + LangGraph, InfoQuest Any OpenAI-compatible API (GPT-4, Gemini via OpenRouter, etc.) ByteDance. Open-source SuperAgent harness. Orchestrates sub-agents, memory, and sandboxes for deep research, code generation, and report writing.
STORM DSPy + LiteLLM, Streamlit All LiteLLM models (OpenAI, Azure, etc.); Search: You.com, Bing, Google, Brave, Tavily, SearXNG Stanford. LLM-powered knowledge curation system that generates full-length Wikipedia-like articles with citations. Features Co-STORM.
GPT Researcher LangGraph, MCP, FastAPI, NextJS OpenAI, Anthropic Claude, Gemini; any OpenAI-compatible API Autonomous agent for deep web & local research. Generates 5-6 page factual reports with citations in PDF/Docx/Markdown.
ChatPaper PyMuPDF, arxiv.py, Flask, Docker OpenAI (GPT-3.5/4) Use ChatGPT to summarize arXiv papers, provide professional translation, paper polishing, peer review analysis, and reviewer response generation.
Tongyi DeepResearch Custom (ReAct, IterResearch, GRPO RL); Serper, Jina, SandboxFusion OpenAI-compatible, OpenRouter; Tongyi-30B-A3B, Dashscope/Bailian Alibaba. Agentic LLM (30.5B params, 3.3B activated) for long-horizon deep information-seeking. SOTA on multiple benchmarks.
Open Deep Research LangChain + LangGraph, MCP, LangSmith OpenAI (GPT-5/4.1), Anthropic (Sonnet 4), OpenRouter, Ollama (local) LangChain. Open-source deep research framework with configurable MCP tools and search APIs.
PaperQA2 Custom + LiteLLM, Pydantic, tantivy OpenAI, Anthropic, Gemini, Ollama, llama.cpp; any LiteLLM provider High-accuracy RAG for scientific documents. Dynamically retrieves full-text papers and iterates on answers. Published at ICLR.
local-deep-research LangChain + LangGraph, FastAPI, FAISS, SQLCipher, SearXNG Ollama, LM Studio, llama.cpp (local); OpenAI, Anthropic, Gemini, OpenRouter Local-first deep research agent reaching ~95% on SimpleQA with local LLMs. Integrates arXiv/PubMed/Semantic Scholar/Wikipedia and 10+ other sources with encrypted storage.
DeepResearchAgent Custom (Autogenesis self-evolution), MMEngine configs OpenRouter (multi-model access) Skywork. Hierarchical multi-agent system with top-level planning agent coordinating specialized lower-level agents.
Auto-Deep-Research AutoAgent Framework + LiteLLM, Docker Anthropic, OpenAI, Gemini, Mistral, Groq, OpenRouter, DeepSeek; any OpenAI-compatible Open-source, cost-efficient alternative to OpenAI's Deep Research. Universal LLM compatibility, zero-config launch. Strong GAIA Benchmark results.
OpenScholar Custom RAG (PyTorch, HuggingFace, Contriever) OpenAI (GPT-4o), Llama 3.1 8B (self-hosted); Semantic Scholar API, You.com Retrieval-augmented LM searching 45M open-access papers. Published in Nature. Outperforms PaperQA2 and Perplexity Pro.
ChatReviewer Python, tiktoken, Docker, HuggingFace Spaces OpenAI (GPT-3.5/4) Uses ChatGPT to analyze paper strengths/weaknesses, provide improvement suggestions, and auto-generate reviewer responses. Companion to ChatPaper.
OpenResearcher Megatron-LM (training), vLLM (serving), HuggingFace, Tevatron, BM25 + Qwen3-Embedding, Serper OpenResearcher-30B-A3B (open-weight release); OpenAI API (scoring) Fully open training + inference pipeline for long-horizon deep research. Releases 30B-A3B model, surpassing GPT-4.1 and Claude Opus 4 on BrowseComp-Plus.

⚙️ Automated Experiment & Code Agent

Projects that automate coding, experiment execution, and iterative optimization. These serve as the "hands" of auto-research systems.

Project Stars Framework / Tools Supported LLM APIs Description
AutoGPT Custom (Agent Builder, workflow blocks), Docker OpenAI, Anthropic, Groq, Llama, AI/ML API (300+ models) One of the earliest autonomous AI agent frameworks. Includes Forge for agent creation, benchmarking suite, and user-friendly UI.
OpenHands Custom agentic framework, composable Python lib Anthropic Claude, OpenAI GPT, MiniMax; any LLM AI-driven software development platform. Autonomous coding agents that edit files, run commands, browse web. 72% on SWE-Bench Verified.
Aider Custom (AI pair-programming CLI), Git integration Anthropic Claude, OpenAI, DeepSeek, OpenRouter, Ollama; nearly any LLM AI pair programming in your terminal. Supports multi-file edits, git integration. Widely used as the coding backbone in research pipelines.
SWE-agent Custom (YAML-config-driven), purpose-built for research OpenAI (GPT-4o), Anthropic (Sonnet 4, Claude 3.7); configurable Princeton. Turns LLMs into software engineering agents that fix real GitHub issues. Pioneered the SWE-Bench benchmark.
PaperBanana Streamlit, OpenRouter OpenAI, Anthropic, Gemini (via OpenRouter) Reference-driven multi-agent framework for automated academic illustration. 5 specialized agents (Retriever, Planner, Stylist, Visualizer, Critic) produce publication-quality diagrams.
MLE-agent Python, Kaggle integration, arXiv, Papers with Code OpenAI, Anthropic Claude, Ollama (Llama3), Mistral Intelligent companion for ML engineering and research. Integrates with arXiv and Papers with Code for better code/research plans. Auto-debugging.
AIDE Python, Streamlit, Docker OpenAI (GPT-4-turbo/4o), Anthropic Claude, Gemini, Ollama (local) AI-Driven Exploration in the Space of Code. LLM agent that writes, evaluates, and improves ML code via agentic tree search. [paper] 4x more Kaggle medals than best linear agent. Hosted platform: Weco AI.

🔧 Research Skills & Plugin Collections

Reusable skill sets and plugin ecosystems that integrate with coding agents (Claude Code, Codex, Gemini CLI, etc.) to enable research workflows.

Project Stars Framework / Tools Supported LLM APIs Description
scientific-agent-skills PyTorch Lightning, scikit-learn, BioPython, RDKit, DeepChem, Scanpy, OpenMM Agent-agnostic (Claude Code, Cursor, Codex, Gemini CLI) 133 ready-to-use scientific skills across bioinformatics, drug discovery, clinical research, medical imaging, and materials science.
AI-Research-SKILLs DeepSpeed, vLLM, LangChain, W&B, MLflow, and 80+ frameworks Agent-agnostic (Claude Code, Codex, Gemini CLI, Qwen Code) 86 skills across 22 categories covering the full AI research lifecycle: literature review, idea generation, experimentation, and paper authoring.
OpenClaw-Medical-Skills BioPython, GATK, Scanpy, RDKit, DeepChem, OpenMM, AlphaFold, pysam, MDAnalysis Claude-based agents via OpenClaw / NanoClaw frameworks 869 medical AI skills spanning clinical reports, genomics, drug discovery, bioinformatics, structural biology, and biomedical databases.

📋 Awesome Lists & Surveys

Curated collections and survey papers on the auto-research landscape.

Project Stars Description
awesome-autoresearch Curated index of autonomous improvement loops, research agents, and autoresearch-style systems inspired by Karpathy's autoresearch. 50+ entries.
awesome-ai-for-science Curated list of AI tools, libraries, papers, datasets, and frameworks for scientific discovery across physics, chemistry, biology, and materials.
Autonomous-Agents Daily-updated curated collection of research papers on autonomous LLM agents. Covers multi-agent systems, scientific computing, robotics, and more.
Awesome-Deep-Research Curated collection of deep research agents — industry products, open-source implementations, 70+ recent papers, and benchmarks through early 2026.

💡 How This Differs from General AI Agent Lists

This list focuses specifically on automating the scientific research process — not general-purpose AI agents. We include projects that target one or more stages of the research lifecycle:

📖 Literature Review → 💡 Idea Generation → 🔍 Novelty Check → 📐 Experiment Design →
💻 Code Implementation → 🚀 Experiment Execution → 📊 Result Analysis → ✍️ Paper Writing → 📝 Peer Review

General-purpose coding agents (OpenHands, Aider, SWE-agent) are included because they serve as critical infrastructure for the experiment execution stage.


🤝 Contributing

PRs welcome! Please ensure the project:

  • Has 500+ GitHub stars (or is exceptionally notable with a top-venue publication)
  • Is directly related to automating scientific research
  • Is open-source with an active repository

Please keep entries sorted by star count (descending) within each category.


📈 Star History

Star History Chart


📄 License

CC0 1.0 Universal

About

A curated collection of automated research tools, covering literature search, paper reading, experiment management, and code generation to help researchers accelerate their workflow.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages