GitHub - aiming-lab/AutoResearchClaw: Fully autonomous research from idea to paper. Chat an Idea. Get a Paper. Fully Autonomous. 🦞

Chat an Idea. Get a Paper. Fully Autonomous.

Just chat with OpenClaw: "Research X" → done.

🇨🇳 中文 · 🇯🇵 日本語 · 🇰🇷 한국어 · 🇫🇷 Français · 🇩🇪 Deutsch · 🇪🇸 Español · 🇧🇷 Português · 🇷🇺 Русский · 🇸🇦 العربية

⚡ One Command. One Paper.

pip install -e . && researchclaw run --topic "Your research idea here" --auto-approve

🤔 What Is This?

You think it. AutoResearchClaw writes it.

Drop a research topic — get back a full academic paper with real literature from arXiv & Semantic Scholar, hardware-aware sandbox experiments (GPU/MPS/CPU auto-detected), statistical analysis, multi-agent peer review, and conference-ready LaTeX targeting NeurIPS/ICML/ICLR. No babysitting. No copy-pasting. No hallucinated references.

📄	`paper_draft.md`	Full academic paper (Introduction, Related Work, Method, Experiments, Results, Conclusion)
📐	`paper.tex`	Conference-ready LaTeX (NeurIPS / ICLR / ICML templates)
📚	`references.bib`	Real BibTeX references from Semantic Scholar and arXiv — auto-pruned to match inline citations
🔍	`verification_report.json`	4-layer citation integrity + relevance verification (arXiv, CrossRef, DataCite, LLM)
🧪	`experiment runs/`	Generated code + sandbox results + structured JSON metrics
📊	`charts/`	Auto-generated condition comparison charts with error bars and confidence intervals
📝	`reviews.md`	Multi-agent peer review with methodology-evidence consistency checks
🧬	`evolution/`	Self-learning lessons extracted from each run
📦	`deliverables/`	All final outputs in one folder — compile-ready for Overleaf

The pipeline runs end-to-end without human intervention. When experiments fail, it self-heals. When hypotheses don't hold, it pivots. When citations are fake, it kills them.

🚀 Quick Start

# 1. Clone & install
git clone https://github.com/Jiaaqiliu/AutoResearchClaw.git
cd AutoResearchClaw
python3 -m venv .venv && source .venv/bin/activate
pip install -e .

# 2. Configure
cp config.researchclaw.example.yaml config.arc.yaml
# Edit config.arc.yaml — set your LLM API endpoint and key

# 3. Run
export OPENAI_API_KEY="sk-..."
researchclaw run --config config.arc.yaml --topic "Your research idea" --auto-approve

Output → artifacts/rc-YYYYMMDD-HHMMSS-<hash>/deliverables/ — compile-ready LaTeX, BibTeX, experiment code, charts.

📝 Minimum required config

project:
  name: "my-research"

research:
  topic: "Your research topic here"

llm:
  base_url: "https://api.openai.com/v1"
  api_key_env: "OPENAI_API_KEY"
  primary_model: "gpt-4o"
  fallback_models: ["gpt-4o-mini"]

experiment:
  mode: "sandbox"
  sandbox:
    python_path: ".venv/bin/python"

🧠 What Makes It Different

Capability	How It Works
🔄 PIVOT / REFINE Loop	Stage 15 autonomously decides: PROCEED, REFINE (tweak params), or PIVOT (new direction). Artifacts auto-versioned.
🤖 Multi-Agent Debate	Hypothesis generation, result analysis, and peer review each use structured multi-perspective debate.
🧬 Self-Learning	Lessons extracted per run (decision rationale, runtime warnings, metric anomalies) with 30-day time-decay. Future runs learn from past mistakes.
📚 Knowledge Base	Every run builds structured KB across 6 categories (decisions, experiments, findings, literature, questions, reviews).
🛡️ Sentinel Watchdog	Background quality monitor: NaN/Inf detection, paper-evidence consistency, citation relevance scoring, anti-fabrication guard.

🦞 OpenClaw Integration

AutoResearchClaw is an OpenClaw-compatible service. Install it in OpenClaw and launch autonomous research with a single message — or use it standalone via CLI, Claude Code, or any AI coding assistant.

🚀 Use with OpenClaw (Recommended)

If you already use OpenClaw as your AI assistant:

1️⃣  Share the GitHub repo URL with OpenClaw
2️⃣  OpenClaw auto-reads RESEARCHCLAW_AGENTS.md → understands the pipeline
3️⃣  Say: "Research [your topic]"
4️⃣  Done — OpenClaw clones, installs, configures, runs, and returns results

That's it. OpenClaw handles git clone, pip install, config setup, and pipeline execution automatically. You just chat.

💡 What happens under the hood

OpenClaw reads RESEARCHCLAW_AGENTS.md → learns the research orchestrator role
OpenClaw reads README.md → understands installation and pipeline structure
OpenClaw copies config.researchclaw.example.yaml → config.yaml
Asks for your LLM API key (or uses your environment variable)
Runs pip install -e . + researchclaw run --topic "..." --auto-approve
Returns the paper, LaTeX, experiments, and citations

🔌 OpenClaw Bridge (Advanced)

For deeper integration, AutoResearchClaw includes a bridge adapter system with 6 optional capabilities:

# config.arc.yaml
openclaw_bridge:
  use_cron: true              # ⏰ Scheduled research runs
  use_message: true           # 💬 Progress notifications (Discord/Slack/Telegram)
  use_memory: true            # 🧠 Cross-session knowledge persistence
  use_sessions_spawn: true    # 🔀 Spawn parallel sub-sessions for concurrent stages
  use_web_fetch: true         # 🌐 Live web search during literature review
  use_browser: false          # 🖥️ Browser-based paper collection

Each flag activates a typed adapter protocol. When OpenClaw provides these capabilities, the adapters consume them without code changes. See docs/integration-guide.md for full details.

🛠️ Other Ways to Run

Method	How
Standalone CLI	`researchclaw run --topic "..." --auto-approve`
Python API	`from researchclaw.pipeline import Runner; Runner(config).run()`
Claude Code	Reads `RESEARCHCLAW_CLAUDE.md` — just say "Run research on [topic]"
OpenCode	Reads `.claude/skills/` — same natural language interface
Any AI CLI	Provide `RESEARCHCLAW_AGENTS.md` as context → agent auto-bootstraps

🔬 Pipeline: 23 Stages, 8 Phases

Phase A: Research Scoping          Phase E: Experiment Execution
  1. TOPIC_INIT                      12. EXPERIMENT_RUN
  2. PROBLEM_DECOMPOSE               13. ITERATIVE_REFINE  ← self-healing

Phase B: Literature Discovery      Phase F: Analysis & Decision
  3. SEARCH_STRATEGY                 14. RESULT_ANALYSIS    ← multi-agent
  4. LITERATURE_COLLECT  ← real API  15. RESEARCH_DECISION  ← PIVOT/REFINE
  5. LITERATURE_SCREEN   [gate]
  6. KNOWLEDGE_EXTRACT               Phase G: Paper Writing
                                     16. PAPER_OUTLINE
Phase C: Knowledge Synthesis         17. PAPER_DRAFT
  7. SYNTHESIS                       18. PEER_REVIEW        ← evidence check
  8. HYPOTHESIS_GEN    ← debate      19. PAPER_REVISION

Phase D: Experiment Design         Phase H: Finalization
  9. EXPERIMENT_DESIGN   [gate]      20. QUALITY_GATE      [gate]
 10. CODE_GENERATION                 21. KNOWLEDGE_ARCHIVE
 11. RESOURCE_PLANNING               22. EXPORT_PUBLISH     ← LaTeX
                                     23. CITATION_VERIFY    ← relevance check

Gate stages (5, 9, 20) pause for human approval or auto-approve with --auto-approve. On rejection, the pipeline rolls back.

Decision loops: Stage 15 can trigger REFINE (→ Stage 13) or PIVOT (→ Stage 8), with automatic artifact versioning.

📋 What Each Phase Does

Phase	What Happens
A: Scoping	LLM decomposes the topic into a structured problem tree with research questions
A+: Hardware	Auto-detects GPU (NVIDIA CUDA / Apple MPS / CPU-only), warns if local hardware is limited, adapts code generation accordingly
B: Literature	Multi-source search (arXiv-first, then Semantic Scholar) for real papers, screens by relevance, extracts knowledge cards
C: Synthesis	Clusters findings, identifies research gaps, generates testable hypotheses via multi-agent debate
D: Design	Designs experiment plan, generates hardware-aware runnable Python (GPU tier → package selection), estimates resource needs
E: Execution	Runs experiments in sandbox, detects NaN/Inf and runtime bugs, self-heals code via targeted LLM repair
F: Analysis	Multi-agent analysis of results; autonomous PROCEED / REFINE / PIVOT decision with rationale
G: Writing	Outlines → section-by-section drafting (5,000-6,500 words) → peer reviews (with methodology-evidence consistency) → revises with length guard
H: Finalization	Quality gate, knowledge archival, LaTeX export with conference template, citation integrity + relevance verification

✨ Key Features

Feature	Description
📚 Multi-Source Literature	Real papers from arXiv (primary) + Semantic Scholar — query expansion, deduplication, circuit breaker with graceful degradation
🔍 4-Layer Citation Verification	arXiv ID check → CrossRef/DataCite DOI → Semantic Scholar title match → LLM relevance scoring. Hallucinated refs auto-removed.
🖥️ Hardware-Aware Execution	Auto-detects GPU (NVIDIA CUDA / Apple MPS / CPU-only) and adapts code generation, imports, and experiment scale accordingly
🧪 Sandbox Experiments	AST-validated code, immutable harness, NaN/Inf fast-fail, self-healing repair, iterative refinement (up to 10 rounds), partial result capture
📝 Conference-Grade Writing	NeurIPS/ICML/ICLR templates, section-by-section drafting (5,000-6,500 words), anti-fabrication guard, revision length guard, anti-disclaimer enforcement
📐 Template Switching	`neurips_2025`, `iclr_2026`, `icml_2026` — Markdown → LaTeX with math, tables, figures, cross-refs, `\cite{}`
🚦 Quality Gates	3 human-in-the-loop gates (Stages 5, 9, 20) with rollback. Skip with `--auto-approve`.

⚙️ Configuration Reference

Click to expand full configuration reference

# === Project ===
project:
  name: "my-research"              # Project identifier
  mode: "docs-first"               # docs-first | semi-auto | full-auto

# === Research ===
research:
  topic: "..."                     # Research topic (required)
  domains: ["ml", "nlp"]           # Research domains for literature search
  daily_paper_count: 8             # Target papers per search query
  quality_threshold: 4.0           # Minimum quality score for papers

# === Runtime ===
runtime:
  timezone: "America/New_York"     # For timestamps
  max_parallel_tasks: 3            # Concurrent experiment limit
  approval_timeout_hours: 12       # Gate stage timeout
  retry_limit: 2                   # Retry count on stage failure

# === LLM ===
llm:
  provider: "openai-compatible"    # Provider type
  base_url: "https://..."          # API endpoint (required)
  api_key_env: "OPENAI_API_KEY"    # Env var for API key (required)
  api_key: ""                      # Or hardcode key here
  primary_model: "gpt-4o"          # Primary model
  fallback_models: ["gpt-4o-mini"] # Fallback chain
  s2_api_key: ""                   # Semantic Scholar API key (optional, higher rate limits)

# === Experiment ===
experiment:
  mode: "sandbox"                  # simulated | sandbox | ssh_remote
  time_budget_sec: 600             # Max execution time per run (default: 600s)
  max_iterations: 10               # Max optimization iterations
  metric_key: "val_loss"           # Primary metric name
  metric_direction: "minimize"     # minimize | maximize
  sandbox:
    python_path: ".venv/bin/python"
    gpu_required: false
    allowed_imports: [math, random, json, csv, numpy, torch, sklearn]
    max_memory_mb: 4096
  ssh_remote:
    host: ""                       # GPU server hostname
    gpu_ids: []                    # Available GPU IDs
    remote_workdir: "/tmp/researchclaw_experiments"

# === Export ===
export:
  target_conference: "neurips_2025"  # neurips_2025 | iclr_2026 | icml_2026
  authors: "Anonymous"
  bib_file: "references"

# === Prompts ===
prompts:
  custom_file: ""                  # Path to custom prompts YAML (empty = defaults)

# === Security ===
security:
  hitl_required_stages: [5, 9, 20] # Stages requiring human approval
  allow_publish_without_approval: false
  redact_sensitive_logs: true

# === Knowledge Base ===
knowledge_base:
  backend: "markdown"              # markdown | obsidian
  root: "docs/kb"

# === Notifications ===
notifications:
  channel: "console"               # console | discord | slack
  target: ""

# === OpenClaw Bridge ===
openclaw_bridge:
  use_cron: false                  # Scheduled research runs
  use_message: false               # Progress notifications
  use_memory: false                # Cross-session knowledge persistence
  use_sessions_spawn: false        # Spawn parallel sub-sessions
  use_web_fetch: false             # Live web search
  use_browser: false               # Browser-based paper collection

🙏 Acknowledgments

Inspired by:

🔬 AI Scientist (Sakana AI) — Automated research pioneer
🧠 AutoResearch (Andrej Karpathy) — End-to-end research automation
🌐 FARS (Analemma) — Fully Automated Research System

📄 License

MIT — see LICENSE for details.

📌 Citation

If you find AutoResearchClaw useful, please cite:

@misc{liu2026autoresearchclaw,
  author       = {Liu, Jiaqi and Xia, Peng and Han, Siwei and Qiu, Shi and Zhang, Letian and Chen, Guiming  and Tu, Haoqin and Yang, Xinyu and and Zhou, Jiawei and Zhu, Hongtu and Li, Yun and Zheng, Zeyu and Xie, Cihang and Ding, Mingyu and Yao, Huaxiu},
  title        = {AutoResearchClaw: Fully Autonomous Research from Idea to Paper},
  year         = {2026},
  organization = {GitHub},
  url          = {https://github.com/aiming-lab/AutoResearchClaw},
}

_{Built with 🦞 by the AutoResearchClaw team}

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.claude/skills/researchclaw		.claude/skills/researchclaw
docs		docs
image		image
researchclaw		researchclaw
tests		tests
.gitignore		.gitignore
README.md		README.md
config.researchclaw.example.yaml		config.researchclaw.example.yaml
config_test_run.yaml		config_test_run.yaml
prompts.default.yaml		prompts.default.yaml
pyproject.toml		pyproject.toml
sentinel.sh		sentinel.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Chat an Idea. Get a Paper. Fully Autonomous.

⚡ One Command. One Paper.

🤔 What Is This?

🚀 Quick Start

🧠 What Makes It Different

🦞 OpenClaw Integration

🚀 Use with OpenClaw (Recommended)

🔌 OpenClaw Bridge (Advanced)

🛠️ Other Ways to Run

🔬 Pipeline: 23 Stages, 8 Phases

✨ Key Features

⚙️ Configuration Reference

🙏 Acknowledgments

📄 License

📌 Citation

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Chat an Idea. Get a Paper. Fully Autonomous.

⚡ One Command. One Paper.

🤔 What Is This?

🚀 Quick Start

🧠 What Makes It Different

🦞 OpenClaw Integration

🚀 Use with OpenClaw (Recommended)

🔌 OpenClaw Bridge (Advanced)

🛠️ Other Ways to Run

🔬 Pipeline: 23 Stages, 8 Phases

✨ Key Features

⚙️ Configuration Reference

🙏 Acknowledgments

📄 License

📌 Citation

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages