From literature review to publication-ready manuscript — fully autonomous, zero human intervention.
Nigmat Rahim · Peking University · nigmatrahim@stu.pku.edu.cn
|
🔬 8 Specialized Agents |
📊 9 Data Sources |
📝 Publication-Ready |
BioAgent is a fully autonomous AI research system that conducts end-to-end bioinformatics research. Given a research question, it autonomously:
- Reviews literature across PubMed, ClinicalTrials, ClinVar, gnomAD, OncoKB, KEGG, UniProt, GWAS Catalog, and ArXiv
- Identifies gaps and generates testable hypotheses with novelty scoring
- Acquires real datasets from 9 biomedical data repositories (never fabricates data)
- Executes computational analyses with auto-generated Python code in a sandboxed environment
- Writes a complete manuscript in IMRAD format with proper PMID citations
- Creates publication-quality figures (Nature theme, 300 DPI, Okabe-Ito colour-blind palette)
- Self-reviews across 5 dimensions and iteratively revises until quality threshold is met
- Exports to Markdown + LaTeX (Bioinformatics OUP format) + BibTeX
No human in the loop required. One command. One research question. One complete manuscript.
# Install
git clone https://github.com/Nigmat-future/bioagent.git && cd bioagent
pip install -e ".[dev]"
# Configure
echo "BIOAGENT_ANTHROPIC_API_KEY=your-key" > .env
# Run
bioagent research "What is the mechanistic role of BRAF V600E in melanoma pathogenesis?"🐳 Docker
docker build -t bioagent:latest .
docker run --rm -e BIOAGENT_ANTHROPIC_API_KEY=$KEY bioagent:latest research "Your question"🔒 Reproducible Install (pinned deps)
pip install -r requirements-lock.txt
pip install -e . ┌─────────────────────────────────────────────────┐
│ 📋 ResearchState │
│ (papers, data, hypotheses, results, figures, │
│ paper_sections, review_feedback, …) │
└─────────────────┬───────────────────────────────┘
│ shared blackboard
┌──────▼──────┐
┌──────────│ 🎯 Orchestr.│◄──────────────────────┐
│ │ Agent │ │
│ └──────┬───────┘ │
│ │ LLM-directed routing │
┌───────┬───────┼────────┬────────┼─────────┬──────────┬─────────┤
▼ ▼ ▼ ▼ ▼ ▼ ▼ ▼
┌───────┐┌──────┐┌──────┐┌───────┐┌────────┐┌────────┐┌────────┐┌──────┐
│📚 Lit ││🔍 Gap││🧪Plan││⚗️ Exp ││💾 Data ││📈Anlst ││✍️Writer││🎨 Fig│
│ Agent ││ Anal.││ Agent││Design ││Acquir. ││ Agent ││ Agent ││Agent │
│ ││ ││ ││ ││ ││ ││ ││ │
│BioMCP ││ LLM ││Hyp+ ││ LLM ││9 tools ││Sandbox ││ IMRAD ││Nature│
│+ArXiv ││ ││rubric││ ││3-tier ││+debug ││+cites ││theme │
└───────┘└──────┘└──────┘└───────┘└────────┘└───┬────┘└────────┘└──────┘
│
┌──────▼──────┐
│ ✅ Validate │──── retry ──┐
└──────┬──────┘ │
▼ │
┌─────────────┐ │
│ 🔄 Iterate │────────────┘
└─────────────┘
│
┌──────▼──────┐
│ 📝 Review │ score ≥ 7 → ✅ DONE
│ (5 dims) │◄── revise ──┐
└──────┬──────┘ │
│ < 7, round < 3 │
└────────────────────┘
│
▼
┌────────────────┐
│ 📦 Export │
│ MD + LaTeX │
│ + BibTeX │
└────────────────┘
|
14-node LangGraph StateGraph with conditional orchestrator routing, a code-execution retry loop (max 5 iterations), and a review-revision loop (max 3 rounds). Optional |
Key design choices:
|
| Agent | Tools & Integrations | What It Does |
|---|---|---|
| 🎯 OrchestratorAgent | LLM-directed routing | Determines the next research phase (12 valid phases) with loop detection and anti-backtrack logic |
| 📚 LiteratureAgent | BioMCP (PubMed · ClinicalTrials · ClinVar · gnomAD · OncoKB · Reactome · KEGG · UniProt · GWAS) + ArXiv | Systematic literature review with structured summaries and gap identification |
| 🧪 PlannerAgent | BioMCP biological context | Hypothesis generation with novelty/testability scoring + detailed experiment design |
| 💾 DataAcquisitionAgent | 9 tools across GEO · cBioPortal · GDC/TCGA · NCBI · ENCODE · ENA · ArrayExpress · 10x CDN · Direct URL | Real dataset download with 3-tier fallback (API → REST/FTP → manual instructions). Mirror-first routing. Never fabricates data. |
| 📈 AnalystAgent | Python sandbox + 8 bioinformatics templates (scRNA-seq · DE · GWAS · survival · enrichment) | Auto-generates analysis code, executes in sandbox, auto-debugs on failure (up to 5 retries) |
| ✍️ WriterAgent | — | Writes publication-quality IMRAD sections with proper PMID citations and data provenance |
| 🎨 VisualizationAgent | Python sandbox + Nature matplotlib theme | Publication figures: 300 DPI, Okabe-Ito colour-blind palette, PDF + PNG output |
| 📝 ReviewAgent | — | 5-dimension self-review (novelty · rigor · clarity · completeness · reproducibility) with revision gating |
Evaluated on three real-world bioinformatics case studies using the same base model and prompts:
| Case Study | v0.1 Score | v0.3 Score | Δ | Highlights |
|---|---|---|---|---|
| TP53 Pan-Cancer | 1.06 | 8.42 | +7.36 | Full IMRAD draft · 4,439 words · 6 figures · Self-review 7/10 |
| scRNA PBMC 3k | 6.44 | 7.64 | +1.20 | Complete single-cell pipeline with clustering + markers |
| BRAF V600E Melanoma | 5.30 | 6.90 | +1.60 | 12 figures · 5 IMRAD sections · 2h 47m runtime |
Weighted composite scores (0–10) across 6 evaluation dimensions. See benchmarks/ for methodology.
📐 Evaluation Dimensions
| Dimension | Metrics |
|---|---|
| Literature Coverage | Precision / Recall vs. gold-standard PMIDs |
| Hypothesis Quality | Novelty, testability, literature grounding |
| Analysis Correctness | Code execution success rate, statistical validity |
| Writing Completeness | Section coverage, word count, Flesch readability |
| Figure Quality | Count, caption coverage, file presence |
| Efficiency | Token usage, cost (USD), self-review score |
# Run a research session
bioagent research "What are the most effective BRAF inhibitors for melanoma?"
# Specify topic explicitly
bioagent research "Analyze PBMC single-cell heterogeneity" --topic "scRNA-seq PBMC"
# Export completed session
bioagent export --thread <thread-id> --format both # Markdown + LaTeX
# Session management
bioagent status --thread <thread-id> # Check progress
bioagent resume --thread <thread-id> # Resume paused sessionfrom bioagent.graph.research_graph import compile_research_graph
from bioagent.tools.execution.sandbox import ensure_workspace
ensure_workspace()
graph = compile_research_graph()
state = {
"research_topic": "BRAF V600E in melanoma",
"research_question": "What is the mechanistic role of BRAF V600E?",
"current_phase": "literature_review",
}
for event in graph.stream(state, config={"configurable": {"thread_id": "session-001"}}):
print(f"Phase: {event.get('current_phase')}")🔧 Full programmatic example
See examples/quickstart.py for a complete working example with all state fields.
workspace/
├── 📁 data/ ← Downloaded datasets (CSV, HDF5, FASTQ, ...)
├── 📁 scripts/ ← Auto-generated Python analysis code
├── 📁 figures/ ← Publication-ready figures (PDF + PNG, 300 DPI)
└── 📁 output/
├── 📄 manuscript.md ← Markdown manuscript
├── 📄 manuscript.tex ← LaTeX (Bioinformatics OUP format)
├── 📄 references.bib ← BibTeX bibliography
└── 📄 provenance.json ← Full audit trail (model, seed, hashes, timings)
All settings use the BIOAGENT_ prefix. Create a .env file or set environment variables:
Core Settings
| Variable | Default | Description |
|---|---|---|
BIOAGENT_ANTHROPIC_API_KEY |
— | Anthropic API key (required) |
BIOAGENT_PRIMARY_MODEL |
claude-sonnet-4-5-20250929 |
Primary LLM model |
BIOAGENT_FALLBACK_MODEL |
gpt-4.1 |
Fallback model |
BIOAGENT_MAX_TOKENS |
4096 |
Max output tokens per LLM call |
BIOAGENT_MAX_TOOL_CALLS |
20 |
Max tool-use iterations per agent |
Budget & Limits
| Variable | Default | Description |
|---|---|---|
BIOAGENT_TOKEN_BUDGET |
500000 |
Total token budget (0 = unlimited) |
BIOAGENT_COST_BUDGET_USD |
10.0 |
Cost budget in USD (0 = unlimited) |
BIOAGENT_CODE_TIMEOUT |
120 |
Code execution timeout (seconds) |
BIOAGENT_MAX_ITERATIONS |
5 |
Max code execution retries |
Infrastructure
| Variable | Default | Description |
|---|---|---|
BIOAGENT_WORKSPACE_DIR |
workspace |
Working directory for outputs |
BIOAGENT_CHECKPOINT_DIR |
checkpoints |
SQLite checkpoint directory |
BIOAGENT_USE_SQLITE_CHECKPOINTS |
true |
Enable session persistence |
BIOAGENT_HUMAN_IN_LOOP |
false |
Require human approval per phase |
BIOAGENT_RANDOM_SEED |
42 |
Random seed for reproducibility |
BIOAGENT_TLS_VERIFY |
true |
TLS certificate verification |
BIOAGENT_LOG_LEVEL |
INFO |
Logging verbosity |
Network & Resilience
| Variable | Default | Description |
|---|---|---|
BIOAGENT_MIN_DOWNLOAD_MBPS |
2.0 |
Minimum download speed floor |
BIOAGENT_DOWNLOAD_MAX_RETRIES |
4 |
Download retry attempts |
BIOAGENT_TMP_STALE_HOURS |
24 |
Stale temp file cleanup threshold |
BIOAGENT_PREFER_MIRRORS |
true |
Prefer EBI/ENA mirrors over NCBI |
|
|
|
|
# Single case
python benchmarks/run_benchmark.py --case braf_melanoma
# All benchmark cases
python benchmarks/run_benchmark.py --case all --output benchmarks/results/
# Resume a failed run from checkpoint
python benchmarks/resume_run.py --thread-id <id>We welcome contributions! See docs/CONTRIBUTING.md for guidelines and docs/DEVELOPMENT.md for the architecture deep-dive and debugging guide.
# Development setup
pip install -e ".[dev]"
pre-commit install
# Run tests
pytest # Fast tests only
pytest -m "not api and not slow" # CI default
pytest --cov=bioagent --cov-report=html # With coverageIf you use BioAgent in your research, please cite:
@article{rahim2026bioagent,
title = {BioAgent: An Autonomous Multi-Agent System for
End-to-End Bioinformatics Research},
author = {Rahim, Nigmat},
journal = {Bioinformatics},
year = {2026},
note = {Under review. Preprint: \url{https://github.com/Nigmat-future/bioagent}}
}Released under the MIT License.