An agentic single-cell RNA-seq analysis system for 10x Genomics Chromium data.
scAgent is an AI assistant that helps wet-lab biologists perform, understand, and interpret scRNA-seq experiments through natural language conversation. It enforces best practices, tracks full provenance, and produces reproducible analyses — no programming knowledge required.
- Python ≥ 3.11
- Feynman — the AI agent runtime
curl -fsSL https://feynman.is/install | bash feynman setup # authenticate with your Claude subscription
- Optional: Claude Code — also works as the agent runtime (scAgent's
.pi/config is compatible with any pi-based agent)
git clone https://github.com/deepmind11/scAgent.git
cd scAgent
python -m venv .venv
source .venv/bin/activate
pip install -e .# Launch scAgent (works from anywhere)
scagent
# With specific model/thinking settings
scagent --model opus --thinking max
# Continue a previous session
scagent -c
# Resume/pick a session
scagent -rOn first launch, scAgent will ask you about your experiment:
- Load your data — point it at a
filtered_feature_bc_matrix.h5from Cell Ranger - Describe your experiment — tissue, organism, experimental design
- Start analyzing — ask questions in natural language
Example prompts:
> Load data/pbmc10k/filtered_feature_bc_matrix.h5 and run QC
> What cell types are in my data?
> Compare gene expression between clusters 0 and 3
> Run pseudobulk DE between treatment and control
> Generate a methods section for my paper
scAgent is evaluated on SC-Bench (Workman et al., 2026, LatchBio), a benchmark of 394 verifiable problems derived from practical scRNA-seq workflows. The current top baseline model on SC-Bench scores 52.8%. The 7 canonical Chromium evaluations are bundled in eval/evals_canonical_chromium/.
| Task | Result |
|---|---|
| Normalization | ✅ Pass |
| HVG / Feature Selection | ✅ Pass |
| Clustering | ✅ Pass |
| Cell Type Annotation | ✅ Pass |
| Differential Expression | ✅ Pass |
| QC (cell filtering) | ❌ Fail |
| Trajectory Analysis | ❌ Fail |
QC fails because the agent applies standard textbook cutoffs (max 5,000 genes) to an already-cleaned dataset — it needs to inspect distributions before filtering and recognize when data is pre-filtered. Trajectory achieves 0.5 recall on terminal marker recovery — the agent identifies the correct CAF differentiation axis (universal fibroblast → myCAF) and recovers Ly6c1 but misses Acta2, a canonical myCAF marker. Improving pseudotime terminal group detection and marker ranking is next.
The eval runs the full LLM agent end-to-end: the agent receives a task prompt, reasons about what analysis to perform, calls tools, and produces a structured answer that is graded automatically.
For future versions of scAgent, the goal is to embed first-principles reasoning directly into the system, enabling the agent to infer insights from the data itself rather than relying solely on standard community-defined thresholds and heuristics.
At present, scAgent does not implement its own agent loop and instead relies on Pi-Monos' agentic framework. A key direction moving forward is to develop a single-cell–specific agent loop that more closely mirrors the reasoning workflow of a computational biologist. This includes structuring the analysis into distinct phases—such as data exploration, quality control, normalization, and interpretation—allowing the agent to make context-aware decisions at each step rather than applying static pipelines.
pip install -e ".[eval]"
python eval/run_llm_benchmark.py # default: claude-opus-4-6
python eval/run_llm_benchmark.py --model claude-sonnet-4-5 # or any modelResults are saved to eval/results/.
The evaluations use SC-Bench by LatchBio (eval-graders). The canonical eval JSONs are included under Apache 2.0.
@article{scbench2026,
title={scBench: Evaluating AI Agents on Single-Cell RNA-seq Analysis},
author={Workman, Kenny and Yang, Zhen and Muralidharan, Harihara and Abdulali, Aidan and Le, Hannah},
year={2026},
note={LatchBio}
}Full pipeline from raw counts to publication: QC → normalization → HVG → PCA → batch integration (Harmony, scVI, BBKNN, Scanorama) → clustering (Leiden/Louvain) → cell type annotation (CellTypist) → differential expression (pseudobulk DESeq2/edgeR, Wilcoxon) → pathway enrichment (GSEA, ClusterProfiler) → trajectory inference (PAGA + DPT + scVelo) → compositional analysis (scCODA + Milo) → cell communication (LIANA+) → perturbation analysis (guide assignment + DE) → immune repertoire (Scirpy: clonotype, diversity) → multimodal CITE-seq (CLR + WNN).
Each tool is defined by a JSON schema in tools/ with parameter types, constraints, and literature-backed defaults. 21 Python tool wrappers in scagent/tools/ implement the analysis logic with input validation, guard rails, plotting, and structured provenance output. Default parameters and analysis guidelines are derived from Best Practices for Single Cell Analysis across Modalities (Heumos et al., 2023), the sc-best-practices.org online book (Theis Lab), and the 10x Genomics Analysis Guide. Experiment metadata collection follows the minSCe guidelines (Füllgrabe et al., 2020). Per-step reference summaries are in best_practices/reference/.
When you load data — whether raw from Cell Ranger or a half-processed .h5ad from a collaborator — the inspector (scagent/inspector.py) automatically determines what has already been done: is adata.X raw counts, log-normalized, or scaled? Are there PCA embeddings? Clustering labels? Batch columns? The agent uses this to reason about what steps are needed rather than assuming it controls the data from the start.
The dependency module (scagent/dependencies.py) encodes what each analysis step requires — both prerequisite steps and data conditions. If you ask for differential expression but haven't clustered yet, scAgent can check what's missing (check_prerequisites), plan the minimal steps to get there (plan_steps), or auto-run the prerequisites (ensure_ready_for).
Every experiment has a paradigm — one of 7 supported analysis types. The analysis DAG in scagent/dag.py generates a paradigm-specific step ordering with validated dependencies:
| Paradigm | Key Steps |
|---|---|
cell_atlas |
Standard QC → clustering → annotation |
disease_vs_healthy |
+ pseudobulk DE + composition + enrichment |
developmental_trajectory |
+ PAGA topology → DPT pseudotime → scVelo |
perturbation_screen |
+ guide assignment → perturbation DE |
temporal_longitudinal |
+ mandatory batch correction + time-course DE |
immune_repertoire |
+ VDJ loading → clonotype → diversity |
multimodal |
+ protein loading → CLR → WNN joint graph |
The DAG prevents invalid operations (pseudobulk DE on single-condition data, clustering on UMAP coordinates) and tracks progress through each step.
You can run an analysis, then go back to any checkpoint and branch off in a different direction. scagent/state.py implements lazy-checkpointed branching — AnnData objects (200MB–2GB) live in memory during normal work and only get written to disk when you fork, switch branches, or end a session. The branch tree lives in .scagent/branches/.
Real-world scRNA-seq analysis happens over weeks — you run QC on Monday, come back to clustering on Thursday, and a reviewer asks about your normalization choice a month later. scAgent maintains cross-session memory via MemPalace (ChromaDB-backed) so nothing is lost between sessions. Every analysis decision, parameter choice, and conversation is stored with metadata (branch, timestamp, analysis phase). When the agent needs to recall a past decision — "why did we choose resolution 0.8?" — it does a semantic search and retrieves the relevant context, even from weeks ago.
Every tool invocation is recorded as a W3C PROV-O graph in JSON-LD — inputs, parameters, outputs, software versions, timestamps. Full traceability.
The export module (scagent/export.py) generates from the provenance chain:
- Methods section — camera-ready prose for a paper, auto-generated from provenance records
- Reproducibility package — a self-contained directory with
methods.md,params.json,README.md, andreplay.py(a script that re-runs the entire analysis from raw data using the recorded parameters)
Share the reproducibility package with a collaborator or reviewer and they can reproduce your exact analysis independently — same parameters, same tool versions, same results.
scAgent explains what it's doing and why at every step. It presents QC distributions and plots, shows parameter choices with rationale, displays top marker genes per cluster with supporting evidence, and asks for confirmation before advancing. No programming knowledge is assumed.
scAgent/
├── scagent/ # Core Python package
│ ├── cli.py # CLI entry point (scagent command)
│ ├── state.py # Branch & snapshot management
│ ├── memory.py # Long-term memory (MemPalace/ChromaDB)
│ ├── provenance.py # W3C PROV-O provenance tracking (JSON-LD)
│ ├── dag.py # Paradigm-aware analysis DAG (7 paradigms)
│ ├── context.py # Experiment context & metadata
│ ├── knowledge.py # Marker gene database
│ ├── inspector.py # AnnData state inspection
│ ├── dependencies.py # Prerequisite checking
│ ├── export.py # Methods section & repro package generation
│ └── tools/ # 21 analysis tool implementations
│ ├── trajectory.py # PAGA + DPT + scVelo
│ ├── composition.py # scCODA + Milo (via pertpy)
│ ├── communication.py # LIANA+ consensus L-R
│ ├── perturbation.py # Guide assignment + perturbation DE
│ ├── repertoire.py # Scirpy: VDJ + clonotype analysis
│ ├── multimodal.py # CITE-seq: CLR + WNN
│ └── ... # QC, clustering, DE, annotation, etc.
├── tools/ # Tool registry (34 JSON schemas)
├── .pi/ # Agent configuration
│ ├── SYSTEM.md # System prompt (identity + rules)
│ ├── settings.json # Model defaults
│ └── skills/ # 26 step-specific instruction sets
├── best_practices/ # Literature-backed reference guides
│ ├── sc_best_practices.pdf # Heumos et al. 2023
│ ├── best_practice_10xGenomics_scRNAseq.pdf # 10x official guide
│ └── reference/ # Per-step best practice summaries
├── eval/ # SC-Bench evaluation framework & results
├── tests/ # 25 test files, 128+ tests
└── pyproject.toml # Package metadata & dependencies
scAgent is built on Feynman, an open-source AI research agent. The .pi/ directory configures the agent:
- System prompt (
SYSTEM.md) — defines scAgent's identity, rules (never cluster on UMAP, always pseudobulk for cross-condition DE, etc.), and interaction style - Skills — 19 step-specific instruction sets loaded contextually when the agent reaches each analysis phase
- Tool schemas — JSON definitions for every tool with parameter constraints and defaults
When you chat with scAgent, it:
- Identifies what you're asking for
- Checks prerequisites against the analysis DAG
- Loads the relevant skill for detailed instructions
- Executes the analysis step with validated parameters
- Records provenance and presents results with interpretation
pip install -e ".[dev]"
pytest tests/ -vIntegration tests require a dataset at data/pbmc10k/filtered_feature_bc_matrix.h5. Download from 10x Genomics.
See outputs/architecture.md for the full system design document.
- SC-Bench by LatchBio — evaluation framework
- Feynman — agent runtime
- Scanpy — core analysis engine
- CellTypist — cell type annotation
- minSCe guidelines (Füllgrabe et al., 2020) — experiment metadata standards for scRNA-seq reporting