scAgent

An agentic single-cell RNA-seq analysis system for 10x Genomics Chromium data.

scAgent is an AI assistant that helps wet-lab biologists perform, understand, and interpret scRNA-seq experiments through natural language conversation. It enforces best practices, tracks full provenance, and produces reproducible analyses — no programming knowledge required.

Prerequisites

Python ≥ 3.11

Feynman — the AI agent runtime

curl -fsSL https://feynman.is/install | bash
feynman setup   # authenticate with your Claude subscription

Optional: Claude Code — also works as the agent runtime (scAgent's .pi/ config is compatible with any pi-based agent)

Installation

git clone https://github.com/deepmind11/scAgent.git
cd scAgent
python -m venv .venv
source .venv/bin/activate
pip install -e .

Quick Start

# Launch scAgent (works from anywhere)
scagent

# With specific model/thinking settings
scagent --model opus --thinking max

# Continue a previous session
scagent -c

# Resume/pick a session
scagent -r

On first launch, scAgent will ask you about your experiment:

Load your data — point it at a filtered_feature_bc_matrix.h5 from Cell Ranger
Describe your experiment — tissue, organism, experimental design
Start analyzing — ask questions in natural language

Example prompts:

> Load data/pbmc10k/filtered_feature_bc_matrix.h5 and run QC
> What cell types are in my data?
> Compare gene expression between clusters 0 and 3
> Run pseudobulk DE between treatment and control
> Generate a methods section for my paper

Evaluation: SC-Bench

scAgent is evaluated on SC-Bench (Workman et al., 2026, LatchBio), a benchmark of 394 verifiable problems derived from practical scRNA-seq workflows. The current top baseline model on SC-Bench scores 52.8%. The 7 canonical Chromium evaluations are bundled in eval/evals_canonical_chromium/.

Task	Result
Normalization	✅ Pass
HVG / Feature Selection	✅ Pass
Clustering	✅ Pass
Cell Type Annotation	✅ Pass
Differential Expression	✅ Pass
QC (cell filtering)	❌ Fail
Trajectory Analysis	❌ Fail

QC fails because the agent applies standard textbook cutoffs (max 5,000 genes) to an already-cleaned dataset — it needs to inspect distributions before filtering and recognize when data is pre-filtered. Trajectory achieves 0.5 recall on terminal marker recovery — the agent identifies the correct CAF differentiation axis (universal fibroblast → myCAF) and recovers Ly6c1 but misses Acta2, a canonical myCAF marker. Improving pseudotime terminal group detection and marker ranking is next.

The eval runs the full LLM agent end-to-end: the agent receives a task prompt, reasons about what analysis to perform, calls tools, and produces a structured answer that is graded automatically.

For future versions of scAgent, the goal is to embed first-principles reasoning directly into the system, enabling the agent to infer insights from the data itself rather than relying solely on standard community-defined thresholds and heuristics.

At present, scAgent does not implement its own agent loop and instead relies on Pi-Monos' agentic framework. A key direction moving forward is to develop a single-cell–specific agent loop that more closely mirrors the reasoning workflow of a computational biologist. This includes structuring the analysis into distinct phases—such as data exploration, quality control, normalization, and interpretation—allowing the agent to make context-aware decisions at each step rather than applying static pipelines.

pip install -e ".[eval]"
python eval/run_llm_benchmark.py                        # default: claude-opus-4-6
python eval/run_llm_benchmark.py --model claude-sonnet-4-5  # or any model

Results are saved to eval/results/.

The evaluations use SC-Bench by LatchBio (eval-graders). The canonical eval JSONs are included under Apache 2.0.

@article{scbench2026,
  title={scBench: Evaluating AI Agents on Single-Cell RNA-seq Analysis},
  author={Workman, Kenny and Yang, Zhen and Muralidharan, Harihara and Abdulali, Aidan and Le, Hannah},
  year={2026},
  note={LatchBio}
}

Key Features

34 Analysis Tools Across 7 Paradigms

Full pipeline from raw counts to publication: QC → normalization → HVG → PCA → batch integration (Harmony, scVI, BBKNN, Scanorama) → clustering (Leiden/Louvain) → cell type annotation (CellTypist) → differential expression (pseudobulk DESeq2/edgeR, Wilcoxon) → pathway enrichment (GSEA, ClusterProfiler) → trajectory inference (PAGA + DPT + scVelo) → compositional analysis (scCODA + Milo) → cell communication (LIANA+) → perturbation analysis (guide assignment + DE) → immune repertoire (Scirpy: clonotype, diversity) → multimodal CITE-seq (CLR + WNN).

Each tool is defined by a JSON schema in tools/ with parameter types, constraints, and literature-backed defaults. 21 Python tool wrappers in scagent/tools/ implement the analysis logic with input validation, guard rails, plotting, and structured provenance output. Default parameters and analysis guidelines are derived from Best Practices for Single Cell Analysis across Modalities (Heumos et al., 2023), the sc-best-practices.org online book (Theis Lab), and the 10x Genomics Analysis Guide. Experiment metadata collection follows the minSCe guidelines (Füllgrabe et al., 2020). Per-step reference summaries are in best_practices/reference/.

State-Aware Data Inspector

When you load data — whether raw from Cell Ranger or a half-processed .h5ad from a collaborator — the inspector (scagent/inspector.py) automatically determines what has already been done: is adata.X raw counts, log-normalized, or scaled? Are there PCA embeddings? Clustering labels? Batch columns? The agent uses this to reason about what steps are needed rather than assuming it controls the data from the start.

Dependency Resolution

The dependency module (scagent/dependencies.py) encodes what each analysis step requires — both prerequisite steps and data conditions. If you ask for differential expression but haven't clustered yet, scAgent can check what's missing (check_prerequisites), plan the minimal steps to get there (plan_steps), or auto-run the prerequisites (ensure_ready_for).

Paradigm-Aware Analysis DAG

Every experiment has a paradigm — one of 7 supported analysis types. The analysis DAG in scagent/dag.py generates a paradigm-specific step ordering with validated dependencies:

Paradigm	Key Steps
`cell_atlas`	Standard QC → clustering → annotation
`disease_vs_healthy`	+ pseudobulk DE + composition + enrichment
`developmental_trajectory`	+ PAGA topology → DPT pseudotime → scVelo
`perturbation_screen`	+ guide assignment → perturbation DE
`temporal_longitudinal`	+ mandatory batch correction + time-course DE
`immune_repertoire`	+ VDJ loading → clonotype → diversity
`multimodal`	+ protein loading → CLR → WNN joint graph

The DAG prevents invalid operations (pseudobulk DE on single-condition data, clustering on UMAP coordinates) and tracks progress through each step.

State Management & Branching

You can run an analysis, then go back to any checkpoint and branch off in a different direction. scagent/state.py implements lazy-checkpointed branching — AnnData objects (200MB–2GB) live in memory during normal work and only get written to disk when you fork, switch branches, or end a session. The branch tree lives in .scagent/branches/.

Long-Term Memory

Real-world scRNA-seq analysis happens over weeks — you run QC on Monday, come back to clustering on Thursday, and a reviewer asks about your normalization choice a month later. scAgent maintains cross-session memory via MemPalace (ChromaDB-backed) so nothing is lost between sessions. Every analysis decision, parameter choice, and conversation is stored with metadata (branch, timestamp, analysis phase). When the agent needs to recall a past decision — "why did we choose resolution 0.8?" — it does a semantic search and retrieves the relevant context, even from weeks ago.

Provenance & Reproducibility

Every tool invocation is recorded as a W3C PROV-O graph in JSON-LD — inputs, parameters, outputs, software versions, timestamps. Full traceability.

The export module (scagent/export.py) generates from the provenance chain:

Methods section — camera-ready prose for a paper, auto-generated from provenance records
Reproducibility package — a self-contained directory with methods.md, params.json, README.md, and replay.py (a script that re-runs the entire analysis from raw data using the recorded parameters)

Share the reproducibility package with a collaborator or reviewer and they can reproduce your exact analysis independently — same parameters, same tool versions, same results.

Biologist-Friendly

scAgent explains what it's doing and why at every step. It presents QC distributions and plots, shows parameter choices with rationale, displays top marker genes per cluster with supporting evidence, and asks for confirmation before advancing. No programming knowledge is assumed.

Project Structure

scAgent/
├── scagent/              # Core Python package
│   ├── cli.py            #   CLI entry point (scagent command)
│   ├── state.py          #   Branch & snapshot management
│   ├── memory.py         #   Long-term memory (MemPalace/ChromaDB)
│   ├── provenance.py     #   W3C PROV-O provenance tracking (JSON-LD)
│   ├── dag.py            #   Paradigm-aware analysis DAG (7 paradigms)
│   ├── context.py        #   Experiment context & metadata
│   ├── knowledge.py      #   Marker gene database
│   ├── inspector.py      #   AnnData state inspection
│   ├── dependencies.py   #   Prerequisite checking
│   ├── export.py         #   Methods section & repro package generation
│   └── tools/            #   21 analysis tool implementations
│       ├── trajectory.py     # PAGA + DPT + scVelo
│       ├── composition.py    # scCODA + Milo (via pertpy)
│       ├── communication.py  # LIANA+ consensus L-R
│       ├── perturbation.py   # Guide assignment + perturbation DE
│       ├── repertoire.py     # Scirpy: VDJ + clonotype analysis
│       ├── multimodal.py     # CITE-seq: CLR + WNN
│       └── ...               # QC, clustering, DE, annotation, etc.
├── tools/                # Tool registry (34 JSON schemas)
├── .pi/                  # Agent configuration
│   ├── SYSTEM.md         #   System prompt (identity + rules)
│   ├── settings.json     #   Model defaults
│   └── skills/           #   26 step-specific instruction sets
├── best_practices/       # Literature-backed reference guides
│   ├── sc_best_practices.pdf        # Heumos et al. 2023
│   ├── best_practice_10xGenomics_scRNAseq.pdf  # 10x official guide
│   └── reference/        #   Per-step best practice summaries
├── eval/                 # SC-Bench evaluation framework & results
├── tests/                # 25 test files, 128+ tests
└── pyproject.toml        # Package metadata & dependencies

How It Works

scAgent is built on Feynman, an open-source AI research agent. The .pi/ directory configures the agent:

System prompt (SYSTEM.md) — defines scAgent's identity, rules (never cluster on UMAP, always pseudobulk for cross-condition DE, etc.), and interaction style
Skills — 19 step-specific instruction sets loaded contextually when the agent reaches each analysis phase
Tool schemas — JSON definitions for every tool with parameter constraints and defaults

When you chat with scAgent, it:

Identifies what you're asking for
Checks prerequisites against the analysis DAG
Loads the relevant skill for detailed instructions
Executes the analysis step with validated parameters
Records provenance and presents results with interpretation

Running Tests

pip install -e ".[dev]"
pytest tests/ -v

Integration tests require a dataset at data/pbmc10k/filtered_feature_bc_matrix.h5. Download from 10x Genomics.

Architecture

See outputs/architecture.md for the full system design document.

Acknowledgments

SC-Bench by LatchBio — evaluation framework
Feynman — agent runtime
Scanpy — core analysis engine
CellTypist — cell type annotation
minSCe guidelines (Füllgrabe et al., 2020) — experiment metadata standards for scRNA-seq reporting

License

MIT

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

scAgent

Prerequisites

Installation

Quick Start

Evaluation: SC-Bench

Key Features

34 Analysis Tools Across 7 Paradigms

State-Aware Data Inspector

Dependency Resolution

Paradigm-Aware Analysis DAG

State Management & Branching

Long-Term Memory

Provenance & Reproducibility

Biologist-Friendly

Project Structure

How It Works

Running Tests

Architecture

Acknowledgments

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 81 Commits
.github/workflows		.github/workflows
.pi		.pi
best_practices/reference		best_practices/reference
eval		eval
hooks		hooks
outputs		outputs
scagent		scagent
tests		tests
tools		tools
.gitignore		.gitignore
AGENTS.md		AGENTS.md
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
scagent.sh		scagent.sh

Folders and files

Latest commit

History

Repository files navigation

scAgent

Prerequisites

Installation

Quick Start

Evaluation: SC-Bench

Key Features

34 Analysis Tools Across 7 Paradigms

State-Aware Data Inspector

Dependency Resolution

Paradigm-Aware Analysis DAG

State Management & Branching

Long-Term Memory

Provenance & Reproducibility

Biologist-Friendly

Project Structure

How It Works

Running Tests

Architecture

Acknowledgments

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages