Skip to content

Albertlongzi/BCER

Repository files navigation

BCER

Brain–Cerebellum–Extremity–Reflector
An agent framework for reliable execution of long-horizon MRI analysis workflows.

Quickstart License Python Status


BCER stands for Brain–Cerebellum–Extremity–Reflector. It separates planning (the Brain; a constrained sketch over a tool catalogue) from execution (the Cerebellum; a deterministic runtime that binds symbolic artifacts to concrete files), calls MRI tools as typed Extremities, and adds bounded local recovery through a two-tier Reflector that repairs recoverable failures and halts on nonrecoverable ones.

This repository accompanies the MICCAI paper. It contains:

  • the BCER Brain planner, Cerebellum executor, Extremity tool layer, and Reflector recovery loop,
  • a strict tool registry with 21 MRI-domain tools across prostate, brain, and cardiac workflows,
  • a benchmark harness with 4 controller arms,
  • a smoke benchmark that proves install correctness without medical data.
                   ┌──────────────────────────────┐
   user goal ────▶ │  Brain (constrained sketch)     │
                   └──────────────┬───────────────┘
                                  │  sketch JSON
                                  ▼
                   ┌──────────────────────────────┐
                   │  Compiler (sketch → DAG)      │
                   └──────────────┬───────────────┘
                                  │  validated DAG
                                  ▼
                   ┌──────────────────────────────┐         ┌──────────────────┐
                   │  Cerebellum (executor)        │ ──────▶ │  Extremity tools  │
                   └──────────────┬───────────────┘         └──────────────────┘
                                  │  on failure
                                  ▼
                   ┌──────────────────────────────┐
                   │  Reflector  Tier-1 (rules)    │
                   │             Tier-2 (LLM)      │
                   └──────────────────────────────┘

Quickstart

Five minutes, no medical data, no GPU, no model weights.

git clone https://github.com/Albertlongzi/BCER.git
cd BCER
conda env create -f envs/base.yml
conda activate bcer-base
pip install -e .

python -m benchmark.smoke

Expected last lines:

[smoke] step 1/3 dummy_load_case        OK
[smoke] step 2/3 dummy_segment          OK
[smoke] step 3/3 dummy_generate_report  OK
[smoke] PASS  3/3 tool dispatches succeeded.

If you see PASS 3/3 the tool registry, dispatcher, and run isolation are all working.


Install

The agent framework only needs Python and three lightweight dependencies. The heavier tool tiers are optional — install only what you plan to use.

Tier Env file Adds When to install
Base envs/base.yml pydicom, SimpleITK, nibabel Always
Inference envs/inference.yml torch, MONAI To run segmentation tools
Reconstruction envs/recon.yml pygrappa, h5py To process raw cardiac k-space
Radiomics envs/radiomics.yml pyradiomics, scikit-image For ROI feature extraction

Tool dispatch is controlled by BCER_TOOL_DISPATCH:

  • inprocess (default) — every tool runs in the current Python process.
  • auto — tier 1/2/3 tools run as subprocesses in their own conda env; base tools stay in process.
  • subprocess — every tool runs as a subprocess, for maximum isolation.

See docs/TOOL_ENV_ANALYSIS.md for the architecture rationale.


Benchmark

The benchmark exercises one (task, arm) cell per run.

Controller arms. Each arm is a different planning/recovery strategy:

Paper label CLI flag
BCER --arm bcer (alias for bcer_sketch)
ReAct --arm react
ReAct + symbolic binding --arm react_token
ReAct + binding + bounded reflector --arm react_token_reflector

Run one task/arm cell against a manifest you built locally:

python benchmark/benchmark_runner.py \
    --manifest benchmark/cases_manifest.jsonl \
    --task long_prostate_full \
    --arm bcer \
    --runs-root runs

See benchmark/README.md and docs/METRICS.md for the full CLI and metric definitions.


Datasets

BCER does not ship any medical data. We evaluated the framework on the following public datasets — links and access conditions:

Domain Dataset Access
Prostate fastMRI Prostate https://fastmri.med.nyu.edu — release agreement; NYU/Meta
Brain BraTS 2021 (RSNA-ASNR-MICCAI) https://www.synapse.org/Synapse:syn25829067 — Synapse account
Cardiac (cine) ACDC https://www.creatis.insa-lyon.fr/Challenge/acdc — registration required
Cardiac (raw k-space) CMRxRecon 2025 https://cmrxrecon.github.io — challenge release

Once you have one or more datasets locally:

export BCER_PROSTATE_ROOT=/path/to/fastMRI_prostate
export BCER_BRATS_ROOT=/path/to/brats2021
export BCER_ACDC_ROOT=/path/to/acdc
export BCER_CARDIAC_RAW_ROOT=/path/to/cmrxrecon   # optional

python scripts/manifest_builder.py \
    --prostate-root "$BCER_PROSTATE_ROOT" \
    --brain-root    "$BCER_BRATS_ROOT" \
    --cardiac-root  "$BCER_ACDC_ROOT" \
    --output benchmark/cases_manifest.jsonl

The manifest builder is layout-tolerant: it scans for NIfTI / DICOM / HDF5 and infers modalities from filenames and DICOM headers. See docs/DATASETS.md for the expected directory layout per domain and how to bring a non-standard dataset.


Documentation

Document Audience
docs/TOOLS.md Tool call convention, the 21 registered tools, how to add new ones
docs/METRICS.md SR / TCR / ERR / safe-halt definitions and how to read a result
docs/DATASETS.md Supported data layouts and manifest format
docs/TOOL_ENV_ANALYSIS.md Environment tiering and subprocess dispatch architecture
benchmark/README.md Benchmark CLI and outputs
docs/cardiac_acdc_classification_rules.md Rule-based cardiac classifier

In development

We are actively exploring the next version of BCER as a more interactive MRI workstation built around:

  • a web UI with chat, action graph, artifact viewer, and node inspector panes,
  • natural-language conversation for proposing and revising MRI workflows,
  • visible action-graph planning before deterministic execution,
  • human-in-the-loop review, patching, and rerun controls,
  • evidence-linked artifacts and reports.

An early scaffold is already in the repository — mock FastAPI backend, static workstation UI shell, shared ActionGraph schema, deterministic executor store, OpenAI-compatible Brain client, and a read-only bridge into the existing tool catalogue — but this line is still exploratory and not yet ready for general use.


Project layout

agent/             planner, sketch compiler, executor, reflector, rule engine
benchmark/         runner, summariser, smoke benchmark, paper-arm definitions
commands/          tool registry, dispatcher, schema validation
core/              project paths, parser, plan DAG, domain config
llm/               LLM backend adapters (OpenAI, Anthropic, Gemini, vLLM)
mri_agent_shell/   interactive CLI shell + cerebellum runtime + dummy tools
runtime/           memory, finalisation, artifact index, sandbox
tools/             21 imaging tool wrappers + subprocess entry point
scripts/           manifest builder and one-off utilities
envs/              tiered conda env files
docs/              user-facing documentation
configs/           task contracts and tool runtime tier config

Status

BCER is a research framework, not a clinically validated product. It is intended for studying the reliability of agent execution on multi-step imaging workflows. It does not replace expert radiological review.

This is an early open-source release. The codebase is still being organized and refined, some areas may not be fully polished, and not every path has been tested end to end. You may encounter bugs or rough edges. Issues and pull requests are welcome.

License

MIT — see LICENSE.

About

BCER: Bounded Cerebellum Execution Runtime — agentic MRI workflow framework (MICCAI paper companion)

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors