Skip to content

andinet/nodule-scout

Repository files navigation

nodule-scout

A small, runnable research-orchestrator agent that helps scope a medical-AI product during early ideation. Given a broad clinical question, it dispatches specialized tool-calling sub-agents at real public FDA/NIH APIs, grounds the results in internal clinical-advisory notes, and emits a structured, fully-cited gap-analysis — refusing to ship any claim it can't trace back to a fetched source.

It is specialized to AI lung-nodule detection/triage on chest CT (predicate space: Fujifilm Synapse Lung Nodule AI, Coreline AVIEW, Riverain ClearRead CT, RevealAI-Lung, Infervision, …). It's the companion code for a blog post on agentic AI in the Software-as-a-Medical-Device (SaMD) lifecycle — Agentic AI is coming to the SaMD lifecycle.

⚠️ This is a conceptual ideation aid. It is not validated SaMD and produces no clinical guidance. The advisory-notes corpus under data/ is synthetic.

What it does

 clinical question ─▶ ORCHESTRATOR (Claude Opus 4.8)
                        decompose → dispatch → ground → synthesize
        ┌───────────────────┼────────────────────┐
   pubmed-researcher   maude-analyst      predicate-analyst      + RAG over
   (PubMed E-utils)   (openFDA MAUDE)    (openFDA 510(k))        advisory notes
        └─────────── Source Ledger (every fetched record) ──────┘
                                │
                 guardrails: scope · citation · human-in-the-loop
                                │
                   outputs/gap_analysis.md + .json + trace.jsonl
  • Three real data sources, no API key required — PubMed E-utilities, openFDA MAUDE (device adverse events), openFDA 510(k) (predicate clearances).
  • Source Ledger — every record any tool fetches is stored under a stable cite_id (PMID:########, K######, MAUDE:########, NOTE:<id>).
  • Citation guardrail — the gap analysis is produced through a submit_gap_analysis tool; a deterministic validator rejects the submission if any claim carries no citation or cites an id absent from the ledger. This is how "every claim cites a verifiable source" is enforced — not by trusting the model.
  • Scope guard — an off-domain question (e.g. breast MRI triage) is refused before any spend.
  • Human-in-the-loop — the run writes a *.draft first; nothing is finalized without explicit approval (--approve).

Built on the Claude Agent SDK (claude-agent-sdk). Authenticate with an ANTHROPIC_API_KEY or a logged-in claude CLI session — see Authentication.

See a real, committed run in examples/sample-run/ (46 source records, 22 claims, 50 citations, 0 guardrail violations).

Quickstart

pip install -e .            # or: pip install claude-agent-sdk httpx pydantic rank-bm25 python-frontmatter
cp .env.example .env        # then set ANTHROPIC_API_KEY (or skip it and use a claude CLI login)

# Deterministic, code-orchestrated run (reproducible; recommended):
python run.py "What are the unmet needs for an AI lung-nodule triage tool on chest CT?" --approve

# Faithful multi-subagent run (Task fan-out to PubMed/MAUDE/510k sub-agents):
python run.py --agentic --approve

Outputs land in outputs/: gap_analysis.md, gap_analysis.json (analysis + full evidence index + attribution report), and trace.jsonl.

Authentication

Running the orchestrator (run.py) needs the Claude Code CLI installed (the Agent SDK drives it) plus one of these two ways to authenticate — pick either:

  • API key — set ANTHROPIC_API_KEY in .env (copied from .env.example). run.py loads .env automatically. Best if you don't use Claude Code.
  • CLI login — run claude once to log in; leave ANTHROPIC_API_KEY unset.

run.py prints which one it's using at startup. The three data-source keys (NCBI_API_KEY, OPENFDA_API_KEY) are optional — the FDA/NIH APIs are free and keyless; keys only raise rate limits. The tests need no keys and no Claude login (they hit the data APIs anonymously and test the guardrails offline).

Two orchestration modes

run_deterministic (default) run_agentic (--agentic)
Retrieval code calls the 4 tools directly orchestrator dispatches 3 sub-agents via the Task tool
Reproducible yes no (model-driven fan-out)
Stability robust experimental¹

Both share the ledger, tools, guardrails, and HITL gate in src/phase01/runtime.py.

¹ --agentic drives the Agent SDK's Task sub-agent dispatch, which is model-driven and, combined with hooks under a non-interactive process on some platforms, can occasionally hit an SDK↔CLI transport error. The deterministic mode is the recommended, reproducible path.

Layout

src/phase01/
  models.py       SourceRecord / SourceLedger / GapAnalysis (Claim requires >=1 cite)
  tools/          pubmed.py · maude.py · fda510k.py · retrieval.py (BM25) · http.py
  runtime.py      in-process SDK tools, Source Ledger, scope hook, HITL gate
  orchestrator.py run_agentic (subagents) + run_deterministic
  guardrails.py   check_scope · validate_attribution (the citation gate)
  render.py       markdown + JSON artifact with a References section
data/advisory_notes/   synthetic advisory-board corpus (RAG source)
examples/sample-run/   a real, committed gap analysis
run.py            CLI entrypoint + human-in-the-loop
tests/            guardrail unit tests + live-API smoke tests

Tests

PYTHONPATH=src python -m pytest tests/ -q

Covers: an injected uncited/unresolvable citation is rejected, an off-domain question is flagged, and each live API returns records.

Data sources

License

MIT — see LICENSE.

About

An illustrative research-orchestrator agent that turns a clinical question into a fully-cited gap analysis from live FDA + NIH data, built on the Claude Agent SDK.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages