🚑 B A Y M A X

An evidence-to-action ER nurse assistant simulation

🔗 Live: https://baymax-bice.vercel.app

Patient said: “My mother’s legs are swollen.” Baymax retrieved the heart-failure precedent nobody mentioned.

Baymax is not impressive because it has more data. It is impressive when another perspective changes what it decides to do.

         👁 Patient world      👁 Drug world
          "What is similar?"   "What is hidden?"
                \              /
                 └──🧠 Brain──┘
        👃 Nose ──┤ changes action when perspective changes
                   ↓
              🛑 Brakes → ⚡ Nerves → 🤝 Hands → 🛡 Verify

Watch Him Think

make sync
make audit
make ui

Open http://localhost:8000/ui/. The live surface reads the generated audit receipt, narrates what Baymax noticed, admits uncertainty, shows what acted or stopped, and only claims verification when the receipt proves it.

Four Golden Cases

CASE 1 · RETRIEVAL DISCOVERY
"My mother's legs are swollen"
→ raw words miss the target
→ evidence-language query retrieves a CHF precedent
→ Baymax surfaces unmentioned evidence before answering
→ next question: short of breath at rest or when lying flat?

CASE 2 · ATTENTION FLIP
routine medication refill
→ served NOSE assigns ESI-attention tier 4
→ expensive eyes stay closed
→ no action attempted

CASE 3 · DECISION FLIP
same critical chest-pain patient
→ bed available: assign_bed
→ ER gridlock: divert
→ capacity perspective changes the action

CASE 4 · BRAKE SAVE
abdominal pain after Ibuprofen
→ patient-only: discharge_plan
→ drug eye: 16/17 exact-Ibuprofen reports marked serious
→ population signal only; no causality claim
→ brake blocks autonomous discharge
→ clinician-review nerve receives and ACKs the case

That is the recruiter path. Four cases show that Baymax discovers hidden evidence, allocates attention, changes decisions when another perspective matters, and stops itself before an unsafe autonomous action.

Technical Audit

python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
make test
make audit

The final receipt is outputs/baymax_audit.json. It contains the discovery receipt and action trajectories:

attention skip: NOSE stops a routine case before expensive evidence work
cross-domain brake: another perspective changes action into human review
bounded action: durable state changes and the outcome is independently re-read

It also records exact source commits, organ grades, the solo-engineer A+ ceiling, and the remaining shortest path.

Simulated Deployment Readiness

Baymax includes a deliberately thin FDE-adjacent rehearsal package:

make readiness

It verifies five acceptance gates against the live audit receipt, then injects a false-success receipt and proves the release contract detects it and requires rollback. The result is eligible only for simulated shadow, never presented as hospital deployment or adoption.

Honesty Ledger

outputs/baymax_audit.json ends with a honesty_ledger: every organ maps to one capability lane, may only claim the one sentence its live proof earns, and Baymax ships at the strength of its weakest load-bearing organ — never above it.

Organ	Capability lane	Verdict gate
NOSE	signal-routing	served patient signal; ≥95% serious recall gate
Left eye	data-truth	55,500 rows scanned this run
Right eye	evidence-retrieval	5,000 openFDA reports scanned; population signal, not causality
Brain / Brakes / Hands	action-engine	decision change, autonomous block, and durable outcome re-read
Nerves	clinical-handoff	receiver ACK proven; cross-service recovery not yet
Mouth	explanation	states the FAERS causality boundary

A live gate that fails drops the organ to ❌, and ❌ leaves are never claimed. The headline verdict is the minimum verdict across load-bearing organs, pinned in CI by tests/test_audit.py::test_honesty_ledger_gates_headline_at_weakest_load_bearing_leaf.

Organ Sources

Organ	Public source	What this repo audits
NOSE	healthcare-genai-engineer + healthcare-signal-platform	Served patient attention signal plus a separate five-signal openFDA batch proof
Left eye	healthcare-ai-data-engineer	Full 55,500-row synthetic encounter corpus and reconciliation
Right eye	healthcare-da + healthcare-signal-platform	Real openFDA evidence plus governed semantic-layer lineage
Brain + hands	healthcare-genai-engineer	Triage, Bed Ops decision, durable action, ACK, and outcome verification

Honest Boundary

Baymax is a closed-loop ER simulation, not a deployed clinical system. The action engine changes durable SQLite state representing a Bed Ops disposition; it does not write to a hospital EHR or bed-management platform.

The per-case NOSE is deliberately cheap: a versioned deterministic ESI attention signal decides whether to open both eyes. On 497 labelled synthetic patient cases it preserves 95.39% serious-case recall while skipping 4.83% of expensive paths. The five statistical/ML/retrieval signals are separately batch-evaluated on 5,000 real openFDA reports; this repo does not pretend those models score patient queries online.

Repo Map

baymax/
├── baymax/demo.py              ✅ plays the three memorable cases
├── baymax/audit.py             ✅ generates deep evidence receipts
├── baymax/served_nose.py       ✅ serves and evaluates pre-eye attention
├── deployment-readiness/       ✅ simulated acceptance, rollback, and handoff
├── cases/legendary_cases.json  ✅ the public three-case screenplay
├── docs/baymax/constitution/   📖 versioned architecture health records
├── tests/test_audit.py         ✅ pins every movie outcome in CI
├── scripts/sync_sources.sh     ✅ fetches the four public sibling sources
├── outputs/                    🟡 audit + simulated rollout receipts
├── .github/workflows/audit.yml ✅ regenerates proof on every change
├── Makefile                    ✅ one-command audit entry points
├── SPEC.md                     📖 scope and proof boundary
└── README.md                   📖 recruiter audit map

retrieval discovery: symptom-language expansion retrieves an unmentioned CHF precedent

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🚑 B A Y M A X

Watch Him Think

Four Golden Cases

Technical Audit

Simulated Deployment Readiness

Honesty Ledger

Organ Sources

Honest Boundary

Repo Map

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 36 Commits
.github/workflows		.github/workflows
backend/healthcare-ai-data-engineer		backend/healthcare-ai-data-engineer
baymax		baymax
cases		cases
deployment-readiness		deployment-readiness
docs/baymax/constitution		docs/baymax/constitution
evidence/l2-decision-safety		evidence/l2-decision-safety
outputs		outputs
scripts		scripts
tests		tests
ui/baymax		ui/baymax
.gitattributes		.gitattributes
.gitignore		.gitignore
.vercelignore		.vercelignore
Makefile		Makefile
README.md		README.md
demo.gif		demo.gif
requirements.txt		requirements.txt
vercel.json		vercel.json

Folders and files

Latest commit

History

Repository files navigation

🚑 B A Y M A X

Watch Him Think

Four Golden Cases

Technical Audit

Simulated Deployment Readiness

Honesty Ledger

Organ Sources

Honest Boundary

Repo Map

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages