Generative dynamics models of rare-disease patient trajectories. In the lineage of Dreamer, Sora, and Genie — specialised to clinical event streams, grounded in a biomedical knowledge graph, verified by an agentic swarm, and validated on real Brazilian SUS (DATASUS) data.
Research preview. Not a medical device.
Authors: Dimas Timmers (Raras Health), Alexandre Melo Kawassaki (Raras Health; Hospital Israelita Albert Einstein; A.C.Camargo Cancer Center), Joao Bosco Oliveira (Co-Human Genomics). Correspondence: dimas@raras.ai
Cite as: Timmers D, Kawassaki AM, Oliveira JB. GEMEO: the first patient world model for rare disease, grounding generative clinical trajectories in the genome and a biomedical knowledge graph. Zenodo. DOI: 10.5281/zenodo.20092130 (concept; always resolves to the latest version).
GEMEO is not a single model — it is a three-pillar architecture (Propose → Simulate → Verify) for patient world models, plus a family of open instances validated on Brazilian SUS rare-disease data.
Patient history
│
▼ Pillar A — Graph Proposer KG zero-shot → first-onset candidates
│ (PrimeKG; Marfan → FBN1, the causal gene)
▼ Pillar B — World-Model Scorer Diffusion-Forcing transformer +
│ recurrence-aware loss → predicts NOVEL events, not repeats
▼ Pillar C — Swarm Verifier case-adaptive multi-agent panel,
│ 3-valued voting, traceable KG evidence paths
▼
Ranked new-onset forecast + intervention plan, with evidence
Targets Level 3 (counterfactual rollout) on the NeurIPS 2025 clinical world-model rubric (arXiv 2511.16333), closing the four gaps that survey names.
Run it on your own data: ADAPTING_TO_A_NEW_DATABASE.md — instantiate
gemeo-<your-substrate>on any MEDS v0.4.1 EHR in ~5 min of GPU.
| Task | GEMEO | Strong baseline | Margin |
|---|---|---|---|
| New-onset prediction (Top-1) | 53.7% | 38.2% (frequency) | +15.5 pp |
| Will-change (AUROC) | 0.906 | 0.889 (count-based) | +0.017 |
| Transition-within-12mo (AUROC) | 0.827 | 0.790 (count-based) | +0.037 |
| Treatment discontinuation (AUROC) | 0.838 | 0.696 (count-based) | +0.142 |
GEMEO leads on every novelty and long-context task. The recurrence-aware objective makes the model predict novel events, not repeats — so these are real signal, not autocorrelation. The world model's learned representation pulls clearly ahead exactly where the 2026 EHR literature predicts it should: context-rich tasks like treatment discontinuation (dropout drives bad outcomes in rare disease), where it beats count-based methods by +0.142 AUROC (arXiv 2511.00782).
architecture/ GEMEO Architecture spec v1 + v2, diagram, gemeo_bench.py (conformance CLI)
pillars/ Pillar A (KG proposer) + Pillar C (swarm verifier) + demos
reference_impl/ the Diffusion-Forcing world-model code (AGPL-3.0)
benchmark/ RareBench-BR Trajectory v2 — datasheet, leaderboard, baselines
paper/ the paper (md + pdf) + figures
reproducers/ Modal scripts for every experiment (~$6 total GPU)
- Architecture + conformance suite: https://huggingface.co/Raras-AI/gemeo-arch
- Flagship model (recurrence-aware): https://huggingface.co/Raras-AI/gemeo-sus
- Application layer (6 inference modes): https://huggingface.co/Raras-AI/gemeo-twin-stack
- Benchmark (first rare-disease trajectory benchmark): https://huggingface.co/datasets/Raras-AI/rarebench-br-trajectory
- Diffusion-Forcing backbone with per-token σ
- Gated cross-attention to a real biomedical KG (PrimeKG)
- MEDS v0.4.1 event substrate
- Bootstrap-then-learn pattern per inference mode
- Bidirectional health-system grounding (PCDT / formulary)
- Audit-driven training
- (v2) Recurrence-aware onset objective (defeats autocorrelation)
- (v2) Competing-risks first-onset head
- (v2) KG zero-shot onset proposer
- (v2) Agentic verification with traceable evidence
Run python architecture/gemeo_bench.py check <checkpoint> to test conformance.
Open the recipe, keep the spice. See LICENSING.md for full terms.
- Source code (architecture, pillars, reference implementation, reproducers): AGPL-3.0 — adopt and build on it freely, with attribution. A reference architecture only becomes a standard if people can use it.
- Model weights (
gemeo-sus, …) and the RareBench-BR benchmark: CC-BY-NC 4.0 on Hugging Face — no commercial reuse of our trained artifacts/data without a separate agreement. - Held back on purpose (not in this repo): the proprietary DATASUS ETL (raw SIH/SIA/APAC/SIM extraction, CNS-hash linkage, trajectory construction), the cohort/preprocessing heuristics that confer an advantage, and the future multimodal (Mayo) substrate. The public
meds_export.pyconsumes an already-built trajectory file — it does not produce one.
@misc{gemeo_2026,
title = {GEMEO: Validated on Brazilian SUS Data},
author = {Timmers, Dimas and Kawassaki, Alexandre and the Raras AI team},
year = {2026},
url = {https://github.com/rarasAI/gemeo}
}Builds on: Diffusion Forcing (Chen NeurIPS 2024), PrimeKG (Chandak Nature 2023), RAVEN (arXiv 2603.24562), CAMP (arXiv 2604.00085), DeepRare (Nature 2026), the clinical world-model rubric (arXiv 2511.16333).