Skip to content

Skills: Rare Disease

Gully Burns edited this page Apr 16, 2026 · 4 revisions

Archived. The rare disease functionality has been split into two skills: Skills: ALG Precision Therapeutics (mechanism investigation) and Skills: DisMech (disease mechanism knowledge graph). Content below is preserved for reference.

Skills: Rare Disease

The /rare-disease skill builds a comprehensive 360° knowledge graph for any rare disease starting from a MONDO ID. It pulls curated phenotypes, causal genes, similar diseases, clinical trials, and drug candidates from Monarch Initiative, ClinicalTrials.gov, and ChEMBL. The agent synthesizes the mechanism, diagnostic criteria, therapeutic landscape, and research gaps.

Philosophy: Disease-Centric

This skill starts from a known disease (MONDO ID) and builds outward to a full 360° profile. This is the opposite of a patient-centric workflow (which starts from symptoms and works toward a diagnosis):

Patient-centric rare-disease
Patient case → diagnosis MONDO ID → full disease profile
Symptom-driven Disease-driven
Diagnostic reasoning chain Phenome + genome + therapeutome
Single patient focus Population-level disease characterization

The skill uses a structured mechanism model (total-loss, partial-loss, gain-of-function, dominant-negative, toxification) defined in the TypeDB schema, making mechanism classification a first-class concept rather than free text.

Data Sources

Source What It Provides API
Monarch Initiative HPO phenotypes, causal genes, SemSim phenotypic similarity api.monarchinitiative.org
ClinicalTrials.gov Active and completed trials by disease name CTG API v2
ChEMBL Drug candidates targeting causal genes chembl.ebi.ac.uk

5-Phase Curation Workflow

Phase 1: Foraging — Identify the Disease

# Search by disease name to find the MONDO ID
uv run python .claude/skills/rare-disease/rare_disease.py search-disease \
    --query "NGLY1 deficiency"

Output: list of MONDO IDs + names. Pick the right one.

Phase 2: Ingestion — Pull Curated Knowledge

# Initialize disease entity from MONDO ID
uv run python .claude/skills/rare-disease/rare_disease.py init-disease \
    --mondo-id "MONDO:0800044"
# → returns disease-id: rd-disease-abc123

DISEASE_ID=rd-disease-abc123

# Ingest HPO phenotypes from Monarch
uv run python .claude/skills/rare-disease/rare_disease.py ingest-phenotypes \
    --disease $DISEASE_ID

# Ingest causal and associated genes
uv run python .claude/skills/rare-disease/rare_disease.py ingest-genes \
    --disease $DISEASE_ID

# Ingest MONDO hierarchy (parent classes)
uv run python .claude/skills/rare-disease/rare_disease.py ingest-hierarchy \
    --disease $DISEASE_ID

# Ingest phenotypically similar diseases (requires ingest-phenotypes first)
uv run python .claude/skills/rare-disease/rare_disease.py ingest-similar \
    --disease $DISEASE_ID --limit 20

# Ingest clinical trials
uv run python .claude/skills/rare-disease/rare_disease.py ingest-clintrials \
    --disease $DISEASE_ID

# Ingest drug candidates via ChEMBL (requires ingest-genes first)
uv run python .claude/skills/rare-disease/rare_disease.py ingest-drugs \
    --disease $DISEASE_ID

Phase 3: Sensemaking — Agent Reads Artifacts

# List artifacts needing analysis
uv run python .claude/skills/rare-disease/rare_disease.py list-artifacts \
    --disease $DISEASE_ID

# Get the MONDO record artifact (raw Monarch JSON)
uv run python .claude/skills/rare-disease/rare_disease.py show-artifact \
    --id <artifact-id>

Ask the agent: "Analyze this MONDO record and extract: synonyms, inheritance pattern, age of onset, prevalence, OMIM/ORPHA cross-references, and any mentioned causal genes."

Phase 4: Analysis — Agent Synthesizes Notes

# Disease overview note
uv run python .claude/skills/rare-disease/rare_disease.py add-note \
    --about $DISEASE_ID \
    --type disease-overview \
    --name "NGLY1 Deficiency Overview" \
    --content "NGLY1 deficiency (MONDO:0800044) is an ultra-rare autosomal recessive..."

# Mechanism note (uses mechanism-type and functional-impact args)
uv run python .claude/skills/rare-disease/rare_disease.py add-note \
    --about $DISEASE_ID \
    --type mechanism \
    --mechanism-type total-loss \
    --functional-impact absence \
    --content "NGLY1 encodes N-glycanase 1, the only known enzyme that cleaves N-glycans..."

# Therapeutic landscape note
uv run python .claude/skills/rare-disease/rare_disease.py add-note \
    --about $DISEASE_ID \
    --type therapeutic-landscape \
    --content "No approved therapies. Clinical trials exploring proteasome pathway..."

Phase 5: Reporting — Query the Knowledge Graph

# Full disease profile
uv run python .claude/skills/rare-disease/rare_disease.py show-disease \
    --id $DISEASE_ID

# Phenotypes grouped by frequency tier
uv run python .claude/skills/rare-disease/rare_disease.py show-phenome \
    --id $DISEASE_ID --min-freq frequent

# Drug targets, indicated drugs, clinical trials
uv run python .claude/skills/rare-disease/rare_disease.py show-therapeutome \
    --id $DISEASE_ID

# Phenotypically similar diseases (SemSim)
uv run python .claude/skills/rare-disease/rare_disease.py show-similar \
    --id $DISEASE_ID

# MONDO hierarchy (parent classes, subtypes)
uv run python .claude/skills/rare-disease/rare_disease.py show-hierarchy \
    --id $DISEASE_ID

Sensemaking Workflows

Mechanism Analysis

After ingest-genes:

  1. Load the MONDO record: show-artifact --id <mondo-artifact-id>
  2. Ask the agent: "Based on this MONDO record and your knowledge of [gene], classify the mechanism of harm: total-loss, partial-loss, gain-of-function, dominant-negative, or toxification. What is the functional impact?"
  3. Store: add-note --type mechanism --mechanism-type total-loss --functional-impact absence

Therapeutic Landscape

After ingest-drugs + ingest-clintrials:

  1. show-therapeutome --id $DISEASE_ID — current drug + trial landscape
  2. Ask the agent: "Analyze the drug targets and clinical trials for [disease]. What is the therapeutic rationale? Are there approved therapies? What are the most promising investigational approaches?"
  3. Store: add-note --type therapeutic-landscape

Differential Diagnosis

After ingest-similar:

  1. show-similar --id $DISEASE_ID — phenotypically similar diseases
  2. show-phenome --id $DISEASE_ID --min-freq frequent — core phenotype profile
  3. Ask the agent: "Compare the phenotype of [disease] to these similar diseases. What distinguishing features separate them? What diagnostic tests would differentiate?"
  4. Store: add-note --type differential

Command Reference

Discovery Commands

Command Description Key Args
search-disease Search Monarch for diseases by name --query, --limit
list-diseases List all diseases in the KG

Ingestion Commands

Command Description Dependencies
init-disease Create disease entity from MONDO ID
ingest-phenotypes Pull HPO associations from Monarch init-disease
ingest-genes Pull causal/associated genes init-disease
ingest-hierarchy Parse MONDO parent classes init-disease
ingest-similar Phenotypic similarity via SemSim ingest-phenotypes
ingest-clintrials Query ClinicalTrials.gov init-disease
ingest-drugs Drug candidates via ChEMBL ingest-genes
build-corpus Print epmc-search commands for literature init-disease

Query Commands

Command Description Key Args
show-disease Full disease profile --id
show-phenome Phenotypes by frequency tier --id, --min-freq
show-therapeutome Drugs, targets, trials --id
show-similar Similar diseases by SemSim score --id
show-hierarchy MONDO parent/child classes --id
list-artifacts List artifacts --disease
show-artifact Get artifact content --id

Annotation Commands

Command Description Key Args
add-note Create interpretive note --about, --type, --content
tag Add tag to any entity --entity, --tag
search-tag Find entities by tag --tag

Note Types

Type Purpose
disease-overview High-level disease summary
mechanism Mechanism of harm (use --mechanism-type, --functional-impact)
phenotypic-spectrum Phenotype variability analysis
diagnostic-criteria Diagnostic criteria synthesis
differential Differential diagnosis
therapeutic-landscape Drug and trial landscape
expert-landscape Research groups and experts
research-gaps Open questions
natural-history Progression, prognosis, survival

HPO Frequency Qualifiers

HP Code Label Frequency
HP:0040280 obligate 100%
HP:0040281 very-frequent 80–99%
HP:0040282 frequent 30–79%
HP:0040283 occasional 5–29%
HP:0040284 rare 1–4%
HP:0040285 very-rare <1%

TypeDB Schema Highlights

Key entity types in namespaces/rd.tql:

domain-thing
├── rd-disease          # MONDO disease (mondo-id, omim-id, orpha-id, inheritance-pattern)
├── rd-gene             # Causal/associated gene (gene-symbol, hgnc-id, ensembl-id)
├── rd-protein          # Protein product (uniprot-id)
├── rd-phenotype        # HPO phenotype (hpo-id, hpo-label)
├── rd-drug             # Therapeutic compound (chembl-id, mechanism-of-action, development-stage)
├── rd-clinical-trial   # Clinical trial (nct-id, trial-phase, trial-status)
└── rd-disease-model    # Experimental model (model-type, model-species)

collection
├── rd-investigation    # Investigation container
└── rd-patient-cohort   # Set of patients sharing criteria

Key relations:

Relation Roles Key Attributes
rd-disease-has-phenotype disease, phenotype frequency-qualifier, evidence-code
rd-gene-causes-disease gene, disease confidence
rd-gene-associated-with gene, disease association-type, confidence
rd-disease-subclass-of child-disease, parent-disease
rd-disease-similar-to disease-a, disease-b similarity-score
rd-drug-targets drug, target-gene, target-protein mechanism-of-action
rd-drug-indicated-for drug, indication development-stage
rd-trial-studies trial, disease

TypeDB Query Examples

# All diseases with their causal genes
match
    $d isa rd-disease;
    (gene: $g, disease: $d) isa rd-gene-causes-disease;
fetch {
    "disease": $d.name,
    "mondo_id": $d.rd-mondo-id,
    "gene": $g.rd-gene-symbol
};

# Phenotypes for a disease, filtered to frequent+
match
    $d isa rd-disease, has rd-mondo-id "MONDO:0800044";
    (disease: $d, phenotype: $p) isa rd-disease-has-phenotype,
        has rd-frequency-qualifier $freq;
    $freq in ["obligate", "very-frequent", "frequent"];
fetch {
    "hpo_id": $p.rd-hpo-id,
    "label": $p.rd-hpo-label,
    "frequency": $freq
};

# Drugs targeting causal genes of a disease
match
    $d isa rd-disease, has id "<disease-id>";
    (gene: $g, disease: $d) isa rd-gene-causes-disease;
    (drug: $dr, target-gene: $g) isa rd-drug-targets;
fetch {
    "drug": $dr.name,
    "chembl_id": $dr.rd-chembl-id,
    "target_gene": $g.rd-gene-symbol,
    "moa": $dr.rd-mechanism-of-action
};

Cross-Skill Integration

Build a Literature Corpus

build-corpus prints ready-to-run epmc-search CLI commands for 3–5 targeted literature queries (disease overview, gene mechanism, gene therapy, natural history). Copy-paste to execute:

uv run python .claude/skills/rare-disease/rare_disease.py build-corpus \
    --disease $DISEASE_ID
# → prints epmc-search commands to run

End-to-End Example: NGLY1 Deficiency

# 1. Find the disease
uv run python .claude/skills/rare-disease/rare_disease.py search-disease \
    --query "NGLY1 deficiency"
# → MONDO:0800044

# 2. Initialize
uv run python .claude/skills/rare-disease/rare_disease.py init-disease \
    --mondo-id "MONDO:0800044"
# → disease_id: rd-disease-<hash>

DISEASE_ID=rd-disease-<hash>

# 3. Ingest all knowledge sources
uv run python .claude/skills/rare-disease/rare_disease.py ingest-phenotypes --disease $DISEASE_ID
# → ~167 phenotypes (alacrima, global dev delay, seizures...)

uv run python .claude/skills/rare-disease/rare_disease.py ingest-genes --disease $DISEASE_ID
# → NGLY1 as causal gene (HGNC:17646)

uv run python .claude/skills/rare-disease/rare_disease.py ingest-hierarchy --disease $DISEASE_ID
# → congenital disorder of deglycosylation → rare disease

uv run python .claude/skills/rare-disease/rare_disease.py ingest-similar --disease $DISEASE_ID
# → similar diseases by HPO profile

uv run python .claude/skills/rare-disease/rare_disease.py ingest-clintrials --disease $DISEASE_ID
# → active trials (including Grace Science Foundation trial)

uv run python .claude/skills/rare-disease/rare_disease.py ingest-drugs --disease $DISEASE_ID
# → drug candidates targeting NGLY1

# 4. Build literature corpus
uv run python .claude/skills/rare-disease/rare_disease.py build-corpus --disease $DISEASE_ID
# → epmc-search commands to copy-paste

# 5. Query and report
uv run python .claude/skills/rare-disease/rare_disease.py show-disease --id $DISEASE_ID
uv run python .claude/skills/rare-disease/rare_disease.py show-phenome --id $DISEASE_ID --min-freq frequent
uv run python .claude/skills/rare-disease/rare_disease.py show-therapeutome --id $DISEASE_ID

API Rate Limits and Caching

API Auth Rate Limit
Monarch Initiative None Reasonable for research use
ClinicalTrials.gov None 10 req/sec
ChEMBL None Responses >50KB cached to ~/.alhazen/cache/json/

Large API responses are automatically cached to disk and referenced via cache-path in TypeDB rather than stored inline.

Clone this wiki locally