Skip to content

fix(health): recognise slim HF mirror for chrombpnet + add alphagenome_pt probe#77

Merged
lucapinello merged 1 commit intomainfrom
fix/health-probe-recognize-slim-mirror-and-pt
Apr 30, 2026
Merged

fix(health): recognise slim HF mirror for chrombpnet + add alphagenome_pt probe#77
lucapinello merged 1 commit intomainfrom
fix/health-probe-recognize-slim-mirror-and-pt

Conversation

@lucapinello
Copy link
Copy Markdown
Contributor

fix(health): recognise slim HF mirror for chrombpnet + add alphagenome_pt probe

Two stale checks in chorus/core/weights_probe.py were causing chorus health to falsely report healthy 0.4.0 installs as ⚠ Not installed.

Symptom

On a fresh chorus setup --oracle all against 0.4.0:

✓ alphagenome: Healthy
⚠ alphagenome_pt: Not installed — run `chorus setup alphagenome_pt`
✓ borzoi: Healthy
⚠ chrombpnet: Not installed — run `chorus setup chrombpnet`
✓ enformer: Healthy
✓ legnet: Healthy
✓ sei: Healthy

…even though both flagged oracles load and predict cleanly (oracle.load_pretrained_model() succeeds in <8 s, end-to-end oracle._predict() returns the expected outputs). I.e. the runtime path is fine; only the probe is stale.

Root cause

  1. chrombpnet probe (_probe_chrombpnet) was looking at the legacy ENCODE-tarball-extracted directory:

    default = CHORUS_DOWNLOADS_DIR / "chrombpnet" / "DNASE_K562"
    if not default.exists() or not any(default.iterdir()):
        return (False, [str(default)])

    In 0.3.0+ that directory is empty by default — chrombpnet weights stream from the slim HF mirror at lucapinello/chorus-chrombpnet-slim (ChromBPNet HF slim mirror + chrombpnet_nobias default (0.3.0) #59 / fix(chrombpnet): F1 — huggingface_hub missing in env yaml + v30 scorched-earth audit #60). The legacy directory only gets populated if a user explicitly requests model_type='chrombpnet' (bias-aware) or fold ≠ 0, both of which fall through to ENCODE.

  2. alphagenome_pt was missing from _ARTIFACT_PROBES entirely. The oracle was added to the registry in AlphaGenome PyTorch backend (opt-in spike) #62 but the health probe was never updated. So the probe-dispatcher never had an entry for it; it always fell through to the missing-marker code path.

Fix

  1. _probe_chrombpnet: accept either the slim-mirror snapshot cache (manifest.json present under models--lucapinello--chorus-chrombpnet-slim/snapshots/<rev>/) or the legacy ENCODE directory.
  2. New _probe_alphagenome_pt: checks the HF cache at models--gtca--alphagenome_pytorch/snapshots/<rev>/*.safetensors (the upstream port's published weights).
  3. New _hf_cache_dir() helper: both new probes route through it so they honour HF_HOME / HF_HUB_CACHE env vars via huggingface_hub.constants.HF_HUB_CACHE, with a defensive fallback to ~/.cache/huggingface/hub.

Verification

End-to-end on macOS arm64:

$ chorus health
✓ alphagenome: Healthy
✓ alphagenome_pt: Healthy
✓ borzoi: Healthy
✓ chrombpnet: Healthy
✓ enformer: Healthy
✓ legnet: Healthy
✓ sei: Healthy

7/7 ✓, was 5/7 before.

pytest -m "not integration and not slow": 368 passed, 1 skipped, 5 deselected (no regression).

Out of scope

  • A regression test for the probe logic. Would need either an integration-marked test that actually invokes chorus health after setup, or a unit test that mocks the HF cache path. Tracked as a follow-up; the fix itself is small enough to inspect by reading.

🤖 Generated with Claude Code

…e_pt probe

Two stale checks in `chorus/core/weights_probe.py` were causing
`chorus health` to falsely report healthy 0.4.0 installs as
"⚠ Not installed":

1. **chrombpnet probe** only looked at the legacy
   `downloads/chrombpnet/DNASE_K562/` ENCODE-tarball-extracted
   directory. In 0.3.0+ that directory is empty by default — chrombpnet
   weights stream from the slim HF mirror at
   `models--lucapinello--chorus-chrombpnet-slim` (#59 / #60). Fix:
   accept *either* the slim-mirror snapshot cache (manifest.json
   present) *or* the legacy ENCODE directory. The runtime path was
   already 0.3.0-correct; only the probe was stale.

2. **alphagenome_pt** was missing from `_ARTIFACT_PROBES` entirely
   (only added to the oracle registry in #62; the health probe was
   never updated). Fix: new `_probe_alphagenome_pt()` checks the HF
   cache at `models--gtca--alphagenome_pytorch/snapshots/<rev>/
   *.safetensors`.

Also factored a small helper `_hf_cache_dir()` so both new probes
honour `HF_HOME` / `HF_HUB_CACHE` env vars via
`huggingface_hub.constants.HF_HUB_CACHE`, with a defensive fallback
to the documented default `~/.cache/huggingface/hub`.

Verified end-to-end on this machine: a fresh `chorus health` now
reports 7/7 ✓ Healthy, where before it reported 5/7 with two false-
alarm warnings on chrombpnet + alphagenome_pt.

`pytest -m "not integration and not slow"`: 368 passed, 1 skipped,
5 deselected (no regression).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@lucapinello lucapinello merged commit 4d307ce into main Apr 30, 2026
1 check passed
@lucapinello lucapinello deleted the fix/health-probe-recognize-slim-mirror-and-pt branch April 30, 2026 21:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant