Add multi-architecture support for ARM macOS#4
Merged
lucapinello merged 3 commits intomainfrom Mar 18, 2026
Merged
Conversation
Detect system architecture at runtime and adapt oracle environment YAML configs before creating conda environments. This allows the canonical Linux x86_64 YAML files to work on Apple Silicon by substituting incompatible packages (e.g. TensorFlow 2.8 -> 2.15.1, removing CUDA packages, pre-building igraph/leidenalg via conda). - New chorus/core/platform.py: PlatformInfo detection, declarative adaptation rules per oracle+platform, YAML config transformer - Modified manager.py: applies adaptations in create_environment(), runs post-install pip steps (e.g. modisco-lite --no-deps) - Adaptations defined for chrombpnet, enformer, borzoi, sei, legnet on macos_arm64; all other oracle+platform combos pass through unchanged Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Fix _pkg_name to handle conda single '=' version pins (e.g. cudatoolkit=11.7) - Remove 'pytorch' conda channel for sei/borzoi/legnet on macOS ARM (pytorch packages available on conda-forge; pytorch channel blocked on some networks) - Relax sei PyTorch <2.0 upper bound on ARM (PyTorch 2.x is compatible) - Add setuptools<81 for enformer (tensorflow_hub needs pkg_resources) Tested all 5 oracles on Apple Silicon: ✓ borzoi: Healthy + prediction OK ✓ chrombpnet: Healthy + prediction OK ✓ enformer: Healthy + prediction OK ✓ legnet: Healthy + prediction OK ✓ sei: Healthy + prediction OK Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
End-to-end tests that instantiate each oracle in its conda environment, load a pretrained model, and run predict() on a genomic region from chr1. Covers chrombpnet, enformer, borzoi, sei, and legnet. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
4 tasks
lucapinello
added a commit
that referenced
this pull request
Apr 15, 2026
The 2026-04-16 deep application audit proposed 6 fixes; commit 5ebb328 implemented 5 of them. This commit adds the missing LOW-priority Fix #4: a short application note on why AlphaGenome DNASE and ChromBPNet ATAC can report different effects for the same variant in the same cell type. Three reasons documented: different training data (DNase vs Tn5), different receptive fields (1 Mb vs 2 kb), different effect aggregation (binned sum vs peak height). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2 tasks
lucapinello
pushed a commit
that referenced
this pull request
Apr 17, 2026
Fresh-install audit at e99fd66 verifying all 4 v10 fixes on a truly clean slate. Teardown: 14.2 GB including tfhub_modules/ this time. All 4 v10 fixes verified live: - Fix #1 (tfhub recovery): code path exists + first-install smoke passes on wiped tfhub cache. - Fix #2 (IGV HF fallback): 0/16 HTMLs fell back to CDN on the same SSL-MITM network that had 6/16 fallbacks in v10. - Fix #3 (FTO README): accurate HepG2 framing + adipose assay_ids block for the ideal run. - Fix #4 (bgzip PATH): 0 'bgzip is not installed' lines across 235 notebook cells (v10 had 20/34/60 per notebook). One minor regression exposed: Fix #4 makes tabix findable, which reveals a pre-existing bug where download_gencode leaves a stale .tbi file that coolbox's `tabix -p gff` rejects with "index file exists". Workaround = delete .tbi; NB1 retry succeeded. Proposed 3-line follow-up fix to annotations.py documented in the report. Also verified: - 308/308 pytest on fresh env (17.3 s) - 6/6 oracle smoke (7 min 2 s) — first Enformer fresh-install with wiped tfhub cache - 12/12 regen within AlphaGenome CPU non-determinism tolerance - 0 orphan HTMLs after parallel regen - 3 notebooks: 0 errors, 0 warnings, 0 bgzip spam - 16/16 HTMLs clean in Selenium - FTO README spot-check confirms Fix #3 committed correctly After 11 audit passes — the last two have surfaced no actual chorus bugs, only environmental quirks (tfhub cache, SSL MITM, PATH inheritance, stale .tbi). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
lucapinello
pushed a commit
that referenced
this pull request
Apr 17, 2026
…efreshes GTF v11 audit exposed a pre-existing bug masked by the pre-Fix-#4 state where tabix was not on PATH: when download_annotation refreshes a GTF, leftover coolbox artefacts (file.gtf.bgz + file.gtf.bgz.tbi from a previous session) point at byte offsets in the old .bgz that no longer match the new one. coolbox then calls tabix -p gff file.bgz on its next GTF() read, tabix refuses to overwrite without -f, and the notebook cell crashes with: CalledProcessError: Command '['tabix', '-p', 'gff', ...]' returned non-zero exit status 1. Fix: in AnnotationManager.download_annotation, after sort_annotation writes the fresh GTF, unlink any stale .bgz / .bgz.tbi / .gz.tbi sharing the same stem. coolbox then regenerates them cleanly on first GTF() call. Three extra unlink() calls on a not-hot path. Unit test: TestStaleGTFIndexCleanup in tests/test_error_recovery.py mocks requests.get + sort_annotation, primes the annotations dir with stale .bgz and .tbi, and verifies both are removed after download_annotation returns. Verified: pytest -m "not integration" → 309 passed (was 308). Without this fix, a notebook run after any annotation refresh (download_gencode() called twice across sessions, or a newer GENCODE version pulled) hits the tabix error on first coolbox visualization cell. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This was referenced Apr 17, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
chorus/core/platform.pymodule detects system architecture (ARM/x86_64), OS (macOS/Linux), and CUDA availability at runtimeEnvironmentManager.create_environment()now adapts conda YAML configs on-the-fly for the detected platform — no separate YAML files needed, Linux x86_64 configs remain the canonical sourcetests/test_smoke_predict.py)Key design decisions
PlatformInfo.keymatches (e.g.,macos_arm64)PLATFORM_ADAPTATIONSdictpip install --no-depsfor modisco-lite on ARM)anaconda.orgby removing pytorch/nvidia channels (conda-forge has ARM builds)Test plan
predict()on genomic regionspytest tests/test_smoke_predict.py -v -s— 5/5 passed🤖 Generated with Claude Code