Add AlphaGenome oracle with cross-platform support#5
Merged
lucapinello merged 10 commits intomainfrom Mar 18, 2026
Merged
Conversation
Detect system architecture at runtime and adapt oracle environment YAML configs before creating conda environments. This allows the canonical Linux x86_64 YAML files to work on Apple Silicon by substituting incompatible packages (e.g. TensorFlow 2.8 -> 2.15.1, removing CUDA packages, pre-building igraph/leidenalg via conda). - New chorus/core/platform.py: PlatformInfo detection, declarative adaptation rules per oracle+platform, YAML config transformer - Modified manager.py: applies adaptations in create_environment(), runs post-install pip steps (e.g. modisco-lite --no-deps) - Adaptations defined for chrombpnet, enformer, borzoi, sei, legnet on macos_arm64; all other oracle+platform combos pass through unchanged Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Fix _pkg_name to handle conda single '=' version pins (e.g. cudatoolkit=11.7) - Remove 'pytorch' conda channel for sei/borzoi/legnet on macOS ARM (pytorch packages available on conda-forge; pytorch channel blocked on some networks) - Relax sei PyTorch <2.0 upper bound on ARM (PyTorch 2.x is compatible) - Add setuptools<81 for enformer (tensorflow_hub needs pkg_resources) Tested all 5 oracles on Apple Silicon: ✓ borzoi: Healthy + prediction OK ✓ chrombpnet: Healthy + prediction OK ✓ enformer: Healthy + prediction OK ✓ legnet: Healthy + prediction OK ✓ sei: Healthy + prediction OK Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
End-to-end tests that instantiate each oracle in its conda environment, load a pretrained model, and run predict() on a genomic region from chr1. Covers chrombpnet, enformer, borzoi, sei, and legnet. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add AlphaGenome oracle (JAX, 1Mb context, 5,731 tracks at 1bp resolution) with metadata, conda environment, platform adaptations, and templates - Fix OraclePrediction.end property (was returning .start) - Fix OraclePredictionTrack.score() (was returning None) - Fix PATH corruption in environment runner - Add SPLICE_SITES and PRO_CAP track types to result.py - Add comprehensive oracle showcase notebook (all 6 oracles, all operations) - Expand test suite to 80 tests covering score(), subset(), platform, etc. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Fix wrong notebook filename in README (gata1_comprehensive_analysis -> comprehensive_oracle_showcase) - Remove outdated "Coming soon" comment for Borzoi (fully implemented) - Fix "Enchanced" typo in README - Fix Sei error message saying "Enformer" instead of "Sei" - Fix Sei _cl2ind() using self._classes_list when already loaded - Fix Sei download: re-download corrupt/truncated archives instead of failing - Fix "Dowloading" typos in Sei log messages Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Fixes 9 issues discovered during a complete fresh-install validation on Linux with NVIDIA A100 GPUs (CUDA 13.1): Environment & dependency fixes: - Add nvidia channel to chorus-borzoi.yml for CUDA package resolution - Add borzoi linux_x86_64_cuda platform adaptation to remove conflicting cudatoolkit/cuda-nvcc (PyTorch bundles its own CUDA runtime) - Add setuptools<81 to chorus-enformer.yml (tensorflow_hub needs pkg_resources) - Add compilers to chorus-alphagenome.yml (sorted_nearest requires C build on Python 3.11 where no pre-built conda package exists) - Add oxbow to base environment.yml (required by coolbox for tab file reading) Runtime fixes: - Add LD_PRELOAD for env's libstdc++ in runner.py run_script_in_environment() (health checks failed with CXXABI_1.3.15 not found on system libstdc++) - Increase dependency check timeout from 30s to 120s (TensorFlow import >30s) AlphaGenome HuggingFace auth: - Add pre-auth check in both _load_direct() and load_template.py to read HF_TOKEN env var before model download (prevents EOFError in subprocess) - Add Step 1b to LINUX_TEST_INSTRUCTIONS.md documenting HF auth setup - Fix AlphaGenome assay ID format in GPU validation script Validated: 6/6 envs installed, 6/6 healthy, 80/80 tests pass, notebook executes (25 cells, 0 errors, 8 visualizations) on both macOS and Linux. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add Prerequisites section (Miniforge, disk space, platform support) - Add verification step to installation instructions - List all 6 oracle setup commands (was only showing 2) - Fix AlphaGenome auth to recommend HF_TOKEN env var over conda activate - Add missing use_environment=True to AlphaGenome code examples - Fix Borzoi specs (524,288 bp input, 7,610 tracks) - Improve CUDA/GPU troubleshooting for cross-platform accuracy - Clarify first-run timing for model weight downloads - Fix typo: vizualization -> visualization Validated from scratch: 6/6 envs installed, 6/6 healthy, 80/80 tests passed, notebook 25 cells / 0 errors / 8 visualizations. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Move coolbox/oxbow to pip in environment.yml (conda version requires pybbi which has no ARM64 build) - Remove gtfsort from conda deps (Linux-only bioconda package) and add Python fallback in sort_gtf() for macOS - Fix AlphaGenome JAX Metal crash (default_memory_space not supported) by setting JAX_PLATFORMS=cpu on macOS before import - Update README to document Metal limitations for AlphaGenome Validated: 80/80 tests pass, notebook 25 cells/0 errors/8 plots. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
FRESH_INSTALL_TEST.md covers both Linux x86_64 and macOS ARM64 in a single document, making LINUX_TEST_INSTRUCTIONS.md redundant. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This was referenced Apr 24, 2026
lucapinello
added a commit
that referenced
this pull request
Apr 24, 2026
Uniform treatment of user-facing error messages across the CLI and oracle surfaces, closing the last v26 audit item. No behavioural changes — just readability and actionability. CLI (cli/_tokens.py, cli/main.py, cli/_setup_prefetch.py): - All `logger.error(...)` messages now end with a period. - HF-token-rejected errors point at https://huggingface.co/settings/tokens (retry hint). - `chorus remove --oracle`, `chorus genome download`, `chorus setup` errors include the exact follow-up command to try. - `_setup_prefetch.py` return-tuple error strings are capitalised and period-terminated for uniform rendering under main.py's `" - {err}"` loop. Oracles (oracles/*.py, not _source/): - All "Failed to load X model in environment" errors now name the conda env (`chorus-X`) and point at `chorus health --oracle X`. - All "Failed to load X model: {e}" errors end with a period (dropping superfluous `str(e)` since f-string formatting handles __str__ automatically). - ChromBPNet's 6 `ValueError` calls for bad assay/cell/fold combos become `InvalidAssayError`, matching how Enformer / Borzoi / SEI / AlphaGenome handle the same class of user error. Dual `ChorusError, ValueError` inheritance (from v26 P2 #19) means `except ValueError` still works. - AlphaGenome's HF-auth error message ends with a period. Tests: 340 passed, 1 skipped on fast suite. Co-authored-by: lp698 <lp698@dimm2fv07n65x.partners.org> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Key changes
chorus/oracles/alphagenome.py+ metadata/templateschorus/core/platform.py— platform detection and per-oracle dependency adaptationenvironment.yml— coolbox moved to pip (ARM64 compat), gtfsort made optional with Python fallbackdefault_memory_spacesupport)examples/comprehensive_oracle_showcase.ipynb) demonstrating all 6 oraclesFRESH_INSTALL_TEST.md)Test plan
chorus setup,chorus health,chorus listall work cross-platform🤖 Generated with Claude Code