Skip to content

v1.4.0

Choose a tag to compare

@dial481 dial481 released this 09 Jun 10:32
· 1 commit to main since this release
f3b5efd

Added

  • AlphaMissense variant pathogenicity enrichment. New AlphaMissenseAnnotator enriches annotations with missense variant pathogenicity scores from DeepMind's AlphaMissense (71M variants, CC BY 4.0). Pre-built SQLite cache downloaded from HuggingFace via db update. AM Score column in terminal, HTML, and JSON reports. PharmGKB rows show AM scores as neutral with caveat (protein structure impact only — tooltip in HTML, dimmed * footnote in terminal, am_caveat field in JSON). --no-alphamissense flag to skip.
  • Config file system. config.toml with per-source on/off toggles and license.commercial = true safety switch that auto-disables non-commercial sources (SNPedia). allelix config show/set/reset CLI commands. CLI flags override config per-invocation.
  • scripts/build_alphamissense_cache.py — AlphaMissense cache build script with Zenodo HTTPS streaming (default) and local TSV modes. Joins against gnomAD cache for coordinate-to-rsID mapping.
  • AlphaMissense CC BY 4.0 attribution in HTML and JSON reports.
  • Magnitude scoring legend in HTML report (collapsible, per-source scoring tables for ClinVar, PharmGKB, GWAS, SNPedia).
  • Source floor note in HTML report when per-source magnitude minimums are active.
  • Repute row background tints in HTML report (red for pathogenic/risk, green for protective/benign) derived from existing significance field.
  • Sortable columns in HTML report (magnitude, gene, source, AM score) via inline JavaScript.
  • ADR-0027 documenting the AlphaMissense enrichment cache architecture.

Fixed

  • HTML report table overflows viewport, columns clipped on left (#20). Added overflow-x: auto container, sticky rsID column, max-width on description cells, refs collapsed into <details> toggle, conditional Review Status column (hidden when all empty), stat card flex-wrap.
  • AlphaMissense build script has zero unit-test coverage (#24). Added 25 tests covering TSV parsing, gnomAD rsID join, chr prefix normalization, --no-gnomad NULL-rsid path, multi-allelic composite PK, batched insert, and end-to-end integration.
  • Download integrity: Content-Length check after downloads catches truncated files.
  • Disk space preflight before decompressing .sqlite.gz caches uses 5x gz size (accounts for gz + decompressed tmp on disk simultaneously).
  • _connection() guards on gnomAD and AlphaMissense annotators raise FileNotFoundError with actionable message when cache is missing.
  • Dead cache_exists() removed from gnomAD and AlphaMissense loaders.
  • Legacy caches stamp remote signal instead of re-downloading on db update.
  • README database sizes updated to match actual on-disk measurements.

Changed

  • db update display includes gnomAD and AlphaMissense in "Analyzing against" annotator list.
  • Both build scripts (build_gnomad_cache.py, build_alphamissense_cache.py) run VACUUM for smaller output files.