You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
gnomAD population allele frequencies. New GnomadAnnotator enriches
report annotations with population frequency context from gnomAD v4.1 exomes
(~16M variants, 730K individuals). Pre-built cache downloaded from HuggingFace
via db update. Frequency column in terminal, HTML, and JSON reports. --no-gnomad flag to skip.
CPIC fallback for PharmGKB.db update succeeds when CPIC API is
unreachable — reuses cached allele function data. Recovery auto-triggers on
next successful check.
Graceful db update. Individual annotator download failures print an error
and continue to remaining annotators instead of aborting the entire update.
scripts/build_gnomad_cache.py — streaming VCF build script. Downloads ~120GB
gnomAD exome VCFs over HTTPS (or reads local files with --local-dir), never
saves VCFs to disk, outputs ~6GB SQLite (~3GB gzipped).
JSON report schema_version bumped to "2" (added allele_frequency field).
Diff engine accepts both v1 and v2 baselines.
gnomAD ODbL v1.0 attribution in HTML and JSON reports.
CI workflow (.github/workflows/ci.yml) — lint + test on push/PR to main.
Fixed
Offline claim in README corrected: analysis runs offline by default with opt-out
freshness check, not opt-in network access.
__del__ partial-init crash on PharmGKB constructor failure.
.gitignore updated for GWAS Catalog test data.
Changed
Pre-push hook reduced to version-tag check only (CI runs the full suite).
Technical
Composite primary key (chrom, pos, ref, alt) on gnomad_frequencies —
preserves multi-allelic sites (rsID-only PK silently dropped ~20% of records).
Coordinate columns indexed for future AlphaMissense/CADD integration.
MAX(af) GROUP BY rsid in lookup queries handles multiple rows per rsID.