Skip to content

Allelix v1.0.0

Choose a tag to compare

@dial481 dial481 released this 06 Jun 12:17
· 7 commits to main since this release

Allelix v1.0.0

Open-source genotype analysis toolkit. Takes raw DNA files from consumer
testing services and runs them against public variant databases, producing
source-attributed research reports. The open-source Promethease replacement.

Supported input formats

  • 23andMe
  • AncestryDNA
  • Family Tree DNA (FTDNA)
  • MyHeritage DNA
  • Living DNA
  • MyHappyGenes (Tempus)

Auto-detection identifies the format from file structure. --format overrides
when needed.

Annotation sources

Database License Lookup method
ClinVar Public domain Position-based, dual-build (GRCh37 + GRCh38)
PharmGKB CC BY-SA 4.0 rsID-based, CPIC allele function filtering
GWAS Catalog Public domain rsID-based, p-value + effect size scoring
SNPedia CC BY-NC-SA 3.0 rsID-based (--exclude-snpedia for commercial use)

All databases are downloaded by the user via allelix db update and cached
locally. Analysis runs offline with zero network access.

Key features

  • Build auto-detection. Detects genome build (GRCh37/GRCh38) from position
    data, not file headers. Warns on header/data mismatch.
  • Offline-first. No telemetry, no uploads, no analytics. Genotype data
    never touches a network.
  • Streaming parsers. 17 MB files never load fully into memory.
  • Three report formats. Terminal (Rich), self-contained HTML, machine-readable
    JSON (schema v1).
  • Report diff. --diff previous.json compares the current run against a
    prior report and surfaces new, removed, and changed annotations.
  • Focused reports. allelix methylation and allelix pharmacogenomics for
    targeted analysis.
  • Freshness detection. db update checks remote signals (MD5, ETag) and
    only re-downloads when the upstream source has changed.

Install

pip install allelix
allelix db update
allelix analyze your_file.txt

Requires Python 3.11+.

Known limitations

  • GRCh36 (hg18). Allelix has no GRCh36 ClinVar cache, so ClinVar
    annotations are skipped entirely for GRCh36 files. PharmGKB, GWAS Catalog,
    and SNPedia use rsID-only lookups and are unaffected.
  • Star alleles. CYP2D6, CYP2C19, and other genes annotated by haplotype
    in PharmGKB are underserved — only SNV-level annotations are matched.
  • VCF. Not yet supported. Planned for v2.0.

License

AGPL-3.0-or-later. Third-party databases retain their original licenses on the
user's machine.

Test suite

794 tests, 94% coverage.