Allelix v1.0.0
Allelix v1.0.0
Open-source genotype analysis toolkit. Takes raw DNA files from consumer
testing services and runs them against public variant databases, producing
source-attributed research reports. The open-source Promethease replacement.
Supported input formats
- 23andMe
- AncestryDNA
- Family Tree DNA (FTDNA)
- MyHeritage DNA
- Living DNA
- MyHappyGenes (Tempus)
Auto-detection identifies the format from file structure. --format overrides
when needed.
Annotation sources
| Database | License | Lookup method |
|---|---|---|
| ClinVar | Public domain | Position-based, dual-build (GRCh37 + GRCh38) |
| PharmGKB | CC BY-SA 4.0 | rsID-based, CPIC allele function filtering |
| GWAS Catalog | Public domain | rsID-based, p-value + effect size scoring |
| SNPedia | CC BY-NC-SA 3.0 | rsID-based (--exclude-snpedia for commercial use) |
All databases are downloaded by the user via allelix db update and cached
locally. Analysis runs offline with zero network access.
Key features
- Build auto-detection. Detects genome build (GRCh37/GRCh38) from position
data, not file headers. Warns on header/data mismatch. - Offline-first. No telemetry, no uploads, no analytics. Genotype data
never touches a network. - Streaming parsers. 17 MB files never load fully into memory.
- Three report formats. Terminal (Rich), self-contained HTML, machine-readable
JSON (schema v1). - Report diff.
--diff previous.jsoncompares the current run against a
prior report and surfaces new, removed, and changed annotations. - Focused reports.
allelix methylationandallelix pharmacogenomicsfor
targeted analysis. - Freshness detection.
db updatechecks remote signals (MD5, ETag) and
only re-downloads when the upstream source has changed.
Install
pip install allelix
allelix db update
allelix analyze your_file.txt
Requires Python 3.11+.
Known limitations
- GRCh36 (hg18). Allelix has no GRCh36 ClinVar cache, so ClinVar
annotations are skipped entirely for GRCh36 files. PharmGKB, GWAS Catalog,
and SNPedia use rsID-only lookups and are unaffected. - Star alleles. CYP2D6, CYP2C19, and other genes annotated by haplotype
in PharmGKB are underserved — only SNV-level annotations are matched. - VCF. Not yet supported. Planned for v2.0.
License
AGPL-3.0-or-later. Third-party databases retain their original licenses on the
user's machine.
Test suite
794 tests, 94% coverage.