A Python tool for genomic variant annotation and database management. Provides a CLI for annotating VCF files and managing annotation databases with support for multiple file formats and genome assemblies.
- 📁 Multi-format Support: VCF, CSV, TSV, and Parquet files
- 💬 Interactive CLI: User-friendly command-line interface
- 🗄️ Database Management: Persistent annotation source storage
- 🧬 VCF Processing: Advanced annotation with SnpEff and our custom annotation core
- 🔬 Assembly Support: GRCh37 and GRCh38
./scripts/setup.shOr, alternatively:
make setup# Create and activate conda environment
conda create -n pyannotator python=3.9
conda activate pyannotator
# Install dependencies
pip install -r requirements.txt
conda install -c bioconda bcftools
# Setup database
cd prisma
python -m prisma generate
python -m prisma db push
cd ..For detailed installation instructions, troubleshooting, and platform-specific setup, see SETUP.md.
# Load and index VCF (using SnpEff)
python pyannotator.py load-variant-file test.vcf.gz --snpeff --name "my_variant_file" --assembly GRCh38.105
# Load annotation database
python pyannotator.py load-database annotations.csv --name "my_db" --assembly GRCh38.105
# List databases
python pyannotator.py list-databases
# Annotate with interactive database selection
python pyannotator.py annotate input.vcf --assembly GRCh38.105 --interactive
# OR annotate with specific databases
python pyannotator.py annotate input.vcf --assembly GRCh38.105 --databases "DB1" "DB2"| Command | Description |
|---|---|
annotate ... |
Annotate VCF file with selected databases |
load-variant-file <file> |
Load and index variant file |
load-database <file> |
Load and register annotation database |
list-databases |
Show registered databases |
delete-all-databases |
Remove all databases |
| Command | Description |
|---|---|
load-transcripts |
Fetch transcripts from Ensembl and store in database |
list-transcripts |
List transcripts stored in database |
clean-transcripts |
Clean/delete transcripts from database |
select-database |
Interactive single database selection |
select-databases |
Interactive multiple database selection |
For detailed command options and examples, see COMMANDS.md.
Uses Prisma with SQLite. See DATABASE.md for schema details and CRUD operations.
- Setup Guide - Comprehensive installation and setup instructions
- Commands Reference - Detailed command options and examples
- Database Integration - Schema and CRUD operations