𧬠A powerful toolkit for genomic variant annotation and clinical interpretation.
- Comprehensive Annotation: ClinVar, gnomAD, COSMIC, dbSNP integration
- Functional Prediction: Gene symbols, consequences, pathogenicity scores
- Multiple Output Formats: VCF, TSV, JSON
- Command Line Interface: Easy-to-use CLI with progress bars
- Modular Design: Each tool can be used independently
- Academic Ready: Designed for research and publication
git clone https://github.com/yourusername/varannote.git
cd VarAnnote
pip install -e .pip install varannotepip install varannotegit clone https://github.com/AtaUmutOZSOY/VarAnnote.git
cd VarAnnote
pip install -e .VarAnnote automatically configures PATH on Windows during installation. If you encounter any issues:
- Restart your terminal after installation - this is usually enough
- Alternative: Use python -m (always works):
python -m varannote --help python -m varannote annotate input.vcf --output output.vcf
- Manual setup (if needed):
python -m varannote setup-path
# Test installation
varannote --version
# or
python -m varannote --version
# Test with help
varannote --help# Annotate variants with default databases
varannote annotate test_variants.vcf --output annotated.vcf
# Use specific databases
varannote annotate input.vcf -d clinvar -d gnomad --output result.vcf
# Output in different formats
varannote annotate input.vcf --format tsv --output result.tsv
varannote annotate input.vcf --format json --output result.json# Predict pathogenicity using ensemble model
varannote pathogenicity variants.vcf --model ensemble
# Use specific model with custom threshold
varannote pathogenicity variants.vcf --model cadd --threshold 0.7varannote --help # Show all commands
varannote annotate --help # Annotation help
varannote pathogenicity --help # Pathogenicity prediction help
varannote pharmacogenomics --help # Pharmacogenomics analysis help
varannote population-freq --help # Population frequency help
varannote compound-het --help # Compound heterozygote detection help
varannote segregation --help # Family segregation analysis help| Command | Description |
|---|---|
annotate |
Comprehensive variant annotation |
pathogenicity |
Pathogenicity prediction |
pharmacogenomics |
Drug-gene interaction analysis |
population-freq |
Population frequency calculation |
compound-het |
Compound heterozygote detection |
segregation |
Family segregation analysis |
| Option | Description |
|---|---|
--output, -o |
Output file path |
--format, -f |
Output format (vcf, tsv, json) |
--genome, -g |
Reference genome (hg19, hg38) |
--verbose, -v |
Enable verbose output |
- VCF files (.vcf, .vcf.gz)
- Standard VCF format with CHROM, POS, REF, ALT fields
- VCF: Annotated VCF with INFO fields
- TSV: Tab-separated values for analysis
- JSON: Structured data for programmatic use
| Database | Description | Fields Added |
|---|---|---|
| ClinVar | Clinical significance | clinvar_significance, clinvar_id |
| gnomAD | Population frequencies | gnomad_af, gnomad_ac, gnomad_an |
| COSMIC | Cancer mutations | cosmic_id, cosmic_count |
| dbSNP | Variant identifiers | dbsnp_id |
varannote annotate test_variants.vcf --output annotated.vcf --verboseOutput:
𧬠Annotating variants from test_variants.vcf
π Using genome: hg38
ποΈ Databases: clinvar, gnomad, dbsnp
π§ Initialized VariantAnnotator with genome: hg38
π Reading variants from test_variants.vcf
π Found 5 variants to annotate
Annotating variants [####################################] 100%
β
Annotation complete: 5 variants processed
π Output saved to: annotated.vcf
varannote annotate test_variants.vcf --format tsv --output results.tsvvarannote pathogenicity test_variants.vcf --model ensemble --threshold 0.6VarAnnote/
βββ setup.py # Package configuration
βββ requirements.txt # Dependencies
βββ README.md # This file
βββ test_variants.vcf # Test data
βββ varannote/
βββ __init__.py # Main package
βββ cli.py # Command line interface
βββ core/ # Core functionality
β βββ annotator.py # Variant annotation engine
β βββ pathogenicity.py # Pathogenicity prediction
βββ tools/ # Individual tools
β βββ annotator.py # Annotation tool
β βββ ... # Other tools
βββ utils/ # Utilities
βββ vcf_parser.py # VCF file parser
βββ annotation_db.py # Database interface
# Install development dependencies
pip install -e ".[dev]"
# Run tests
pytest tests/
# Run with coverage
pytest --cov=varannote tests/- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
If you use VarAnnote in your research, please cite:
Γzsoy, A. U. (2025). VarAnnote: Comprehensive Variant Analysis & Annotation Suite (Version 1.0.0) [Computer software]. https://doi.org/10.5281/zenodo.15615370
@software{ozsoy2025varannote,
author = {Γzsoy, Ata Umut},
title = {VarAnnote: Comprehensive Variant Analysis \& Annotation Suite},
url = {https://github.com/AtaUmutOZSOY/VarAnnote},
doi = {10.5281/zenodo.15615370},
version = {1.0.0},
year = {2025}
}A. U. Γzsoy, "VarAnnote: Comprehensive Variant Analysis & Annotation Suite," Version 1.0.0, 2025, doi: 10.5281/zenodo.15615370. [Online]. Available: https://github.com/AtaUmutOZSOY/VarAnnote
This project is licensed under the MIT License - see the LICENSE file for details.
- Author: Ata Umut ΓZSOY
- Email: ataumut7@gmail.com
- GitHub: https://github.com/AtaUmutOZSOY/VarAnnote
- BioPython community for sequence analysis tools
- gnomAD consortium for population frequency data
- ClinVar team for clinical variant curation
- COSMIC database for cancer mutation data