The AGVD Variant Query Tool is a command-line utility for querying variant information against the African Genome Variation Database (AGVD). It supports input from VCF, CSV, TSV, or Excel files and provides threshold-based filtering and clustering of variants using AGVD's GraphQL API.
- Supports VCF, CSV, TSV, and Excel input formats
- Accepts both
rsIDandCHR_POS_REF_ALTvariant formats - Submits queries in batches for improved performance
- Optional local caching for repeated queries
- Dry-run mode for validation without querying
- Exports enriched results and JSON summary
- Multithreaded for faster processing
- Supports "peek" query mode for quick variant lookups
- Python 3.7+
- Dependencies (installed via
pip install -r requirements.txt):
pandas
tqdm
pysam
requests
openpyxlpython agvd \
--KEY YOUR_AGVD_API_KEY \
--INFILE path/to/input.vcf \
--OUTPUT path/to/output.csv \
--THRESHOLD 0.01| Argument | Description |
|---|---|
--BATCH |
Batch size for API queries (default: 1000) |
--COLUMN |
Column name with variant IDs (CSV/TSV/Excel only) |
--CHR |
Chromosome column name |
--POS |
Position column name |
--REF |
Reference allele column name |
--ALT |
Alternate allele column name |
--dry-run |
Validates the file without submitting queries |
--verbose |
Enables debug-level logging |
--cache |
Enables local query caching |
--threads |
Number of threads to use for parallel processing |
--peek |
Provide a list of variant IDs (or input file) to run a quick lookup without thresholding |
Standard .vcf file with #CHROM, POS, REF, and ALT fields.
Either:
- Single column with
rsIDorCHR_POS_REF_ALTformat - Separate columns for
--CHR,--POS,--REF,--ALT
- A file containing original input +:
AGVDCUTOFF: status based on MAF thresholdAfrican_MAF: MAF value<Cluster>_MAF: MAF per population cluster
- A
_summary.jsonwith success/failure statistics
The peek mode lets you quickly retrieve availability and access URLs for variants without threshold-based filtering.
python agvd --peek --INFILE variants.txtpython agvd --peek rs123 rs456 chr1:12345:A:GReturns:
[
{
"id": "rs123",
"status": "available",
"url": "https://agvd.afrigen-d.org/variant?id=rs123"
},
{
"id": "1-12345-A-G",
"status": "unavailable",
"url": null
}
]You can also call this in Python:
from agvd.query import peek_variants
results = peek_variants(["rs123", "chr1:12345:A:G"])To test locally:
python agvd \
-k test_key \
-i examples/test.csv \
-o out.csv \
-t 0.05 \
--verboseTo profile performance:
python -m cProfile agvd ...MIT License © 2025 AGVD Team
For support or questions, please contact: agvd@afrigen-d.org