We provide a fast and free haplogroup classification web service. You can upload your mtDNA profiles aligned to rCRS or RSRS (beta) and receive mitochondrial haplogroups in return. Fasta, VCF and hsd input formats are accepted. So far, HaploGrep and the updated HaploGrep 2 have been cited over 400 times (Google Scholar - June 2018). Please join our HaploGrep Google User Group for future updates and ongoing discussions.
We also provide a command line version for local usage. Download and execute the latest release (v2.1.16).
java -jar haplogrep-2.1.16.jar --in <input> --format vcf/fasta/hsd --out haplogroups.txt
HaploGrep requires Java 8 and works on Windows, Linux and Mac.
Input File Formats
The recommended input format is VCF or FASTA. For alignment, bwa version 0.7.17 is used.
You can also specify your profiles in hsd format, which is a simple tab-delimited file format consisting of 4 columns (ID, Range, Haplogroup and Polymorphisms). For readability, the polymorphisms are also tab-delimited (so columns > 4). A hsd example can be found here.
- By default HaploGrep expects that your data is aligned against rCRS (which is included in the human references hg19 and hg38). If your data is aligned against RSRS, add the
--rsrsparameter (Default: off). Please read this blog post carefully before adding this option.
- To change the metric to Hamming Distance or Jaccard add the
--metricparameter (Default: Kulczynski Measure).
- For adding additional output columns (e.g. found or remaining polymorphisms) please add the
--extend-reportflag (Default: off).
- The used Phylotree version can be changed using the
--phylotreeparameter (Default: 17).
- If your using genotyping arrays, please add the
--chipparameter to limit the range to array SNPs only (Default: off, VCF only). To get the same behaviour for hsd files, please add only the variants to the range, which are included on the array or in the range you have sequenced (e.g. control region). Range can be sepearted by a semicolon
;, both ranges and single positions are allowed (e.g. 1-576; 34).
- Create a graph of all input samples by using the
--lineageparameter. (Default: off). As an output we provide a Graphviz DOT file. You can then use graphviz (
sudo apt-get install graphviz) to convert the dot file to a e.g. pdf (
dot <dot-file> -Tpdf > graph.pdf).
mtDNA reference sequences
Several mtDNA references exist, HaploGrep supports rCRS and RSRS. Please checkout our blog post to learn more about this topic.
If you are using HaploGrep for genotyping array data, please have a look at the
--chip parameter above.
Heteroplasmies (VCF only)
Heteroplasmies are often stored as heterozygous genotypes (0/1). If a HF field (= Heteroplasmy Frequency of variant allele; introduced by MToolBox) is specified in the VCF header, we add variants with a HF > 0.96 to the input profile.
Please have a look at mtDNA-Server to check for heteroplasmies and contamination in your NGS data.
Check out our blog regarding mtDNA topics.