We provide a fast and free haplogroup classification service. You can upload your mtDNA profiles (vcf or hsd format) and receive mitochondrial haplogroups in return. So far, HaploGrep and the updated HaploGrep 2 have been cited over 400 times (Google Scholar - June 2018). Please join our HaploGrep Google User Group for future updates and ongoing discussions.
Command-line Version for local usage
Download and execute the latest release (v2.1.9).
java -jar haplogrep-2.1.9.jar --in <input> --format vcf/hsd --out haplogroups.txt
HaploGrep requires Java 8 and works for Windows, Linux and Mac operating systems.
- For adding additional output columns (e.g. found or remaining polymorphisms) please add the
--extend-reportflag (Default: off).
- To change the metric to Hamming or Jaccard add the
--metricparameter (Default: kulczynski).
- The used Phylotree version can be changed using the
--phylotreeparameter (Default: 17).
- If your variants are from genotyping arrays, please add the
--chipparameter. The range will then be limited to array SNPs only (Default: off). This will only work for VCF. To get the same behaviour for hsd files, please add only the variants to the range, which are included in the array or in the range you have sequenced (e.g. control region). Range can be sepearted by a semicolon
;, both ranges and single positions are allowed (e.g. 1-576; 34).
- To output the complete path from rCRS root to your input sample use the
--lineageparameter. (Default: off). We provide a textual format (
*.lineage.txt) and a Graphviz DOT format. You can upload the HaploGrep
*.graphviz.txtfile here or process it with the Graphviz library.
The default input format is VCF. You can also specify your profiles in hsd format, which is a simple tab-delimited file format consisting of 4 columns (ID, Range, Haplogroup and Polymorphisms). For readability, the polymorphisms are also tab-delimited (so columns > 4). A hsd example can be found here.
Several mtDNA references exist, HaploGrep currently assumes that everything is aligned to rCRS. Please checkout our blog post to learn more about this topic.
If you are using HaploGrep for genotyping array, please have a look at the
--chip parameter above.
Heteroplasmies (VCF only)
Heteroplasmies are often stored as heterozygous genotypes (0/1). If a HF field (= Heteroplasmy Frequency of variant allele; introduced by MToolBox) is specified in the VCF header, we add variants with a HF > 0.96 to the input profile.
Please have a look at mtDNA-Server to check for heteroplasmies and contamination in your NGS data.
Check out our blog regarding mtDNA topics.