Expansion Hunter: a tool for estimating repeat sizes

There are a number of regions in the human genome consisting of repetitions of short unit sequence (commonly a trimer). Such repeat regions can expand to a size much larger than the read length and thereby cause a disease. Fragile X Syndrome, ALS, and Huntington's Disease are well known examples.

Expansion Hunter aims to estimate sizes of such repeats by performing a targeted search through a BAM/CRAM file for reads that span, flank, and are fully contained in each repeat.

Linux and macOS operating systems are currently supported.


Expansion Hunter is provided under the terms and conditions of the GPLv3 license. It relies on several third party packages provided under other open source licenses, please see COPYRIGHT.txt for additional details.


Installation instructions, usage guide, and description of file formats are available on wiki.


The detailed description of the method can be found here:

Dolzhenko et al., Detection of long repeat expansions from PCR-free whole-genome sequence data, Genome Research 2017