Skip to content
This repository has been archived by the owner on Jan 31, 2020. It is now read-only.

genome/dindel-tgi

Repository files navigation

dindel-tgi

This is a fork of dindel-1.01 by The Genome Institute at Washington University. It is licensed under the GNU GPLv3.

Dindel was originally developed by Cornelis Albers together with Gerton Lunter (Wellcome Trust Centre for Human Genetics, University of Oxford) and Richard Durbin (Wellcome Trust Sanger Institute).

Dindel is a program for calling small indels from short-read sequence data ('next generation sequence data'). It is currently designed to handle only Illumina data.

Dindel takes BAM files with mapped Illumina read data and enables researchers to detect small indels and produce a VCF file of all the variant calls. It has been written in C++ and can be used on Linux-based and Mac computers (it has not been tested on Windows operating systems).


Dindel requires a BAM file containing the read-alignments as input. It then extracts candidate indels from the BAM file, and realigns the reads to candidate haplotypes consisting of these candidate indels in windows of ~120 bp. If there is sufficient evidence for an alternative haplotype to the reference, it will call an indel.

Dindel can test candidate indels discovered with other methods, for instance longer deletions found by split-read methods or indels obtained through assembly methods. Dindel will then realign both mapped and unmapped reads to see if the candidate indel is supported by the reads.

Dindel produces a VCF file with the indel calls. Genotype likelihoods can be obtained from intermediate files generated by Dindel.

There is basic support for outputting a realigned BAM file for each realignment-window. These realigned BAM files can be used to call SNPs near (candidate) indels.

You can find more information, including a manual and the original source tarball, on the Wellcome Trust Sanger Institute's Dindel page. You may also be interested in the Genome Research 2010 article Dindel: Accurate indel calls from short-read data (DOI: 10.1101/gr.112326.110).

This fork currently contains a patch to allow sequence data that includes the 0x800 flag (supplementary alignment) from newer versions of the SAM specification. It also removes the included copy of Boost and adds samtools-0.1.19 as a submodule.

Compiling

This software depends on Boost, which should probably be installed using your packager manager of choice.

This software also depends on SAMtools for SAM format support, and SeqAn for aligning candidate haplotypes to the reference sequence with the Needleman-Wunsch algorithm. Both of these dependencies are included in this repo.

If libboost is installed and the samtools submodule is in place, you should be able to compile the dindel binary using make.

Packaging

Install dependencies, clone the repository, and run dpkg-buildpackage:

$ sudo apt-get update
$ sudo apt-get install debhelper git-core libboost-dev libboost-program-options-dev libncurses5-dev
$ git clone https://github.com/genome/dindel-tgi.git dindel-tgi1.01-wu1-1.01-wu1
$ cd dindel-tgi1.01-wu1-1.01-wu1
$ git checkout 1.01-wu1-3
$ dpkg-buildpackage