VarDict
Clone or download
zhongwulai Bug fixes and updates to be consistant with VarDictJava
A few chages:
1. The default filter -F is "0x504" instead of "0x500".  Now any
secondary alignment, PCR or optical duplicates, as well as unmapped
reads are filtered.
2. Complext variants extention $VEXT default to 2 instead of 3 to keep
within a codon length.  It can be changed by option "-X".
3. Phasing is now also extended to non-consecutive mismatches, allowing
$VEXT matches in between (lines 1777-1795
4. Fix the output column issue in paired analysis in lines 523-530, 557
and 561.
5. Fix CIGAR modification issue in line 1091.
6. Fix the indels at exon edge in RNA-seq due to alinger's error in
hisat2, which can produce negative coverages in some cases. They'll be
ignored.  (lines 1511-1514 and 1615-1618)
7. Prevent relignment to coop up soft-clip reads due to a couple of
rogue alignments due to sequencing error (added lines 4329 and 4345).
Latest commit 8279565 Oct 10, 2018
Permalink
Failed to load latest commit information.
Stat Embed Stat::Basic module for vardict2mut.pl (fix #50) Aug 8, 2017
git Project updated Jun 21, 2018
.gitignore gitignored idea and macos files Aug 10, 2017
LICENSE Initial commit Jul 15, 2014
README.rst Update README.rst Apr 12, 2017
checkCov.pl 1. Remove dupdates on the fly with -t option. 2. Improved amplicon-aw… Jul 22, 2014
checkSNV.pl Modified scripts to use /usr/bin/env instead of hard coded paths to P… Apr 4, 2014
hg19_5k_150bpOL_seg.txt.gz BED file for hg19 Sep 18, 2014
joinLine Changed run permissions for perl scripts Nov 25, 2014
minicheckCov.sh Initial Commit Mar 21, 2014
minicheckSNV.sh Initial Commit Mar 21, 2014
minivardict.sh Various updates Nov 26, 2014
minivardict_ion.sh Improved indel calling and somatic calling Jun 18, 2014
my.bam Add example bam and bed files to test installation Oct 16, 2015
my.bam.bai Add example bam and bed files to test installation Oct 16, 2015
my.bed Add example bam and bed files to test installation Oct 16, 2015
pickLine Changed run permissions for perl scripts Nov 25, 2014
runVarDict_BAM.sh Some minor updates Mar 4, 2016
sample2vardict.pl Several indel improvements and bug fixes Nov 11, 2014
splitBed.pl splitBed.pl permissions change to executable for group and user Nov 17, 2014
splitBed2.pl Fix a bug for amplicon variant calling Sep 18, 2014
testsomatic.R Some minor updates Mar 4, 2016
teststrandbias.R teststrandbias.R compatibility with VarDictJava 1.5.5 Sep 21, 2018
var2vcf_paired.pl var2vcf_paired: add AF, DP and VD into INFO (which were declared in h… Aug 10, 2017
var2vcf_valid.pl ALT="." and GT="0/0" were added back in when no variant is present. May 10, 2018
vardict Generated symlink vardict to point to vardict.pl Apr 4, 2014
vardict.pl Bug fixes and updates to be consistant with VarDictJava Oct 10, 2018
vardict2fm.pl Add vardict2sgz.pl and vardict2fm.pl Sep 10, 2016
vardict2mut.pl CLNSIG check logic moved to vcf2txt.pl Aug 11, 2018
vardict2sgz.pl Add vardict2sgz.pl and vardict2fm.pl Sep 10, 2016
vardict_sv.pl VarDict for structural variants Apr 27, 2018
vcf2txt.pl vcf2txt.pl: address new Clinvar format for CLNSIG Aug 11, 2018
waitVardict.pl Modified scripts to use /usr/bin/env instead of hard coded paths to P… Apr 4, 2014

README.rst

VarDict

VarDict is an ultra sensitive variant caller for both single and paired sample variant calling from BAM files. VarDict implements several novel features such as amplicon bias aware variant calling from targeted sequencing experiments, rescue of long indels by realigning bwa soft clipped reads and better scalability than many Java based variant callers.

Due to the philosophy of VarDict in calling "everything", several downstream strategies have been developed to filter variants to for example the most likely cancer driving events. These strategies are based on evidence in different databases and/or quality metrics. http://bcb.io/2016/04/04/vardict-filtering/ provides an overview of how to develop further filters for VarDict. The script at https://github.com/AstraZeneca-NGS/VarDict/blob/master/vcf2txt.pl can be used to put the variants into a context by including information from dbSNP, Cosmic and ClinVar. We are open to suggestions from the community on how to best narrow down to the variants of most interest.

A Java based drop-in replacement for vardict.pl is being developed at https://github.com/AstraZeneca-NGS/VarDictJava. The Java implementation is approximately 10 times faster than the original Perl implementation and does not depend on samtools

To enable amplicon aware variant calling (single sample mode only; not supported in paired variant calling), please make sure the bed file has 8 columns with the 7th and 8th columns containing the insert interval (therefore subset of the 2nd and 3rd column interval). The bed files typically look similar to the below two overlapping intervals:

chr1 115247094 115247253 NRAS 0 . 115247117 115247232

chr1 115247202 115247341 NRAS 0 . 115247224 115247323

For more information on amplicon aware calling please see https://github.com/AstraZeneca-NGS/VarDict/wiki/Amplicon-Mode-in-VarDict

VarDict is fully integrated in e.g. bcbio-nextgen, see https://github.com/chapmanb/bcbio-nextgen

Please cite VarDict:

Lai Z, Markovets A, Ahdesmaki M, Chapman B, Hofmann O, McEwen R, Johnson J, Dougherty B, Barrett JC, and Dry JR. VarDict: a novel and versatile variant caller for next-generation sequencing in cancer research. Nucleic Acids Res. 2016, pii: gkw227.

The link to is article can be accessed through: http://nar.oxfordjournals.org/cgi/content/full/gkw227?ijkey=Tk8eKQcYwNlQRNU&keytype=ref

Coded by Zhongwu Lai 2014.

Requirements

  • Perl (uses /usr/bin/env perl)
  • R (uses /usr/bin/env R)
  • samtools (must be in path, not required if using the Java implementation in place of vardict.pl)

Quick start

Make sure the VarDict folder (scripts vardict.pl, vardict, testsomatic.R, teststrandbias.R, var2vcf_valid.pl and var2vcf_somatic.pl) is in path before running the following commands.

  • Running in single sample mode:

    AF_THR="0.01" # minimum allele frequency
    vardict -G /path/to/hg19.fa -f $AF_THR -N sample_name -b /path/to/my.bam -c 1 -S 2 -E 3 -g 4 /path/to/my.bed | teststrandbias.R | var2vcf_valid.pl -N sample_name -E -f $AF_THR
    
  • Paired variant calling:

    AF_THR="0.01" # minimum allele frequency
    vardict -G /path/to/hg19.fa -f $AF_THR -N tumor_sample_name -b "/path/to/tumor.bam|/path/to/normal.bam" -c 1 -S 2 -E 3 -g 4 /path/to/my.bed | testsomatic.R | var2vcf_paired.pl -N "tumor_sample_name|normal_sample_name" -f $AF_THR
    

Contributors

License

The code is freely available under the MIT license.