Skip to content

Amplicon Mode in VarDict

Martin Haagmans edited this page Apr 25, 2018 · 2 revisions

Description

VarDict has a built-in capability of calling variants for PCR-based target enrichment (amplicon). The goals of amplicon based variant calling are:

  1. Identify and filter out variants that showed amplicon-bias, that is, if a variant is targeted by more than one amplicon, it has to be detectable in all amplicons. Otherwise, it's determined to be amplicon-bias and filtered out.

  2. Softly discounting PCR primers. No pre-process is needed to remove or soft clip PCR primers. Given the design, VarDict will automatically clip PCR primers.

  3. Avoid mis-paired PCR products, especially when there're mismatches in primers.

Running VarDict in amplicon mode

VarDict's amplicon mode is automatically triggered when an 8-column BED file is supplied. Below is an example:

chr1    46713345        46713502        RAD54L  0       .       46713363        46713481
chr1    46713418        46713578        RAD54L  0       .       46713435        46713561

The 8 columns are:

  1. chr (required)

  2. PCR start (including primer) (required)

  3. PCR end (including primer) (required)

  4. Gene info (can be .)

  5. Score (can be .)

  6. Strand (can be .)

  7. Insert start (excluding primer) (required)

  8. Insert end (excluding primer) (required)

If you split the BED file into smaller ones in order to run VarDict in parallel, please ensure that any overlapping PCR amplicons are present in same BED file. If the gene info column is present, you can use splitBed.pl script in the repository to split BED files, which ensures amplicons from the same gene are in the same BED file, assuming amplicons don't overlap if they're from different genes.

The script var2vcf_valid.pl script will automatically recognize the amplicon mode based on the output and no special parameter is needed.