Skip to content

mcgml/GermlineEnrichment

Repository files navigation

GermlineEnrichment

Description

Diagnostic NGS pipeline for SNPs/Indels/CNVs/SVs/LOH from germline panel/exome data (Illumina paired-end)

Requires variables files. See https://github.com/mcgml/MakeVariableFiles

Launch with qsub 1_GermlineEnrichment.sh in the sample directory. Assumes Torque/PBS is installed

Caveats

  • BQSR requires at least 100M bases post filtering to create an accurate model. Roughly, it shouldn't be used for designs less than 0.5Mb.
  • Script 2 requires PED file. By default, one is created assuming all unrelated samples. Downstream filtering assumes samples are unrelated unless specified in the PED

Outputs

  • BAM alignment
  • VCF files
  • QC metrics
  • Tabix indexed coverage per base

Relatedness

  • Not suitable for panel analysis
Same sample 1st degree 2nd degree 3rd degree Unrelated
~0.5 ~0.25 ~0.125 ~0.0625 0-0.04

Expected variant metrics

SNVs

Type Variants TiTv
WGS ~4.4M 2.0-2.1
WES ~41k 3.0-3.3

If your TiTv Ratio is too low, your callset likely has more false positives.

INDELs

Indel frequency Insertion to deletion ratio
Common ~1
Rare 0.2-0.5

A significant deviation from the expected ratios listed in the table above could indicate a bias resulting from artifactual variants.