Skip to content


Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?

Latest commit


Git stats


Failed to load latest commit information.
Latest commit message
Commit time

Genotyping pipeline for short read Illumina data generated from whole genome sequencing

Scripts for processing raw reads to generate a set of high quality genotypes, as in "Genomic signatures of extensive inbreeding in Isle Royale wolves, a population on the threshold of extinction" by Robinson et al. (2019). Pipeline adapted from the Best Practices for GATK3.

Required software:

Other requirements:


1. FastqToSam

Convert raw read data to an unmapped BAM file.

2. MarkIlluminaAdapters

Mark Illumina adapter sequences in the unmapped BAM file.

3. AlignCleanBam

Efficiently align reads to a reference genome.

4. MarkDuplicates

Mark (and optionally remove) duplicate reads.

5. RemoveBadReads (optional)

Remove reads with low mapping quality or that don't align in proper pair (as indicated by flags). Optional step, reduces the size of the BAM file by eliminating reads that are not desired in downstream processing.

6. BaseQualityScoreRecalibration

Recalibrate base quality scores to reach convergence between reported and empirical base quality scores.

7. HaplotypeCaller

Generate a gVCF file for each BAM file.

8. GenotypeGVCFs

Generate a VCF file from gVCF files.

9. TrimAlternates and VariantAnnotator

Remove alleles from the VCF file that don't appear in any genotypes, and add desired annotations to the INFO field.

10. Variant Effect Predictor

Add mutation impact to the INFO field.

11. VariantFiltration and custom filtering

Apply site- and genotype-level filters.


No description, website, or topics provided.






No packages published