QC training developed by Pathogen Informatics at Wellcome Sanger Institute.
This is a training module for QC in the form of jupyter notebooks. Begin in the index notebook and this will guide you through the training.
Key aims of the training include:
- Describe the different NGS data formats available (FASTQ, SAM/BAM, CRAM, VCF/BCF)
- Perform conversions between the different data formats
- Perform a QC assessment of high throughput sequence data
- Identify possible contamination in high throughput sequence data
Please report any issues to the issues page or email path-help@sanger.ac.uk
The tutorial runs in a container with the following software installed:
- bcftools 1.10.2
- bwa 0.7.17
- kraken 1.1.1
- perl 5.30.0
- picard 2.22.2
- samtools 1.10