Skip to content
David Jones edited this page Aug 9, 2021 · 3 revisions

Here you will find basic descriptions of the intent of generic scripts, please see the help/manual entries for full details as these will be the most current source of information.

bam_stats (c)

Generates read and mapping statistics for a BAM file. It will process BAMs with read-group headers appropriately splitting information into individual rows of the resulting *.bas file (bas concept from vr-pipe project).

The intent is to capture as many of the commonly requested statistics in a single pass of a BAM file. If you have ideas for new statistics please create an issue in the tracker.

Data not linked to a read-group is dumped into a single bin, unless you know the file only contains one lane of sequencing you should consider these statistics suspect.

bamToBw.pl

Generate BigWig file from BAM/CRAM, parallel processing where possible. This uses cgpBigWig to do the heavy lifting. Customs filtering of reads is available.

diff_bams (c)

Compares 2 BAM files at the record level. Checks stable elements of the header are matched (SQ entries and order) skips potentially unstable header items (PG may have different file paths etc.). Each read is compared for mapping and flag info. You are optionally able to skip reads with poor MAPQ values as these can be volatile.

mismatchQc (c)

Applies the mismatch QC (QC vendor fail flag and additional mismatchQC tag YES) to reads failing mismatchQC

mmFlagmodifier (c)

Removes or reinstates the QC vendor fail flag from a bam/cram file in the presence of the mismatch QC fail tag