Skip to content

SAMtools recipes

Thomas Cokelaer edited this page Apr 28, 2023 · 1 revision

Remove duplicates

If you BAM file has no duplicated marked, you may use samtools to mark and remove then as follows.

First, sort the BAM file on query name::

samtools sort -@ 4 -n data.bam data.sorted.byquery.bam

You may need to fix the mate::

samtools fixmate -m -r data.sorted.byquery.bam fixmate.bam

Sort back to coordinate::

samtools sort -@ 4 fixmate.bam -o fixmate.sorted.bycoord.bam

Finally, remove the duplicates (with markdup)::

samtools markdup -s -r --fixmate.sorted.bycoord.bam data.markdup.bam