Skip to content

isabella232/doppelmark

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

doppelmark duplicate marking tool

doppelmark is a high-performance duplicate sequencing read marking tool for marking PCR and optical(pad-hopping) duplicate reads. It is functionally equivalent to the picard and sambamba duplicate marking tools, but runs much more efficiently and takes advantage of multi-core hardware. For some workloads and hardware, doppelmark is 100x faster than picard, and 7x faster than sambamba.

doppelmark achieves its speedup by dividing the input into shards and running the shards in parallel. Each shard includes input decompression, duplicate marking, and compression of the resulting output data. It detects duplicates without sorting all records. For a detailed description of the algorithm and design, see doc.go.

  • doppelmark: High-performance duplicate marking tool

About

NGS duplicate marking

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Go 100.0%