Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
Improve concurrency in multithreaded BGZF writer (bgzf_mt) #51
This PR includes two commits that significantly increase compression throughput with the multithreaded BGZF writer (bgzf_mt), improving the performance of samtools sort, merge, and view -b in multithreaded mode.
The first commit yields most of the performance benefit by continuing to ingest data while multithreaded compression proceeds in the background. This change does not complicate things very much and hopefully should be fairly uncontroversial.
The second commit, which yields a smaller improvement, is a bit riskier because it introduces writing to the file handle from different threads. It currently works out that the writes are serialized without needing an explicit mutex, but this could be a maintenance concern going forward.