Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make calmd run faster on non-position-sorted data #1723

Merged
merged 1 commit into from Oct 3, 2022

Conversation

daviesrob
Copy link
Member

Adds caching so that calmd doesn't continuously have to reread reference sequences if the input does not go through them sequentially. As caching requires extra memory, it is only enabled if calmd detects an attempt to go backwards in the reference dictionary. Thus processing a position-sorted file uses no more space than before, while processing other orderings is now much faster.

Fixes #1595

Add caching so that calmd doesn't continuously have to reread
reference sequences if the input does not go through them
sequentially.  As caching requires extra memory, it is only enabled
if calmd detects an attempt to go backwards in the reference
dictionary.  Thus processing a position-sorted file uses no more
space than before, while processing other orderings is now much
faster.
@whitwham whitwham merged commit 06bb098 into samtools:develop Oct 3, 2022
@daviesrob daviesrob deleted the calmd_name_sorted branch October 3, 2022 15:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

samtools calmd is pretty slow
2 participants