Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce GC to speed up unzip operations #57

Merged
merged 1 commit into from
Mar 16, 2022
Merged

Conversation

cmdcolin
Copy link
Contributor

This can speed up large BAM files significantly

With a test file conjured up for the jb2profile generated with deep (2400x) coverage short reads with wgsim

With this branch: 14seconds
Without this branch: 78seconds

The avoiding of slice, working with more "low level" Uint8Array API (could work with buffer probably but Uint8Array is reliable and not polyfilled) ensures speed

With this change, very little GC occurs while unzipping

@cmdcolin cmdcolin changed the title Use subarray instead of slice to speed up bgzf-filehandle Reduce GC to speed up unzip operations Mar 16, 2022
@cmdcolin cmdcolin merged commit 92cc043 into master Mar 16, 2022
@cmdcolin cmdcolin deleted the subarray_vs_slice branch March 16, 2022 15:26
@rbuels
Copy link
Contributor

rbuels commented Mar 16, 2022

Nice work!

@cmdcolin
Copy link
Contributor Author

interestingly this doesn't have an enormous impact on the jb2profile results. it was 5x faster if i was zoomed in very far (large bp per px) but if zoomed out even to 1kbp it becomes pretty slow again (which is probably related to other factors other than this PR)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants