Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Paired read matcher can use enormous memory for large input with many chimeric reads #69

Closed
bolosky opened this issue Feb 17, 2016 · 1 comment
Assignees

Comments

@bolosky
Copy link
Contributor

bolosky commented Feb 17, 2016

The paired read matcher reads in an input SAM/BAM file in order and emits matched pairs of reads. For a sorted input where there are lots of chimerically mapped reads, it may be a long time between mate pairs showing up, and in the interim SNAP stores the first end in memory (not only uncompressed, but in a format that is actually pretty wasteful of buffer space).

This can use an inordinate amount of memory for large input files with a high chimeric read fraction. We will need to find some way to mitigate this, probably by spilling to disk.

@bolosky
Copy link
Contributor Author

bolosky commented Nov 24, 2020

Fixed in 1.0.

@bolosky bolosky closed this as completed Nov 24, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants