-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactored GenomicAnnotation to reduce memory usage #395
Conversation
…notation in memory
Oh I missed this issue. Why would memory usage be double when reading from the pickled object? Isn't the pickled object just the loaded annotation saved? |
I was looking at this whole morning but still don't have a good conclusion. And this seems to only happen to the annotation file but not the genome fasta. Reading directly from FASTA uses just as much memory as loading from the pickled one. |
Wow interesting. Maybe a very specific data structure is not agreeing with pickling... |
Quoting from the PR that I accidentally opened for no reason..
I think we can close #394 for now. We should further optimize the memory usage but not the top priority for now. |
sorry i forgot about this. no thoughts whatsoever XD sounds like work |
The
GenomicAnnoation
class was refactored slightly. When loading data from GTF, only useful annotation information is kept. For all the features of the same transcript, their IDs are now using the same object by reference. The memory usage for annotation itself is about 6.5 GB when reading directly from GTF and 10.5 GB when loading from the pickled object. Issue #394 is still open.