-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Updated Scheme and Sequence type to summary and optimized MLST to use thread pools #85
Conversation
This is awesome @jennifertran, thanks. I'll try to find some time to look over this soon. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code is great @jennifertran, thanks so much for taking a look into improving the scheduling of mlst
👍.
There's a few comments I have for you listed below, but overall this is great.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @jennifertran for making the changes. I just have a few more comments.
This is awesome. Thanks so much @jennifertran. It works great now, and is much much faster. 👍 |
Based on Issue #81 and #79
Problem
Based on the issue #81 , since the
mlst --threads
command wasn't doing what we expected to do, there was still a huge bottleneck within the code since each file has to go through mlst one at a time, it wasn't utilizing it resources enough.Solution
Partition the files evenly based on the number of threads available, instantiate the threads in a thread pool and assign an mlst instance to a thread.
Implementation
The implementation is pretty similar to how blast does it. It uses the
ThreadPoolExecutor
class to assign the thread to a process.Testing
Been checking the
time
system instaramr
to see if there was a significant performance change given 50 files and use thetop
command to see if the CPU's were being utilized.