Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updated Scheme and Sequence type to summary and optimized MLST to use thread pools #85

Merged
merged 4 commits into from
Jul 30, 2019

Conversation

jennifertran
Copy link
Contributor

@jennifertran jennifertran commented Jun 29, 2019

Based on Issue #81 and #79

Problem

Based on the issue #81 , since the mlst --threads command wasn't doing what we expected to do, there was still a huge bottleneck within the code since each file has to go through mlst one at a time, it wasn't utilizing it resources enough.

Solution

Partition the files evenly based on the number of threads available, instantiate the threads in a thread pool and assign an mlst instance to a thread.

Implementation

The implementation is pretty similar to how blast does it. It uses the ThreadPoolExecutor class to assign the thread to a process.

Testing

Been checking the time system in staramr to see if there was a significant performance change given 50 files and use the top command to see if the CPU's were being utilized.

@apetkau
Copy link
Member

apetkau commented Jun 29, 2019

This is awesome @jennifertran, thanks. I'll try to find some time to look over this soon.

Copy link
Member

@apetkau apetkau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code is great @jennifertran, thanks so much for taking a look into improving the scheduling of mlst 👍.

There's a few comments I have for you listed below, but overall this is great.

staramr/blast/JobHandler.py Show resolved Hide resolved
staramr/blast/JobHandler.py Outdated Show resolved Hide resolved
staramr/blast/JobHandler.py Outdated Show resolved Hide resolved
staramr/results/AMRDetectionSummary.py Show resolved Hide resolved
@jennifertran jennifertran self-assigned this Jul 25, 2019
Copy link
Member

@apetkau apetkau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @jennifertran for making the changes. I just have a few more comments.

staramr/blast/JobHandler.py Show resolved Hide resolved
staramr/blast/JobHandler.py Outdated Show resolved Hide resolved
staramr/blast/JobHandler.py Outdated Show resolved Hide resolved
@apetkau
Copy link
Member

apetkau commented Jul 30, 2019

This is awesome. Thanks so much @jennifertran. It works great now, and is much much faster. 👍

@apetkau apetkau merged commit e6abbb2 into development Jul 30, 2019
@apetkau apetkau deleted the mlst_summary branch July 30, 2019 17:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants