Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

multithreads running problem #161

Closed
JaneDY opened this issue Jun 25, 2019 · 8 comments
Closed

multithreads running problem #161

JaneDY opened this issue Jun 25, 2019 · 8 comments

Comments

@JaneDY
Copy link

JaneDY commented Jun 25, 2019

Hello!
I set 6 for --cpu when I qsub my hmmscan command, but it was running only with 1 cpu. Does anyone know why and how to run the command parallelly? Thanks a lot for help.

@cryptogenomicon
Copy link
Member

cryptogenomicon commented Jun 25, 2019

It's not clear from your question whether you mean that your job was literally allocated only 1 core (by your cluster management software), or if you mean that you looked at cpu utilization statistics and saw ~100% not ~600%.

If the former: see the documentation for your cluster management software (i.e. qsub). You have to give the --cpu <n> option to hmmscan, and you also have to give an argument to qsub to request cores. This argument will probably be something like -pe smp <n>.

If the latter: hmmscan is i/o intensive, and unless you have atypically wonderful disk speed, it will typically saturate these days at 1-2 cores, making it usually pointless (or even counterproductive) to ask for more. The default 2 cores is set to be about right for typical use. I wouldn't recommend trying to use 6 cores (unless you have disks that can support ~3-6x typical SSD read speeds).

@raw937
Copy link

raw937 commented Apr 8, 2021

Hello,

Having a similar issue. I am trying to give it more --cpu and it appears to make no difference whether I put 1 or 28. I get no speed up. I am unsure if the multithread is working? Do I need to config differently?

@cryptogenomicon
Copy link
Member

Try 2. Above 2, there will typically be rapidly diminishing returns, depending on your hardware.

@raw937
Copy link

raw937 commented Apr 9, 2021

Thank you for your reply. It's an honor to discuss with you.
We use slurm mostly. Should we include the qsub?
As we are hoping to use more threads to get a speed up for testing.
Could we use 2 nodes with max threads?

Again thank you so much for your time.

@npcarter
Copy link
Member

npcarter commented Apr 9, 2021 via email

@raw937
Copy link

raw937 commented Apr 9, 2021

Thank you very much Nick!

nylander added a commit to nylander/birdscanner2 that referenced this issue May 28, 2024
Change default threads for nhmmer from 10 to 2. See <EddyRivasLab/hmmer#161>
@jieegao
Copy link

jieegao commented Oct 11, 2024

HMMER 4

HMMER 4

If you plot hmmsearch/phmmer/nhmmer search time as a function of the number of cores (--cpu) used, going from one to two typically cuts search time by almost 50%. Going from 2->3 typically gives very little benefit, and after that search time tends to go up somewhat as you add cores. The reason for this is that converting the raw data in a FASTA file into the data structures HMMER uses takes about half as much time as searching a database, and HMMER 3 only allocates one thread to processing the database file. Therefore, once you have two worker threads (--cpu 2) consuming the data that the parsing thread generates, adding more worker threads doesn't help, as they wind up spending all of their time waiting for the parsing thread to generate sequences for them to search. As the number of worker threads gets large, performance will decrease below the 2-thread point because the worker threads waiting for data from the parsing thread creates contention for lock data structures, which slows things down. The problem is even worse for hmmscan, as it takes about 200x as much data to represent an HMM as it does to represent a sequence of the same length. As a result, hmmscan is performance-limited by the time to read and parse its input database, and typically sees no benefit from additional worker threads. Given that, the best way to take advantage of machines with many CPU cores is to run multiple searches in parallel, each using 1-2 worker threads. That's easy if you have many searches to do. If you have few searches to perform, you could try chopping your database into pieces that you search simultaneously and using the -Z option to set the database size to the size of your original database. This I/O limit on performance is something of an artifact of how much disk speeds have improved since HMMER 3 was released. At the time it was released, disk bandwidths were typically about 200 MB/sec, and that was the bottleneck on HMMER's performance. Today, SSDs with bandwidths of 5-7 GB/sec are common, which makes the method we use to parse database files the bottleneck. For HMMER 4, we have a combination of a new database format and a better parser design, which should break this bottleneck, but HMMER 4 is not ready for general use. Hope that helps,

-Nick
On Fri, Apr 9, 2021 at 2:01 PM Richard Allen White III < @.***> wrote: Thank you for your reply. It's an honor to discuss with you. We use slurm mostly. Should we include the qsub? As we are hoping to use more threads to get a speed up for testing. Could we use 2 nodes with max threads? Again thank you so much for your time. — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#161 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABDJBZBHC3UJ3IYTX4NUAMTTH46G7ANCNFSM4H3FQTCQ .

Dose HMMER 4-develop branch is ready for general use now? how could i use it to multithreads running,thanks.

@cryptogenomicon
Copy link
Member

Sorry, no, HMMER4 is not ready.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants