-
Notifications
You must be signed in to change notification settings - Fork 71
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FATAL: J state unsupported #238
Comments
HMMER depends on IEEE754-compliant floating point arithmetic, and |
Thanks! This is a helpful explanation. I'd used it because it afforded a small 1-2% speedup (I'm trying to squeeze as much performance as I can out of it, since it is a core component of a few QC and annotation pipelines I'm kicking the tires on). Indeed once -ffast-math is removed, the error disappears. Interestingly, all other combinations of compiler optimization flags I've tried, including code profiling options, even using the intel compiler, all seemed to proceed without this issue. In practice I've rarely come across a case that so intimately depends on IEEE754, but they exist, clearly (compiling glibc is another prominent example that actually stops in its tracks without compiling if -ffast-math is detected)! Feel free to close this. The only other interesting problem I've had is if hmmer is using more than 128 threads on a system, split across hmmer instances. I get strange non-deterministic segfaults even in the standard conda compile whenever using > 128 threads in total (across concurrent hmmsearch runs, not within a single instance of hmmsearch). But since I can't fathom what threading model could possibly lead to this behavior (signed char as # threads? Or uint8_t but not accounting for the I/O thread?), and because I haven't seen other reports of this using the standard conda compile, I am hesitant to report the behavior without a clear confirmation that it is indeed an issue with hmmer and not some other aspect of the pipeline/stack. |
One possibility for why you’re seeing unpredictable segfaults when running many threads is that the amount of memory used in HMMER’s forward and backward stages is highly variable, growing as the product of sequence length and HMM length. This can cause a machine to run out of RAM on some runs but not others, depending on whether a number of high-RAM computations happen at the same time or not.
[Nick Carter - Chat @ Spike](https://spikenow.com/r/a/?ref=spike-organic-signature&_ts=106j0m) [106j0m]
On April 27, 2021 at 0:47 GMT, Gabriel Al Ghalith ***@***.***> wrote:
Thanks! This is a helpful explanation. I'd used it because it afforded a small 1-2% speedup (I'm trying to squeeze as much performance as I can out of it, since it is a core component of a few QC and annotation pipelines I'm kicking the tires on.
Indeed once -ffast-math is removed, the error disappears. Interestingly, all other combinations of flags, code profiling options, even using the intel compiler, all seemed to proceed without this issue. In practice I've rarely come across a case that so intimately depends on IEEE754 (but they exist, clearly! Compiling glibc is another prominent example that actually stops in its tracks without compiling if -ffast-math is detected). Feel free to close this.
The only other interesting problem I've had is if hmmer is using more than 128 threads on a system, split across hmmer instances. I get strange non-deterministic segfaults even in the standard conda compile whenever using > 128 threads in total (across concurrent hmmsearch runs, not within a single instance of hmmsearch). But since I can't fathom what threading model could possibly lead to this behavior (signed char as # threads? Or unsigned uint8_t but not accounting for the I/O thread?), and because I haven't seen other reports of this using the standard conda compile, I am hesitant to report the behavior without a clear confirmation that it is indeed an issue with hmmer and not some other aspect of the pipeline/stack.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, [view it on GitHub](#238 (comment)), or [unsubscribe](https://github.com/notifications/unsubscribe-auth/ABDJBZFIL77335JWZXNQOFLTKYCRHANCNFSM43P4HE3A).
|
Interesting, what kind of RAM use are we talking here (as ballpark)? My system has 2TB of RAM and it is memory-defragmented before any HPC run (e.g. all caches are dropped and no other processes/ramdisks are loaded beyond the basic Fedora OS). Working disk space is also large at ~90TB free (in case of large temporary files). My experience on laptops with 4GB RAM and 8 instances (4C/8T CPU) of the prokka and checkm pipelines I'm using this for haven't run into this issue. This new rig has 500x the memory for 32x the number of processes. I should try with just 129 vs 128 processes and see if there is a definite break there. |
Yeah, with that much RAM I doubt you're having that sort of out-of-RAM
error. On the searches I run, I see some that need more than 16GB
single-threaded, but I doubt you'd see enough of those searches happening
at the same time to overwhelm a 2TB machine.
What sorts of core counts per run are you using? HMMER 3 doesn't get much
performance benefit from more than 2 cores per search due to file parsing
limitations. We're working on that for HMMER 4, but using more than 2
cores/search on HMMER 3 is generally a waste.
…-Nick
On Mon, Apr 26, 2021 at 9:48 PM Gabriel Al-Ghalith ***@***.***> wrote:
Interesting, what kind of RAM use are we talking here (as ballpark)? My
system has 2TB of RAM and it is memory-defragmented before any HPC run
(e.g. all caches are dropped and no other processes/ramdisks are loaded
beyond the basic Fedora OS). Working disk space is also large at ~90TB free
(in case of large temporary files). My experience on laptops with 4GB RAM
and 8 instances (4C/8T CPU) of the pipeline haven't run into this issue
before. This new rig has 500x the memory for 32x the number of processes. I
should try with just 129 vs 128 processes and see if there is a definite
break there.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#238 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABDJBZHDSXCGJQXY7OTNC53TKYJYTANCNFSM43P4HE3A>
.
|
Thanks, this is great context. Yes, I had been running up to 256 separate instances of hmmsearch --cpu 1. (This way it multi-threads quite well indeed!) But as there is another hidden I/O thread running anyway, and I can stagger other process to run concurrently, I now limit runs to 128 instances of hmmsearch --cpu 1. Because it is all running asynchronously and being aggregated in the background, there is minimal (or negative!) performance loss by dropping the hyperthreading! |
Actually this led to some head-scratching so I decided to investigate what was going on with my hmmscan threading where running fewer instances would yield higher performance... then I spotted it. Opened a new discussion, #240 |
I've compiled v3.3.2 of the source code with native compiler flags for the Zen v2 architecture. I've also added -ffast-math.
I get this occasionally while running:
FATAL: J state unsupported
Fatal exception (source file p7_trace.c, line 163):
realloc for size 0 failed
sh: line 1: 3642453 Aborted (core dumped) hmmalign --outformat Pfam /tmpus/b26dfdff-eb39-47f3-8950-3ae0f3af697c tempg_2/out/storage/aai_qa/SGB-03449/PF03710.10.unaligned.faa > tempg_2/out/storage/aai_qa/SGB-03449/PF03710.10.aligned.faa
As well as a bunch of other "FATAL: J state unsupported" sprinkled throughout.
What is a J state and why is it failing? Code that prints this:
src/tracealign.c: case p7T_J: p7_Die("J state unsupported");
Other info:
256 thread CPU system
2TB RAM
Running in the prokka pipeline on rep genomes from the SGB (Pasolli et al).
The text was updated successfully, but these errors were encountered: