FATAL: J state unsupported #238

GabeAl · 2021-04-24T07:53:18Z

I've compiled v3.3.2 of the source code with native compiler flags for the Zen v2 architecture. I've also added -ffast-math.

I get this occasionally while running:
FATAL: J state unsupported
Fatal exception (source file p7_trace.c, line 163):
realloc for size 0 failed
sh: line 1: 3642453 Aborted (core dumped) hmmalign --outformat Pfam /tmpus/b26dfdff-eb39-47f3-8950-3ae0f3af697c tempg_2/out/storage/aai_qa/SGB-03449/PF03710.10.unaligned.faa > tempg_2/out/storage/aai_qa/SGB-03449/PF03710.10.aligned.faa

As well as a bunch of other "FATAL: J state unsupported" sprinkled throughout.

What is a J state and why is it failing? Code that prints this:
src/tracealign.c: case p7T_J: p7_Die("J state unsupported");

Other info:
256 thread CPU system
2TB RAM

Running in the prokka pipeline on rep genomes from the SGB (Pasolli et al).

The text was updated successfully, but these errors were encountered:

cryptogenomicon · 2021-04-24T13:24:33Z

HMMER depends on IEEE754-compliant floating point arithmetic, and --ffast-math allows the compiler to make unsafe and noncompliant optimizations. Why did you use --ffast-math, what happens if you just compile the code normally, and what is the result of make check with and without your custom compiler options?

GabeAl · 2021-04-27T00:46:58Z

Thanks! This is a helpful explanation. I'd used it because it afforded a small 1-2% speedup (I'm trying to squeeze as much performance as I can out of it, since it is a core component of a few QC and annotation pipelines I'm kicking the tires on).

Indeed once -ffast-math is removed, the error disappears. Interestingly, all other combinations of compiler optimization flags I've tried, including code profiling options, even using the intel compiler, all seemed to proceed without this issue. In practice I've rarely come across a case that so intimately depends on IEEE754, but they exist, clearly (compiling glibc is another prominent example that actually stops in its tracks without compiling if -ffast-math is detected)! Feel free to close this.

The only other interesting problem I've had is if hmmer is using more than 128 threads on a system, split across hmmer instances. I get strange non-deterministic segfaults even in the standard conda compile whenever using > 128 threads in total (across concurrent hmmsearch runs, not within a single instance of hmmsearch). But since I can't fathom what threading model could possibly lead to this behavior (signed char as # threads? Or uint8_t but not accounting for the I/O thread?), and because I haven't seen other reports of this using the standard conda compile, I am hesitant to report the behavior without a clear confirmation that it is indeed an issue with hmmer and not some other aspect of the pipeline/stack.

npcarter · 2021-04-27T01:26:11Z

One possibility for why you’re seeing unpredictable segfaults when running many threads is that the amount of memory used in HMMER’s forward and backward stages is highly variable, growing as the product of sequence length and HMM length. This can cause a machine to run out of RAM on some runs but not others, depending on whether a number of high-RAM computations happen at the same time or not. [Nick Carter - Chat @ Spike](https://spikenow.com/r/a/?ref=spike-organic-signature&_ts=106j0m) [106j0m] On April 27, 2021 at 0:47 GMT, Gabriel Al Ghalith ***@***.***> wrote: Thanks! This is a helpful explanation. I'd used it because it afforded a small 1-2% speedup (I'm trying to squeeze as much performance as I can out of it, since it is a core component of a few QC and annotation pipelines I'm kicking the tires on. Indeed once -ffast-math is removed, the error disappears. Interestingly, all other combinations of flags, code profiling options, even using the intel compiler, all seemed to proceed without this issue. In practice I've rarely come across a case that so intimately depends on IEEE754 (but they exist, clearly! Compiling glibc is another prominent example that actually stops in its tracks without compiling if -ffast-math is detected). Feel free to close this. The only other interesting problem I've had is if hmmer is using more than 128 threads on a system, split across hmmer instances. I get strange non-deterministic segfaults even in the standard conda compile whenever using > 128 threads in total (across concurrent hmmsearch runs, not within a single instance of hmmsearch). But since I can't fathom what threading model could possibly lead to this behavior (signed char as # threads? Or unsigned uint8_t but not accounting for the I/O thread?), and because I haven't seen other reports of this using the standard conda compile, I am hesitant to report the behavior without a clear confirmation that it is indeed an issue with hmmer and not some other aspect of the pipeline/stack. — You are receiving this because you are subscribed to this thread. Reply to this email directly, [view it on GitHub](#238 (comment)), or [unsubscribe](https://github.com/notifications/unsubscribe-auth/ABDJBZFIL77335JWZXNQOFLTKYCRHANCNFSM43P4HE3A).

GabeAl · 2021-04-27T01:48:42Z

Interesting, what kind of RAM use are we talking here (as ballpark)? My system has 2TB of RAM and it is memory-defragmented before any HPC run (e.g. all caches are dropped and no other processes/ramdisks are loaded beyond the basic Fedora OS). Working disk space is also large at ~90TB free (in case of large temporary files). My experience on laptops with 4GB RAM and 8 instances (4C/8T CPU) of the prokka and checkm pipelines I'm using this for haven't run into this issue. This new rig has 500x the memory for 32x the number of processes. I should try with just 129 vs 128 processes and see if there is a definite break there.

npcarter · 2021-04-27T23:59:23Z

Yeah, with that much RAM I doubt you're having that sort of out-of-RAM error. On the searches I run, I see some that need more than 16GB single-threaded, but I doubt you'd see enough of those searches happening at the same time to overwhelm a 2TB machine. What sorts of core counts per run are you using? HMMER 3 doesn't get much performance benefit from more than 2 cores per search due to file parsing limitations. We're working on that for HMMER 4, but using more than 2 cores/search on HMMER 3 is generally a waste.

…

-Nick

On Mon, Apr 26, 2021 at 9:48 PM Gabriel Al-Ghalith ***@***.***> wrote: Interesting, what kind of RAM use are we talking here (as ballpark)? My system has 2TB of RAM and it is memory-defragmented before any HPC run (e.g. all caches are dropped and no other processes/ramdisks are loaded beyond the basic Fedora OS). Working disk space is also large at ~90TB free (in case of large temporary files). My experience on laptops with 4GB RAM and 8 instances (4C/8T CPU) of the pipeline haven't run into this issue before. This new rig has 500x the memory for 32x the number of processes. I should try with just 129 vs 128 processes and see if there is a definite break there. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#238 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ABDJBZHDSXCGJQXY7OTNC53TKYJYTANCNFSM43P4HE3A> .

GabeAl · 2021-04-28T01:51:27Z

Thanks, this is great context. Yes, I had been running up to 256 separate instances of hmmsearch --cpu 1. (This way it multi-threads quite well indeed!)

But as there is another hidden I/O thread running anyway, and I can stagger other process to run concurrently, I now limit runs to 128 instances of hmmsearch --cpu 1. Because it is all running asynchronously and being aggregated in the background, there is minimal (or negative!) performance loss by dropping the hyperthreading!

GabeAl · 2021-04-29T00:43:31Z

there is minimal (or negative!) performance loss by dropping the hyperthreading!

Actually this led to some head-scratching so I decided to investigate what was going on with my hmmscan threading where running fewer instances would yield higher performance... then I spotted it. Opened a new discussion, #240

GabeAl closed this as completed May 2, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FATAL: J state unsupported #238

FATAL: J state unsupported #238

GabeAl commented Apr 24, 2021 •

edited

Loading

cryptogenomicon commented Apr 24, 2021

GabeAl commented Apr 27, 2021 •

edited

Loading

npcarter commented Apr 27, 2021 via email

GabeAl commented Apr 27, 2021 •

edited

Loading

npcarter commented Apr 27, 2021 via email

GabeAl commented Apr 28, 2021

GabeAl commented Apr 29, 2021

FATAL: J state unsupported #238

FATAL: J state unsupported #238

Comments

GabeAl commented Apr 24, 2021 • edited Loading

cryptogenomicon commented Apr 24, 2021

GabeAl commented Apr 27, 2021 • edited Loading

npcarter commented Apr 27, 2021 via email

GabeAl commented Apr 27, 2021 • edited Loading

npcarter commented Apr 27, 2021 via email

GabeAl commented Apr 28, 2021

GabeAl commented Apr 29, 2021

GabeAl commented Apr 24, 2021 •

edited

Loading

GabeAl commented Apr 27, 2021 •

edited

Loading

GabeAl commented Apr 27, 2021 •

edited

Loading