-
Notifications
You must be signed in to change notification settings - Fork 367
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Netlib BLAS test xblat3d using BLIS on Intel Broadwell incorrectly signals IEEE_UNDERFLOW_FLAG IEEE_DENORMAL #486
Netlib BLAS test xblat3d using BLIS on Intel Broadwell incorrectly signals IEEE_UNDERFLOW_FLAG IEEE_DENORMAL #486
Comments
Sorry to hear about these errors, @akesandgren.
Can you clarify what you mean by "netlib blas test"? Are you referring to test drivers in netlib LAPACK?
Skylake or SkylakeX? It may also be helpful if you can tell us which operation is triggering the error. |
netlib blas test == BLAS/TESTING when downloading LAPACK from netlib. I haven't had time to dig down into the details of it before, but a quick test-one-routine-at-a-time reveals the it is DGEMM |
I'm not able to reproduce this on Xeon E5-2680 v4 with the dblat3 driver included with BLIS and |
Can't reproduce with the Netlib driver with |
Can you provide more details about your processor, environment (OS, compiler), BLIS configuration, and flags used for the Netlib driver? |
Duh, dblat3d. Let me try that one. |
OK, AFAICT dblat3d is the input file for dblat3 so I was in the right place. We're back to needing more details as above. |
Ah, forgot about this one. Will try to provide more info soon. |
Hardware: Intel(R) Xeon(R) CPU E5-2690 v4 @ 2.60GHz
This sets the correct flags for the testing routines and make it link with -lblis.
This will spew out:
If you do not see this then we need to dive into details on how BLIS is built. |
I was finally able to reproduce this and tracked it down to a bug that was fixed in 0.8.1. Easy fix! |
Ok, can you give me the commit or describe the fix, i'm curious by nature and in BLAS-related stuff in particular... |
It's the 0.8.1 tag or you can checkout the current HEAD. The fix has to do with not reading one element past the end the rows/columns of small matrices (in a way which cannot result in a segfault, but can introduce a NaN or Inf value into a horizontal addition). The particular commit for the fix is b43dae9. |
Ahh reading out-of-bounds... was kind of guessing at that, not the first time :-) |
Nope, still getting
with BLIS 0.8.1 now with GCC 10.3.0 We build BLIS with this configure line:
and with
|
Are those CFLAGS for BLIS or xblat3d? |
The
is what we build BLIS with.
I build the BLAS/TESTING with FFLAGS_NOOPT since it is the testing code and it should not have any optimizations done.
|
Found it. This one isn't a read-past-the-end, but a place where some vestigial code does a horizontal add on junk data and then throws it away. |
Are you sure? I still get the same if I add that commit on top of 0.8.1. Are there more commits between 0.8.1 and that commit that I need? |
It's possible that was only the first location of several issues. Let me delve deeper. |
That's what you get for fixing bugs in 5-minute spurts between other work :). |
Know the problem well... |
OK, I think I really fixed it this time. |
Yes, with those two commits on top of 0.8.1 the problem is finally gone. Thanks, awaiting 0.8.2 eagerly... |
Fixes flame#486. Change-Id: I568386b5d67a698ea9c0b6b17f133df86c2894bd
Building netlib blas test and linking with BLIS signals an incorrect IEEE_UNDERFLOW_FLAG IEEE_DENORMAL warning when running xblat3d on Intel Broadwell.
Flags for building BLAS test routines:
-O0 -frecursive -std=legacy -mieee-fp -fno-trapping-math -fno-math-errno -march=native
BLIS built by EasyBuild in gobff/2020b toolchain.
./xblat3d < dblat3.in
Note: The following floating-point exceptions are signalling: IEEE_UNDERFLOW_FLAG IEEE_DENORMAL
This does not happen with refblas or OpenBlas/0.3.12.
It does not happen when running on AMD EPYC 7302P, nor on Skylake.
The text was updated successfully, but these errors were encountered: