Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

openlibm fenv changes broke CUDA #38427

Closed
maleadt opened this issue Nov 13, 2020 · 4 comments · Fixed by JuliaMath/openlibm#219 or #38466
Closed

openlibm fenv changes broke CUDA #38427

maleadt opened this issue Nov 13, 2020 · 4 comments · Fixed by JuliaMath/openlibm#219 or #38466
Assignees
Labels
kind:regression Regression in behavior compared to a previous version
Milestone

Comments

@maleadt
Copy link
Member

maleadt commented Nov 13, 2020

On latest nightly as well as master + USE_BINARYBUILDER_OPENLIBM=false DEPS_GIT=openlibm:

Thread 1 "julia-debug" received signal SIGSEGV, Segmentation fault.
0x00007ffff7b855d0 in fesetenv () from /home/tim/Julia/julia/build/release/usr/bin/../lib/libopenlibm.so
(gdb) bt
#0  0x00007ffff7b855d0 in fesetenv () from /home/tim/Julia/julia/build/release/usr/bin/../lib/libopenlibm.so
#1  0x00007fff824578ae in ?? () from /usr/lib/libnvidia-ptxjitcompiler.so.1
#2  0x00007fff8244b8fc in __cuda_CallJitEntryPoint () from /usr/lib/libnvidia-ptxjitcompiler.so.1
#3  0x00007fff9196bc30 in ?? () from /usr/lib/libcuda.so.1
#4  0x00007fff91995e23 in ?? () from /usr/lib/libcuda.so.1
#5  0x00007fff91793fd0 in ?? () from /usr/lib/libcuda.so.1
#6  0x00007fff91731853 in ?? () from /usr/lib/libcuda.so.1
#7  0x00007fff917e992e in cuModuleLoadDataEx () from /usr/lib/libcuda.so.1

Reverting JuliaMath/openlibm#213 by using OPENLIBM_SHA1=878948d3dd6bc940f65582b8cae286a06d89ad81 (and changing rounding.jl to look for fesetround and fegetround in libjulia again or bootstrap fails) fixes the issue. Reproduces with CUDA.jl's examples/vadd.jl (requiring https://github.com/JuliaGPU/GPUCompiler.jl/pull/69/files for 1.6 compatibility).

Feel free to transfer this issue to https://github.com/JuliaMath/openlibm; I opened it here because it ultimately manifests as CUDA.jl not working on latest nightly.

@maleadt maleadt added the kind:regression Regression in behavior compared to a previous version label Nov 13, 2020
@maleadt maleadt added this to the 1.6 features milestone Nov 13, 2020
@maleadt
Copy link
Member Author

maleadt commented Nov 13, 2020

Looks like the openlibm fesetenv implementation is broken. It doesn't handle FE_DFL_ENV (((const fenv_t *) -1)), instead accessing the environment pointer directly: https://github.com/JuliaMath/openlibm/blob/65d7406056d4bdd0ec0da05694364333c4d44331/include/openlibm_fenv_amd64.h#L194. FE_NOMASK_ENV also isn't handled. For reference, the glibc implementation: https://github.com/bminor/glibc/blob/5500cdba4018ddbda7909bc7f4f9718610b43cf0/sysdeps/x86_64/fpu/fesetenv.c#L29-L111. Why are we using our own?

@staticfloat
Copy link
Sponsor Member

I believe the reason why we did this was because we wanted feature parity across all platforms, otherwise some math functions were not getting looked up properly across all platforms. @vtjnash do you remember why we did this?

@vtjnash
Copy link
Sponsor Member

vtjnash commented Nov 13, 2020

I don't know why. We added that code to openlibm so that the function was available on Windows. Not sure why, since it should be in the mingw crt https://github.com/Alexpux/mingw-w64/blob/master/mingw-w64-crt/misc/fesetenv.c

maleadt added a commit to maleadt/openlibm that referenced this issue Nov 16, 2020
The implementation of `fesetenv` cannot be portable, as the value of
`FE_DFL_ENV` differs between platforms. On FreeBSD, it is a actual
environment. With glibc however, it's a sentinel -1 handled in the
implementation of its floating point functions.

With openlibm based on FreeBSD's libm, it assumes `FE_DFL_ENV` to be an
actual environment. That assumption breaks using code that was compiled
against glibc, e.g., `libcuda`:

```
Thread 1 "julia-debug" received signal SIGSEGV, Segmentation fault.
0x00007ffff7b855d0 in fesetenv () from /home/tim/Julia/julia/build/release/usr/bin/../lib/libopenlibm.so
(gdb) bt
```

This reverts commit 5a27b4c.

Fixes JuliaLang/julia#38427.
@maleadt maleadt reopened this Nov 17, 2020
@DilumAluthge
Copy link
Member

I'm re-opening this as possibly related to #39462

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind:regression Regression in behavior compared to a previous version
Projects
None yet
4 participants