-
-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
dlopen of libgomp 13.1.0 and 13.2.0 with RTLD_DEEPBIND on Python fail with segmentation fault on Ubuntu 22.04 #114
Comments
Indeed, it seems that the problematic piece of code was only introduced in libgomp 13 : gcc-mirror/gcc@9f2fca5 . |
Based on the patch you found, it seems to have something to do with parsing I saw that the ipopt-feedstock sets
In particular, from the commit you linked that introduced the new facility for host vs. device, it seems to me that:
|
I am not sure this is related to ipopt/spral. The environment in which this happens reported in #114 (comment) is created with
Just to understand, which stack? The problem occurs just by combining libgomp and python, and I do not think that python depends on libgomp . |
OK, sorry about that. I followed your "downstream issue" a bit, that's why I got to ipopt. If this happens purely with python+libgomp, then I'm more stumped (I thought it was something about setting/parsing the |
I found another issue that contains a segfault in libgomp's initialize_env() weechat/weechat#2009 , if I got it correctly it happens again with libgomp 13.2.0 , but with Fedora 39. |
I reproduced the issue in Debian and Ubuntu distro with apt-packages that contain gomp 13, while earlier distros with gomp 12 all pass fine: https://github.com/traversaro/reproduce-python-gomp-deepbind-issue/actions/runs/6172933871. On the other hand, Fedora 38 has gomp 13.2.0, but does not reproduce the error, similarly also latest arch does not reproduce the problem. |
The issue here seems to happen even with PHP. I wonder if it happens in general when using |
I tested with casadi, and the issue did not happened when using a simple C++ example (I tested https://github.com/casadi/casadi/blob/main/docs/examples/cplusplus/ipopt_nl.cpp). |
Just to be sure I created a minimal C-based test, and indeed the issue does not appear to happen with that, see https://github.com/traversaro/reproduce-python-gomp-deepbind-issue/actions/runs/6174144211 and https://github.com/traversaro/reproduce-python-gomp-deepbind-issue/blob/main/test.c . |
I was able to reproduce the problem without libgomp, just with a manually coded shared lib, i.e.
While in normal use:
For some reason the So perhaps we should move the issue to Python feedstock? |
Ok, I think this is the combination of two different behaviour/problems:
P2 can be reproduced easily on libgomp >= 13 with this MWE: #include <dlfcn.h>
#include <stdio.h>
#include <stdlib.h>
int main () {
clearenv();
void * handle = dlopen("libgomp.so.1", RTLD_NOW);
if (handle) {
fprintf(stderr, "dlopen of libgomp.so.1 done correctly.\n");
return EXIT_SUCCESS;
} else {
fprintf(stderr, "dlopen of libgomp.so.1 failed with error: %s.\n", dlerror());
return EXIT_SUCCESS;
}
return EXIT_SUCCESS;
} to run:
I will open a bug upstream in GCC for P2. |
|
The issue was fixed upstream for GCC14, see:
The patch is huge, but avoiding to to indentation changes it can be summarized to single line change, that for backport can be more adapt to reduce the risk of patch conflicts. |
Great job!
Proof of that statement, using Github's UI. |
It turns that also this was working fine in gomp <= 12 and it does not work in gomp 13, so I opened an issue also for that: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111556 . However, to be honest I am not sure if this is a problem in libgomp, in glibc or simply a problem of how ELF and the POSIX spec interact. |
Solution to issue cannot be found in the documentation.
Issue
If I try to dlopen with RTLD_DEEPBIND from a Python environment libgomp 13.*, I obtain a segfault. A simple reproducer is just the command
python -c "import ctypes; import os; ctypes._dlopen(os.environ['CONDA_PREFIX']+'/lib/libgomp.so.1', os.RTLD_DEEPBIND)"
:The issue does not appear if:
The backtrace is the following:
and seems to indicate that something is going wrong around https://github.com/gcc-mirror/gcc/blob/releases/gcc-13.2.0/libgomp/env.c#L2062 . I have a few ideas to investigate this further, like debugging the value of the
environ
global variable, but I am not sure when I will have time for this, so in the meanwhile I opened this issue.Downstream issue: conda-forge/casadi-feedstock#91 .
Installed packages
Environment info
The text was updated successfully, but these errors were encountered: