Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segmentation fault when mixing OpenGL and OpenCL #28

Closed
athas opened this issue Jan 3, 2020 · 6 comments
Closed

Segmentation fault when mixing OpenGL and OpenCL #28

athas opened this issue Jan 3, 2020 · 6 comments

Comments

@athas
Copy link
Contributor

athas commented Jan 3, 2020

I'm mostly curious about whether other users of nixos-rocm have seen this. When mixing OpenGL and OpenCL in the same program, I get a segmentation fault deep inside LLVM.

glut.c for reproduction:

#include <GL/glut.h>
#include <CL/cl.h>
#include <assert.h>

int main(int argc, char *argv[]) {
  cl_int err;
  cl_platform_id platform;
  err = clGetPlatformIDs(1, &platform, NULL);
  assert(err == CL_SUCCESS);

  glutInit(&argc, argv);

  return 0;
}

Compile with:

$ nix-shell -p rocm-opencl-runtime -p freeglut -p mesa_glu --run 'gcc glut.c -o glut -lOpenCL -lOpenGL -lglut -g'

I get a segfault on glutInit(). If I do glutInit() first, then I get a segfault on clGetPlatformIDs().

This happens both with ROCm 2.10, and the WIP ROCm 3.0.

@athas
Copy link
Contributor Author

athas commented Jan 3, 2020

At some point in the past (a few months ago), this worked, but I don't recall the exact versions anymore.

@acowley
Copy link
Collaborator

acowley commented Jan 4, 2020

Your reproducer segfaults for me, too, but I don't have much clue as to where the problem is. I guess we've got a few different versions of LLVM at play, so I tried rebuilding mesa with rocm-llvm, but that failed with what looks like a breaking change in LLVM. I'm not sure if that's the right avenue to pursue, but it's the first thing that came to mind.

@athas
Copy link
Contributor Author

athas commented Jan 4, 2020

It works with NixOS 19.09 (that is, stable) and master nixos-rocm.

@athas
Copy link
Contributor Author

athas commented Mar 16, 2020

I have a vague suspicion about what may be going on. Even the following program crashes when linked with -lOpenCL and run in oclgrind:

#include <GL/glut.h>
#include <assert.h>

int main(int argc, char *argv[]) {
  glutInit(&argc, argv);
  return 0;
}

The strack trace is very suspicious:

#0  0x00007ffff74f7f57 in llvm::StringMapImpl::LookupBucketFor(llvm::StringRef) () from /nix/store/vjm3m5w63kpl1bkw0gspsadnvi6zqs3i-oclgrind-19.10/lib64/liboclgrind.so
#1  0x00007ffff74cacb5 in llvm::cl::Option::setArgStr(llvm::StringRef) () from /nix/store/vjm3m5w63kpl1bkw0gspsadnvi6zqs3i-oclgrind-19.10/lib64/liboclgrind.so
#2  0x00007fffeee4e94d in __static_initialization_and_destruction_0(int, int) [clone .constprop.0] () from /nix/store/y6i4y94y5c7qqnvjx5z97dmrc25icsyj-llvm-9.0.1-lib/lib/libLLVM-9.so
#3  0x00007ffff7fe301a in call_init (l=<optimized out>, argc=argc@entry=1, argv=argv@entry=0x7fffffffcf08, env=env@entry=0x7fffffffcf18) at dl-init.c:72
#4  0x00007ffff7fe3116 in call_init (env=0x7fffffffcf18, argv=0x7fffffffcf08, argc=1, l=<optimized out>) at dl-init.c:30
#5  _dl_init (main_map=main_map@entry=0x448f50, argc=1, argv=0x7fffffffcf08, env=0x7fffffffcf18) at dl-init.c:119
#6  0x00007ffff7fe6f23 in dl_open_worker (a=a@entry=0x7fffffffb940) at dl-open.c:510
#7  0x00007ffff7decdac in __GI__dl_catch_exception (exception=<optimized out>, operate=<optimized out>, args=<optimized out>) at dl-error-skeleton.c:196
#8  0x00007ffff7fe67fa in _dl_open (file=0x7fffffffbbc0 "/run/opengl-driver/lib/dri/radeonsi_dri.so", mode=-2147483390, caller_dlopen=0x7ffff45ceace <loader_open_driver+414>,
    nsid=<optimized out>, argc=1, argv=0x7fffffffcf08, env=0x7fffffffcf18) at dl-open.c:592
#9  0x00007ffff7cba246 in dlopen_doit (a=a@entry=0x7fffffffbb60) at dlopen.c:66
#10 0x00007ffff7decdac in __GI__dl_catch_exception (exception=exception@entry=0x7fffffffbb00, operate=<optimized out>, args=<optimized out>) at dl-error-skeleton.c:196
#11 0x00007ffff7dece1f in __GI__dl_catch_error (objname=0x4052b0, errstring=0x4052b8, mallocedp=0x4052a8, operate=<optimized out>, args=<optimized out>) at dl-error-skeleton.c:215
#12 0x00007ffff7cba8f5 in _dlerror_run (operate=operate@entry=0x7ffff7cba1f0 <dlopen_doit>, args=args@entry=0x7fffffffbb60) at dlerror.c:170
#13 0x00007ffff7cba2c6 in __dlopen (file=<optimized out>, mode=<optimized out>) at dlopen.c:87
#14 0x00007ffff45ceace in loader_open_driver () from /run/opengl-driver/lib/libGLX_mesa.so.0
#15 0x00007ffff45bbd31 in driOpenDriver () from /run/opengl-driver/lib/libGLX_mesa.so.0
#16 0x00007ffff45c4340 in dri3_create_screen () from /run/opengl-driver/lib/libGLX_mesa.so.0
#17 0x00007ffff45b0fa9 in __glXInitialize () from /run/opengl-driver/lib/libGLX_mesa.so.0
#18 0x00007ffff45acb84 in GetGLXPrivScreenConfig () from /run/opengl-driver/lib/libGLX_mesa.so.0
#19 0x00007ffff45ad0fe in glXQueryExtensionsString () from /run/opengl-driver/lib/libGLX_mesa.so.0
#20 0x00007ffff7eaee6e in fgPlatformInitialize () from /nix/store/dwkgb62gdlwx6pyka0yr40q6lwvxf624-freeglut-3.2.1/lib/libglut.so.3
#21 0x00007ffff7ea500c in glutInit () from /nix/store/dwkgb62gdlwx6pyka0yr40q6lwvxf624-freeglut-3.2.1/lib/libglut.so.3
#22 0x0000000000401052 in main (argc=<optimized out>, argv=<optimized out>) at glut.c:5

What is the LLVM embedded in liboclgrind.so doing there?! I suspect that maybe the radeonsi driver is doing some process inspection to find LLVM, and something cannot handle two LLVMs co-existing in the same process. I have no idea how that would happen, though. Also, this stack trace is different to what I get with ROCm. But this may ultimately be a Mesa bug, not a ROCm one.

@athas
Copy link
Contributor Author

athas commented Mar 18, 2020

It seems to work if I compile Mesa with LLVM 8 rather than LLVM 9.

@athas athas mentioned this issue Apr 14, 2020
@athas
Copy link
Contributor Author

athas commented Apr 15, 2020

Seems fixed by #35, but it might also be a change in Mesa for all I know.

@athas athas closed this as completed Apr 15, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants