Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

glfwTerminate causes segfault, at XCloseDisplay on master #1146

Closed
chalecampb opened this issue Nov 27, 2017 · 11 comments

Comments

Projects
None yet
2 participants
@chalecampb
Copy link

commented Nov 27, 2017

Hi guys,

Versions - Ubuntu 16.04LTS, with CMake 3.5.1, git 2.7.4, GCC 5.4, NVidia 384.98

I just came from the nanogui repo, I found that their 2 year old example repository (nanogui-test) works but when GLFW is updated to latest, it starts segfaulting on exit. Wanted to bring it to your attention.

The glfw submodule in nanogui-test is pegged at bde2fa0. Syncing to that commit resolves the issue.

I checked where the segfault occurred; the last break I set was at

/glfw/src/x11_init.c:783

Shortly after that the line XCloseDisplay(_glfw.x11.display); causes a segfault.

I saw that there are some issues entered which could cause a segfault if terminate were called twice, but you can inspect the code at nanogui-test, the application only calls glfwTerminate is called only once on exit (from nanogui::shutdown()). Also, in gdb there was no second call to terminate, it only ever failed at the first one.

I'll try to find specifically when the commits stop working. If you have any recommendations let me know.

Thanks!
-Alex

EDIT: Changed NVidia driver version to 384.98

@elmindreda elmindreda self-assigned this Nov 27, 2017

@elmindreda elmindreda added verified and removed verified labels Nov 27, 2017

@elmindreda

This comment has been minimized.

Copy link
Member

commented Nov 27, 2017

Are you asking GLFW to create the context using EGL?

@chalecampb

This comment has been minimized.

Copy link
Author

commented Nov 27, 2017

Thanks for your quick response.

All the code interacting with GLFW is in https://github.com/wjakob/nanogui/blob/master/src/screen.cpp

It states that the default is an OpenGL 3.3 context (line 121).

The example code does not pass the major and minor so the context should be 3.3.

In any case, EGL is not found anywhere in the calling code so I have to say, probably not, unless there is something tricky going on. I would have expected glfwGetEGLDisplay somewhere if it were using EGL. Admittedly I am not a GL buff, but I am happy to do any digging.

@elmindreda

This comment has been minimized.

Copy link
Member

commented Nov 27, 2017

Thanks!

A call stack for the segfault would be invaluable.

If you're able, also try running the current glfwinfo and see if that also segfaults. It has switches that lets you create any kind of context and framebuffer (see glfwinfo -h).

The output of ldd $(which glxinfo) might also be of use.

@chalecampb

This comment has been minimized.

Copy link
Author

commented Nov 28, 2017

Hi again,

Stack trace is

Program received signal SIGSEGV, Segmentation fault.
0x00007ffff2283a90 in ?? ()
(gdb) bt
#0  0x00007ffff2283a90 in ?? ()
#1  0x00007ffff2029e2a in ?? () from /usr/lib/nvidia-384/libnvidia-glsi.so.384.98
#2  0x00007ffff2032f45 in ?? () from /usr/lib/nvidia-384/libnvidia-glsi.so.384.98
#3  0x00007ffff62df642 in XCloseDisplay () from /usr/lib/x86_64-linux-gnu/libX11.so.6
#4  0x00007ffff7b5a07e in _glfwPlatformTerminate () from /home/alex/Projects/Source/nanogui-test/ext/nanogui/libnanogui.so
#5  0x00007ffff7b53b78 in glfwTerminate () from /home/alex/Projects/Source/nanogui-test/ext/nanogui/libnanogui.so
#6  0x0000000000401a81 in main (argc=<optimized out>, argv=<optimized out>) at /home/alex/Projects/Source/nanogui-test/test.cpp:62

@chalecampb

This comment has been minimized.

Copy link
Author

commented Nov 28, 2017

I was able to get a list of all glfw calls, just started the application, immediately closed, got a segfault, and processed the output.

https://gist.github.com/chalecampb/0e2b95ff6201779685e26dd5b912aacb

And based on those calls, internally something is routing to the GLFW calls to EGL. So that does actually sound like you guessed the issue correctly. I am not sure how that is happening, I will dig more. I am interested in understanding why that could cause the failure, even then.

@elmindreda

This comment has been minimized.

Copy link
Member

commented Nov 28, 2017

And based on those calls, internally something is routing to the GLFW calls to EGL.

It seems from the gdb call stack like you're not using a debug build (-g -O0), so the trace tool you're using may be guessing wrong. I don't see any calls to _glfwChooseVisualEGL or _glfwCreateContextEGL which would be there if EGL was used for context creation.

@chalecampb

This comment has been minimized.

Copy link
Author

commented Nov 28, 2017

@elmindreda

This comment has been minimized.

Copy link
Member

commented Nov 28, 2017

There was an issue with recent Nvidia EGL that was fixed yesterday with 9e6c0c7.

@chalecampb

This comment has been minimized.

Copy link
Author

commented Nov 28, 2017

@chalecampb

This comment has been minimized.

Copy link
Author

commented Nov 29, 2017

Hi,

I wasn't able to checkout that reference, but I did see that it was a 1 line move in x11_init.c where the segfault was happening. So I just made the change and can confirm that it resolves the issue.

Thanks!
-Alex

@chalecampb chalecampb closed this Nov 29, 2017

@elmindreda

This comment has been minimized.

Copy link
Member

commented Nov 29, 2017

Yay! Thank you for the follow-up!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.