Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

virtualgl-2.4-2 crashes wine #15

Closed
danielausparis opened this issue Feb 22, 2016 · 30 comments
Closed

virtualgl-2.4-2 crashes wine #15

danielausparis opened this issue Feb 22, 2016 · 30 comments

Comments

@danielausparis
Copy link

My setup: Arch Linux 64 with Bumblebee/Optirun on a NVidia card. Latest upgrade to version 2.4-2 makes the setup unusable, the Wine log says: "err:wgl:has_opengl glAccum not found in libGL, disabling OpenGL."

I was able to solve temporarily the issue by downgrading to version 2.4.1-1.

Many thanks for solving this issue probably impacting many users.

@dcommander
Copy link
Member

2.4.1-1 is not a downgrade from 2.4-2. It is an upgrade. The number after the "-" is the build number, which is not controlled by us (it's an Arch Linux-specific build number.) The number before the "-" is the VirtualGL version number.

@danielausparis
Copy link
Author

Commander yes Sir, indeed I am sorry, I mixed up the versions. The upgrade causing the trouble was Arch version 2.5-1, and a downgrade to Arch version 2.4.1-1 solved the issue. In VirtualGL terms this means the upgrade from 2.4.1 to 2.5 caused the issue.
Thank you for your kind attention.

@dcommander
Copy link
Member

OK, that makes more sense. :) I'll see if I can reproduce it.

@dcommander
Copy link
Member

Can you tell me which version of WINE you're using?

@danielausparis
Copy link
Author

Sir my version of Wine is 1.9.4.

@dcommander
Copy link
Member

Not able to reproduce this, unfortunately. I tried on RHEL 6 with a custom build of WINE 1.9.4. I tried on Fedora 22 with WINE 1.9.3 from the official WINE YUM repository. No ability or desire to test on Arch.

Is it a specific application that isn't working or all applications?

@danielausparis
Copy link
Author

Many thanks for your kind efforts. My use case is a game running on wine, i have no other software available.
The issue seems related to the Bumblebee/optirun stack as far as i can see from the Arch fora.

Le 22 février 2016 22:05:38 GMT+01:00, DRC notifications@github.com a écrit :

Not able to reproduce this, unfortunately. I tried on RHEL 6 with a
custom build of WINE 1.9.4. I tried on Fedora 22 with WINE 1.9.3 from
the official WINE YUM repository. No ability or desire to test on
Arch.

Is it a specific application that isn't working or all applications?


Reply to this email directly or view it on GitHub:
#15 (comment)

@dcommander
Copy link
Member

64-bit:

wget https://sourceforge.net/projects/virtualgl/files/2.5/VirtualGL-Utils64-2.5.exe
7z e VirtualGL-Utils64-2.5.exe
vglrun wine ./wglspheres.exe

32-bit:

wget https://sourceforge.net/projects/virtualgl/files/2.5/VirtualGL-Utils-2.5.exe
7z e VirtualGL-Utils-2.5.exe
vglrun wine ./wglspheres.exe

@dcommander
Copy link
Member

Seems this may be related to the nVidia GL-vendor-neutral driver:

https://bugs.archlinux.org/task/48109

but I installed the latest nVidia driver (361.28) on my machine and enabled GLVND, and VirtualGL 2.5 works fine. So this still seems Arch-specific to me.

@danielausparis
Copy link
Author

Many thanks again for your kind efforts! You are right, the nVidia driver seems faulty. However it seems that I am in the strange case of the last post (Kasei Wang) of the Arch bug page. I'll investigate more and keep you posted, of course. Thumbs up for open source!

@dcommander
Copy link
Member

Cool. Keep me posted. If there's something I can do on my end, I'm happy to do it.

@kaseiwang
Copy link

If "optirun glxgears" works and "optirun wine wglspheres.exe" fail. Does it mean this is a wine bug?

@danielausparis
Copy link
Author

Well, in my case it seemed that the upgrade of virtualgl from 2.4.1 to 2.5 was pivotal, because its downgrade allowed my system to work again. However, the reading of the related posts leaves me confused, because multiple potential causes may interfere (new NVidia driver etc.) and we have no really clear explanation yet available.

@dcommander
Copy link
Member

I strongly suspect that the issue is manifesting only with applications that load libGL functions with dlopen()/dlaym(), which is why WINE and Steam are failing but normal applications aren't. And I strongly suspect that it is a packaging problem on the part of Arch, since I can't repro it on other platforms using the same driver.

I would try uninstalling the Arch version of the nVidia driver and installing the driver using nVidia's installer (from nvidia.com) and see if that improves the situation.

@kaseiwang
Copy link

primus and VGL2.4.1 works on Arch fine with libglvnd. I think it's an VGL2.5 bug.

@dcommander
Copy link
Member

Primus works fundamentally differently than VirtualGL. It attempts to provide a complete libGL substitute, and primusrun manipulates LD_LIBRARY_PATH to point to this substitute library so that applications will load it instead of the system's libGL. We could do that too, and in fact, one of the early prototypes I worked on with IBM in 2003-- which ultimately became DCV, a product now owned and maintained by NICE-- takes that approach. The old Chromium3D software also took that approach and had similar restrictions to Primus (only worked with well-behaved OpenGL applications that didn't do front buffer rendering or use any esoteric extensions.) As explained in the VirtualGL background article (http://www.virtualgl.org/About/Background), the main reason why I opted for the LD_PRELOAD technique instead was compatibility. When you supply a complete substitute libGL, you have to implement every function that an application might call, including any vendor extensions, and you have to keep a close watch for changes to the OpenGL spec. By using LD_PRELOAD, VirtualGL only has to implement a handful of functions (mostly from the slower-moving GLX spec), and the rest are passed through automatically to the system's libGL. That is, VGL is a much more vendor-transparent and future-proof solution, but given that Primus is mainly designed for Bumblebee, vendor transparency isn't as important.

In short, I could easily envision a scenario under which Arch's implementation of libglvnd breaks VirtualGL but not Primus. VirtualGL 2.5 introduced a new method of symbol loading. In VirtualGL 2.4.x, all of the underlying libGL and libX11 symbols were loaded within the body of XOpenDisplay(). In VGL 2.5, they are instead loaded on demand using glXGetProcAddress() (for OpenGL/GLX symbols) and dlsym() (for other symbols.)

Some things that might be useful to try-- these could give me a clue as to what's going on:

  • Try export VGL_FAKEXCB=0. This will disable the XCB interposer (which is needed for Qt5 applications but probably not anything else.) One difference in VGL 2.5 is that the XCB interposer is now run-time-enabled by default, whereas in VGL 2.4.x you had to do a custom build of the code and run the application with vglrun +xcb to enable the XCB interposer.

  • If you're willing to get your hands dirty, edit server/faker-sym.cpp and comment the lines as follows:

    //#if sun
    
    // For whatever reason, on Solaris, if a function doesn't exist in libGL,
    // glXGetProcAddress() will return the address of VGL's interposed
    // version, which causes an infinite loop until the program blows its stack
    // and segfaults.  Thus, we use the old reliable dlsym() method.
    dlerror();  // Clear error state
    sym=dlsym(gldllhnd, (char *)name);
    err=dlerror();
    
    //#else
    
    //sym=(void *)__glXGetProcAddress((const GLubyte *)name);
    
    //#endif
    

    This will cause VirtualGL to use dlsym() to load the libGL functions (as VGL 2.4.x did) rather than glXGetProcAddress(), which may make a difference. Maybe there is something fishy about glXGetProcAddress() in Arch's libglvnd implementation.

The reason why I suspect that this is Arch-specific has to do with the comments in the Arch bug report: https://bugs.archlinux.org/task/48109. Some of those users are reporting issues with Primus as well, and the comments suggest that Arch's distribution of libglvnd may be incomplete. That is why I'm suggesting that you try the driver from nVidia, because I can test that same driver, whereas I cannot test the Arch-specific implementation.

@kaseiwang
Copy link

Tried edit server/faker-sym.cpp. Wine still fail on nvidia 358.16 without libglvnd. Glxgears works.

@danielausparis
Copy link
Author

Commander, Arch policy is to normally never modify upstream software. So that Arch's libglvnd can be considered as the vanilla upstream libglvnd. Packaging in this context consists only of copying the compiled files to the corresponding locations.
Anyway your in-depth analysis sounds very accurate, this new symbol loading seems highly relevant to our difficulties. Apparently some applications do not use the on demand loading as expected. Many thanks for this progress.

@dcommander
Copy link
Member

OK, thanks for the info. The main reason why I'm blaming Arch is that I have tried reproducing the issue on two other platforms with libglvnd, without success. Is there by chance a live boot image for Arch? That could be a straightforward way for me to get it up and running on one of my nVidia-equipped machines.

@danielausparis
Copy link
Author

I found a new live arch that seems appealing: https://sourceforge.net/projects/archex/

@kaseiwang
Copy link

VGL build from 6aea81a works. f0243dc fail. Sames on nvidia 358.16 without libglvnd.
Is this help you to confirm you analysis?

@danielausparis
Copy link
Author

Just an Arch-related hint for you: the live Arch should be on a dvd on which you boot, and from which you would install on a usb key. This way you will have a true Arch install that is updatable via pacman (this update should be your first task on your fresh install: "pacman -Syu" as root).

@dcommander
Copy link
Member

@kasei-wang that is expected, 6aea81a is an early commit in the 2.5.x development branch, from before the new symbol loading mechanism was developed, so that commit will behave just like 2.4.1.

@dcommander
Copy link
Member

I am certain that this is due to the same issue as #16-- that is, Arch is not correctly distributing the VirtualGL fakers. They seem to be providing two copies of libvglfaker-nodl.so, as evidenced by the fact that /usr/lib32/libvglfaker.so does not depend on libGL.so.1 like it should. Unfortunately, however, the workaround described in #16 doesn't work in this case, because WINE is using a more "pure" approach for loading symbols-- strictly using dlopen()/dlsym() instead of the more hybrid approach used by Steam.

@danielausparis
Copy link
Author

Excellent, Commander. Many thanks for the investigation. As per the dependency chains published on Arch package specs, there are two providers for libvglfaker: lib32-virtualgl and virtualgl. Both are required by bumblebee, the lib32-virtualgl link being optional. Having two providers is not catastrophic per se, however I noticed a dependancy change between virtualgl-2.4-(1/2) and virtualgl-2.5-1: the old ones did not provide libvglfaker, whereas the new one does. So that my machine actually has NOT libvglfaker at all, because I downgraded virtualgl to begin with, due to the crash. And my optirun/wine works perfectly with the old virtualgl, without libvglfaker.
Should I anyway unblock virtualgl upgrade and have a try with all new versions? I could always backtrack in case of problems.
Also, we might ask the package maintainer for further investigation.

@dcommander
Copy link
Member

Yes, you're going to need to pursue this with the package maintainer. Since they effectively only released the -nodl faker, their implementation of VirtualGL will not work with any application that loads libGL symbols indirectly, and that encompasses a wide variety of applications (including lots of native Linux games-- not just Steam.)

@danielausparis
Copy link
Author

@danielausparis
Copy link
Author

issue has been assigned to package maintainer by Arch bug management.

@tidux
Copy link

tidux commented Jul 24, 2017

This isn't just Arch specific. I'm hitting the same issue with Nvidia driver 375, and VirtualGL and VirtualGL32 2.5.1 on Debian 9.1.

@dcommander
Copy link
Member

@tidux I doubt it's the same issue, although it might have the same symptoms. This specific issue was due to incorrect packaging of VirtualGL by the Arch distribution. WINE requires VirtualGL's libdl interposer, which wasn't being activated properly under Arch. If you are encountering problems with our official packages, I can look into that. Please post a new bug report with the specific version of WINE and the nVidia driver you are using, as well as the application you are running when you encounter the error and the specific error message.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants