-
Notifications
You must be signed in to change notification settings - Fork 86
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Steam Linux Runtime on Guix does not find driver dependencies #478
Comments
I did some more digging and I think found the actual problem. I ran the capture-libs command within pressure vessel (with the So what I think is happening is that pressure vessel is not adding the LLVM libraries to the overrides (no LLVM ones listed in the output or Steam's system information) needed to load some of the Mesa drivers, like Now, one option is for Guix to build LLVM and Mesa with the appropriate option shared library option. While Guix might decide this is better (separating out the more "developer" build of LLVM for those that need it rather than for everything) and join most other distros, this still seems like a general problem that someone may run into compiling their own stuff. And would seem appropriate for pressure vessel to pull in linked dependencies for host graphics drivers. (For reference, this has come up on Guix but seems to have gotten lost: https://lists.gnu.org/archive/html/guix-devel/2020-06/msg00147.html and https://issues.guix.gnu.org/42576) As far as I can tell, there's no way to force the LLVM libs to end up in the override, is there? Can (should?) pressure vessel be more robust in how it handles dependencies for overrides it brings in, especially for these few critical components? Thanks! |
(Apologies for any spam, but hoping the process is helpful for others as well.) Building LLVM with Adding the flag
But in launching with an xterm in the environment and running the command from the log won't find
while it still exists from the host, e.g.
So the search is only within the container and not from host at this stage, but just to confirm that this is still accessible:
The first part of an
and for this
Note that all these
|
I spoke too soon, seems moving to just The real potential issue, finally: Looking back at the logs and going back through the changes, does seem that without LLVM compiled with the dynlib option, that pressure vessel won't see/capture the individual LLVM libraries needed by Mesa. So I see it fail on In any event, I hope this is helpful documentation and maybe of some use to someone else trying to debug such things (there's a lot to wrap your head around here!). |
As an immediate response before looking into the technical details: @podiki or @kisak-valve: Please change the title of this issue to make it obvious that it is on Guix. "Steam Linux Runtime on Guix does not find driver dependencies" would probably be a good title. pressure-vessel is very sensitive to the exact "shape" of non-FHS distributions like NixOS and Guix, and I would not expect it to work unmodified. We added some code to pressure-vessel specifically to support NixOS, and we will probably need to do the same for Guix. If a Guix user will test it for me (by replacing |
You've jumped straight in to very specific details from the various logs, but you're not giving me an overview of the bigger picture. Please could you provide the information from the issue reporting template, in particular the Steam system information (Help -> System Information) and the It might be that you are correct about the failure mode, but to solve it we will probably need to "zoom out" and look at some more basic facts about Guix. |
Yes there is -
It is deliberate that we use that flag. For loadable modules that do not normally appear in the default search path for libraries (and in particular Mesa DRI drivers), we need to run capsule-capture-libs twice: once with This is because if If you had attached a whole log instead of selecting the parts you thought were most relevant, we'd be able to see the second invocation (with
Most of the commands in the log are not running (and not necessarily even runnable!) in the container created by pressure-vessel, because they are gathering information that we will need before we can create our container. The log does tell you when it hands over to the It's extra-confusing on NixOS and Guix, because on a typical FHS-ish distribution like Debian or Fedora, we only have one layer of containers:
but on NixOS, and presumably Guix too, there's an additional layer of container to turn it into a more FHS-shaped environment that will not break all our assumptions:
|
Here are some things that are unusual about NixOS, and will probably be equally unusual about Guix:
|
In addition to all the other information I've asked for, it would probably be useful if you could show me a file listing ( Ideally, I want pressure-vessel to "just work" by default on as many OSs as possible, and not need local workarounds - but for more-rarely-used OSs like NixOS and Guix, I have to depend on users of those OSs to provide the information that is needed to make that possible. FHS and nearly-FHS distros like Debian, Fedora, Arch, or even Exherbo can usually be made to work as-is, with pressure-vessel responding to what the distro does rather than the other way round. Non-FHS distros like NixOS and Guix are just too unusual for that approach to work (and Steam won't work there anyway), so it's pragmatic for those distros to provide a closer-to-FHS container to run Steam in, which lets me concentrate on making pressure-vessel work as intended in that closer-to-FHS container instead of trying to make it work on the unmodified host system. In particular, distros where users are expected to compile libraries from source for themselves (as opposed to binary-package-based distros like Debian and Fedora) are disproportionately "expensive" to test in a centralized way - it would take too much disk space and CPU time to bring up test systems with multiple source-based distros and keep them up to date - so we're reliant on users of these distros to bring issues to our attention. Similarly, if things don't work in a distro that aims to be highly customizable, we are unlikely to be able to reproduce the exact customizations that a particular user has made. |
Hi @smcv! And thanks for looking into this, much appreciated. Apologies for a flurry of messages as I sort of talked to myself in trying to debug without giving you the big picture. Long reponse here to the points you raised. I've linked a log of the failure case and will provide more logs and info you requested in a later follow-up. Thanks!
Thanks and I would be happy to help test any fixes. I think the NixOS changes are generally helpful on the Guix side as they share very similar overall structure. I should note that I can provide a few files and easy directions for reproducing the same environment in either a VM or on top of a regular distro, which will remain isolated.
I will add that in a followup (not at that computer until later, though I do have some of the logs). I'll give the System Information and
This is helpful, thank you. I think what I ran into that was most confusing and led me astray in my earlier isssue report was that I didn't have the right debug flags to see the details of the different Here is a log where capturing fails with the I'll provide the System Information and other info for this scenario when I'm at the original computer, with a fresh log just in case too.
Indeed, there's a lot of layers here! Containers within containers (as I understand it, what made Flatpak Steam tricky too).
Yes, that would be great. Perhaps pressure-vessel can have a list of such "stores" that should be read-only mounted. (Aside: I don't think there should be any difficulty if you have Guix (or Nix) as an additional package manager on top of another distro, as everything is only located in the store or linked to it. So it is self-contained in a way, and running something from Guix should stay within Guix. The possible exception is maybe graphics drivers or other things like that where you need the actual real host version? I'm not sure, and that is extra confusion. I just tested running Steam from Guix on top of Arch and it worked, seemed to only use the Guix libraries including Mesa. But that is for another time.)
As far as I know there is only
Yes, this is a sticking point. By default Guix's I've been trying to make this work for us (you can see the discussion at this issue on nonguix, where Nonguix is because Steam is non-free and GNU Guix is a free software distribution). I don't know much in this arena, but my efforts came up empty so far, just not sure that something like So what I'm doing now is reverting to an earlier version of the FHS container we had, where we have a more vanilla
This (reliance on the cache) might be why my efforts for tricking Guix's standard
Similar to what I/we've done here too. Besides there not being
I believe it is "Guix container technology," so just Guile code. I think this code is where most of it lives. In any event, the same idea of controlling what directores are exposed, environment variables, a separate profile of available libraries and programs, etc.
I'll provide this later from the original computer to be sure. The configuration options in LLVM from Guix's standard (see here) are changing
I understand and agree on the approach. An FHS container is handy for Guix to have for exactly these reasons, and I think having it mimic as close to possible something like Arch or Ubuntu makes a lot of sense on both sides.
I understand, but let me also say that we provide binary substitutes with Guix (and now Nonguix packages) and focus on reproducibility. Which actually makes this, I would say, a very good testing system since you can easily depoly the same environment repeatedly and on different hardware (or VM images). Guix is certainly not big enough or the ideal target here (as one that follows free software guidelines), but did want to mention that. It is fun technology. Thanks again for all your time on this, I've learned a lot about how this works! I'll provide some more logs and happy to test any development builds or provide info for reproducibility. |
I ran Steam with Here are the logs and And here is the same logs for the non-working configuration. My test game (Ape Out) was just hanging it seemed like, so I also tried to run Hades (which gave an error). I can redo this in a different way if it didn't capture what you want. I wasn't sure if the log option had changed something, so I ran without it as I had before: this log shows the output. The difference is noted in the previous comment, where the working configuration has LLVM with the dylib option configured (and Mesa built against that LLVM) and the non-working is original (no dylib but (I'm aware there are some other errors popping up; still working on the container and packaging but that's for later.) |
Sorry, that is definitely not going to work for us, because when you run a game (or Proton) in the Steam Linux Runtime environment, the executable you're using comes from the game (or Proton or pressure-vessel or the Steam Linux Runtime) - but to use your graphics drivers from the host system, we will usually also need to use your glibc. We also cannot work around this by overriding If Guix is patching glibc anyway, would it be possible to make it try a constant (hard-coded) path such as Or the patched glibc could look at a Guix-specific environment variable, perhaps something like The only other option that I can see is for the closer-to-FHS environment that is used to run Steam to have its own build of glibc that has one of these configurations/patches, even if the normal glibc that is used to run other programs does not. |
In general, when getting this stuff working on an unusual distro, it's best to use a native Linux game and configure it to use the "Steam Linux Runtime" compatibility tool. This removes a lot of complexity from the overall system by not needing to use Proton to emulate Windows, and lets you concentrate on the pressure-vessel parts that are distro-specific. I used Floating Point for a lot of the early pressure-vessel development, because it's tiny and can run on weak hardware. When you have that working, the next step is something like TF2 (OpenGL) or Artifact (Vulkan), and then start trying Proton.
I'm sure, but if the choice is between spending time making Steam and pressure-vessel work better on mainstream distros like Debian/Fedora/Arch, or spending the same amount of time making it work better on Guix, I have to do the one that benefits more people, and that's the mainstream distros. |
I think we are going to need a working Similarly, we need your glibc to read a If your
Please try downloading https://gitlab.steamos.cloud/steamrt/steam-runtime-tools/-/jobs/69589/artifacts/raw/_build/production/pressure-vessel-bin.tar.gz, using it to replace your This is built from a branch which adds the same special cases for |
Some of your logs seem to be getting truncated by Github, and some of them seem to be 0 bytes long. It might help to attach them as attachments instead of uploading them as a Gist; or it might help to stop applying I haven't yet looked into why The |
We sort of do, but it's in code rather than data. We'll probably make it iterate through an array if a third one is needed (Nix, Guix and some third thing). |
I'll respond to your questions and get the complete logs to you in a later comment, but I wanted to make sure the lede didn't get buried earlier: Steam and Proton do work on Guix for me. This is with the changes of:
From what you say looks like the separate |
Sorry about the logs, somehow completely missed the attaching option here. These are the logs from the working version (LLVM with
And these from the non-working version (LLVM with |
On the ld cache issue: I don't see Guix changing this without some good reason, though maybe they could provide an official variant that use the normal However, it is just a one-line change to also make have a |
I ran with the pressure-vessel build you linked earlier. While some warnings did go away (in the System Information no more warnings like
Here is the System Information output: working-config-2.txt And the log (only ran with (If I reran with the same runtime you linked but still adding the EDIT: Floating Point works with your runtime and the original version (both without the additional |
This usually means your We "promoted" the beta version of "Steam Linux Runtime - soldier" to stable status recently, so perhaps your Steam client picked up that update and overwrote the modified The way to use a modified
The two configurations that I would expect to work are:
Running the production version of pressure-vessel on Guix, without the After testing a special build of pressure-vessel, you can get back to the production version with: right-click on You can find the version number of the production version that Steam is currently shipping by looking at
In case I didn't make it clear enough: if you just install a native Linux game like Floating Point and run it, it runs with the traditional To test pressure-vessel with native Linux games, you have to use: right-click on game -> If you have one configuration that works and one configuration that doesn't, it's useful if you can share the two logs for comparison.
OK, that seems like a pragmatic approach, and I think it's similar to how this already works on NixOS. |
Hmm, this seems wrong:
And in your Compare with the working build, which successfully finds its LLVM libraries (in this case one big library rather than a lot of little libraries):
|
You are correct, the runtime you provided was overwritten, sorry I missed that. I've redone the test and it works as expected, no need for the System Information output: working-config-3.txt Log of running Hades (default DirectX mode), no more "unlikely to appear" messages: slr-app1145360-t20211208T210238.log |
Wow. One of those classic, can't believe I didn't see it earlier obvious mistakes: The answer to this weird LLVM lib problem was right there in front of us (really me, I should have seen this): if you'll take a look at the top line of the (In case anyone cares about the details: Guix by default will install the most recent version of a package available, so Sorry about all that! And sorry to myself for all this running around. I can confirm using the proper For me this closes the original issue of driver loading in pressure-vessel on Guix. The problem in general was providing all the libraries needed to load graphics drivers, in this case LLVM for Mesa, where of course you need to match the right version. Let me summarize where we are overall with pressure-vessel on Guix:
With this, Steam and pressure-vessel (and Proton) work in my testing. There may be other bugs I need to find and fix (running into a |
Now that you've confirmed that the test-build helps, I'll get that change merged soon. Expect to see it in a future beta of
Yes. Half of the change I asked you to test is effectively the same as |
With the change for the filesystem mounting coming from upsteam, this issue is closed for me. But if it is better to keep it open until a beta comes out that has the change, feel free to leave it open. Thanks again for navigating this with me and making the change! |
I see this was merged upstream, so I'll go ahead and close this. If this should remain open until an actual beta or stable release is out, feel free to reopen this until then, @kisak-valve. |
This change has gone out in It is likely to be "promoted" into the stable releases that are used by default at some point in the future. |
Update: please see most recent comment for what I think is the actual issue here, though these other comments are hopefully helpful for reproducing and determining what is happening in the actual code.
Pressure Vessel can load the host's GL drivers from Mesa and will note that it is capturing these drivers to place in overrides. However, not all of them are actually there once it comes time for the game to launch (i.e. after
LIBGL_DRIVERS_PATH
is overwritten).For example, here is the relevant output for the 64bit radeonsi_dri driver, which exists for 32bit as well as other drivers like swrast:
And then later the errors from libGL (I believe the failing to find configuration files is innocuous or similar to #432 in Nix):
And indeed, looking at the overrides folder in the container shows some of the files missing:
This happens if I make these files available in another location like
/lib/dri
and setLIBGL_DRIVERS_PATH
, where I can see pressure vessel picking them up from this path, but with the same results. I thought this was due to symlinking before (there are only a couple of different drivers for Mesa, most of these_dri.so
files are symlinks for me) but still happens building Mesa in a more standard way, as in the above output.The same file is loaded successfully earlier, before the final game launching step, e.g.
Some technical details of my setup below, but any clues I could look into? Why is this happening for some of the drivers but not all? In particular it is happening for the exact file that is loaded earlier, but not sure how this would affect the capturing. How can I debug what is happening, as there are no errors around the
capsule-capture-libs
step, which sounds like it is doing exactly what it should, and does get all the other drivers?Some technical details:
This is on GNU Guix, which has a similar format to NixOS. What is relevant is that everything is stored in
/gnu/store
and then symlinked. So Steam is launched in a container that mimics a typical FHS setup (similar to how Nix does it) creating all the usual folders in a (Guix, not bwrap) container./gnu/store
is visible and otherwise works in Steam. Steam has no problem finding things in/gnu/store
like this and launches, installs games, but fails in the final container for launching a game. I run Steam withLIBGL_DEBUG=verbose PRESSURE_VESSEL_FILESYSTEMS_RO=/gnu/store G_MESSAGES_DEBUG=1 PRESSURE_VESSEL_VERBOSE=1 STEAM_LINUX_RUNTIME_VERBOSE=1
I should note I'm working on updating a Steam package to have more recent Proton working (older ones work, before the runtime container was added in Proton 5.13). I think I've successfully set everything up to work, getting around the same issues raised for NixOS in e.g. NixOS/nixpkgs#100655 But this is where I'm stuck and not sure what else I can change on the host OS side that would have an effect.
The text was updated successfully, but these errors were encountered: