Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Flesh out NVIDIA support for biarch and multiarch systems #4207

Merged
merged 5 commits into from Nov 15, 2017

Conversation

ikeydoherty
Copy link
Contributor

This change implements proper support for NVIDIA on both biarch and multiarch systems, chiefly:

  • Access to the multilib (emul32) variant of the libraries within the Snaps
  • Exposes Vulkan ICD files to the Snap (/var/lib/snapd/lib/vulkan)
  • Inserts new multilib directory into SNAP_LIBRARY_PATH by default
  • Refactors the mount code to allow future extensions (i.e. for OpenCL/ICD sharing)
  • Ensures parity between library availability of multiarch and biarch system

This refactors mount-support-nvidia somewhat to allow us to expose the
32-bit NVIDIA drivers (from `/usr/lib32`) to the guest snap on both multiarch
and biarch systems, in their `/var/lib/snapd/lib/gl32` directory.

The `SNAP_LIBRARY_PATH` is modified to include the new directory, so that
32-bit processes in normal snapcraft packages using the helpers do not need
to do anything to benefit from them.

This particularly benefits the LSI project by making the 32-bit drivers
available to Steam.

Signed-off-by: Ikey Doherty <ikey@solus-project.com>
This builds upon the NVIDIA support by copying the ICD files from the
static host location `/usr/share/vulkan/icd.d` into a new tmpfs under
`/var/lib/snapd/lib/vulkan` so that interested parties can make use
of them from within the snap environment.

This can be used by interested snaps by setting VK_ICD_FILENAMES to
this directory, and is a requirement for correctly implementing full
Vulkan support in the linux-steam-integration snap.

Signed-off-by: Ikey Doherty <ikey@solus-project.com>
Additionally, let's ensure we have parity with the NVIDIA bind-mounts on
multiarch by ensuring that our vdpau driver for NVIDIA is also accessible
under the rootfs.

To make VDPAU accessible, one must set `VDPAU_DRIVER_PATH` to include
the `/var/lib/snapd/lib/gl:/var/lib/snapd/lib/gl/vdpau` paths within
the snap to ensure that vdpau drivers for both biarch and multiarch hosts
can be found trivially.

Signed-off-by: Ikey Doherty <ikey@solus-project.com>
@ikeydoherty
Copy link
Contributor Author

Note that there is probably going to be a seccomp build failure in the tests, I saw this locally even on clean git as it wanted a daemon user to exist, and then it wanted some setuid stuff for it, etc. The test is non portable and causes ancillary failures.

@codecov-io
Copy link

codecov-io commented Nov 13, 2017

Codecov Report

Merging #4207 into master will increase coverage by 0.07%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff            @@
##           master   #4207      +/-   ##
=========================================
+ Coverage   75.72%   75.8%   +0.07%     
=========================================
  Files         437     439       +2     
  Lines       37869   37995     +126     
=========================================
+ Hits        28678   28801     +123     
- Misses       7190    7192       +2     
- Partials     2001    2002       +1
Impacted Files Coverage Δ
interfaces/builtin/opengl.go 100% <ø> (ø) ⬆️
snap/snapenv/snapenv.go 95.12% <100%> (ø) ⬆️
overlord/ifacestate/helpers.go 59.6% <0%> (-0.67%) ⬇️
overlord/snapstate/handlers.go 65.75% <0%> (-0.03%) ⬇️
corecfg/refresh.go 70% <0%> (ø)
overlord/snapstate/cookies.go 69.69% <0%> (ø)
overlord/state/change.go 95.83% <0%> (+0.24%) ⬆️
overlord/snapstate/snapmgr.go 84% <0%> (+1.89%) ⬆️
corecfg/corecfg.go 50% <0%> (+5.55%) ⬆️
cmd/snap-update-ns/utils.go 100% <0%> (+14.06%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d82a909...0d9a468. Read the comment docs.

@ikeydoherty
Copy link
Contributor Author

Validation instructions for LSI to check this change

  • Have NVIDIA proprietary drivers installed, active, and the 32-bit driver available.
  • Ensure snapd is fully restarted with the changes and apparmor is reloaded
  • Ensure any previous LSI/runtime installs are removed first..
  • Make sure you get to the initial steam login ui. Gratz, multilib works!
wget https://packages.solus-project.com/lsi/1/solus-runtime-gaming_0.0.0_amd64.snap
wget https://packages.solus-project.com/lsi/1/linux-steam-integration_0.6_amd64.snap
sudo snap install --dangerous solus-runtime-gaming*.snap
sudo snap install --dangerous --devmode linux-steam-integration*.snap
snap run linux-steam-integration

Copy link
Collaborator

@bboozzoo bboozzoo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me otherwise.

# Support reading the Vulkan ICD files
/var/lib/snapd/lib/vulkan/ r,
/var/lib/snapd/lib/vulkan/** r,
/var/lib/snapd/hostfs/usr/share/vulkan/icd.d/10_nvidia*.json r,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/var/lib/snapd/hostfs/usr/share/vulkan/icd.d/*.json r, should catch ICDs from all drivers.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And we don't want all drivers, that's the point. We only want NVIDIA drivers exposed - not having mesa vulkan exposed :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TLDR a runtime should provide its own mesa and vulkan (libvulkan.so.1) like the solus-runtime-gaming does, but we accept host ICDs from vulkan to allow the NVIDIA vulkan stuff to work properly.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe I'm missing something here, but we're already allowing to use the host's GL implementation. I guess the same would be fitting for Vulkan. Meaning, we should probably allow to access all ICDs. otherwise the client would end up having access to a single implementation only.

If you'd want to use a specific ICD then there's VULKAN_ICD_FILENAMES environment library.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No we don't - we only allow the host GL library for the NVIDIA library. Allowing mesa to cross-contaminate would be utterly catastrophic. Likewise exposing the invalid ICDs for Vulkan which have the potential to clash with the internal vulkan icds provided by the snaps own mesa build will cause things to break.

solus-runtime-gaming already has its own mesa, and default vulkan icds for the open source vulkan drivers. We glob this vulkan directory to add the host NVIDIA ICDs as thats the only host GL library we'd ever share. FWIW I'm fully aware of VK_ICD_FILENAMES, please read the actual commit messages where I documented that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense now. Thanks for taking the time to explain.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its ugly ugly evil stuff and I wish we didn't have to support it, but there it is :/ FWIW I'll be following up with a highly similar change at some point which will expose the OpenCL ICD definitions (primarily to have CUDA working) - but this will require me to work on a patch to send to Khronos to have something like a OCL_ICD_FILENAMES variable too. We'd want this for stuff like Blender+NVCC combinations, etc. Having the context here now though will help when it comes to implementing that next step. :) It'd be down to the internal runtime to provide something like beignet as the open source solution.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I remember it to be a bit of a hassle with OpenCL. If all else fails you can always try mounting tmpfs over the vendors directory and expose the just file that you need. Ugly as hell, but you wouldn't need to through back and forth with the Khronos group :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is true, but if necessary we can carry the relevant patch in the runtime, as the ocl-icd package is only tiny for the sake of the patch to actually support it

@zyga
Copy link
Collaborator

zyga commented Nov 13, 2017

Please make fmt the code. EDIT: I just did that.

Signed-off-by: Zygmunt Krynicki <zygmunt.krynicki@canonical.com>
@ikeydoherty
Copy link
Contributor Author

Sorry didn't even see the indent file there nor find references to it. Sorta had to wind it wrt. style as its a weird one.

It turns out Ubuntu uses "nvidia" as the prefix to the file without a
"10_" style prefix, so we'll ensure we catch all NVIDIA ICD json files
with this new glob.

Signed-off-by: Ikey Doherty <ikey@solus-project.com>
Copy link
Contributor

@mvo5 mvo5 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, thanks for working on this!

@ikeydoherty
Copy link
Contributor Author

Oh my pleasure @mvo5 :) Can't wait to get this stuff out in the wild and in the store :D

Copy link
Collaborator

@zyga zyga left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, interesting work!

@@ -129,7 +129,7 @@ func (s *HTestSuite) TestSnapRunSnapExecEnv(c *C) {
"SNAP_ARCH": arch.UbuntuArchitecture(),
"SNAP_COMMON": "/var/snap/snapname/common",
"SNAP_DATA": "/var/snap/snapname/42",
"SNAP_LIBRARY_PATH": "/var/lib/snapd/lib/gl:/var/lib/snapd/void",
"SNAP_LIBRARY_PATH": "/var/lib/snapd/lib/gl:/var/lib/snapd/lib/gl32:/var/lib/snapd/void",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wasn't aware this can be in the same path at once.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, the linker will just skip invalid ELFCLASS files

@mvo5 mvo5 merged commit 096ede7 into snapcore:master Nov 15, 2017
@ikeydoherty
Copy link
Contributor Author

Yay - ty :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
5 participants