Flesh out NVIDIA support for biarch and multiarch systems #4207

Merged
merged 5 commits into from Nov 15, 2017

Conversation

Projects
None yet
5 participants
Contributor

ikeydoherty commented Nov 13, 2017

This change implements proper support for NVIDIA on both biarch and multiarch systems, chiefly:

  • Access to the multilib (emul32) variant of the libraries within the Snaps
  • Exposes Vulkan ICD files to the Snap (/var/lib/snapd/lib/vulkan)
  • Inserts new multilib directory into SNAP_LIBRARY_PATH by default
  • Refactors the mount code to allow future extensions (i.e. for OpenCL/ICD sharing)
  • Ensures parity between library availability of multiarch and biarch system

ikeydoherty added some commits Nov 13, 2017

cmd/snap-confine: Implement full 32-bit NVIDIA driver support
This refactors mount-support-nvidia somewhat to allow us to expose the
32-bit NVIDIA drivers (from `/usr/lib32`) to the guest snap on both multiarch
and biarch systems, in their `/var/lib/snapd/lib/gl32` directory.

The `SNAP_LIBRARY_PATH` is modified to include the new directory, so that
32-bit processes in normal snapcraft packages using the helpers do not need
to do anything to benefit from them.

This particularly benefits the LSI project by making the 32-bit drivers
available to Steam.

Signed-off-by: Ikey Doherty <ikey@solus-project.com>
cmd: Support exposing NVIDIA Vulkan ICD files to the snaps
This builds upon the NVIDIA support by copying the ICD files from the
static host location `/usr/share/vulkan/icd.d` into a new tmpfs under
`/var/lib/snapd/lib/vulkan` so that interested parties can make use
of them from within the snap environment.

This can be used by interested snaps by setting VK_ICD_FILENAMES to
this directory, and is a requirement for correctly implementing full
Vulkan support in the linux-steam-integration snap.

Signed-off-by: Ikey Doherty <ikey@solus-project.com>
cmd/snap-confine: Add missing bi-arch NVIDIA files
Additionally, let's ensure we have parity with the NVIDIA bind-mounts on
multiarch by ensuring that our vdpau driver for NVIDIA is also accessible
under the rootfs.

To make VDPAU accessible, one must set `VDPAU_DRIVER_PATH` to include
the `/var/lib/snapd/lib/gl:/var/lib/snapd/lib/gl/vdpau` paths within
the snap to ensure that vdpau drivers for both biarch and multiarch hosts
can be found trivially.

Signed-off-by: Ikey Doherty <ikey@solus-project.com>
Contributor

ikeydoherty commented Nov 13, 2017

Note that there is probably going to be a seccomp build failure in the tests, I saw this locally even on clean git as it wanted a daemon user to exist, and then it wanted some setuid stuff for it, etc. The test is non portable and causes ancillary failures.

@ikeydoherty ikeydoherty referenced this pull request in solus-project/linux-steam-integration Nov 13, 2017

Closed

Snap version 0.6 fails to launch on Ubuntu 17.10 #35

codecov-io commented Nov 13, 2017

Codecov Report

Merging #4207 into master will increase coverage by 0.07%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff            @@
##           master   #4207      +/-   ##
=========================================
+ Coverage   75.72%   75.8%   +0.07%     
=========================================
  Files         437     439       +2     
  Lines       37869   37995     +126     
=========================================
+ Hits        28678   28801     +123     
- Misses       7190    7192       +2     
- Partials     2001    2002       +1
Impacted Files Coverage Δ
interfaces/builtin/opengl.go 100% <ø> (ø) ⬆️
snap/snapenv/snapenv.go 95.12% <100%> (ø) ⬆️
overlord/ifacestate/helpers.go 59.6% <0%> (-0.67%) ⬇️
overlord/snapstate/handlers.go 65.75% <0%> (-0.03%) ⬇️
corecfg/refresh.go 70% <0%> (ø)
overlord/snapstate/cookies.go 69.69% <0%> (ø)
overlord/state/change.go 95.83% <0%> (+0.24%) ⬆️
overlord/snapstate/snapmgr.go 84% <0%> (+1.89%) ⬆️
corecfg/corecfg.go 50% <0%> (+5.55%) ⬆️
cmd/snap-update-ns/utils.go 100% <0%> (+14.06%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d82a909...0d9a468. Read the comment docs.

Contributor

ikeydoherty commented Nov 13, 2017

Validation instructions for LSI to check this change

  • Have NVIDIA proprietary drivers installed, active, and the 32-bit driver available.
  • Ensure snapd is fully restarted with the changes and apparmor is reloaded
  • Ensure any previous LSI/runtime installs are removed first..
  • Make sure you get to the initial steam login ui. Gratz, multilib works!
wget https://packages.solus-project.com/lsi/1/solus-runtime-gaming_0.0.0_amd64.snap
wget https://packages.solus-project.com/lsi/1/linux-steam-integration_0.6_amd64.snap
sudo snap install --dangerous solus-runtime-gaming*.snap
sudo snap install --dangerous --devmode linux-steam-integration*.snap
snap run linux-steam-integration

Looks good to me otherwise.

interfaces/builtin/opengl.go
+ # Support reading the Vulkan ICD files
+ /var/lib/snapd/lib/vulkan/ r,
+ /var/lib/snapd/lib/vulkan/** r,
+ /var/lib/snapd/hostfs/usr/share/vulkan/icd.d/10_nvidia*.json r,
@bboozzoo

bboozzoo Nov 13, 2017

Contributor

/var/lib/snapd/hostfs/usr/share/vulkan/icd.d/*.json r, should catch ICDs from all drivers.

@ikeydoherty

ikeydoherty Nov 13, 2017

Contributor

And we don't want all drivers, that's the point. We only want NVIDIA drivers exposed - not having mesa vulkan exposed :)

@ikeydoherty

ikeydoherty Nov 13, 2017

Contributor

TLDR a runtime should provide its own mesa and vulkan (libvulkan.so.1) like the solus-runtime-gaming does, but we accept host ICDs from vulkan to allow the NVIDIA vulkan stuff to work properly.

@bboozzoo

bboozzoo Nov 13, 2017

Contributor

Maybe I'm missing something here, but we're already allowing to use the host's GL implementation. I guess the same would be fitting for Vulkan. Meaning, we should probably allow to access all ICDs. otherwise the client would end up having access to a single implementation only.

If you'd want to use a specific ICD then there's VULKAN_ICD_FILENAMES environment library.

@ikeydoherty

ikeydoherty Nov 13, 2017

Contributor

No we don't - we only allow the host GL library for the NVIDIA library. Allowing mesa to cross-contaminate would be utterly catastrophic. Likewise exposing the invalid ICDs for Vulkan which have the potential to clash with the internal vulkan icds provided by the snaps own mesa build will cause things to break.

solus-runtime-gaming already has its own mesa, and default vulkan icds for the open source vulkan drivers. We glob this vulkan directory to add the host NVIDIA ICDs as thats the only host GL library we'd ever share. FWIW I'm fully aware of VK_ICD_FILENAMES, please read the actual commit messages where I documented that.

@bboozzoo

bboozzoo Nov 13, 2017

Contributor

Makes sense now. Thanks for taking the time to explain.

@ikeydoherty

ikeydoherty Nov 13, 2017

Contributor

Its ugly ugly evil stuff and I wish we didn't have to support it, but there it is :/ FWIW I'll be following up with a highly similar change at some point which will expose the OpenCL ICD definitions (primarily to have CUDA working) - but this will require me to work on a patch to send to Khronos to have something like a OCL_ICD_FILENAMES variable too. We'd want this for stuff like Blender+NVCC combinations, etc. Having the context here now though will help when it comes to implementing that next step. :) It'd be down to the internal runtime to provide something like beignet as the open source solution.

@bboozzoo

bboozzoo Nov 13, 2017

Contributor

I remember it to be a bit of a hassle with OpenCL. If all else fails you can always try mounting tmpfs over the vendors directory and expose the just file that you need. Ugly as hell, but you wouldn't need to through back and forth with the Khronos group :)

@ikeydoherty

ikeydoherty Nov 13, 2017

Contributor

This is true, but if necessary we can carry the relevant patch in the runtime, as the ocl-icd package is only tiny for the sake of the patch to actually support it

Contributor

zyga commented Nov 13, 2017

Please make fmt the code. EDIT: I just did that.

cmd: make fmt
Signed-off-by: Zygmunt Krynicki <zygmunt.krynicki@canonical.com>
Contributor

ikeydoherty commented Nov 13, 2017

Sorry didn't even see the indent file there nor find references to it. Sorta had to wind it wrt. style as its a weird one.

cmd/snap-confine: Loosen the NVIDIA Vulkan ICD glob
It turns out Ubuntu uses "nvidia" as the prefix to the file without a
"10_" style prefix, so we'll ensure we catch all NVIDIA ICD json files
with this new glob.

Signed-off-by: Ikey Doherty <ikey@solus-project.com>

mvo5 approved these changes Nov 14, 2017

Looks great, thanks for working on this!

Contributor

ikeydoherty commented Nov 14, 2017

Oh my pleasure @mvo5 :) Can't wait to get this stuff out in the wild and in the store :D

zyga approved these changes Nov 14, 2017

LGTM, interesting work!

@@ -129,7 +129,7 @@ func (s *HTestSuite) TestSnapRunSnapExecEnv(c *C) {
"SNAP_ARCH": arch.UbuntuArchitecture(),
"SNAP_COMMON": "/var/snap/snapname/common",
"SNAP_DATA": "/var/snap/snapname/42",
- "SNAP_LIBRARY_PATH": "/var/lib/snapd/lib/gl:/var/lib/snapd/void",
+ "SNAP_LIBRARY_PATH": "/var/lib/snapd/lib/gl:/var/lib/snapd/lib/gl32:/var/lib/snapd/void",
@zyga

zyga Nov 14, 2017

Contributor

I wasn't aware this can be in the same path at once.

@ikeydoherty

ikeydoherty Nov 14, 2017

Contributor

Sure, the linker will just skip invalid ELFCLASS files

@mvo5 mvo5 merged commit 096ede7 into snapcore:master Nov 15, 2017

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details
Contributor

ikeydoherty commented Nov 15, 2017

Yay - ty :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment