-
Notifications
You must be signed in to change notification settings - Fork 562
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
interfaces/builtin/opengl: allow access to Tegra X1 #6623
interfaces/builtin/opengl: allow access to Tegra X1 #6623
Conversation
Allow access to fuses and devices created by the Tegra X1 drivers, so CUDA can be used in this chip.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good.
unix (bind,listen) type=seqpacket addr="@cuda-uvmfd-[0-9a-f]*", | ||
/{dev,run}/shm/cuda.* rw, | ||
/dev/nvhost-* rw, | ||
/dev/nvmap rw, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's annoying how many different devices there are for nvidia.... Also note that the /dev/nvhost-* and /dev/nvmap devices have been prone to security issues with papers being written about them (eg, but this is only one of many CVEs: https://googleprojectzero.blogspot.com/2015/01/exploiting-nvmap-to-escape-chrome.html). My concerns regarding maintenance and direct access to the devices is expressed here: https://bugs.launchpad.net/ubuntu/+source/apparmor-easyprof-ubuntu/+bug/1197133 (though my main worry is direct access to all graphics devices, not just nvidia).
Unfortunately, access to these devices is a requirement for the hosts that have them and while the accesses make me uncomfortable, there isn't anything we can do about it atm with the state of graphics drivers access on Linux.
Regardless of my worry, this is going to need changes to setup_devices_cgroup() in cmd/snap-confine/udev-support.c (and the apparmor profile), no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jdstrand agreed about required changes to snap-confine
's Nvidia device handling.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@zyga @jdstrand I have added the nvmap and nvhost-* devices to the udev plug so they get added to the cgroup of the confined process. Fortunately, in this case nvidia supports access from sysfs, so there is no need to modify udev-support.c
.
One interesting thing that might be a bug: originally I was able to access the devices without changing openglConnectedPlugUDev
because actually no cgroup was being created for the snap processes. I think that the reason was that the rules:
KERNEL=="renderD[0-9]*", TAG+="snap_matrix-mul_matrixMul"
KERNEL=="vchiq", TAG+="snap_matrix-mul_matrixMul"
SUBSYSTEM=="drm", KERNEL=="card[0-9]*", TAG+="snap_matrix-mul_matrixMul"
TAG=="snap_matrix-mul_matrixMul", RUN+="/usr/lib/snapd/snap-device-helper $env{ACTION} snap_matrix-mul_matrixMul $devpath $major:$minor"
did not trigger the call to snap-device-helper
, as actually none of renderD[0-9]*, vchiq or drm actually exist in the system. Maybe there should always a rule here to make sure the call really happens (like KERNEL=="null"
?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
did not trigger the call to
snap-device-helper
, as actually none of renderD[0-9]*, vchiq or drm actually exist in the system. Maybe there should always a rule here to make sure the call really happens (likeKERNEL=="null"
?)
We've considered unconditional use of the device cgroup, but that is known to break at least certain software (eg, greengrass) which is why we've recently added the ability for an interface to opt out of setting up the device cgroup.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
did not trigger the call to
snap-device-helper
, as actually none of renderD[0-9]*, vchiq or drm actually exist in the system. Maybe there should always a rule here to make sure the call really happens (likeKERNEL=="null"
?)We've considered unconditional use of the device cgroup, but that is known to break at least certain software (eg, greengrass) which is why we've recently added the ability for an interface to opt out of setting up the device cgroup.
But, if connectedPlugUDev
is defined, should not a cgroup be created always if the interface is connected? What I am saying is that depending on whether some devices happen to exist in the system, the cgroup would be created or not. That would mean, for instance, that the same snap, depending on whether installed in a Jetson TX1 or on, say, a PC laptop with nVidia GPU, in the first case it would not have a cgroup, while it would in the later (talking about this interface before the PR).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
did not trigger the call to
snap-device-helper
, as actually none of renderD[0-9]*, vchiq or drm actually exist in the system. Maybe there should always a rule here to make sure the call really happens (likeKERNEL=="null"
?)We've considered unconditional use of the device cgroup, but that is known to break at least certain software (eg, greengrass) which is why we've recently added the ability for an interface to opt out of setting up the device cgroup.
But, if
connectedPlugUDev
is defined, should not a cgroup be created always if the interface is connected? What I am saying is that depending on whether some devices happen to exist in the system, the cgroup would be created or not. That would mean, for instance, that the same snap, depending on whether installed in a Jetson TX1 or on, say, a PC laptop with nVidia GPU, in the first case it would not have a cgroup, while it would in the later (talking about this interface before the PR).
No. The current design is that if there are any tagged devices that are assigned to the snap, only then will a cgroup be used and the devices added since from a security POV there is no point in using a device cgroup if nothing is tagged for the device (indeed, as mentioned previously, there are times we explicitly don't want the device cgroup). Your evaluation of the situation is accurate and is the current design. We may change that design in the future, but it is not considered a bug.
When it can be a bug is for hot-pluggable devices such as joysticks since a snap might be started with no gamepads plugged in and then a gamepad is plugged in after the fact. This was the precise scenario when we considered making the device cgroup mandatory, but felt at the time it was too risky of a behavior change (it might be reconsidered at a future date). Feel free to add /dev/full to the opengl interface if you feel this is an issue for this interface.
One reason why I originally advocated putting these NVIDIA device accesses inside a |
// will be added by snap-confine. | ||
var openglConnectedPlugUDev = []string{ | ||
`SUBSYSTEM=="drm", KERNEL=="card[0-9]*"`, | ||
`KERNEL=="vchiq"`, | ||
`KERNEL=="renderD[0-9]*"`, | ||
`KERNEL=="nvhost-*"`, | ||
`KERNEL=="nvmap"`, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unless those are GPL and udev
sees them you need to also change snap-confine's udev-support.c
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if they are GPL or not, but udev
does see them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are they represented in sysfs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes: https://paste.ubuntu.com/p/GQVRy9ychP/ (see my previous comment #6623 (comment))
Unit tests need an update
|
…v rules Signed-off-by: Maciej Borzecki <maciej.zenon.borzecki@canonical.com>
Unit tests should pass now. I've also merged current master which has the necessary fixes for spread tests that were causing pain recently. |
Right, this is a good idea. There are things that could be added there that should not be in the opengl interface (like be able to call mknod, thing that CUDA libraries do on first time call). Only issue, I'm not sure how separable are pure GPU devices of CUDA devices. But of course the shared resources could be in both interfaces. |
Thanks! |
This isn't just about CUDA though. These accesses were needed, for example, for certain models of Google phones to run with Ubuntu Touch (ie, these accesses are needed to use the GPU at all on those devices; CUDA also needs to use the GPU, but for different reasons so splitting them out, at least for this case, doesn't make sense). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the changes!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Allow access to fuses and devices created by the Tegra X1 drivers, so
CUDA can be used in this chip.