snap-confine: cleanup incorrectly created nvidia udev tags #4042

Merged
merged 2 commits into from Oct 13, 2017

Conversation

Projects
None yet
4 participants

@mvo5 mvo5 added this to the 2.28 milestone Oct 13, 2017

@zyga zyga requested a review from jdstrand Oct 13, 2017

codecov-io commented Oct 13, 2017

Codecov Report

Merging #4042 into master will decrease coverage by 0.01%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #4042      +/-   ##
==========================================
- Coverage   75.86%   75.84%   -0.02%     
==========================================
  Files         431      431              
  Lines       36922    36922              
==========================================
- Hits        28010    28005       -5     
- Misses       6958     6962       +4     
- Partials     1954     1955       +1
Impacted Files Coverage Δ
overlord/ifacestate/helpers.go 61.82% <0%> (-0.68%) ⬇️
overlord/snapstate/snapstate.go 80.04% <0%> (-0.25%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 29f0971...8702efc. Read the comment docs.

zyga approved these changes Oct 13, 2017

LGTM with one remark about approach in case jdstrand has some concerns about glob.

+// See also:
+// https://forum.snapcraft.io/t/weird-udev-enumerate-error/2360/17
+void sc_maybe_fixup_udev()
+{
@zyga

zyga Oct 13, 2017

Contributor

I would probably prefer a simple list of calls tounlinkat(tag_dir_fd, "+nvidia-foo"); (there are about 4 or 5 of files to unlink) that is less sophisticated and operates only on the tags affecting the snap we are dealing with but I don't have strong opinions about it.

I'm going to approve this in the interest of time, but I'd like to see the initialization comment addressed and I would like to see a test case (ie spread) that proves this is working correctly when the files are there and when they are not. Both should be addressed before merging.

As for the other comment , I have a small preference that is listed inline. I also wouldn't mind the suggestion @zyga has if others agree. Both of these could happen in a follow-up PR.

cmd/snap-confine/snap-confine.c
+void sc_maybe_fixup_udev()
+{
+ glob_t glob_res SC_CLEANUP(globfree) = {
+ .gl_pathv = NULL};
@jdstrand

jdstrand Oct 13, 2017

Contributor

Can you also set .gl_pathc and .gl_offs to 0?

@zyga

zyga Oct 13, 2017

Contributor

Hmm, aren't those set by the C standard? We set one field, the rest is initialized to zero.

cmd/snap-confine/snap-confine.c
+ // killem'all
+ for (size_t i = 0; i < glob_res.gl_pathc; ++i) {
+ unlink(glob_res.gl_pathv[i]);
+ }
@jdstrand

jdstrand Oct 13, 2017

Contributor

While it would seem to make logical sense for glob() to set gl_pathc to 0 when GLOB_NOMATCH, I don't see in the man page that this is strictly guaranteed. I suspect your observed behavior is that it will be zero if not found. I'd feel slightly more comfortable and we'd get a teensy optimization if you only ran "killem'all" when err == GLOB_NOMATCH.

Contributor

jdstrand commented Oct 13, 2017

All that said, I'm now wondering why we are fixing this here, rather than in, say, snapd.core-fixup.sh...

Contributor

zyga commented Oct 13, 2017

@jdstrand I'm not sure snapd.core-fixup.sh operates on classic systems where we are affected by this issue. EDIT: In addition we need to run this when new (restarted) snapd is starting up, not when the system is booting. After installing edge snapd this issue fixes itself upon system restart because /run is just a tmpfs.

Contributor

jdstrand commented Oct 13, 2017

@zyga and @mvo - I should've said like snapd.core-fixup.sh. The point is, snap-confine is in the critical path for application launching. On the one hand, it triggered the issue with nvidia so it makes some sense to have it cleanup. On the other hand, this is a setuid binary and we're adding code that only needs to ever be run once, but will be run on every invocation of every command forever. Perhaps there's a one off case that would be cleaner.

Contributor

zyga commented Oct 13, 2017

@jdstrand I agree, when we initially discussed this with @mvo5 we considered putting this in udev backend initialization code so that snapd does the cleanup. For opengl (desktop) apps this seems like a reasonable compromise and it has the advantage of not needing snap-confine changes and it may be removed with the next epoch hop

Contributor

jdstrand commented Oct 13, 2017

BTW, if using snap-confine, the apparmor profile will need to be adjusted for:

/run/udev/tags/snap_*/ r,
/run/udev/tags/snap_*/*nvidia* w,
Contributor

zyga commented Oct 13, 2017

@jdstrand we also talked about updating the apparmor profile but @mvo5 tested this in practice so one of the existing rules must be granting us /run write access

Contributor

jdstrand commented Oct 13, 2017

To be clear, my approval is that the code is doing what it is advertising (with my recommendation for a teensy code change) so if you decide that is the best path forward, I'm not in your way (considering the urgency of the fix). I do think there needs to be some (more?) discussion if snap-confine is the right place. If you decide it is, I won't block. We could have this only in 2.28.5 and not include it in 2.29 or master if you think that is best, for example.

Contributor

jdstrand commented Oct 13, 2017

Oh, right, it calls out to the udev helper directly so needs to be able to write:

/run/udev/** rw,
Contributor

zyga commented Oct 13, 2017

I like the idea of not having this in master.

@mvo5 mvo5 merged commit f5f9252 into snapcore:master Oct 13, 2017

2 of 7 checks passed

artful-amd64 autopkgtest running
Details
artful-i386 autopkgtest running
Details
xenial-amd64 autopkgtest running
Details
xenial-i386 autopkgtest running
Details
xenial-ppc64el autopkgtest running
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
zesty-amd64 autopkgtest finished (success)
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment