Join GitHub today
GitHub is home to over 20 million developers working together to host and review code, manage projects, and build software together.
cmd/snap-confine: re-associate with pid-1 mount namespace if required #2624
Conversation
zyga
added some commits
Dec 2, 2016
|
NOTE: I managed to avoid the kernel OOPS but the tests will still fail on recursive confinement. I asked @jdstrand to have a look after sharing some ideas on IRC. The essence of the remaining problem:
Then the log file contains:
the kernel ring buffer contains:
(I clear the buffer just before running this command) |
|
There is nothing in the logs that indicates a problem from AppArmor. Those ALLOWED messages are because you are running strace from under complain mode and the policy doesn't allow using strace, so the violation against policy is logged (but allowed). |
|
@jdstrand correct BUT the permission denied only happens if you do this through apparmor. Try the same idea ( |
|
Did you turn of kernel rate limiting? sudo sysctl -w kernel.printk_ratelimit=0 |
|
The kernel issue is now reported as https://bugs.launchpad.net/apparmor/+bug/1656121 |
zyga
added some commits
Jan 16, 2017
zyga
added
Blocked
Critical
labels
Jan 16, 2017
|
This is blocked on the kernel/apparmor bug and critical for CE for their product development. |
|
@zyga there's a kernel waiting for you to test in the launchpad bug. Give it a shot when you get some time. |
|
I tested the test kernel and the results are inconclusive. I could use some hands-on time with jj to debug this further. |
zyga
added some commits
Jan 18, 2017
pedronis
changed the title from
Re-associate with pid-1 mount namespace if required
to
cmd/snap-confine: re-associate with pid-1 mount namespace if required
Jan 20, 2017
zyga
added some commits
Jan 31, 2017
|
JJ has fixed all the issues in apparmor that affected this test. The next kernel release (~2.5 weeks away from now) should make this test green without using the test kernel. |
| +details: | | ||
| + snap-confine uses privately-shared /run/snapd/ns to store bind-mounted | ||
| + mount namespaces of each snap. In the case that snap-confine is invoked | ||
| + from the mount namespace it typically constructs that directory is not |
niemeyer
Feb 13, 2017
Contributor
Not reading properly: "it typically constructs that directory is not available"
| + operate in such an environment snap-confine must first re-associate its own | ||
| + process with another namespace in which the /run/snapd/ns directory is | ||
| + visible. The most obvious candidate is pid one, which definitely doesn't | ||
| + run in a snap-specific namespace, has a predictable PID and is long lived. |
niemeyer
Feb 13, 2017
Contributor
This is such a clear explanation of the rationale. Thanks! Can we please have this copied over above the function sc_reassociate_with_pid1_mount_ns in ns-support.c as well?
zyga
added some commits
Feb 14, 2017
|
Tests failed but this may have been caused by stale kernels used on test images. I'll do a local test in qemu |
|
I just tested this on the -67 kernel from xenial-proposed and it works :-) Looking forward to merging this. |
arapulido
commented
Mar 15, 2017
|
@zyga The -67 is now in xenial-updates. Can we land this? |
|
FYI, I'm going to review this PR later today. |
|
@arapulido oh, indeed. I'll check if we can get a clean run and if tyler agrees we can land this |
zyga
added this to the
2.24 milestone
Mar 15, 2017
|
This is now tagged to 2.24 since CE requested this to be fixed with high priority and we have a high chance of landing it as the fixed kernel is now out. |
mvo5
and others
added some commits
Mar 15, 2017
tyhicks
approved these changes
Mar 15, 2017
I don't love this change because it opens up a way for a confined application to escape to possibly enter the init namespace. However, I think the code is correct around the mount namespace handling in snap-confine and it should be safe to land.
I had one suggestion to lock down the AppArmor profile a bit and I'd like for you to quickly investigate if it is possible before you merge.
Ack from me.
| @@ -275,6 +275,10 @@ | ||
| /var/lib/snapd/hostfs/var/lib/lxd r, | ||
| # support for the mount namespace sharing | ||
| + capability sys_ptrace, | ||
| + # allow snap-confine to read /proc/1/ns/mnt | ||
| + ptrace peer=unconfined, |
tyhicks
Mar 15, 2017
Contributor
From my quick testing using readlink /proc/1/ns/mnt and nsenter --mount=/proc/1/ns/mnt, you can get away with specifying a single ptrace permission rather than granting all ptrace permissions:
ptrace trace peer=unconfined,
Also, I don't seem to need capability sys_ptrace, but trust that you added it after seeing a denial for that capability.
zyga
Mar 15, 2017
Contributor
Thank you for the review. I'll tighten the ptrace rule and see if I need sys_ptrace with the fixed kernel.
zyga
Mar 15, 2017
Contributor
I ran a quick test without `capability sys_ptrace that showed green on the regression test. I chose to remove this extra permission. I think it was added when we tested various kernel fixes but I don't think it is required in practice now.
| + * lived. | ||
| + */ | ||
| + sc_reassociate_with_pid1_mount_ns(); | ||
| + const char *snap_name = getenv("SNAP_NAME"); |
zyga
Mar 15, 2017
Contributor
We don't need this. I think it was a part of a very very old fix that can now be simplified
zyga
added some commits
Mar 15, 2017
|
The failure in docker is unrelated to the test and looks network related. |
|
So I got |
|
Running the
EDIT: This occurs during this command in the test:
|
zyga
added some commits
Mar 16, 2017
zyga
removed
the
Blocked
label
Mar 16, 2017
|
Thanks for trying to remove the |
|
Failures in adt are caused by the -66 kernel (the fix is in -67) |
niemeyer
merged commit a0f74cd
into
snapcore:master
Mar 16, 2017
2 of 6 checks passed
zyga
deleted the
zyga:reassociate-fix
branch
Mar 16, 2017
|
@zyga - this PR needs to be reverted because the required kernel patches have been reverted. If 2.24 is released with this commit, we will see the OOPSes from https://bugs.launchpad.net/apparmor/+bug/1656121 in production. |
|
@jdstrand a full revert will be messy (lots of patches) but we can disable this easily via: https://github.com/snapcore/snapd/pull/3076/files |
zyga commentedJan 12, 2017
This patch changes the initialization process of mount namespace
handling code. If the snap-confine process is itself in a namespace
other than that of pid-1 (which happens when snaps with confinement
other than classic are trying to invoke other snaps) then snap-confine
will move itself to the mount namespace of pid-1 before attempting any
other actions.
This allows snap-confine to get access to /run/snapd/ns directory that
has private sharing and thus is not shared with any mount peer groups.
Fixes: https://bugs.launchpad.net/snap-confine/+bug/1644439
Signed-off-by: Zygmunt Krynicki zygmunt.krynicki@canonical.com