LXC and AppArmor -- a lost cause? #10166

mbiebl · 2018-09-24T21:38:03Z

All the recent back and forth with trying to get the test-suite pass in an AA-confined LXC container makes me wonder, if this is worth the trouble.
Should I/we give up on this idea and declare this setup unsupported?
Especially given the feedback from @brauner and @stgraber which indicates that Ubuntu itself disables AA when running their own CI on LXC.
I'm a bit at a loss atm what to do about this. Should I continue to file bug reports when I run into failures when using autopkgtest with the LXC backend with AA enabled?

poettering · 2018-09-25T08:29:57Z

well, for all failures between aa, lxc, systemd I am pretty sure some need to be fixed in aa (or its policy), others in lxc, and even others in systemd. If there are good reasons I'm all for fixing the latter here upstream. That said, I don't run ubuntu/lxc/aa myself, hence I am not going to the biggest help, but I can certainly suggest fixes and stuff.

So, it's really up to you if you want to spend the time on it, all I can offer is my technical input and maybe a patch or two for systemd when things have sufficiently been tracked down.

brauner · 2018-10-01T09:43:46Z

Same here. I'm more than willing to help out.

mbiebl · 2018-10-05T19:50:32Z

See #9700 (comment)
and #10011

With git maste, running the test-suite under LXC+AA triggers a huge amount of failures.

brauner · 2018-10-07T16:32:16Z

There seem to be two problems that have been identified:

changing mount propagation aka remounting certain paths
using DynamicUser feature in combination with containers

Both seem to be addressed by various commits. Have you tested with systemd git master and how the test-suite fares.
If you haven't, please test. Additionally, please report back clear error messages so that we can zoom in on any additional issues.

mbiebl · 2018-10-07T20:37:17Z

@brauner the problem actually exists only in git master and not v239. Here are logs from v239-1142-gad1bf59c6:
log.confined.txt -- AA turned in the LXC container via lxc.aa_profile = unconfined

log.unconfined.txt -- AA enabled

brauner · 2018-10-07T21:03:44Z

On Sun, Oct 7, 2018 at 10:37 PM Michael Biebl ***@***.***> wrote: @brauner the problem actually exists only in git master and not v239. Here are logs from v239-1142-gad1bf59c6: log.confined.txt -- AA turned in the LXC container via lxc.aa_profile = unconfined log.unconfined.txt -- AA enabled

Any chance I can get the dmesg output from these testruns? That would probably be helpful because it would allow us to see what exactly AppArmor is denying.

mbiebl · 2018-10-07T21:53:50Z

@brauner sure, no problem.
Attached is the dmesg output of a test-suite run:
dmesg.txt

I made a second run with auditd installed, which produced a more verbose log:
audit.log.txt

brauner · 2018-10-08T21:21:31Z

So, for neworkd there's the

type=AVC msg=audit(1538948870.928:1533): apparmor="DENIED" operation="mount" info="failed flags match" error=-13 profile="lxc-container-default-cgns" name="/" pid=11956 comm="(networkd)" flags="rw, rslave"

problem again. So that needs to be allowed in the profile.

localed-locale seems odd

FAIL: 'System Locale:' not found in:

Hm...
@xnox, who can @mbiebl talk to if he wants to find out how the autopkgtest suite is run on Ubuntu. Do you know how our AppArmor profile looks like for this?

xnox · 2018-10-10T10:36:32Z

In ubuntu (for the distro builds, not the ubuntu-ci that is visible on github) we run autopkgtests across all 6 architectures in the current development. All are executed "unconfined" inside OpenStack KVM instances, apart from armhf. The armhf one is executed inside lxc container, on top of a arm64 (xenial?!) instance with some command-line options to make it report armhf in uname inside armhf containers.

The code which does the slaves setup/management is at https://git.launchpad.net/autopkgtest-cloud

I think you care about setup-adt-lxc.commands and the custom aa profile that is applied against lxc itself there....

https://git.launchpad.net/autopkgtest-cloud/tree/lxc-slave-admin/setup-adt-lxc.commands#n56

poettering · 2018-10-24T16:42:19Z

Let's close this one, and instead focus on individual issues, i.e. #9700 and such. There's little actionable in this issue itself.

I figure part of the issue is outside of systemd's own scope anyway, and needs to be fixed in aa/lxc on debian. For everything else: please file individual issues instead.

mbiebl · 2018-10-24T17:03:56Z

I don't plan to run autopkgtest with AA-enabled on LXC in the future and file further bug reports for it.

From what I understand, even Ubuntu (which is heavily invested in AA) turns off AA completely or uses a custom AA profile.
This feels wrong to me: why should tests run in a separate, heavily modified environment compared to what's later run in production. Surely we can make the tests pass this way, but it doesn't give any guarantess that the executables will actually run later in a AA-confined LXC container.

To me this looks that the combination LXC+AA is unsupported, so spending more time on that doesn't seem useful.

xnox · 2018-10-26T17:09:51Z

@mbiebl some of the tests exercise things that are never executed in the initial namespace as root. And it is correct, for normal operation, to deny running things that can escape a container and affect the host.

Your assessment of Ubuntu CI is incomplete. By default, we run ubuntu CI in full VMs, launched by openstack, with AA fully turned on and enforcing. That's our preferred choice on all architectures. Initially, when we were bringing up architectures we did not have openstack for all arches and had to unfortunately resort to running tests in containers. This was the case for s390x, but now fixed. Out of the 6 arches Ubuntu CI is executed on today, only armhf remains in a container, on an arm64 KVM instance. This is because at the moment we still do not have working grub on efi on armhf with our patches working correctly. As soon as we have armhf cloud images working in our multi-arch openstack deployment that arch will too switch to full VMs. The profiles I have linked to are not the default, but the fallback we are using in case of lxd/lxc confiment, but only when we must.

There are conflicting goals between what is possible in containers, and what unittests systemd needs to be able to execute. Given that pid1 on the host system typically has access to do more things than containers will ever be allowed to.

Running test suite in containers is useful, but doesn't sufficiently test all the things that need to be tested.

I don't understand why e.g. debian-ci for autopkgtests does not use ssh runner to spin up EC2 VMs and execute autopkgtests on them as root. Given the available EC2 resources to the Debian project.

xnox · 2018-10-26T17:12:39Z

Investigating things that should work in containers, and making sure they don't regress in containers - is useful. For which in Ubuntu there are other jenkins instances that validate that - e.g. boot default cloud image containers and check they are operational; boot non-degraded; etc. But that's integration, rather than unit testing.

evverx · 2018-10-26T17:38:12Z

By default, we run ubuntu CI in full VMs, launched by openstack, with AA fully turned on and enforcing.

I have never seen Ubuntu CI failing due to the AppArmor profile, which makes me think that either the profile is too relaxed to be meaningful or it is actually relaxed before the tests are run.

mbiebl · 2018-10-26T22:24:33Z

@xnox thanks for the further clarifications regarding Ubuntu CI
@evverx It's the combination LXC+AA which is problematic, and Ubuntu only runs armhf via LXC (with a modified AA policy), iiuc

Me filing bug reports from time to time whenever I try to run the autopkgtests via LXC+AA is not sustainable, we'd have to do that automatically to be useful.
Before that, we'd have to clarify, if the systemd autopkgtest test-suite is even supposed to work inside an AA-confined LXC container or not (or which parts of it). This bug report was an attempt to clarify that, but tbh, I'm none the wiser.

evverx · 2018-10-26T23:30:26Z

@mbiebl what I was trying to say is that any policy that blocks anything should at some point be problematic given the pace at which new features are added to systemd and the nature of some of those features. The fact that Ubuntu CI has never failed due to the AppArmor profile (assuming that it's enforced) surprises me to say the least.

mbiebl · 2018-10-26T23:35:47Z

@evverx LXC containers apply different AA policies then a qemu based VMs, at least this is my understanding. These files look specific to LXC:

$ find /etc/apparmor.d -name "*lxc*" 
/etc/apparmor.d/libvirt/TEMPLATE.lxc
/etc/apparmor.d/lxc-containers
/etc/apparmor.d/usr.bin.lxc-start
/etc/apparmor.d/lxc
/etc/apparmor.d/lxc/lxc-default-cgns
/etc/apparmor.d/lxc/lxc-default-with-mounting
/etc/apparmor.d/lxc/lxc-default
/etc/apparmor.d/lxc/lxc-default-with-nesting
/etc/apparmor.d/abstractions/lxc
/etc/apparmor.d/abstractions/libvirt-lxc
/etc/apparmor.d/local/usr.bin.lxc-start

Then again, I'm far from a AA or LXC expert, so what I say might be totally bogus.

mbiebl · 2018-10-27T00:41:09Z

The profiles I have linked to are not the default, but the fallback we are using in case of lxd/lxc confiment, but only when we must.

@xnox I'm interested in what that means exactly. When and how exactly is the fallback profile used?
Isn't there the risk that you are then testing stuff which doesn't later match the production environment (in case you run the software in LXC/LXD)?

xnox · 2019-01-11T11:29:57Z

The profiles I have linked to are not the default, but the fallback we are using in case of lxd/lxc confiment, but only when we must.

@xnox I'm interested in what that means exactly. When and how exactly is the fallback profile used?
Isn't there the risk that you are then testing stuff which doesn't later match the production environment (in case you run the software in LXC/LXD)?

@mbiebl

I'm not sure how to word this better. We prefer to run autopkgtests in KVM VMs, as that most closely emulates bare-metal / production environments. And we have that available for all architectures apart from armhf. Eventually, we will have that for armhf as well. Whilst KVM VM was not available, we utilise lxd containers for armhf testing of autpkgtests (all of Ubuntu packages, not just systemd exclusive). Based on our trial an error we have relaxed lxd confinement to get more autopkgtests to pass, given the fact that these lxd containers run on virtualized hosts and are quite restricted in other ways. But doing such conscious confinement changes was for the goal of making armhf lxd confined testing be slightly more closer to KVM one. It is not the goal of armhf-lxd testing, to excercise all autopkgtests in a "default lxd" environment.

I understand that above changes make local reproducibility of armhf LXD test results harder.

But equally, it has never been my personal goal to make all upstream systemd tests work correctly under such test environment.

There is no "fallback profile", either KVM is used on arches for which we have openstack, or the laxed lxd as shown is used on manually provisioned machines (currently on armhf only).

Is above more clear, now?

poettering added the lxc/lxd label Oct 10, 2018

evverx mentioned this issue Oct 24, 2018

skip various test-execute tests when we have no namespacing #10505

Merged

poettering closed this as completed Oct 24, 2018

mbiebl mentioned this issue Oct 24, 2018

test-execute fails under lxc #9700

Closed

evverx mentioned this issue Nov 15, 2018

systemd-analyze: add a new "security" verb for analyzing unit sandboxing options #10701

Merged

bluca mentioned this issue May 21, 2024

Fix tests and services with PrivateNetwork=yes running under LXC with AppArmor #32945

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LXC and AppArmor -- a lost cause? #10166

LXC and AppArmor -- a lost cause? #10166

mbiebl commented Sep 24, 2018 •

edited

poettering commented Sep 25, 2018

brauner commented Oct 1, 2018

mbiebl commented Oct 5, 2018

brauner commented Oct 7, 2018 •

edited

mbiebl commented Oct 7, 2018

brauner commented Oct 7, 2018 via email

mbiebl commented Oct 7, 2018

brauner commented Oct 8, 2018

xnox commented Oct 10, 2018

poettering commented Oct 24, 2018

mbiebl commented Oct 24, 2018 •

edited

xnox commented Oct 26, 2018

xnox commented Oct 26, 2018

evverx commented Oct 26, 2018

mbiebl commented Oct 26, 2018 •

edited

evverx commented Oct 26, 2018 •

edited

mbiebl commented Oct 26, 2018 •

edited

mbiebl commented Oct 27, 2018

xnox commented Jan 11, 2019

LXC and AppArmor -- a lost cause? #10166

LXC and AppArmor -- a lost cause? #10166

Comments

mbiebl commented Sep 24, 2018 • edited

poettering commented Sep 25, 2018

brauner commented Oct 1, 2018

mbiebl commented Oct 5, 2018

brauner commented Oct 7, 2018 • edited

mbiebl commented Oct 7, 2018

brauner commented Oct 7, 2018 via email

mbiebl commented Oct 7, 2018

brauner commented Oct 8, 2018

xnox commented Oct 10, 2018

poettering commented Oct 24, 2018

mbiebl commented Oct 24, 2018 • edited

xnox commented Oct 26, 2018

xnox commented Oct 26, 2018

evverx commented Oct 26, 2018

mbiebl commented Oct 26, 2018 • edited

evverx commented Oct 26, 2018 • edited

mbiebl commented Oct 26, 2018 • edited

mbiebl commented Oct 27, 2018

xnox commented Jan 11, 2019

mbiebl commented Sep 24, 2018 •

edited

brauner commented Oct 7, 2018 •

edited

mbiebl commented Oct 24, 2018 •

edited

mbiebl commented Oct 26, 2018 •

edited

evverx commented Oct 26, 2018 •

edited

mbiebl commented Oct 26, 2018 •

edited