Join GitHub today
GitHub is home to over 20 million developers working together to host and review code, manage projects, and build software together.
systemd-journald-audit.socket vs lxd #6519
Comments
|
my recommendation: simply block the socket(AF_NETLINK, SOCK_RAW, NETLINK_AUDIT) call through seccomp in lxd, and make it return -EPERM. That way handling the case when audit is off in the kernel and when it is blocked in a container is handled the exact same way. Moreover systemd will just work then, as it already makes the necessary checks. And lxd would just do what nspawn does already. |
|
ack. Also the audit/apparmor DENIED is red herring. as that's for something else, due to wrong sock_type. So audit namespacing is being worked on in the kernel, thus the argument against filtering it, is that it will start working on newer kernels. My other question from irc chats with stgraber was this:
somehow I would have expected for kernel to eperm opening the socket in the first places, rather than waiting for bind to eperm that. |
|
please keep this issue open for commenting for a little while. and maybe tag it as something like a discussion or some such. |
poettering
added
journal
needs-discussion
labels
Aug 7, 2017
Well, I am pretty sure that audit namespacing will mean using a new API (i.e. CLONE_NEWAUDIT or so), hence it appears to me that nothing is lost if auditing is blocked for now entirely, and only turned back on when it can actually work and the CLONE_NEWAUDIT stuff is used. And even if audit namespacing is piggybacked on some other clone() bit, then I still think it's the duty of the container manager to grok that, and unmask auditing in that case, and only do so when the kernel is safe to support it... |
|
The plan is to piggyback onto user namespace and thus be transparent. But indeed that is speculation, until all merged and stable. And it has been "worked on" for a while now. Thus no expectations as to when audit namespacing will land. |
xnox commentedAug 2, 2017
Submission type
systemd version the issue has been seen with
234
Used distribution
Ubuntu
In case of bug report: Expected behaviour you didn't see
System boots non-degraded
In case of bug report: Unexpected behaviour you saw
System boots degraded, status of systemd-journald-audit.socket is failed Result: resources
In case of bug report: Steps to reproduce the problem
On Ubuntu:
$ lxc launch ubuntu-daily:a degradedboot
$ lxc exec degradedboot bash
$ systemctl status systemd-journald-audit.socket
....
The container in question, is apparmor protected unpriviledged (user namespaced) lxd container (systemd-virt-detect lxc).
I did manual check as mentioned in #6508 (comment) and there is no errno set, and fd 4 is opened.
To debug this further, I have tweaked .socket unit to actaully be related to some other unit, rather than be before systemd-journald, as otherwise there are no useful logs to see why starting the socket unit failed.
Here are more detailed logs:
Which means we get
EPERMuponbindcall. And on the host I getI'm trying to resolve a user experience issue of default container comming up degraded. Thus at the moment I have no preference on how to solve this.
From above. Should the audit checks try to bind() and watch for EPERM comming from LSM? Or for example, should the host LSM (apparmor) rules be tightened to prevent opening NETLINK_AUDIT if one will not be able to bind to it? By default lxd doesn't limit / filter capabilities that are available in the container.
Please advise best strategy. And I'll ping lxd / apparmor people to read this bug report - despite this not being neither lxd nor apparmor bug tracker. Somehow I suspect that this can be fixed in either of the three projects. And all three can point fingers at the other =)
Also as a side note, if audit-fd was not passed to journald, it will try to open it, but it will ignore failing it. Thus maybe audit.socket unit can be adjusted to somehow be "non-fatal" to not cause degraded state if and when bind() fails for it. However, ideally in above scenario audit.socket unit should not be started at all given, in a way, it is known in advance it will fail for this user case.