Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

systemd-journald-audit.socket vs lxd #6519

Closed
xnox opened this issue Aug 2, 2017 · 8 comments · Fixed by #27524
Closed

systemd-journald-audit.socket vs lxd #6519

xnox opened this issue Aug 2, 2017 · 8 comments · Fixed by #27524

Comments

@xnox
Copy link
Member

xnox commented Aug 2, 2017

Submission type

  • Bug report

systemd version the issue has been seen with

234

Used distribution

Ubuntu

In case of bug report: Expected behaviour you didn't see

System boots non-degraded

In case of bug report: Unexpected behaviour you saw

System boots degraded, status of systemd-journald-audit.socket is failed Result: resources

In case of bug report: Steps to reproduce the problem

On Ubuntu:
$ lxc launch ubuntu-daily:a degradedboot
$ lxc exec degradedboot bash
$ systemctl status systemd-journald-audit.socket

....
The container in question, is apparmor protected unpriviledged (user namespaced) lxd container (systemd-virt-detect lxc).

I did manual check as mentioned in #6508 (comment) and there is no errno set, and fd 4 is opened.

To debug this further, I have tweaked .socket unit to actaully be related to some other unit, rather than be before systemd-journald, as otherwise there are no useful logs to see why starting the socket unit failed.

Here are more detailed logs:

# systemctl status systemd-journald-audit.socket
● systemd-journald-audit.socket - Journal Audit Socket
   Loaded: loaded (/lib/systemd/system/systemd-journald-audit.socket; static; vendor preset: enabled)
   Active: failed (Result: resources)
     Docs: man:systemd-journald.service(8)
           man:journald.conf(5)
   Listen: audit 1 (Netlink)

Aug 02 16:08:19 normal systemd[1]: systemd-journald-audit.socket: Enqueued job systemd-journald-audit.socket/stop as 594
Aug 02 16:08:19 normal systemd[1]: systemd-journald-audit.socket: Job systemd-journald-audit.socket/stop finished, result=done
Aug 02 16:08:23 normal systemd[1]: systemd-journald-audit.socket: Trying to enqueue job systemd-journald-audit.socket/start/replace
Aug 02 16:08:23 normal systemd[1]: systemd-journald-audit.socket: Installed new job systemd-journald-audit.socket/start as 659
Aug 02 16:08:23 normal systemd[1]: systemd-journald-audit.socket: Enqueued job systemd-journald-audit.socket/start as 659
Aug 02 16:08:23 normal systemd[1]: systemd-journald-audit.socket: ConditionCapability=CAP_AUDIT_READ succeeded.
Aug 02 16:08:23 normal systemd[1]: systemd-journald-audit.socket: ConditionSecurity=audit succeeded.
Aug 02 16:08:23 normal systemd[1]: systemd-journald-audit.socket: Failed to listen on sockets: Operation not permitted
Aug 02 16:08:23 normal systemd[1]: systemd-journald-audit.socket: Job systemd-journald-audit.socket/start finished, result=failed
Aug 02 16:08:23 normal systemd[1]: Failed to listen on Journal Audit Socket.

Which means we get EPERM upon bind call. And on the host I get

[2956447.961322] audit: type=1400 audit(1501690831.694:2170): apparmor="DENIED" operation="file_lock" profile="lxd-normal_</var/lib/lxd>" pid=8307 comm="(t-daemon)" family="unix" sock_type="dgram" protocol=0 addr=none

I'm trying to resolve a user experience issue of default container comming up degraded. Thus at the moment I have no preference on how to solve this.

From above. Should the audit checks try to bind() and watch for EPERM comming from LSM? Or for example, should the host LSM (apparmor) rules be tightened to prevent opening NETLINK_AUDIT if one will not be able to bind to it? By default lxd doesn't limit / filter capabilities that are available in the container.

Please advise best strategy. And I'll ping lxd / apparmor people to read this bug report - despite this not being neither lxd nor apparmor bug tracker. Somehow I suspect that this can be fixed in either of the three projects. And all three can point fingers at the other =)

Also as a side note, if audit-fd was not passed to journald, it will try to open it, but it will ignore failing it. Thus maybe audit.socket unit can be adjusted to somehow be "non-fatal" to not cause degraded state if and when bind() fails for it. However, ideally in above scenario audit.socket unit should not be started at all given, in a way, it is known in advance it will fail for this user case.

@poettering
Copy link
Member

my recommendation: simply block the socket(AF_NETLINK, SOCK_RAW, NETLINK_AUDIT) call through seccomp in lxd, and make it return -EPERM. That way handling the case when audit is off in the kernel and when it is blocked in a container is handled the exact same way. Moreover systemd will just work then, as it already makes the necessary checks. And lxd would just do what nspawn does already.

@xnox
Copy link
Member Author

xnox commented Aug 2, 2017

ack. Also the audit/apparmor DENIED is red herring. as that's for something else, due to wrong sock_type.

So audit namespacing is being worked on in the kernel, thus the argument against filtering it, is that it will start working on newer kernels.

My other question from irc chats with stgraber was this:

is it reasonable to get EPERM on bind() instead of earlier on fd creation with socket()? what good is an audit socket fd, which one will not be able to bind()?

somehow I would have expected for kernel to eperm opening the socket in the first places, rather than waiting for bind to eperm that.

@xnox
Copy link
Member Author

xnox commented Aug 2, 2017

please keep this issue open for commenting for a little while. and maybe tag it as something like a discussion or some such.

@poettering
Copy link
Member

So audit namespacing is being worked on in the kernel, thus the argument against filtering it, is that it will start working on newer kernels.

Well, I am pretty sure that audit namespacing will mean using a new API (i.e. CLONE_NEWAUDIT or so), hence it appears to me that nothing is lost if auditing is blocked for now entirely, and only turned back on when it can actually work and the CLONE_NEWAUDIT stuff is used. And even if audit namespacing is piggybacked on some other clone() bit, then I still think it's the duty of the container manager to grok that, and unmask auditing in that case, and only do so when the kernel is safe to support it...

@xnox
Copy link
Member Author

xnox commented Aug 14, 2017

The plan is to piggyback onto user namespace and thus be transparent. But indeed that is speculation, until all merged and stable. And it has been "worked on" for a while now. Thus no expectations as to when audit namespacing will land.

@asbachb
Copy link

asbachb commented Jan 7, 2020

I wonder if there's an actual workaround for that problem nowerdays?

@llccd
Copy link

llccd commented Mar 14, 2020

@asbachb simply mask it using systemctl mask systemd-journald-audit.socket could be a workaround

@poettering
Copy link
Member

I proposed a proper fix here:

#19443 (comment)

that should work even on lxc where sandboxing is not done.

enr0n added a commit to enr0n/systemd that referenced this issue May 4, 2023
If a container manager does not follow the guidance in
https://systemd.io/CONTAINER_INTERFACE/ regarding audit capabilities,
then the current check may not be sufficient to determine that audit
will function properly. In particular, when calling bind() on the audit
fd, we will get EPERM if running in a user-namespaced container.

Expand the check to make an AUDIT_GET_FEATURE request on the audit fd to
test if it is working. If this fails with ECONNREFUSED, we know it is
because the kernel does not support the use of audit outside of the
initial user namespace.

Note that the approach of this patch was suggested here:
systemd#19443 (comment)

Fixes: systemd#6519
enr0n added a commit to enr0n/systemd that referenced this issue May 4, 2023
If a container manager does not follow the guidance in
https://systemd.io/CONTAINER_INTERFACE/ regarding audit capabilities,
then the current check may not be sufficient to determine that audit
will function properly. In particular, when calling bind() on the audit
fd, we will get EPERM if running in a user-namespaced container.

Expand the check to make an AUDIT_GET_FEATURE request on the audit fd to
test if it is working. If this fails with ECONNREFUSED, we know it is
because the kernel does not support the use of audit outside of the
initial user namespace.

Note that the approach of this patch was suggested here:
systemd#19443 (comment)

Fixes: systemd#6519
enr0n added a commit to enr0n/systemd that referenced this issue May 4, 2023
If a container manager does not follow the guidance in
https://systemd.io/CONTAINER_INTERFACE/ regarding audit capabilities,
then the current check may not be sufficient to determine that audit
will function properly. In particular, when calling bind() on the audit
fd, we will get EPERM if running in a user-namespaced container.

Expand the check to make an AUDIT_GET_FEATURE request on the audit fd to
test if it is working. If this fails with ECONNREFUSED, we know it is
because the kernel does not support the use of audit outside of the
initial user namespace.

Note that the approach of this patch was suggested here:
systemd#19443 (comment)

Fixes: systemd#6519
enr0n added a commit to enr0n/systemd that referenced this issue May 4, 2023
If a container manager does not follow the guidance in
https://systemd.io/CONTAINER_INTERFACE/ regarding audit capabilities,
then the current check may not be sufficient to determine that audit
will function properly. In particular, when calling bind() on the audit
fd, we will get EPERM if running in a user-namespaced container.

Expand the check to make an AUDIT_GET_FEATURE request on the audit fd to
test if it is working. If this fails with ECONNREFUSED, we know it is
because the kernel does not support the use of audit outside of the
initial user namespace.

Note that the approach of this patch was suggested here:
systemd#19443 (comment)

Fixes: systemd#6519
enr0n added a commit to enr0n/systemd that referenced this issue May 4, 2023
If a container manager does not follow the guidance in
https://systemd.io/CONTAINER_INTERFACE/ regarding audit capabilities,
then the current check may not be sufficient to determine that audit
will function properly. In particular, when calling bind() on the audit
fd, we will get EPERM if running in a user-namespaced container.

Expand the check to make an AUDIT_GET_FEATURE request on the audit fd to
test if it is working. If this fails with ECONNREFUSED, we know it is
because the kernel does not support the use of audit outside of the
initial user namespace.

Note that the approach of this patch was suggested here:
systemd#19443 (comment)

Fixes: systemd#6519
enr0n added a commit to enr0n/systemd that referenced this issue May 5, 2023
If a container manager does not follow the guidance in
https://systemd.io/CONTAINER_INTERFACE/ regarding audit capabilities,
then the current check may not be sufficient to determine that audit
will function properly. In particular, when calling bind() on the audit
fd, we will get EPERM if running in a user-namespaced container.

Expand the check to make an AUDIT_GET_FEATURE request on the audit fd to
test if it is working. If this fails with ECONNREFUSED, we know it is
because the kernel does not support the use of audit outside of the
initial user namespace.

Note that the approach of this patch was suggested here:
systemd#19443 (comment)

Fixes: systemd#6519
enr0n added a commit to enr0n/systemd that referenced this issue May 5, 2023
If a container manager does not follow the guidance in
https://systemd.io/CONTAINER_INTERFACE/ regarding audit capabilities,
then the current check may not be sufficient to determine that audit
will function properly. In particular, when calling bind() on the audit
fd, we will get EPERM if running in a user-namespaced container.

Expand the check to make an AUDIT_GET_FEATURE request on the audit fd to
test if it is working. If this fails with ECONNREFUSED, we know it is
because the kernel does not support the use of audit outside of the
initial user namespace.

Note that the approach of this patch was suggested here:
systemd#19443 (comment)

Fixes: systemd#6519
bluca pushed a commit that referenced this issue May 5, 2023
If a container manager does not follow the guidance in
https://systemd.io/CONTAINER_INTERFACE/ regarding audit capabilities,
then the current check may not be sufficient to determine that audit
will function properly. In particular, when calling bind() on the audit
fd, we will get EPERM if running in a user-namespaced container.

Expand the check to make an AUDIT_GET_FEATURE request on the audit fd to
test if it is working. If this fails with ECONNREFUSED, we know it is
because the kernel does not support the use of audit outside of the
initial user namespace.

Note that the approach of this patch was suggested here:
#19443 (comment)

Fixes: #6519
bluca pushed a commit to bluca/systemd that referenced this issue Jun 2, 2023
If a container manager does not follow the guidance in
https://systemd.io/CONTAINER_INTERFACE/ regarding audit capabilities,
then the current check may not be sufficient to determine that audit
will function properly. In particular, when calling bind() on the audit
fd, we will get EPERM if running in a user-namespaced container.

Expand the check to make an AUDIT_GET_FEATURE request on the audit fd to
test if it is working. If this fails with ECONNREFUSED, we know it is
because the kernel does not support the use of audit outside of the
initial user namespace.

Note that the approach of this patch was suggested here:
systemd#19443 (comment)

Fixes: systemd#6519
(cherry picked from commit 362235b)
(cherry picked from commit 4be604e)
peckato1 pushed a commit to peckato1/systemd that referenced this issue Jun 12, 2023
If a container manager does not follow the guidance in
https://systemd.io/CONTAINER_INTERFACE/ regarding audit capabilities,
then the current check may not be sufficient to determine that audit
will function properly. In particular, when calling bind() on the audit
fd, we will get EPERM if running in a user-namespaced container.

Expand the check to make an AUDIT_GET_FEATURE request on the audit fd to
test if it is working. If this fails with ECONNREFUSED, we know it is
because the kernel does not support the use of audit outside of the
initial user namespace.

Note that the approach of this patch was suggested here:
systemd#19443 (comment)

Fixes: systemd#6519
(cherry picked from commit 362235b)
valentindavid pushed a commit to valentindavid/systemd that referenced this issue Aug 8, 2023
If a container manager does not follow the guidance in
https://systemd.io/CONTAINER_INTERFACE/ regarding audit capabilities,
then the current check may not be sufficient to determine that audit
will function properly. In particular, when calling bind() on the audit
fd, we will get EPERM if running in a user-namespaced container.

Expand the check to make an AUDIT_GET_FEATURE request on the audit fd to
test if it is working. If this fails with ECONNREFUSED, we know it is
because the kernel does not support the use of audit outside of the
initial user namespace.

Note that the approach of this patch was suggested here:
systemd#19443 (comment)

Fixes: systemd#6519
(cherry picked from commit 362235b)
(cherry picked from commit 4be604e)
(cherry picked from commit 7418088)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging a pull request may close this issue.

4 participants