Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test-oomd-util fails in container #20593

Closed
mwhudson opened this issue Aug 31, 2021 · 10 comments · Fixed by #20705
Closed

test-oomd-util fails in container #20593

mwhudson opened this issue Aug 31, 2021 · 10 comments · Fixed by #20705

Comments

@mwhudson
Copy link

systemd version the issue has been seen with

248

Used distribution

Ubuntu 20.04 host (with HWE kernel), Fedora 34 container

Linux kernel version used (uname -a)

Linux fedora34 5.11.0-27-generic #29~20.04.1-Ubuntu SMP Wed Aug 11 15:58:17 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux

CPU architecture issue was seen on

amd64 and arm64

Expected behaviour you didn't see

/usr/lib/systemd/tests/test-oomd-util passing

Unexpected behaviour you saw

/usr/lib/systemd/tests/test-oomd-util failing like so:

[root@fedora34 ~]# /usr/lib/systemd/tests/test-oomd-util 
Error converting 'MemTotal:       32495256 kB trailing' from /oomdgetsysctxtestfM7xTX to uint64_t: Invalid argument
Last pgscan 33 greater than current pgscan 2 for /zupa.slice. Using last pgscan of zero.
Last pgscan 33 greater than current pgscan 2 for /zupa.slice. Using last pgscan of zero.
Last pgscan 33 greater than current pgscan 1 for /herp.slice/derp.scope. Using last pgscan of zero.
Last pgscan 33 greater than current pgscan 1 for /herp.slice/derp.scope. Using last pgscan of zero.
Last pgscan 33 greater than current pgscan 2 for /zupa.slice. Using last pgscan of zero.
Last pgscan 33 greater than current pgscan 1 for /herp.slice/derp.scope. Using last pgscan of zero.
Last pgscan 33 greater than current pgscan 1 for /herp.slice/derp.scope. Using last pgscan of zero.
Last pgscan 33 greater than current pgscan 1 for /herp.slice/derp.scope. Using last pgscan of zero.
Bus n/a: changing state UNSET → OPENING
sd-bus: starting bus by connecting to /run/dbus/system_bus_socket...
Bus n/a: changing state OPENING → AUTHENTICATING
Bus n/a: changing state AUTHENTICATING → HELLO
Sent message type=method_call sender=n/a destination=org.freedesktop.DBus path=/org/freedesktop/DBus interface=org.freedesktop.DBus member=Hello cookie=1 reply_cookie=0 signature=n/a error-name=n/a error-message=n/a
Sent message type=method_call sender=n/a destination=org.freedesktop.DBus path=/org/freedesktop/DBus interface=org.freedesktop.DBus member=AddMatch cookie=2 reply_cookie=0 signature=s error-name=n/a error-message=n/a
Got message type=method_return sender=org.freedesktop.DBus destination=:1.8 path=n/a interface=n/a member=n/a cookie=4294967295 reply_cookie=1 signature=s error-name=n/a error-message=n/a
Bus n/a: changing state HELLO → RUNNING
Sent message type=method_call sender=n/a destination=org.freedesktop.systemd1 path=/org/freedesktop/systemd1 interface=org.freedesktop.systemd1.Manager member=StartTransientUnit cookie=3 reply_cookie=0 signature=ssa(sv)a(sa(sv)) error-name=n/a error-message=n/a
Got message type=method_return sender=:1.4 destination=:1.8 path=n/a interface=n/a member=n/a cookie=114 reply_cookie=3 signature=o error-name=n/a error-message=n/a
Got message type=signal sender=org.freedesktop.DBus destination=:1.8 path=/org/freedesktop/DBus interface=org.freedesktop.DBus member=NameAcquired cookie=4294967295 reply_cookie=0 signature=s error-name=n/a error-message=n/a
Got message type=method_return sender=org.freedesktop.DBus destination=:1.8 path=n/a interface=n/a member=n/a cookie=4294967295 reply_cookie=2 signature= error-name=n/a error-message=n/a
Match type='signal',sender='org.freedesktop.systemd1',path='/org/freedesktop/systemd1',interface='org.freedesktop.systemd1.Manager',member='JobRemoved' successfully installed.
Got message type=signal sender=:1.4 destination=n/a path=/org/freedesktop/systemd1 interface=org.freedesktop.systemd1.Manager member=JobRemoved cookie=119 reply_cookie=0 signature=uoss error-name=n/a error-message=n/a
Got result done/Success for job test-oomd-util-8d268a3c9d766cf0.scope
Bus n/a: changing state RUNNING → CLOSED
Found cgroup2 on /sys/fs/cgroup/, full unified hierarchy
oomd attempting to kill 157 with KILL
oomd attempting to kill 158 with KILL
oomd attempting to kill 159 with KILL
oomd attempting to kill 160 with KILL
Error getting memory.current from /system.slice/test-oomd-util-8d268a3c9d766cf0.scope: No data available
Assertion 'oomd_cgroup_context_acquire(cgroup, &ctx) == 0' failed at src/oom/test-oomd-util.c:117, function test_oomd_cgroup_context_acquire_and_insert(). Aborting.
Aborted (core dumped)

Steps to reproduce the problem

  1. Get yourself an Ubuntu 20.04 system
  2. lxd init, accept all the defaults
  3. lxc launch images:fedora/34 -c security.nesting=true fedora34
  4. Install systemd-tests inside the container
  5. Run /usr/lib/systemd/tests/test-oomd-util

Additional program output to the terminal or log subsystem illustrating the issue

  • The "-c security.nesting=true" when launching the container seems to be necessary to get systemd in the container to set up a unified hierarchy as the failing bits of test-oomd-util don't run for a hybrid hierarchy
  • Although 20.04 uses a hybrid hierarchy by default, changing the kernel command line to force a unified hierarchy does not make the tests pass (so it's not a "hybrid outside" / "unified inside" sort of thing)
  • I tried with a Fedora 34 vm as a host and didn't see the failure. That has a 5.13 kernel though, which might be relevant?
@mwhudson
Copy link
Author

Oh I think I mis-tested, this probably is a "hybrid outside" / "unified inside" thing (maybe it fails with a unified hierarchy on the outside too with an old enough kernel?)

@mwhudson
Copy link
Author

Yes, so with kernel 5.4 the test fails even with unified cgroups on the host. With 5.11 or 5.13 it only fails with a hybrid cgroups on the host.

@poettering
Copy link
Member

/cc @anitazha

@poettering
Copy link
Member

which systemd version is used inside the container? does it fail if you compile current git of systemd and run the testuite from that? "meson build && ninja -C build test"

@ddstreet
Copy link
Contributor

didn't I fix this with commit 1354002

@mwhudson
Copy link
Author

mwhudson commented Sep 1, 2021

Oh yes probably, argh.

@mwhudson mwhudson closed this as completed Sep 1, 2021
@slyon
Copy link
Contributor

slyon commented Sep 10, 2021

Okay, so after playing around with this a bit more, I'd say this definitely is a "hybrid outside" / "unified inside" issue and ddstreet's commit does not fix the issue. 1354002 fixed the situation for memory.swap.current but this problem is about a missing memory.current and I'm not sure if we can just ignore the memory.current value, as we did with memory.swap.current..?

Therefore I think this issue should be re-opened. @anitazha

I can reproduce it on an Ubuntu Impish host (kernel 5.11.0-31-generic, systemd 248.3-1ubuntu3), running an Ubuntu Impish container (kernel 5.11.0-31-generic, systemd 249.743.g8fd4d27f3c+21.10.20210909132725 – daily upstream build from ppa:ubuntu-support-team/systemd) – or basically any container that uses the unified cgroups hierarchy (like Fedora 34, as mentioned before).

Interestingly, in other places that try to read memory.current (cgtop.c/cgroup.c) there is a fallback to memory.usage_in_bytes if the system does not detect an all unified hierarchy. But those seem to be mutually exclusive:

On a host using the hybrid hierarchy I can only see memory.usage_in_bytes (same goes for hybrid container on hybrid host):

$ find /sys/fs/cgroup/ -wholename *system.slice/memory.current
$ find /sys/fs/cgroup/ -wholename *system.slice/memory.usage_in_bytes
/sys/fs/cgroup/memory/system.slice/memory.usage_in_bytes

=> Tests are skipped ("test-oomd-util: cgroups are not running in unified mode, skipping tests.")

On a host using the unified hierarchy I can only see memory.current (same goes for unified container on unified host and hybrid container on unified host):

$ find /sys/fs/cgroup/ -wholename *system.slice/memory.current
/sys/fs/cgroup/system.slice/memory.current
$ find /sys/fs/cgroup/ -wholename *system.slice/memory.usage_in_bytes

=> Tests pass as expected.

BUT: Inside a unified container on a hybrid host, I can see neither:

$ find /sys/fs/cgroup/ -wholename *system.slice/memory.current
$ find /sys/fs/cgroup/ -wholename *system.slice/memory.usage_in_bytes

=> This is a problem. The test fails, because it cannot access memory.current but is not skiped either.

@yuwata
Copy link
Member

yuwata commented Sep 10, 2021

Already opened another issue #20655, and seems duplicate of this. From the logs in the issue,

Error getting memory.current from /system.slice/test-oomd-util-f69769737b554529.scope: No data available
Assertion 'oomd_cgroup_context_acquire(cgroup, &ctx) == 0' failed at src/oom/test-oomd-util.c:117, function test_oomd_cgroup_context_acquire_and_insert(). Aborting.

@yuwata
Copy link
Member

yuwata commented Sep 10, 2021

At least #20705 can hide the issue. But I am not sure it is a correct way to fix.

@ddstreet
Copy link
Contributor

BUT: Inside a unified container on a hybrid host, I can see neither:

$ find /sys/fs/cgroup/ -wholename *system.slice/memory.current
$ find /sys/fs/cgroup/ -wholename *system.slice/memory.usage_in_bytes

the container cgroup2 won't have any controllers, because they are all in use by the host cgroup1 setup. The host's hybrid cgroup1/cgroup2 setup means that all the subsystem controllers are explicitly mounted (and thus in use by cgroup1), while the cgroup2 mounted at /sys/fs/cgroup/unified/ lacks any controllers. Since the memory controller is what provides memory.current, it won't appear in a container where only cgroup2 is mounted. This means, unfortunately, a container attempting to use only cgroup2 on a host where cgroup1 is in use won't work; the container will have to use cgroup1 or hybrid.

That is quite different from what 1354002 was fixing, as the memory controller can be present but the kernel might be configured without support for memory.swap.*

ddstreet pushed a commit to ddstreet/systemd that referenced this issue Sep 10, 2021
When running in a container where the host is using cgroup1, we can't
use unified cgroup, since the host kernel has the cgroup controllers
in use by cgroup1.

Fixes: systemd#20593
Fixes: systemd#20655
ddstreet pushed a commit to ddstreet/systemd that referenced this issue Sep 10, 2021
When running in a container where the host is using cgroup1, we can't
use unified cgroup, since the host kernel has the cgroup controllers
in use by cgroup1.

Fixes: systemd#20593
Fixes: systemd#20655
ddstreet pushed a commit to ddstreet/systemd that referenced this issue Sep 10, 2021
When running in a container where the host is using cgroup1, we can't
use unified cgroup, since the host kernel has the cgroup controllers
in use by cgroup1.

Fixes: systemd#20593
Fixes: systemd#20655
ddstreet pushed a commit to ddstreet/systemd that referenced this issue Sep 10, 2021
When running in a container where the host is using cgroup1, we can't
use unified cgroup, since the host kernel has the cgroup controllers
in use by cgroup1.

Fixes: systemd#20593
Fixes: systemd#20655
ddstreet pushed a commit to ddstreet/systemd that referenced this issue Sep 11, 2021
When running in a container where the host is using cgroup1, we can't
use unified cgroup, since the host kernel has the cgroup controllers
in use by cgroup1.

Fixes: systemd#20593
Fixes: systemd#20655
codepeon pushed a commit to codepeon/systemd that referenced this issue Nov 19, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging a pull request may close this issue.

5 participants