Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

User Manager for user fails to start every second time #12386

Closed
buczek opened this issue Apr 25, 2019 · 4 comments · Fixed by #12833
Closed

User Manager for user fails to start every second time #12386

buczek opened this issue Apr 25, 2019 · 4 comments · Fixed by #12833

Comments

@buczek
Copy link
Contributor

buczek commented Apr 25, 2019

In our environment we use non-interactive public-key authenticated ssh-session a lot for management purposes. Since several weeks, sometime ssh commands ( ssh HOST do-somehing) executed by management scripts fail. with ssh returning a non-zero exit status but nothing on stderr.

We've found, that pam_systemd.so was failing and mitigated the problem by changing it from required to optional in pam.d/sshd. However the logfiles still indicated that there is a sporadic problem and, for example, the User Manager (systemd --user) is not started for every session. For systemd v242 the errors logged are:

Apr 21 20:01:12 phaedra.molgen.mpg.de systemd[1]: Starting User Manager for UID 0...
Apr 21 20:01:12 phaedra.molgen.mpg.de systemd[36132]: user@0.service: Failed to attach to cgroup /user.slice/user-0.slice/user@0.service: No such file or directory
Apr 21 20:01:12 phaedra.molgen.mpg.de systemd[36132]: user@0.service: Failed at step CGROUP spawning /lib/systemd/systemd: No such file or directory
Apr 21 20:01:12 phaedra.molgen.mpg.de systemd[1]: user@0.service: Failed with result 'protocol'.
Apr 21 20:01:12 phaedra.molgen.mpg.de systemd[1]: Failed to start User Manager for UID 0.

We figured out, that this happens on some systems (10 out of 200, but the set changes) and, if it happens at all, then typically on every second ssh session.

There is a long chain of problems involved. Here is an overview of what I think is happening:

  • The first session is started as expected. systemd[1] creates the cgroups and starts a user manager. A ssh HOST systemd-cgls may look like this:
-.slice
├─user.slice
│ └─user-0.slice
│   ├─session-35.scope
│   │ ├─6261 sshd: root@notty
│   │ └─6270 systemd-cgls
│   └─user@0.service
│     └─init.scope
│       ├─6264 /lib/systemd/systemd --user
│       └─6265 (sd-pam)
├─init.scope
[...]
  • However, when the session ends, there is a shutdown problem. systemd --user completes while (sd-pam) still exists. Both, the user systemd --user and systemd[1] try to remove the user@0.service slice, but both fail, because there is still a process in it. Here's some output of systemd with log_level=debug plus some lines added by my favorite high-level debugging tool printf:
pr 24 18:11:17 sigalrm.molgen.mpg.de systemd[1]: Received SIGCHLD from PID 11246 (systemd).
Apr 24 18:11:17 sigalrm.molgen.mpg.de systemd[1]: Child 11246 (systemd) died (code=exited, status=0/SUCCESS)
Apr 24 18:11:17 sigalrm.molgen.mpg.de systemd[1]: user@0.service: Child 11246 belongs to user@0.service.
Apr 24 18:11:17 sigalrm.molgen.mpg.de systemd[1]: user@0.service: Main process exited, code=exited, status=0/SUCCESS
Apr 24 18:11:17 sigalrm.molgen.mpg.de systemd[1]: user@0.service: Succeeded.
Apr 24 18:11:17 sigalrm.molgen.mpg.de systemd[1]: user@0.service: Changed stop-sigterm -> dead
Apr 24 18:11:17 sigalrm.molgen.mpg.de systemd[1]: XXX 0x8e3970 unit_prune_cgroup
Apr 24 18:11:17 sigalrm.molgen.mpg.de systemd[1]: XXX cg_trim controller=_systemd path=/user.slice/user-0.slice/user@0.service delete_root=1
Apr 24 18:11:17 sigalrm.molgen.mpg.de systemd[1]: XXX trim_cb: rmdir /sys/fs/cgroup/unified/user.slice/user-0.slice/user@0.service/init.scope errno now 16
Apr 24 18:11:17 sigalrm.molgen.mpg.de systemd[1]: XXX cg_trim_everywhere: cg_trim return=-16
Apr 24 18:11:17 sigalrm.molgen.mpg.de systemd[1]: user@0.service: Failed to destroy cgroup /user.slice/user-0.slice/user@0.service, ignoring: Device or resource busy
Apr 24 18:11:17 sigalrm.molgen.mpg.de systemd[1]: XXX 0x8e3970 unit_prune_cgroup DONT DO IT
Apr 24 18:11:17 sigalrm.molgen.mpg.de systemd[1]: user@0.service: Job 52904 user@0.service/stop finished, result=done
Apr 24 18:11:17 sigalrm.molgen.mpg.de systemd[1]: Stopped User Manager for UID 0.
  • unit_prune_cgroup() [1] gets an error from cg_trim_everywhere() and returns early without executing u->cgroup_realized = false;. The related error is only shown, when systemd runs with log_level=debug (or in the above output, because I changed the log level to warning).

  • After user@0.service is stopped, user-runtime-dir@0.service is about to be stopped, too. When this unit completes, user-0.slice is cleaned up recusivly. This time it succeeds, because sd-pam had some more time to exit:

Apr 24 18:11:17 sigalrm.molgen.mpg.de systemd[1]: Stopped User Runtime Directory /run/user/0.
Apr 24 18:11:17 sigalrm.molgen.mpg.de systemd[1]: user-0.slice changed active -> dead
Apr 24 18:11:17 sigalrm.molgen.mpg.de systemd[1]: XXX 0x7d7b20 unit_prune_cgroup
Apr 24 18:11:17 sigalrm.molgen.mpg.de systemd[1]: XXX cg_trim controller=_systemd path=/user.slice/user-0.slice delete_root=1
Apr 24 18:11:17 sigalrm.molgen.mpg.de systemd[1]: XXX trim_cb: rmdir /sys/fs/cgroup/unified/user.slice/user-0.slice/user@0.service/init.scope errno now 0
Apr 24 18:11:17 sigalrm.molgen.mpg.de systemd[1]: XXX trim_cb: rmdir /sys/fs/cgroup/unified/user.slice/user-0.slice/user@0.service errno now 0
Apr 24 18:11:17 sigalrm.molgen.mpg.de systemd[1]: XXX cg_trim controller=name=systemd path=/user.slice/user-0.slice delete_root=1
Apr 24 18:11:17 sigalrm.molgen.mpg.de systemd[1]: XXX trim_cb: rmdir /sys/fs/cgroup/systemd/user.slice/user-0.slice/user@0.service/init.scope errno now 0
Apr 24 18:11:17 sigalrm.molgen.mpg.de systemd[1]: XXX trim_cb: rmdir /sys/fs/cgroup/systemd/user.slice/user-0.slice/user@0.service errno now 0
Apr 24 18:11:17 sigalrm.molgen.mpg.de systemd[1]: XXX cg_trim_everywhere: cg_trim return=0
Apr 24 18:11:17 sigalrm.molgen.mpg.de systemd[1]: XXX cg_trim controller=cpu path=/user.slice/user-0.slice delete_root=1
Apr 24 18:11:17 sigalrm.molgen.mpg.de systemd[1]: XXX cg_trim controller=blkio path=/user.slice/user-0.slice delete_root=1
Apr 24 18:11:17 sigalrm.molgen.mpg.de systemd[1]: XXX cg_trim controller=memory path=/user.slice/user-0.slice delete_root=1
Apr 24 18:11:17 sigalrm.molgen.mpg.de systemd[1]: XXX trim_cb: rmdir /sys/fs/cgroup/memory/user.slice/user-0.slice/user@0.service errno now 0
Apr 24 18:11:17 sigalrm.molgen.mpg.de systemd[1]: XXX cg_trim controller=devices path=/user.slice/user-0.slice delete_root=1
Apr 24 18:11:17 sigalrm.molgen.mpg.de systemd[1]: XXX cg_trim controller=pids path=/user.slice/user-0.slice delete_root=1
Apr 24 18:11:17 sigalrm.molgen.mpg.de systemd[1]: XXX trim_cb: rmdir /sys/fs/cgroup/pids/user.slice/user-0.slice/user@0.service errno now 0
Apr 24 18:11:17 sigalrm.molgen.mpg.de systemd[1]: XXX 0x7d7b20 unit_prune_cgroup DO IT
Apr 24 18:11:17 sigalrm.molgen.mpg.de systemd[1]: user-0.slice: Job 52907 user-0.slice/stop finished, result=done
Apr 24 18:11:17 sigalrm.molgen.mpg.de systemd[1]: Removed slice User Slice of UID 0.
  • On the next ssh-session, however, user@0.service still has u-> cgroup_realized = true and doesn't attempt to set up the cgroups, although the tree is no longer available:
Apr 24 18:15:07 sigalrm.molgen.mpg.de systemd[1]: user@0.service: XXX Service Start 1
Apr 24 18:15:07 sigalrm.molgen.mpg.de systemd[1]: user@0.service: XXX 0x8e3970 Service Spawn 1
Apr 24 18:15:07 sigalrm.molgen.mpg.de systemd[1]: user@0.service: XXX Unit 0x8e3970 prepare exec 1
Apr 24 18:15:07 sigalrm.molgen.mpg.de systemd[1]: XXX 0x8e3970 unit realize cgroup now 1
Apr 24 18:15:07 sigalrm.molgen.mpg.de systemd[1]: XXX 0x8e3970 has mask realized cgroup_realized=1 cgroup_realized_mask=00000050 target_mask=00000050 mask_v1=0000007b c
Apr 24 18:15:07 sigalrm.molgen.mpg.de systemd[1]: XXX 0x8e3970 result: 1
Apr 24 18:15:07 sigalrm.molgen.mpg.de systemd[1]: XXX 0x8e3970 unit_realize_cgroup_now: NOOP, because it says it is already is...
Apr 24 18:15:07 sigalrm.molgen.mpg.de systemd[1]: user@0.service: XXX Unit 0x8e3970 prepare exec 2
Apr 24 18:15:07 sigalrm.molgen.mpg.de systemd[1]: user@0.service: XXX 0x8e3970 Service Spawn 2
Apr 24 18:15:07 sigalrm.molgen.mpg.de systemd[1]: user@0.service: Passing 0 fds to service
Apr 24 18:15:07 sigalrm.molgen.mpg.de systemd[1]: user@0.service: About to execute: /lib/systemd/systemd --user
Apr 24 18:15:07 sigalrm.molgen.mpg.de systemd[1]: user@0.service: Forked /lib/systemd/systemd as 11349
Apr 24 18:15:07 sigalrm.molgen.mpg.de systemd[1]: Sent message type=signal sender=n/a destination=n/a path=/org/freedesktop/systemd1/unit/user_400_2eservice interface=o
Apr 24 18:15:07 sigalrm.molgen.mpg.de systemd[1]: Sent message type=signal sender=n/a destination=n/a path=/org/freedesktop/systemd1/unit/user_400_2eservice interface=o
Apr 24 18:15:07 sigalrm.molgen.mpg.de systemd[1]: user@0.service: Changed dead -> start
Apr 24 18:15:07 sigalrm.molgen.mpg.de systemd[1]: Starting User Manager for UID 0...
Apr 24 18:15:07 sigalrm.molgen.mpg.de systemd[11349]: user@0.service: Failed to attach to cgroup /user.slice/user-0.slice/user@0.service: No such file or directory
Apr 24 18:15:07 sigalrm.molgen.mpg.de systemd[11349]: user@0.service: Failed at step CGROUP spawning /lib/systemd/systemd: No such file or directory

There are multiple problems involved. E.g. Why isn't ssh delivering an error message on stderr when a required pam module fails? Should system@0.service terminate, while the sd-pam process is still alive?

Anyway, what can be identified as a bug here is that u-> cgroup_realized doesn't correctly track the state of the cgroup hierarchy. An easy fix might be to put u->cgroup_realized = false; at the beginning of unit_prune_cgroup() before we actually attempt to delete the cgroup tree and possibly fail, maybe after succeeding in parts.

But on top of that: If other units are allowed to recursively clean up the tree from a higher position (e.g. from the user slice) , a unit can never know about the state of the cgroup tree based on its own actions only.

I will apply some quickfix for our environment but I don't want to attempt a fix for upstream here, because I'm unsure of the design principles and the vision regarding u->cgroup_realized. Is it just some kind of cache and can be removed ( looking at /sys/fs/cgroup should be fast enough in the context of spawning external prgramms ) or is it required for another reason?

[1]

void unit_prune_cgroup(Unit *u) {

buczek added a commit to buczek/systemd that referenced this issue Apr 25, 2019
When part of the cgroup hierarchy can not be deleted (e.g. because there
are still processes in it), do not exit unit_prune_cgroup early, but
continue so that u->cgroup_realized is reset.

Change the log severity to warning, so that we can monitor the issue of
locked cgroups.

Fixes systemd#12386
@buczek
Copy link
Contributor Author

buczek commented Apr 25, 2019

The patch buczek/systemd@f36092b fixed it for us. I don't suggest it as a solution, because I think people with a better overview can come up with something better.

@buczek
Copy link
Contributor Author

buczek commented Apr 30, 2019

Addendum: It might play a role, that we have systemd configured with -Ddefault-kill-user-processes=false.

@poettering
Copy link
Member

i think your fix is actually ok, can you please submit as pr?

@poettering
Copy link
Member

and the msg should probably be downgraded to debug level in the ENOTEMPTY case (i assume that's the errno you see, right?)

@poettering poettering added this to the v243 milestone Jun 18, 2019
buczek added a commit to buczek/systemd that referenced this issue Jun 19, 2019
When part of the cgroup hierarchy cannot be deleted (e.g. because there
are still processes in it), do not exit unit_prune_cgroup early, but
continue so that u->cgroup_realized is reset.

Log the known case of non-empty cgroups at debug level and other errors
at warning level.

Fixes systemd#12386
@buczek buczek changed the title Use Manager for user fails to start every second time User Manager for user fails to start every second time Jun 19, 2019
buczek added a commit to buczek/systemd that referenced this issue Jun 19, 2019
When part of the cgroup hierarchy cannot be deleted (e.g. because there
are still processes in it), do not exit unit_prune_cgroup early, but
continue so that u->cgroup_realized is reset.

Log the known case of non-empty cgroups at debug level and other errors
at warning level.

Fixes systemd#12386
buczek added a commit to buczek/systemd that referenced this issue Jun 19, 2019
When part of the cgroup hierarchy cannot be deleted (e.g. because there
are still processes in it), do not exit unit_prune_cgroup early, but
continue so that u->cgroup_realized is reset.

Log the known case of non-empty cgroups at debug level and other errors
at warning level.

Fixes systemd#12386
pull bot pushed a commit to Mattlk13/systemd that referenced this issue Jun 20, 2019
When part of the cgroup hierarchy cannot be deleted (e.g. because there
are still processes in it), do not exit unit_prune_cgroup early, but
continue so that u->cgroup_realized is reset.

Log the known case of non-empty cgroups at debug level and other errors
at warning level.

Fixes systemd#12386
edevolder pushed a commit to edevolder/systemd that referenced this issue Jun 26, 2019
When part of the cgroup hierarchy cannot be deleted (e.g. because there
are still processes in it), do not exit unit_prune_cgroup early, but
continue so that u->cgroup_realized is reset.

Log the known case of non-empty cgroups at debug level and other errors
at warning level.

Fixes systemd#12386
edevolder pushed a commit to edevolder/systemd that referenced this issue Jun 26, 2019
When part of the cgroup hierarchy cannot be deleted (e.g. because there
are still processes in it), do not exit unit_prune_cgroup early, but
continue so that u->cgroup_realized is reset.

Log the known case of non-empty cgroups at debug level and other errors
at warning level.

Fixes systemd#12386
zachsmith pushed a commit to zachsmith/systemd that referenced this issue Jul 26, 2019
When part of the cgroup hierarchy cannot be deleted (e.g. because there
are still processes in it), do not exit unit_prune_cgroup early, but
continue so that u->cgroup_realized is reset.

Log the known case of non-empty cgroups at debug level and other errors
at warning level.

Fixes systemd#12386
keszybz pushed a commit to systemd/systemd-stable that referenced this issue Sep 3, 2019
When part of the cgroup hierarchy cannot be deleted (e.g. because there
are still processes in it), do not exit unit_prune_cgroup early, but
continue so that u->cgroup_realized is reset.

Log the known case of non-empty cgroups at debug level and other errors
at warning level.

Fixes systemd/systemd#12386

(cherry picked from commit 0219b35)
keszybz pushed a commit to systemd/systemd-stable that referenced this issue Sep 3, 2019
When part of the cgroup hierarchy cannot be deleted (e.g. because there
are still processes in it), do not exit unit_prune_cgroup early, but
continue so that u->cgroup_realized is reset.

Log the known case of non-empty cgroups at debug level and other errors
at warning level.

Fixes systemd/systemd#12386

(cherry picked from commit 0219b35)
(cherry picked from commit 7f7b786)
Yamakuzure pushed a commit to elogind/elogind that referenced this issue Sep 23, 2019
When part of the cgroup hierarchy cannot be deleted (e.g. because there
are still processes in it), do not exit unit_prune_cgroup early, but
continue so that u->cgroup_realized is reset.

Log the known case of non-empty cgroups at debug level and other errors
at warning level.

Fixes systemd/systemd#12386
Yamakuzure pushed a commit to elogind/elogind that referenced this issue Sep 23, 2019
When part of the cgroup hierarchy cannot be deleted (e.g. because there
are still processes in it), do not exit unit_prune_cgroup early, but
continue so that u->cgroup_realized is reset.

Log the known case of non-empty cgroups at debug level and other errors
at warning level.

Fixes systemd/systemd#12386

(cherry picked from commit 0219b3524f414e23589e63c6de6a759811ef8474)
(cherry picked from commit 7f7b7865eefe6577fad4f94bdc5e7c83df62ac60)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging a pull request may close this issue.

2 participants