Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

services fail to track their tasks on daemon-reload #6299

Closed
cpaelzer opened this issue Jul 6, 2017 · 2 comments
Closed

services fail to track their tasks on daemon-reload #6299

cpaelzer opened this issue Jul 6, 2017 · 2 comments
Labels
already-implemented bug 🐛 Programming errors, that need preferential fixing cgroups pid1

Comments

@cpaelzer
Copy link
Contributor

cpaelzer commented Jul 6, 2017

Submission type: Bug report

systemd version the issue has been seen with:

233-10 (Debian)
233-8ubuntu2 (Ubuntu)
232-21ubuntu2 (Ubuntu)

Used distribution

  • Debian and Ubuntu

Expected behaviour:

keeping track of Tasks on daemon-reload

Unexpected behaviour

services loosing track of their Tasks
Missing the line like: "Tasks: 16 (limit: 32768)" on sytsemctl status output

Steps to reproduce the problem

TL;DR a systemctl daemon-reload triggers the bug, but the following sequence outlines the issue.
It does:

  1. check all running services and proves they "know" about their tasks.
  2. Then it calls daemon reload
  3. proves all services "forgot" their Tasks
  4. Restarts a service (any of them does the trick)
  5. shows that now all services remember their tasks again
#!/bin/bash
SERVICES="$(systemctl list-units --type=service | grep -v -E '(user|getty)' | awk '/running/{print $1}')"
status () {
    for srv in $SERVICES; do
        printf "%40s%s\n" "$srv : " "$(systemctl status $srv | grep Tasks)"
    done
}

status
systemctl daemon-reload
status
# any service restart would do
systemctl restart cron.service
status

A sample output of such looks like:

./general-trigger-systemd.sh   
              accounts-daemon.service :     Tasks: 3 (limit: 4915)
                          atd.service :     Tasks: 1 (limit: 4915)
                         cron.service :     Tasks: 1 (limit: 4915)
                         dbus.service :     Tasks: 1 (limit: 4915)
                       iscsid.service :     Tasks: 2 (limit: 4915)
                     libvirtd.service :     Tasks: 16 (limit: 32768)
                 lvm2-lvmetad.service :     Tasks: 1 (limit: 4915)
                        lxcfs.service :     Tasks: 11 (limit: 4915)
                       polkit.service :     Tasks: 3 (limit: 4915)
                      rsyslog.service :     Tasks: 4 (limit: 4915)
                        snapd.service :     Tasks: 7 (limit: 4915)
                          ssh.service :     Tasks: 1 (limit: 4915)
             systemd-journald.service :     Tasks: 1 (limit: 4915)
               systemd-logind.service :     Tasks: 1 (limit: 4915)
             systemd-resolved.service :     Tasks: 1 (limit: 4915)
            systemd-timesyncd.service :     Tasks: 2 (limit: 4915)
                systemd-udevd.service :     Tasks: 1
                     virtlogd.service :     Tasks: 2 (limit: 4915)
              accounts-daemon.service : 
                          atd.service : 
                         cron.service : 
                         dbus.service : 
                       iscsid.service : 
                     libvirtd.service : 
                 lvm2-lvmetad.service : 
                        lxcfs.service : 
                       polkit.service : 
                      rsyslog.service : 
                        snapd.service : 
                          ssh.service : 
             systemd-journald.service : 
               systemd-logind.service : 
             systemd-resolved.service : 
            systemd-timesyncd.service : 
                systemd-udevd.service : 
                     virtlogd.service : 
              accounts-daemon.service :     Tasks: 3 (limit: 4915)
                          atd.service :     Tasks: 1 (limit: 4915)
                         cron.service :     Tasks: 1 (limit: 4915)
                         dbus.service :     Tasks: 1 (limit: 4915)
                       iscsid.service :     Tasks: 2 (limit: 4915)
                     libvirtd.service :     Tasks: 16 (limit: 32768)
                 lvm2-lvmetad.service :     Tasks: 1 (limit: 4915)
                        lxcfs.service :     Tasks: 11 (limit: 4915)
                       polkit.service :     Tasks: 3 (limit: 4915)
                      rsyslog.service :     Tasks: 4 (limit: 4915)
                        snapd.service :     Tasks: 7 (limit: 4915)
                          ssh.service :     Tasks: 1 (limit: 4915)
             systemd-journald.service :     Tasks: 1 (limit: 4915)
               systemd-logind.service :     Tasks: 1 (limit: 4915)
             systemd-resolved.service :     Tasks: 1 (limit: 4915)
            systemd-timesyncd.service :     Tasks: 2 (limit: 4915)
                systemd-udevd.service :     Tasks: 1
                     virtlogd.service :     Tasks: 2 (limit: 4915)

Additional notes:

  • I happened to realize that while the output of "systemctl status " does not list the tasks other system tools like systemd-cgtop work as expected
  • This might sound like an "just bad output" issue, but in fact I got to this analysis by tracing down a debian bug which looses all libvirt-lxc containers on a service restart when coming from that bad state.

I might overlook something, but since it seems fairly reproducible and it exceeds my systemd-foo now.
So I hoped filing this bug might help get some experts to reproduce, comment and look into it.

@evverx
Copy link
Member

evverx commented Jul 6, 2017

This seems to have been fixed by #5619, but the patch has not been backported yet. Is there any chance you could apply the patch and check that it works for you?

@evverx evverx added cgroups needs-reporter-feedback ❓ There's an unanswered question, the reporter needs to answer pid1 labels Jul 6, 2017
@cpaelzer
Copy link
Contributor Author

cpaelzer commented Jul 7, 2017

Hi,
thanks for pointing out the potential fix.
It applied with just a few offsets to latest Ubuntu so I went and built an experimental package in a ppa.

Verifying with that proved to be the fix I need.
It fixes the simple reproduction case that I listed above, as well as resolving the issue of the service loosing all containers if restarted from the "bad state".

Thanks a lot, closing as upstream the fix is known and good.

@cpaelzer cpaelzer closed this as completed Jul 7, 2017
@evverx evverx added already-implemented bug 🐛 Programming errors, that need preferential fixing and removed needs-reporter-feedback ❓ There's an unanswered question, the reporter needs to answer labels Jul 7, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
already-implemented bug 🐛 Programming errors, that need preferential fixing cgroups pid1
Development

No branches or pull requests

2 participants