Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unit state may be clobbered when unit symlink is added #14141

Open
OhNoMoreGit opened this issue Nov 25, 2019 · 0 comments
Open

Unit state may be clobbered when unit symlink is added #14141

OhNoMoreGit opened this issue Nov 25, 2019 · 0 comments
Labels
bug 🐛 Programming errors, that need preferential fixing pid1

Comments

@OhNoMoreGit
Copy link

OhNoMoreGit commented Nov 25, 2019

systemd version the issue has been seen with

systemd v243

Used distribution

Fedora

Description of problem

Adding a new symlink to an active unit and reloading systemd can cause that unit's state to be clobbered. In some cases, this can cause systemd to crash.

Steps to reproduce the problem

First create a dummy systemd service, and have it refer to another not-yet-existing service:

# cat << EOF >/etc/systemd/system/foobar-a.service
[Unit]
Wants=foobar-b.service

[Service]
ExecStart=/bin/sleep 600
Restart=always
EOF

Make sure the service is loaded and started:

# systemctl daemon-reload
# systemctl start foobar-a.service

systemd will internally create a "not-found" unit to track foobar-b.service. We can see this in systemd-analyze dump:

# systemd-analyze dump
...
-> Unit foobar-a.service:
        Description: foobar-a.service
        Instance: n/a
        Unit Load State: loaded
        Unit Active State: active
...
-> Unit foobar-b.service:
        Description: foobar-b.service
        Instance: n/a
        Unit Load State: not-found
        Unit Active State: inactive
...

We want foobar-b.service to be ordered after foobar-a.service. If it is ordered before, use systemctl daemon-reexec to reinitialize everything, and with luck the hashmap ordering will be different.

Next, we create a link from foobar-b.service to an existing, active service. foobar-a.service is convenient:

# ln -sf /etc/systemd/system/foobar-{a,b}.service
# systemctl daemon-reload

When the state for the not-found foobar-b.service is deserialized, it will clobber the state of foobar-a.service:

# systemctl status foobar-a.service
* foobar-a.service
   Loaded: loaded (/etc/systemd/system/foobar-a.service; enabled; vendor preset: disabled)
   Active: inactive (dead)
 Main PID: 19170 (sleep)
   CGroup: /system.slice/foobar-a.service
           `-19170 /bin/sleep 600

Nov 25 09:58:42 test systemd[1]: Started foobar-a.service.

Note that systemd thinks the service is inactive, even though the main PID is still running!

If that main PID exits (e.g. kill 19170), systemd will hit the assertion:

Code should not be reached 'Uh, main process died at wrong time.' at ../src/core/service.c:3490, function service_sigchld_event(). Aborting.

It then freezes execution.

(Also reported against RHEL 7 at https://bugzilla.redhat.com/show_bug.cgi?id=1760149, though that bug report is really originally for a different issue.)

@poettering poettering added bug 🐛 Programming errors, that need preferential fixing pid1 labels Nov 25, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug 🐛 Programming errors, that need preferential fixing pid1
Development

No branches or pull requests

2 participants