Submission type
systemd version the issue has been seen with
v237 and v233
Used distribution
Custom Distribution with Linux 4.9.87 x86_64 GNU/Linux
In case of bug report: Expected behaviour you didn't see
Systemd should restart a failed service
In case of bug report: Unexpected behaviour you saw
Systemd failed to restart a failed service
In case of bug report: Steps to reproduce the problem
Using the following service file start a service. Observe the service's log in journal files. One can observe that the even after the ExecStart script fails, systemd will not try to restart the service.
To ensure proper timing when the condition can be seen, I have used sleep in the scripts. The Main script should send the notification to systemd and die with a failure before the StartPost script ends. This is the precise condition when the problem is observed.
=========== test.service ========
[Unit]
Description=Test Service
[Service]
Type=notify
NotifyAccess=all
StandardOutput=journal
ExecStart=/start.sh
ExecStartPost=/startpost.sh
ExecStopPost=/stoppost.sh
TimeoutStartSec=0
RemainAfterExit=yes
Restart=on-failure
StartLimitInterval=0
StartLimitBurst=0
[Install]
# cat /start.sh
#!/bin/bash
systemd-notify --ready --status="sleeping"
sleep 3s
exit 1
# cat /startpost.sh
#!/bin/bash
sleep 9s
exit 0
# cat /stoppost.sh
#!/bin/bash
exit 0
An illustration of the different scripts' running time
# job's timing
|=^=======3s==========|(start.sh - exit 1)
| (notify)
|====================9s===========|(startpost.sh - exit 0)
Sample output
# systemctl status test
● test.service - Ram Service
Loaded: loaded (/etc/systemd/system/test.service; static; vendor preset: enabl
Active: active (exited) (Result: exit-code) since Tue 2018-05-08 14:55:00 IST
Process: 1935 ExecStartPost=/startpost.sh (code=exited, status=0/SUCCESS)
Process: 1931 ExecStart=/start.sh (code=exited, status=1/FAILURE)
Main PID: 1931 (code=exited, status=1/FAILURE)
Status: "sleeping"
CPU: 4ms
Systemd logs
May 08 14:54:54 host-0 systemd[1]: test.service: Child 1931 belongs to test.service
May 08 14:54:54 host-0 systemd[1]: test.service: Main process exited, code=exited,
May 08 14:55:00 host-0 systemd[1]: test.service: Child 1935 belongs to test.service
May 08 14:55:00 host-0 systemd[1]: test.service: Control process exited, code=exit
May 08 14:55:00 host-0 systemd[1]: test.service: Got final SIGCHLD for state start
May 08 14:55:00 host-0 systemd[1]: test.service: Changed start-post -> exited
May 08 14:55:00 host-0 systemd[1]: test.service: Job test.service/start finished, r
May 08 14:55:00 host-0 systemd[1]: Started Test Service.
May 08 14:55:00 host-0 systemd[1]: test.service: cgroup is empty
May 08 14:55:00 host-0 systemd[1]: test.service: Failed to send unit change signal
Submission type
systemd version the issue has been seen with
v237 and v233
Used distribution
Custom Distribution with Linux 4.9.87 x86_64 GNU/Linux
In case of bug report: Expected behaviour you didn't see
Systemd should restart a failed service
In case of bug report: Unexpected behaviour you saw
Systemd failed to restart a failed service
In case of bug report: Steps to reproduce the problem
Using the following service file start a service. Observe the service's log in journal files. One can observe that the even after the ExecStart script fails, systemd will not try to restart the service.
To ensure proper timing when the condition can be seen, I have used sleep in the scripts. The Main script should send the notification to systemd and die with a failure before the StartPost script ends. This is the precise condition when the problem is observed.
An illustration of the different scripts' running time
Sample output
Systemd logs