Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

core, job: fix breakage of ordering dependencies by systemctl reload command #13860

Merged
merged 1 commit into from Nov 4, 2019

Conversation

d-hatayama
Copy link
Contributor

No description provided.

@d-hatayama
Copy link
Contributor Author

d-hatayama commented Oct 30, 2019

I want to fix something to remove the label ci-fails/needs-rework, but I don't know what causes the label. There are many things that fail, I think some of them are not due to the patch set. Maybe, is it enough to deal with the failures on unit tests regarding job handling?

@anitazha
Copy link
Member

@d-hatayama You can click on "Details" below to see logs from the failed CI. I looked at a couple of the failing CIs and it is for unit tests. So if you fix the unit tests like you said it should be fine.

@d-hatayama
Copy link
Contributor Author

Thanks, @anitazha . It looks to me that the problematic patch is the 2nd one only, but this is actually not related to Issue #10464. I'll re-post the 2nd patch as another PR.

…command

Currently, systemctl reload command breaks ordering dependencies if it's
executed when its target service unit is in activating state.

For example, prepare A.service, B.service and C.target as follows:

    # systemctl cat A.service B.service C.target
    # /etc/systemd/system/A.service
    [Unit]
    Description=A

    [Service]
    Type=oneshot
    ExecStart=/usr/bin/echo A1
    ExecStart=/usr/bin/sleep 60
    ExecStart=/usr/bin/echo A2
    ExecReload=/usr/bin/echo A reloaded
    RemainAfterExit=yes

    # /etc/systemd/system/B.service
    [Unit]
    Description=B
    After=A.service

    [Service]
    Type=oneshot
    ExecStart=/usr/bin/echo B
    RemainAfterExit=yes

    # /etc/systemd/system/C.target
    [Unit]
    Description=C
    Wants=A.service B.service

Start them.

    # systemctl daemon-reload
    # systemctl start C.target

Then, we have:

    # LANG=C journalctl --no-pager -u A.service -u B.service -u C.target -b
    -- Logs begin at Mon 2019-09-09 00:25:06 EDT, end at Thu 2019-10-24 22:28:47 EDT. --
    Oct 24 22:27:47 localhost.localdomain systemd[1]: Starting A...
    Oct 24 22:27:47 localhost.localdomain systemd[1]: A.service: Child 967 belongs to A.service.
    Oct 24 22:27:47 localhost.localdomain systemd[1]: A.service: Main process exited, code=exited, status=0/SUCCESS
    Oct 24 22:27:47 localhost.localdomain systemd[1]: A.service: Running next main command for state start.
    Oct 24 22:27:47 localhost.localdomain systemd[1]: A.service: Passing 0 fds to service
    Oct 24 22:27:47 localhost.localdomain systemd[1]: A.service: About to execute: /usr/bin/sleep 60
    Oct 24 22:27:47 localhost.localdomain systemd[1]: A.service: Forked /usr/bin/sleep as 968
    Oct 24 22:27:47 localhost.localdomain systemd[968]: A.service: Executing: /usr/bin/sleep 60
    Oct 24 22:27:52 localhost.localdomain systemd[1]: A.service: Trying to enqueue job A.service/reload/replace
    Oct 24 22:27:52 localhost.localdomain systemd[1]: A.service: Merged into running job, re-running: A.service/reload as 1288
    Oct 24 22:27:52 localhost.localdomain systemd[1]: A.service: Enqueued job A.service/reload as 1288
    Oct 24 22:27:52 localhost.localdomain systemd[1]: A.service: Unit cannot be reloaded because it is inactive.
    Oct 24 22:27:52 localhost.localdomain systemd[1]: A.service: Job 1288 A.service/reload finished, result=invalid
    Oct 24 22:27:52 localhost.localdomain systemd[1]: B.service: Passing 0 fds to service
    Oct 24 22:27:52 localhost.localdomain systemd[1]: B.service: About to execute: /usr/bin/echo B
    Oct 24 22:27:52 localhost.localdomain systemd[1]: B.service: Forked /usr/bin/echo as 970
    Oct 24 22:27:52 localhost.localdomain systemd[970]: B.service: Executing: /usr/bin/echo B
    Oct 24 22:27:52 localhost.localdomain systemd[1]: B.service: Failed to send unit change signal for B.service: Connection reset by peer
    Oct 24 22:27:52 localhost.localdomain systemd[1]: B.service: Changed dead -> start
    Oct 24 22:27:52 localhost.localdomain systemd[1]: Starting B...
    Oct 24 22:27:52 localhost.localdomain echo[970]: B
    Oct 24 22:27:52 localhost.localdomain systemd[1]: B.service: Child 970 belongs to B.service.
    Oct 24 22:27:52 localhost.localdomain systemd[1]: B.service: Main process exited, code=exited, status=0/SUCCESS
    Oct 24 22:27:52 localhost.localdomain systemd[1]: B.service: Changed start -> exited
    Oct 24 22:27:52 localhost.localdomain systemd[1]: B.service: Job 1371 B.service/start finished, result=done
    Oct 24 22:27:52 localhost.localdomain systemd[1]: Started B.
    Oct 24 22:27:52 localhost.localdomain systemd[1]: C.target: Job 1287 C.target/start finished, result=done
    Oct 24 22:27:52 localhost.localdomain systemd[1]: Reached target C.
    Oct 24 22:27:52 localhost.localdomain systemd[1]: C.target: Failed to send unit change signal for C.target: Connection reset by peer
    Oct 24 22:28:47 localhost.localdomain systemd[1]: A.service: Child 968 belongs to A.service.
    Oct 24 22:28:47 localhost.localdomain systemd[1]: A.service: Main process exited, code=exited, status=0/SUCCESS
    Oct 24 22:28:47 localhost.localdomain systemd[1]: A.service: Running next main command for state start.
    Oct 24 22:28:47 localhost.localdomain systemd[1]: A.service: Passing 0 fds to service
    Oct 24 22:28:47 localhost.localdomain systemd[1]: A.service: About to execute: /usr/bin/echo A2
    Oct 24 22:28:47 localhost.localdomain systemd[1]: A.service: Forked /usr/bin/echo as 972
    Oct 24 22:28:47 localhost.localdomain systemd[972]: A.service: Executing: /usr/bin/echo A2
    Oct 24 22:28:47 localhost.localdomain echo[972]: A2
    Oct 24 22:28:47 localhost.localdomain systemd[1]: A.service: Child 972 belongs to A.service.
    Oct 24 22:28:47 localhost.localdomain systemd[1]: A.service: Main process exited, code=exited, status=0/SUCCESS
    Oct 24 22:28:47 localhost.localdomain systemd[1]: A.service: Changed start -> exited

The issue occurs not only in reload command, i.e.:

  - reload
  - try-restart
  - reload-or-restart
  - reload-or-try-restart commands

The cause of this issue is that job_type_collapse() doesn't take care of the
activating state.

Fixes: systemd#10464
@d-hatayama
Copy link
Contributor Author

I removed the 2nd patch, which I'll request another PR later. I also modified the patch description of the 1st patch.

@poettering poettering removed the ci-fails/needs-rework 🔥 Please rework this, the CI noticed an issue with the PR label Nov 4, 2019
@poettering poettering merged commit d155979 into systemd:master Nov 4, 2019
@d-hatayama d-hatayama deleted the devel_issue_10464 branch November 13, 2019 12:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Development

Successfully merging this pull request may close these issues.

None yet

4 participants