-
Notifications
You must be signed in to change notification settings - Fork 527
fix: more resilient etcd systemd #3809
fix: more resilient etcd systemd #3809
Conversation
Codecov Report
@@ Coverage Diff @@
## master #3809 +/- ##
=======================================
Coverage 73.19% 73.19%
=======================================
Files 148 148
Lines 25394 25394
=======================================
Hits 18587 18587
Misses 5671 5671
Partials 1136 1136
Continue to review full report at Codecov.
|
/lgtm |
@Michael-Sinz: changing LGTM is restricted to collaborators In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Then again, I wrote that up in the bug report so someone else should also validate. |
f6d2b1f
to
bc5d722
Compare
bc5d722
to
ba5680e
Compare
Wants=network-online.target | ||
RequiresMountsFor=/var/lib/etcddisk | ||
After=network.target var-lib-etcddisk.mount | ||
Wants=network-online.target var-lib-etcddisk.mount |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found the discussion of systemd dependency fields interesting this morning, thanks @Michael-Sinz and @jackfrancis. So I understand this change, except for why var-lib-etcddisk.mount
needs to be in the Wants=
stanza. Why isn't it being in the After=
stanza sufficient?
Otherwise lgtm++.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, so the After is all that is needed (if you read my bug report: #3807 you can see that)
However, if someone were to manually start etcd from some unknown state, After does not make sure the mount exists, so the Wants= says add the mount to the "must start" (After just means "after it ran")
Note that Wants= is not after, it just means "add to the list of things to start"
As I said in the bug report, the only things required to make it work for the boot scenarios is the After= and the removal of the RequiresMountsFor= - but that does not make it semantically the same for manual/other systemd operations.
It does not cost much to add the other two items (steps 3 and 4 from the bug report) and it makes this basically semantically the same as before but mechanically different (and thus fixing the problem)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be optional but I can imagine edge cases where it could be needed
As you said in the bug report, yep. Thanks for re-explaining!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: jackfrancis, mboersma, Michael-Sinz The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Reason for Change:
This PR addresses observed systemd failures to guarantee the etcd mount dependency upon
/var/lib/etcddisk
via theRequiresMountsFor
configuration option. In the real world, the systemdRequiresMountsFor
implementation is unable to detect the specified mount points despite the volume being mounted on the host, and thus blocks the start operation of the service indefinitely.All credit to @Michael-Sinz :)
Issue Fixed:
Fixes #3807
Requirements:
Notes: