Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NewSystemd handles UnitExists when starting units #290

Merged
merged 1 commit into from
Jul 16, 2023

Conversation

mmerkes
Copy link
Contributor

@mmerkes mmerkes commented Apr 23, 2023

As done in this runc PR
opencontainers/runc#3782,
UnitExists errors are no longer ignored when starting units. If it already exists, we attempt to reset the failed unit and retry once. Otherwise, we fail. See #279 for more context.

Fixes #279

In generally, I'm following runc's example and making this code more robust. See code and commit for more details. Ideally, we'd reuse the code, but don't currently have a dependency on runc as a library here and I don't have enough context to know if that would be reasonable or not.

@mmerkes mmerkes force-pushed the unitexists branch 2 times, most recently from 699e348 to 8565c75 Compare May 20, 2023 21:30
@mmerkes mmerkes changed the title [WIP] NewSystemd handles UnitExists when starting units NewSystemd handles UnitExists when starting units May 20, 2023
@mmerkes
Copy link
Contributor Author

mmerkes commented May 20, 2023

@kolyshkin Let me know if you have feedback here since you opened the original issue and made the runc change. Tx!

Copy link
Contributor

@kolyshkin kolyshkin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alas, I am not a maintainer and can't say which uses/users this method has.

But nevertheless a fix needs to be made, because the current behavior (of not adding the container into a proper cgroup) is totally unacceptable (and can, in some circumstances, be qualified as a security issue).

So, I'm adding my non-binding LGTM, but this needs a review from the actual maintainers of this package.

.gitignore Outdated
@@ -1,2 +1,5 @@
example/example
cmd/cgctl/cgctl

*.swp
coverage.txt
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These settings should be in your home directory

Copy link
Member

@AkihiroSuda AkihiroSuda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please squash commits

As done in opencontainers/runc#3782,
UnitExists errors are no longer ignored when starting units.
This change makes the logic more robust like in runc:

1. Attempts to reset a failed unit if it already exists
2. Verifies via the unit status that it successfully starts
3. Waits longer for unit to start
4. Continues to ignore unit existing when pid is -1 to
accommodate kubelet use case
5. Otherwise, returns an error if it already exists

Signed-off-by: Matt Merkes <matt.merkes@gmail.com>

Update .gitignore

Signed-off-by: Matt <matt.merkes@gmail.com>
@mmerkes
Copy link
Contributor Author

mmerkes commented Jun 28, 2023

Please squash commits

Updated. Falsely assumed that the merge would squash. @AkihiroSuda

Copy link
Member

@thaJeztah thaJeztah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@mmerkes
Copy link
Contributor Author

mmerkes commented Jul 14, 2023

@AkihiroSuda This good to be merged?

@AkihiroSuda AkihiroSuda merged commit a0ae1c2 into containerd:main Jul 16, 2023
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

NewSystemd must not ignore UnitExists
5 participants