-
Notifications
You must be signed in to change notification settings - Fork 38.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kubelet: cgroups: be verbose about validation #108568
kubelet: cgroups: be verbose about validation #108568
Conversation
/sig node |
@rphillips @mrunalp this lgtm but wanted someone more familiar with cgroup manager (if that's either of you two, or others) |
if !cgroupManager.Exists(cgroupRoot) { | ||
return nil, fmt.Errorf("invalid configuration: cgroup-root %q doesn't exist", cgroupRoot) | ||
if err := cgroupManager.Validate(cgroupRoot); err != nil { | ||
return nil, fmt.Errorf("invalid configuration: %w", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NB: this is terminal to the kubelet
process and not explaining what error caused the configuration to be invalid is poor UX
7cb3880
to
0e475f3
Compare
/assign @kolyshkin |
/triage accepted |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to not change the API (I don't like two methods with different names that do the same thing, with the only difference is return value)? Say, increase the log level in in Exists
to a warning, and add logging to the missing places?
If not, maybe it's better to change all instances of if cm.Exists()
to if cm.Validate() == nil
.
Also, ideally I'd like this to go on top of #107149 since it incorporates some non-trivial changes. |
@kolyshkin no. I very strongly disagree with you. Logging critical errors that cause the process to exit at a high verbosity level and asking the user to re-run their setup in order to see the error should not be how we go about this. Also, if you notice, there were places where the logging did not expose the reason for |
@kolyshkin also while I understand the request re: the other PR, it looks very large, very old and it's not clear the timeline on which it would merge. |
d02257c
to
1296562
Compare
@rphillips @kolyshkin updated to entirely remove the |
This is changing and removing an established API. Exists() should probably stay and Validate be added. Exists() can simply call Validate(). The internal code can call Validate() to propagate the errors. |
Gotcha. @rphillips that was the original factoring. Let me revert to that one... |
Previously, callers of `Exists()` would not know why the cGroup was or was not existing. In one call-site in particular, the `kubelet` would entirely fail to start if the cGroup validation did not succeed. In these cases we MUST explain what went wrong and pass that information clearly to the caller. Previously, some but not all of the reasons for invalidation were logged at a low log-level instead. This led to poor UX. The original method was retained on the interface so as to make this diff small. Signed-off-by: Steve Kuznetsov <skuznets@redhat.com>
@rphillips reverted to the original state |
1296562
to
8f2bc39
Compare
/lgtm |
/assign @derekwaynecarr |
/approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: smarterclayton, stevekuznetsov The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Previously, callers of
Exists()
would not know why the cGroup was orwas not existing. In one call-site in particular, the
kubelet
wouldentirely fail to start if the cGroup validation did not succeed. In
these cases we MUST explain what went wrong and pass that information
clearly to the caller. Previously, some but not all of the reasons for
invalidation were logged at a low log-level instead. This led to poor
UX.
Signed-off-by: Steve Kuznetsov skuznets@redhat.com
/kind bug
/cc @sjenning @derekwaynecarr @smarterclayton