-
Notifications
You must be signed in to change notification settings - Fork 1.6k
KEP 127: add a metric, describe an error kubelet will return, and target one more beta #5413
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
cc @wojtek-t |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@haircommander thanks! Left two simple comments, feel free to ignore the one about the metric :)
If that is the case, checking the pod events to see if they are failing for user namespaces reasons | ||
(like the errors shown in this KEP) is advised, in which case it is recommended to rollback or | ||
disable the feature gate. | ||
If there are no successfully created user namespaced pods (but are pods that have been attempted to be created), then there may be an issue with user namespaces on that node. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if we just name the metrics? Like the old text, but using the metrics
If there are no successfully created user namespaced pods (but are pods that have been attempted to be created), then there may be an issue with user namespaces on that node. | |
If the kubelet metric `started_user_namespaced_pods_errors_total` has a value close to `started_user_namespaced_pods_total` it means most of pods with userns started are failing. If that is the case, checking the pod events to see if they are failing for user namespaces reasons (like the errors shown in this KEP) is advised, in which case it is recommended to rollback or disable the feature gate. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated!
- the version of Kubernetes where the KEP graduated to general availability | ||
- when the KEP was retired or superseded | ||
--> | ||
- Kubernetes 1.34: Feature goes GA |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The kep.yaml says 1.35
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oop thanks!
+1 to adding those metrics - they are definitely useful and very easy to reason about and sound reasonbly straightforward to add /approve PRR |
Signed-off-by: Peter Hunt <pehunt@redhat.com>
8487267
to
9eead1a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks a lot @haircommander !
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: haircommander, mrunalp, rata, wojtek-t The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Support User Namespaces in pods #127