-
Notifications
You must be signed in to change notification settings - Fork 39.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kubectl/pkg/drain: Include namespace in evictPods return error #115594
kubectl/pkg/drain: Include namespace in evictPods return error #115594
Conversation
And also in the terminating-namespace log output. This makes it easier to track down drain-blocking pods, without having to hunt around in earlier logs for 'evicting pod ...' messages. Before this change, caller logs might look like: evicting pod {namespace}/{name} ... error when waiting for pod "{name}" terminating: global timeout reached: 20s With this change, they will look like: evicting pod {namespace}/{name} ... error when waiting for pod "{name}" in namespace "{namespace}" to terminate: global timeout reached: 20s
Hi @wking. Thanks for your PR. I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/ok-to-test |
@@ -342,7 +342,7 @@ func (d *Helper) evictPods(pods []corev1.Pod, evictionGroupVersion schema.GroupV | |||
if err == nil { | |||
returnCh <- nil | |||
} else { | |||
returnCh <- fmt.Errorf("error when waiting for pod %q terminating: %v", pod.Name, err) | |||
returnCh <- fmt.Errorf("error when waiting for pod %q in namespace %q to terminate: %v", pod.Name, pod.Namespace, err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not good at this, and only want to make it brief.
returnCh <- fmt.Errorf("error when waiting for pod %q in namespace %q to terminate: %v", pod.Name, pod.Namespace, err) | |
returnCh <- fmt.Errorf("error when waiting for pod %q/%q terminating: %v", pod.Namespace, pod.Name, err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We expect the {namespace}/{name}
semantics to be implicitly clear without surrounding words? There are other */{name}
conventions around, such as:
$ kubectl get --help | grep /NAME
kubectl get [(-o|--output=)j...] (TYPE[.VERSION][.GROUP] [NAME | -l label] | TYPE[.VERSION][.GROUP]/NAME ...) [flags] [options]
and while most error-message consumers will be able to distinguish {namespace}/{name}
from {type}/{name}
, the fact that folks will see both makes me personally prefer using a separate word in the error message to make it clear which string is the namespace.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
kubernetes/staging/src/k8s.io/kubectl/pkg/cmd/clusterinfo/clusterinfo_dump.go
Lines 268 to 270 in fca9f37
writer.Write([]byte(fmt.Sprintf("==== START logs for container %s of pod %s/%s ====\n", container.Name, pod.Namespace, pod.Name))) | |
defer writer.Write([]byte(fmt.Sprintf("==== END logs for container %s of pod %s/%s ====\n", container.Name, pod.Namespace, pod.Name))) | |
kubernetes/staging/src/k8s.io/kubectl/pkg/cmd/drain/drain.go
Lines 173 to 174 in fca9f37
fmt.Fprintf(o.Out, "pod %s/%s %s\n", pod.Namespace, pod.Name, verbStr) | |
} else { |
kubernetes/staging/src/k8s.io/kubectl/pkg/cmd/run/run.go
Lines 412 to 413 in fca9f37
Err: fmt.Errorf("pod %s/%s terminated (%s)\n%s", pod.Namespace, pod.Name, pod.Status.ContainerStatuses[0].State.Terminated.Reason, pod.Status.ContainerStatuses[0].State.Terminated.Message), | |
Code: int(rc), |
This style is widely used in the current code base.
And pod <ns>/<name>
is a warning message, not a structured output.
In kubectl get <x>/<y>
: this is not a log but a command, for the input part, the x has to be type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this concerns you, I would suggest doing a cleanup on the current code base(before it, I think this should be discussed in sig-cli.)
@@ -315,7 +315,7 @@ func (d *Helper) evictPods(pods []corev1.Pod, evictionGroupVersion schema.GroupV | |||
} else if apierrors.IsForbidden(err) && apierrors.HasStatusCause(err, corev1.NamespaceTerminatingCause) { | |||
// an eviction request in a deleting namespace will throw a forbidden error, | |||
// if the pod is not marked deleted, we retry until it is. | |||
fmt.Fprintf(d.ErrOut, "error when evicting pod %q (will retry after 5s): %v\n", activePod.Name, err) | |||
fmt.Fprintf(d.ErrOut, "error when evicting pod %q from terminating namespace %q (will retry after 5s): %v\n", activePod.Name, activePod.Namespace, err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fmt.Fprintf(d.ErrOut, "error when evicting pod %q from terminating namespace %q (will retry after 5s): %v\n", activePod.Name, activePod.Namespace, err) | |
fmt.Fprintf(d.ErrOut, ""error when evicting pod %q/%q (will retry after 5s): %v\n", activePod.Namespace, activePod.Name, err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't think that the fact that the namespace is terminating is interesting? If someone is investigating the logs for "why is ${POD}
failing to drain?", it seemed like ...terminating namespace "${NAMESPACE}"...
would help them focus on why that namespace termination was slow, instead of wasting their time poking at the pod specifics.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This makes sense to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/triage accepted
/priority backlog
/approve
/assign @pacoxu
for final tag after his comments are addressed
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: soltysh, wking The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/lgtm |
LGTM label has been added. Git tree hash: e610cc4f8eedbb8797e94180346f3927ffca5059
|
And also in the terminating-namespace log output. This makes it easier to track down drain-blocking pods, without having to hunt around in earlier logs for
evicting pod ...
messages. Before this change, caller logs might look like:With this change, they will look like:
What type of PR is this?
/kind cleanup
What this PR does / why we need it:
Makes it easier to identify and dig into drain-blocking pods.
Does this PR introduce a user-facing change?