Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

improve log for pod deletion poll loop #49597

Merged
merged 1 commit into from
Jul 28, 2017

Conversation

jeffvance
Copy link
Contributor

@jeffvance jeffvance commented Jul 26, 2017

What this PR does / why we need it:
It improves some logging related to waiting for a pod to reach a passed-in condition. Specifically, related to issue 49529 where better logging may help to debug the root cause.

Release note:

NONE

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Jul 26, 2017
@k8s-github-robot k8s-github-robot added size/S Denotes a PR that changes 10-29 lines, ignoring generated files. release-note-label-needed release-note-none Denotes a PR that doesn't merit a release note. and removed release-note-label-needed labels Jul 26, 2017
@jeffvance
Copy link
Contributor Author

/assign @msau42
@msau42 I think this will help debug the above issue we discussed. Can you ptal? Thanks.

@jeffvance
Copy link
Contributor Author

/test-all

@jeffvance
Copy link
Contributor Author

/test pull-kubernetes-kubemark-e2e-gce

@jeffvance
Copy link
Contributor Author

/test pull-kubernetes-e2e-gce-etcd3

1 similar comment
@jeffvance
Copy link
Contributor Author

/test pull-kubernetes-e2e-gce-etcd3

@jeffvance
Copy link
Contributor Author

jeffvance commented Jul 26, 2017

@msau42 all tests pass now.
@erictune can you approve?

@copejon
Copy link
Contributor

copejon commented Jul 26, 2017

+1 Logging before condition() call. This will make tracing down the various polling errors less painful.

@@ -764,28 +764,28 @@ func waitForServiceAccountInNamespace(c clientset.Interface, ns, serviceAccountN
}

func WaitForPodCondition(c clientset.Interface, ns, podName, desc string, timeout time.Duration, condition podCondition) error {
Logf("Waiting up to %[1]v for pod %[2]s status to be %[3]s", timeout, podName, desc)
Logf("Waiting up to %v for pod %q in namespace %q to be %q", timeout, podName, ns, desc)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leave "status" in the message?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think no because leaving "status" in this message is misleading since the insert used is the passed-in desc. The passed in condition func may use the pod's Status to decide when to stop the loop but we don't know that here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess it's kind of strange because in the terminated case, the desc string == ""

Copy link
Contributor Author

@jeffvance jeffvance Jul 26, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When the loop ends for the pod-delete calls, Status.Reason is typically "" but sometimes I see other reasons. The desc value is passed in by the caller and meant to be descriptive of the condition AFAICT. I've created a new pr to deal with better detection of pod deletion and in this pr the reason parm will not be checked if it is "".

@msau42
Copy link
Member

msau42 commented Jul 26, 2017

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jul 26, 2017
@jeffvance
Copy link
Contributor Author

@rmmh @erictune Can you approve? I have lgtm and all tests pass. Would like to get this in to help debug some gci-gce-slow flakes. Thanks.

@jeffvance
Copy link
Contributor Author

@spiffxp Any chance you can take a look and approve? I have the lgtm and reviews and all tests pass. I'd like this in so I can get better logging on some gci-gce-slow flakes. Thanks.

@spiffxp
Copy link
Member

spiffxp commented Jul 27, 2017

/approve

@k8s-github-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jeffvance, msau42, spiffxp

Associated issue: 49529

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these OWNERS Files:

You can indicate your approval by writing /approve in a comment
You can cancel your approval by writing /approve cancel in a comment

@k8s-github-robot k8s-github-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jul 27, 2017
@k8s-github-robot
Copy link

Automatic merge from submit-queue (batch tested with PRs 49619, 49598, 47267, 49597, 49638)

@k8s-github-robot k8s-github-robot merged commit 5c874be into kubernetes:master Jul 28, 2017
k8s-github-robot pushed a commit that referenced this pull request Jul 31, 2017
Automatic merge from submit-queue (batch tested with PRs 49651, 49707, 49662, 47019, 49747)

improve detectability of deleted pods

**What this PR does / why we need it**:
Adds comment to `waitForPodTerminatedInNamespace` to better explain how it's implemented.
~~It improves pod deletion detection in the e2e framework as follows:~~
~~1.  the `waitForPodTerminatedInNamespace` func looks for pod.Status.Phase ==  _PodFailed_ or _PodSucceeded_ since both values imply that all containers have terminated.~~
~~2.  the `waitForPodTerminatedInNamespace` func also ignores the pod's Reason if the passed-in `reason` parm is "". Reason is not really relevant to the pod being deleted or not, but if the caller passes a non-blank `reason` then it will be lower-cased, de-blanked and compared to the pod's Reason (also lower-cased and de-blanked). The idea is to make Reason checking more flexible and to prevent a pod from being considered running when all of its containers have terminated just because of a Reason mis-match.~~

Releated to pr [49597](#49597) and issue [49529](#49529).

**Release note**:
```release-note
NONE
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note-none Denotes a PR that doesn't merit a release note. size/S Denotes a PR that changes 10-29 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants