Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add timeout for polling NEG health status for pod readiness #834

Merged
merged 1 commit into from
Aug 28, 2019

Conversation

freehan
Copy link
Contributor

@freehan freehan commented Aug 27, 2019

prevent pod to be stuck in unready state due to misconfiguration

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. approved Indicates a PR has been approved by an approver from all required OWNERS files. labels Aug 27, 2019
// 3. syncPod patches the neg readiness condition to be false
reason = negNotReadyReason
message = fmt.Sprintf("Waiting for pod to become healthy in at least one of the NEG(s): %v", negs)
// check if the pod has been waiting for the endpoint to show up as Healthy in NEG for too long
Copy link
Member

@bowei bowei Aug 27, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we make things return early for the code to be easier to reason about

if len(negs) == 0 {
  expectedCondition.Status = ...
  expectedCondition... = ...
  return r.ensurePodNegCondition(pod, expectedCondition)
}

and so forth (get rid of the else's)

pkg/neg/readiness/reflector.go Show resolved Hide resolved
{
desc: "timeout waiting for endpoint to become healthy in NEGs",
mutateState: func() {
//pod := generatePodWithTimestamp(testNamespace, podName, true, false, false, now.Truncate(unreadyTimeout))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

delete

Type: shared.NegReadinessGate,
Reason: negReadyTimedOutReason,
Status: v1.ConditionTrue,
Message: fmt.Sprintf("Timeout waiting for pod to become healthy in at least one of the NEG(s): %v. Marking condition %q to True.", []string{"neg1", "neg2"}, shared.NegReadinessGate),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is pretty fragile, I guess it's ok since the code is so close the test

negReadyReason = "LoadBalancerNegReady"
negNotReadyReason = "LoadBalancerNegNotReady"
maxRetries = 15
negReadyReason = "LoadBalancerNegReady"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

might want to document each of these

@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Aug 27, 2019
@freehan
Copy link
Contributor Author

freehan commented Aug 27, 2019

Addressed the comments.

@bowei
Copy link
Member

bowei commented Aug 28, 2019

/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Aug 28, 2019
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: bowei, freehan

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot merged commit 010d95d into kubernetes:master Aug 28, 2019
k8s-ci-robot added a commit that referenced this pull request Sep 10, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants