Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kubelet: retry pod sandbox creation when containers were never created #79451

Merged
merged 1 commit into from
Jun 27, 2019

Conversation

yujuhong
Copy link
Contributor

If kubelet never gets past sandbox creation (i.e., never attempted to
create containers for a pod), it should retry the sandbox creation on
failure, regardless of the restart policy of the pod.

What type of PR is this?

Uncomment only one /kind <> line, hit enter to put that in a new line, and remove leading whitespaces from that line:

/kind api-change

/kind bug

/kind cleanup
/kind design
/kind documentation
/kind failing-test
/kind feature
/kind flake

What this PR does / why we need it:

Which issue(s) this PR fixes:

Fixes #79398

Special notes for your reviewer:

Does this PR introduce a user-facing change?:

Fix a bug where kubelet would not retry pod sandbox creation when the restart policy of the pod is Never

If kubelet never gets past sandbox creation (i.e., never attempted to
create containers for a pod), it should retry the sandbox creation on
failure, regardless of the restart policy of the pod.
@yujuhong yujuhong added kind/bug Categorizes issue or PR as related to a bug. priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. sig/node Categorizes an issue or PR as relevant to SIG Node. labels Jun 27, 2019
@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Jun 27, 2019
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: yujuhong

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 27, 2019
@yujuhong
Copy link
Contributor Author

The regression was introduced in 1.13. We need to patch 1.15, 1.14, and 1.13.

@yujuhong
Copy link
Contributor Author

The original change was introduced in #68980 /cc @derekwaynecarr

@@ -472,11 +472,15 @@ func (m *kubeGenericRuntimeManager) computePodActions(pod *v1.Pod, podStatus *ku
// If we need to (re-)create the pod sandbox, everything will need to be
// killed and recreated, and init containers should be purged.
if createPodSandbox {
if !shouldRestartOnFailure(pod) && attempt != 0 {
if !shouldRestartOnFailure(pod) && attempt != 0 && len(podStatus.ContainerStatuses) != 0 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we also add check len(podStatus.InitContainerStatuses) != 0?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a kubelet's internal type; there's no initcontianerstatuses.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The unit tests added show that this would work for init containers too.

Copy link
Contributor

@mattjmcnaughton mattjmcnaughton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

modulo the open question about init containers this looks good to me! Thanks for the quick fix and for adding a test case.

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 27, 2019
@yujuhong
Copy link
Contributor Author

/retest

@yujuhong
Copy link
Contributor Author

/cc @dashpole

@dashpole
Copy link
Contributor

/lgtm
Thanks for the fix!

@yujuhong
Copy link
Contributor Author

/retest

Don't think this change would affect pull-kubernetes-kubemark-e2e-gce-big

@k8s-ci-robot k8s-ci-robot merged commit b51f621 into kubernetes:master Jun 27, 2019
k8s-ci-robot added a commit that referenced this pull request Jun 28, 2019
…51-upstream-release-1.14

Automated cherry pick of #79451: kubelet: retry pod sandbox creation when containers were
k8s-ci-robot added a commit that referenced this pull request Jun 28, 2019
…51-upstream-release-1.15

Automated cherry pick of #79451: kubelet: retry pod sandbox creation when containers were
@johnpipi
Copy link

johnpipi commented Jan 6, 2020

from what version of kubernetes is this fix part of?

@ashb
Copy link

ashb commented Oct 12, 2020

from what version of kubernetes is this fix part of?

1.16+

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/kubelet cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/bug Categorizes issue or PR as related to a bug. lgtm "Looks good to me", indicates that a PR is ready to be merged. priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/node Categorizes an issue or PR as relevant to SIG Node. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

kubelet won't retry PodSandbox creation for pods with restart policy "Never"
7 participants