Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix eks restart pods helm #10351

Merged
merged 1 commit into from
Mar 5, 2020
Merged

Conversation

tom-hadlaw-hs
Copy link
Contributor

@tom-hadlaw-hs tom-hadlaw-hs commented Feb 26, 2020

Signed-off-by: Tom Hadlaw thomas.hadlaw@hootsuite.com

Please ensure your pull request adheres to the following guidelines:

  • For first time contributors, read Submitting a pull request
  • All code is covered by unit and/or runtime tests where feasible.
  • All commits contain a well written commit description including a title,
    description and a Fixes: #XXX line if the commit addresses a particular
    GitHub issue.
  • All commits are signed off. See the section Developer’s Certificate of Origin
  • Provide a title or release-note blurb suitable for the release notes.
  • Thanks for contributing!

Follow up bugfix's for my previous pull request. One problem is that helm actually breaks the docker --format argument value as it considers it a template value. This causes the conditional to always be false and for the container restart to never happen on EKS nodes.
Secondly, upon further testing, it seems like kubelet can fail due if the container is removed: (rpc error: code = Unknown desc = unable to inspect docker image ...).

I'm not exactly sure why this didn't happen during my first round of testing but I propose to just kill the container and let Kubelet manage the remaining container state (docker kill just sends sigkill to the container process, wheraes rm removes all the container state as well). The container does get cleaned up afterwards.

fixes: #9571


This change is Reviewable

@tom-hadlaw-hs tom-hadlaw-hs requested a review from a team as a code owner February 26, 2020 19:24
@tom-hadlaw-hs tom-hadlaw-hs requested a review from a team February 26, 2020 19:24
@maintainer-s-little-helper
Copy link

Release note label not set, please set the appropriate release note.

@coveralls
Copy link

coveralls commented Feb 26, 2020

Coverage Status

Coverage increased (+0.04%) to 45.687% when pulling e611448 on tom-hadlaw-hs:fix-eks-restart-pods into f1a8fc1 on cilium:master.

Copy link
Member

@jrajahalme jrajahalme left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, but please add "Fixes: #" to the commit message so that we can track backports effectively.

@jrajahalme
Copy link
Member

Also add the description to the commit message, thanks!

Copy link
Member

@aanm aanm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it seems cilium.yaml was added by accident

@aanm aanm added pending-review release-note/bug This PR fixes an issue in a previous release of Cilium. labels Feb 27, 2020
@tom-hadlaw-hs
Copy link
Contributor Author

@jrajahalme I added the original issue from the first pull request

Fixes issue where Helm breaks the node init script when restartPods is enabled.

Fixes issue where removing container can cause issues with Kubelet scheduling.

fixes: cilium#9571

Signed-off-by: Tom Hadlaw <thomas.hadlaw@hootsuite.com>
@aanm
Copy link
Member

aanm commented Mar 4, 2020

test-me-please

@aanm aanm requested a review from jrajahalme March 4, 2020 09:32
@joestringer
Copy link
Member

test-me-please

@tgraf tgraf merged commit 99e6bc8 into cilium:master Mar 5, 2020
1.8.0 automation moved this from In progress to Merged Mar 5, 2020
@joestringer joestringer added this to Needs backport from master in 1.7.2 Mar 5, 2020
@tklauser tklauser moved this from Needs backport from master to Backport done to v1.7 in 1.7.2 Mar 23, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release-note/bug This PR fixes an issue in a previous release of Cilium.
Projects
No open projects
1.7.2
Backport done to v1.7
1.8.0
  
Merged
Development

Successfully merging this pull request may close these issues.

DaemonSet Pods scheduled onto new Nodes sometimes are not managed
7 participants