Skip to content
This repository has been archived by the owner on Sep 30, 2020. It is now read-only.

Referencing the drainTimeout value in the NodeDrainer daemonset #1731

Conversation

HarryStericker
Copy link
Contributor

In a previous PR (#1722) I had attempted to stop node drainer scheduling on cordoned nodes once the timeout had elapsed (300s/5m). This is beneficial to me as otherwise nodeDrainer tries to redeploy itself onto the node after the 5 minutes have elapsed, and goes into a restart loop as kubelet is preventing things from being scheduled. When this happens our alerts go off.

This proved more difficult than planned, and I can not think of any other way to do so than placing a taint on all nodes that the daemonset does not have a toleration for. This is IMO not feasible and way too heavyweight.

We already have drainTimeout specified here in cluster.yml so it makes sense to use this value in the drain command also. This way I can increase the timeout allowing the node to drain and terminate within the allotted time.

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Sep 11, 2019
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
To complete the pull request process, please assign danielfm
You can assign the PR to them by writing /assign @danielfm in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. label Sep 11, 2019
@codecov-io
Copy link

Codecov Report

Merging #1731 into master will not change coverage.
The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff           @@
##           master    #1731   +/-   ##
=======================================
  Coverage   25.14%   25.14%           
=======================================
  Files          98       98           
  Lines        5027     5027           
=======================================
  Hits         1264     1264           
  Misses       3621     3621           
  Partials      142      142

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 9fe0077...484b691. Read the comment docs.

@dominicgunn dominicgunn added this to the v0.15.0 milestone Sep 12, 2019
@dominicgunn
Copy link
Contributor

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Sep 12, 2019
@HarryStericker HarryStericker changed the title Utilizing the parameterized drain timeout value Referencing the drainTimeout value in the NodeDrainer daemonset Sep 12, 2019
@davidmccormick davidmccormick merged commit 985e0a7 into kubernetes-retired:master Sep 18, 2019
@davidmccormick
Copy link
Contributor

Many thanks for your contribution! 🙏

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm Indicates that a PR is ready to be merged. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants