New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pkg/daemon: use upstream kubectl/pkg/drain #1571
pkg/daemon: use upstream kubectl/pkg/drain #1571
Conversation
0a5526e
to
921e4f2
Compare
/retest |
921e4f2
to
7b091d6
Compare
Get rid of openshift/kubernetes-drain and use the upstream kubectl/pkg/drain pkg. MAO has switched to that as well and it's the right way forward since the old drain pkg in openshift and cluster-apis obsolete. Signed-off-by: Antonio Murdaca <runcom@linux.com>
7b091d6
to
c792d65
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can confirm that existing drain behaviour is the same.
/retest |
4 similar comments
/retest |
/retest |
/retest |
/retest |
/lgtm |
/test e2e-gcp-op |
/test e2e-gcp-upgrade |
/retest Please review the full test history for this PR and help us cut down flakes. |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: ashcrow, runcom, sinnykumari, yuqi-zhang The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
IgnoreAllDaemonSets: true, | ||
DeleteLocalData: true, | ||
GracePeriodSeconds: -1, | ||
Timeout: 20 * time.Second, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems this isn't enough for certain pods like the router (we bumped that to 90s in master, and MAO still has that at 20s) - this is related to https://bugzilla.redhat.com/show_bug.cgi?id=1843998 and it seems that the router can't be evicted because this drain isn't respecting the 60m graceful termination period and also, the error we get is about not respecting the PDB.
@michaelgugino @enxebre do you know more about this behavior or what is best? This patch regresses the behavior as we were waiting "forever" before this patch, and then only 20s/90s do you have any inputs here?
cc @danehans
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Timeout here is just how long the drain library will run before giving up (and then drain will be run again on a subsequent requeue).
GracePeriodSeconds is the amount of time that the eviction API will set as grace period if PDBs allow deleting the pod. -1 (IIRC) means use the pod's specified grace period.
Get rid of openshift/kubernetes-drain and use the upstream kubectl/pkg/drain pkg.
MAO has switched to that as well and it's the right way forward since the old
drain pkg in openshift and cluster-apis obsolete.
Closes #71 also
This is a pre-req to fix https://bugzilla.redhat.com/show_bug.cgi?id=1814241
Signed-off-by: Antonio Murdaca runcom@linux.com