-
Notifications
You must be signed in to change notification settings - Fork 38.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Updated Regional PD failover test to use node taints instead of instance group deletion #69700
Updated Regional PD failover test to use node taints instead of instance group deletion #69700
Conversation
test/e2e/storage/regional_pd.go
Outdated
framework.ExpectNoError(framework.WaitForReadyNodes(c, nodeCount, framework.RestartNodeReadyAgainTimeout), | ||
"Error waiting for nodes from the new instance group to become ready.") | ||
framework.Logf("removing previously added node taints") | ||
nodesInZone, err = c.CoreV1().Nodes().List(metav1.ListOptions{LabelSelector: selector.String()}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do you need to call this again or can you reuse the nodes list from above? Any error here could skip the taint removal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep the list can be reused. Thanks for the catch!
test/e2e/storage/regional_pd.go
Outdated
} | ||
} | ||
node.Spec.Taints = newTaints | ||
_, err := c.CoreV1().Nodes().Update(&node) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you might need to retry the update a few times. Node currently does a heartbeat every few seconds by updating the node object.
test/e2e/storage/regional_pd.go
Outdated
@@ -44,6 +48,7 @@ import ( | |||
const ( | |||
pvDeletionTimeout = 3 * time.Minute | |||
statefulSetReadyTimeout = 3 * time.Minute | |||
taintKey = "zoneTaint" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would make this key name more specific to make it easier to debug. Maybe use the test namespace name?
2908853
to
38c5866
Compare
Addressed comments |
/lgtm |
/approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: msau42, verult The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
…nce group deletion
38c5866
to
8ee882a
Compare
Rebased and resolved an import conflict. PTAL |
/lgtm |
/kind failing-test |
/retest |
…-release-1.12 Automated cherry pick of #69700: Updated Regional PD failover test to use node taints instead
What this PR does / why we need it: The existing failover mechanism (deleting GCE managed instance groups) relies on the wrong NodeInstanceGroup parameter format from the test context for regional clusters, and there is no simple way to continue using the instance group technique.
This PR fixes the problem and also addresses a previous TODO left in the test by converting to tainting nodes instead.
Need to be cherry-picked to 1.12
Which issue(s) this PR fixes (optional, in
fixes #<issue number>(, fixes #<issue_number>, ...)
format, will close the issue(s) when PR gets merged):Fixes #69005
Release note:
/sig storage
/assign @msau42