New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
API-initiated eviction: handle deleteOptions correctly #116554
API-initiated eviction: handle deleteOptions correctly #116554
Conversation
what PR regressed this? |
a7e51d6
to
4ca6333
Compare
} | ||
_, _, err := r.store.Delete(ctx, name, rest.ValidateAllObjectFunc, options) | ||
_, _, err = r.store.Delete(ctx, name, rest.ValidateAllObjectFunc, deleteOptionsCopy) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
one think that could be problematic is if somebody does update after our update, before we try to delete the pod. In that case we should do a rollback of the condition. Do we care about such a race? Should I implement the rollback? @liggitt
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know... I thought we had a controller reconciling bogus conditions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have a disruption controller which cleans up stale DisruptionTarget conditions:
func (dc *DisruptionController) syncStalePodDisruption(ctx context.Context, key string) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool, I think, because it is an unlikely race, it should be enough to leave it to the controller
} | ||
_, _, err := r.store.Delete(ctx, name, rest.ValidateAllObjectFunc, options) | ||
_, _, err = r.store.Delete(ctx, name, rest.ValidateAllObjectFunc, deleteOptionsCopy) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know... I thought we had a controller reconciling bogus conditions
@atiratree thank you for working on the fix. I think we should cherry-pick it to 1.26. |
4ca6333
to
2e5222c
Compare
+1 for the cherry-pick |
Thanks for the link. Is this a kubelet issue with kubemark? It seems like the test that failed is |
Filed #116696 for the race |
I think it's kubemark or the kubemark unit test |
} | ||
|
||
eviction := newV1Eviction(ns.Name, updatedPod.Name, deleteOption) | ||
err = clientSet.CoreV1().RESTClient().Post(). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can use noRetriesRESTClient here as well and call the normal clientset Evict method
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated, thanks!
fbdd4e3
to
bda1607
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not an apiserver expert, but:
/lgtm
LGTM label has been added. Git tree hash: ccfc0a2e2ad90f7a66fcea47e11ddaa2ef8a40c7
|
/sig api-machinery |
when adding a DisruptionTarget condition into a pod that will be deleted - handle ResourceVersion and Preconditions correctly - handle DryRun option correctly Co-authored-by: Jordan Liggitt jordan@liggitt.net
bda1607
to
51c0e23
Compare
/lgtm |
LGTM label has been added. Git tree hash: 0a0e050fc81b12e21b5c9fe7efeb0f55dc36e63e
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: atiratree, liggitt The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
once this passes tests, go ahead and open a cherry-pick to release-1.26 |
flaky test Test_Run_OneVolumeDetachFailNodeWithReadWriteOnce |
opened a cherry-pick #116750 |
…16554-upstream-release-1.26 Automated cherry pick of #116554: API-initiated eviction: handle deleteOptions correctly
What type of PR is this?
/kind bug
/kind regression
What this PR does / why we need it:
when adding a DisruptionTarget condition into a pod that will be deleted by an eviction
Which issue(s) this PR fixes:
Fixes #116552
Special notes for your reviewer:
more details in the issue
Does this PR introduce a user-facing change?
Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.: