-
Notifications
You must be signed in to change notification settings - Fork 38.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix cronjob status reconciliation when job template labels change #107997
Fix cronjob status reconciliation when job template labels change #107997
Conversation
Welcome @d-honeybadger! |
Hi @d-honeybadger. Thanks for your PR. I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/assign @soltysh |
/sig apps |
/ok-to-test This definitely requires a release note. |
For comparison, in the job controller we list all the pods for a similar reason: kubernetes/pkg/controller/job/job_controller.go Lines 558 to 560 in 3866cb9
So I don't see why we shouldn't do the same for the cronjob controller. I'll have to defer to @soltysh as I don't have the historical background. |
Ah yes, added. |
You should probably add a test too :) |
@janetkuo @kow3ns do you have any thoughts on #107997 (comment) ? |
The reason that we didn't do this for the original CronJob controller is that it doesn't use watches, see the ControllerRef design doc for more details. Given that CronJob v2 now uses watches, it makes sense to make the changes accordingly. |
Makes sense. @d-honeybadger could you add a unit test for the change? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch, I'll dig through perf data and see how good/bad this is, alternatively we could add a separate indexer for jobs, if we needed, but that can be done separately from this PR.
But I definitely would like to see unit test added like @alculquicondor mentioned
/approve
/triage accepted
/priority important-longterm
/assign @alculquicondor
he has the final lgtm on it
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: d-honeybadger, soltysh The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
e62e462
to
63f9f1f
Compare
@alculquicondor 👍 Added a test and a comment for good measure. |
Name: "foo-nonmatching-label", | ||
Labels: map[string]string{"key": "different-value"}, | ||
OwnerReferences: []metav1.OwnerReference{{Name: "fooer", Controller: &trueRef}}}, | ||
}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add one that has a different owner reference.
63f9f1f
to
fb094dc
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
Thanks!
Thanks for fixing this! I'm curious if this bug will get backported to any older versions of k8s? Or just released in the 1.24 release? |
This qualifies for backporting as it's a regression. |
@alculquicondor not done this before so let me know if I got something wrong. But I've created 3 cherrypick PRs, one each for 1.21, 1.22, 1.23. I hope that's helpful :) |
…-of-#107997-upstream-release-1.22 Automated cherry pick of #107997: cronjob_controllerv2: do not filter jobs to be reconciled by
…-of-#107997-upstream-release-1.23 Automated cherry pick of #107997: cronjob_controllerv2: do not filter jobs to be reconciled by
…-of-#107997-upstream-release-1.21 Automated cherry pick of #107997: cronjob_controllerv2: do not filter jobs to be reconciled by
What type of PR is this?
/kind bug
/kind regression
What this PR does / why we need it:
In CronJobController V2, the controller only considers jobs with labels that match current cronjob's job template labels. This leads to the controller losing track of jobs owned by a given cronjob when those template labels change. For example, if job template labels were updated while the cronjob was in the
active
status, that cronjob will be stuck asactive
, since the controller will never find the completed job whose labels now don't match those of the job template.Which issue(s) this PR fixes:
Fixes #106496
Special notes for your reviewer:
I see that in #93370 filtering by label was added for performance. If anyone has ideas on how to keep that performance improvement while maintaining correct behavior, I'd be happy to close this and work on another solution.
Does this PR introduce a user-facing change?
Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.: