Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix cronjob controller page list err #77475

Merged

Conversation

@liucimin
Copy link
Contributor

commented May 6, 2019

What type of PR is this?

/kind bug

What this PR does / why we need it:
When the number of jobs exceeds 500, cronjob cannot schedule, bug of pager.List

Which issue(s) this PR fixes:

Fixes ##77465

Special notes for your reviewer:

Does this PR introduce a user-facing change?:
NONE

When the number of jobs exceeds 500, cronjob should schedule without error.
@k8s-ci-robot

This comment has been minimized.

Copy link
Contributor

commented May 6, 2019

Hi @liucimin. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot requested review from feiskyer, MrHohn and kubernetes/dep-approvers May 6, 2019

@liucimin liucimin closed this May 6, 2019

@liucimin liucimin force-pushed the liucimin:fix_cronjob_controller_pageerr branch from 8c2d0ee to c516bb5 May 6, 2019

@k8s-ci-robot k8s-ci-robot added approved size/XS and removed size/L labels May 6, 2019

@liucimin liucimin reopened this May 6, 2019

@cblecker cblecker removed the request for review from kubernetes/dep-approvers May 6, 2019

@liucimin

This comment has been minimized.

Copy link
Contributor Author

commented May 6, 2019

/kind bug
/sig apps

@jingyih
Copy link
Contributor

left a comment

Thanks for fixing the bug! Added few comments.

err = meta.EachListItem(jlTmp, func(object runtime.Object) error {
jobTmp, ok := object.(*batchv1.Job)
if !ok {
return fmt.Errorf("Object must be a batchv1.Job.")

This comment has been minimized.

Copy link
@jingyih

jingyih May 6, 2019

Contributor

Maybe helpful to also print the actual type of the object if it is not batchv1.Job?

This comment has been minimized.

Copy link
@liucimin

liucimin May 7, 2019

Author Contributor

Thx for the comments.
I have print the actual type in the new version.

if !ok {
utilruntime.HandleError(fmt.Errorf("expected type *batchv1.JobList, got type %T", jlTmp))

js := make([]batchv1.Job, 0)

This comment has been minimized.

Copy link
@jingyih

jingyih May 6, 2019

Contributor

maybe use make([]batchv1.Job, 0, len(jlTmp) to avoid re-allocating and copying when append happens?

This comment has been minimized.

Copy link
@liucimin

liucimin May 7, 2019

Author Contributor

Because of the comment from jpbetz , i use the pager.New(...).EachListItem().
So i could not get the length of the jobs before EachListItem.

})

if err != nil {
utilruntime.HandleError(fmt.Errorf("Failed to extract job list from %v", jlTmp))

This comment has been minimized.

Copy link
@jingyih

jingyih May 6, 2019

Contributor

include err in error message.

This comment has been minimized.

Copy link
@liucimin

liucimin May 7, 2019

Author Contributor

Ok.

pkg/controller/cronjob/cronjob_controller.go Outdated Show resolved Hide resolved
@@ -109,19 +110,29 @@ func (jm *Controller) syncAll() {
jobListFunc := func(opts metav1.ListOptions) (runtime.Object, error) {
return jm.kubeClient.BatchV1().Jobs(metav1.NamespaceAll).List(opts)
}

jlTmp, err := pager.New(pager.SimplePageFunc(jobListFunc)).List(context.Background(), metav1.ListOptions{})

This comment has been minimized.

Copy link
@jpbetz

jpbetz May 6, 2019

Contributor

Just call pager.New(...).EachListItem() instead?

This comment has been minimized.

Copy link
@jpbetz

jpbetz May 6, 2019

Contributor

And no point building up a slice just to do for _, sj := range sjs { } further below (for the CronJobs). Just replace the for with the EachListItem() call.

This comment has been minimized.

Copy link
@liucimin

liucimin May 7, 2019

Author Contributor

Thx for your comment.
I use the EachListItem to avoid the for _, sj := range sjs { }.

@jpbetz

This comment has been minimized.

Copy link
Contributor

commented May 6, 2019

/ok-to-test

@liucimin liucimin force-pushed the liucimin:fix_cronjob_controller_pageerr branch from 619522b to f68b12e May 7, 2019

@jpbetz

This comment has been minimized.

Copy link
Contributor

commented May 7, 2019

/lgtm

@jingyih

This comment has been minimized.

Copy link
Contributor

commented May 7, 2019

lgtm. Thanks!

@MrHohn
MrHohn approved these changes May 7, 2019
Copy link
Member

left a comment

/approve

@jpbetz

This comment has been minimized.

Copy link
Contributor

commented May 30, 2019

/assign @soltysh @smarterclayton Could we get approval for this one? It would be good to try to fix it for 1.15.

@soltysh
Copy link
Contributor

left a comment

/approve
/priority important-longterm

@k8s-ci-robot

This comment has been minimized.

Copy link
Contributor

commented May 30, 2019

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: liucimin, MrHohn, soltysh

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@smarterclayton
Copy link
Contributor

left a comment

Also please add a test that proves this works with >page size cronjobs

syncOne(&sj, jobsBySj[sj.UID], time.Now(), jm.jobControl, jm.sjControl, jm.recorder)
cleanupFinishedJobs(&sj, jobsBySj[sj.UID], jm.jobControl, jm.sjControl, jm.recorder)
if err != nil {
utilruntime.HandleError(fmt.Errorf("Failed to extract cronJobs list: %v", err))

This comment has been minimized.

Copy link
@smarterclayton

smarterclayton May 30, 2019

Contributor

This should be “CronJobs”

@smarterclayton

This comment has been minimized.

Copy link
Contributor

commented May 30, 2019

This is a bug and isn’t part of code freeze

@smarterclayton

This comment has been minimized.

Copy link
Contributor

commented May 30, 2019

We will backport this no matter what

@jpbetz

This comment has been minimized.

Copy link
Contributor

commented May 30, 2019

This is a bug and isn’t part of code freeze

Agreed, just getting the ball rolling. Thanks for the review.

@BenTheElder

This comment has been minimized.

Copy link
Member

commented May 31, 2019

^^^ podutils flake, the tests passed
/retest

@k8s-ci-robot k8s-ci-robot merged commit 6feea43 into kubernetes:master May 31, 2019

21 checks passed

cla/linuxfoundation liucimin authorized
Details
pull-kubernetes-bazel-build Job succeeded.
Details
pull-kubernetes-bazel-test Job succeeded.
Details
pull-kubernetes-conformance-image-test Skipped.
pull-kubernetes-cross Skipped.
pull-kubernetes-dependencies Job succeeded.
Details
pull-kubernetes-e2e-gce Job succeeded.
Details
pull-kubernetes-e2e-gce-100-performance Job succeeded.
Details
pull-kubernetes-e2e-gce-csi-serial Skipped.
pull-kubernetes-e2e-gce-device-plugin-gpu Job succeeded.
Details
pull-kubernetes-e2e-gce-storage-slow Skipped.
pull-kubernetes-godeps Skipped.
pull-kubernetes-integration Job succeeded.
Details
pull-kubernetes-kubemark-e2e-gce-big Job succeeded.
Details
pull-kubernetes-local-e2e Skipped.
pull-kubernetes-node-e2e Job succeeded.
Details
pull-kubernetes-node-e2e-containerd Job succeeded.
Details
pull-kubernetes-typecheck Job succeeded.
Details
pull-kubernetes-verify Job succeeded.
Details
pull-publishing-bot-validate Skipped.
tide In merge pool.
Details
@tanoda

This comment has been minimized.

Copy link

commented Jun 12, 2019

I want to add a keyword for search: expected type *batchv1.JobList, got type *internalversion.List on cronjob_controller.go

@puja108

This comment has been minimized.

Copy link
Contributor

commented Jun 19, 2019

We are experiencing the issue (#77465) in production on 1.14, but it seems this is currently only included in the to be released 1.15. Any plans to cherrypick for 1.14?

@jpbetz

This comment has been minimized.

Copy link
Contributor

commented Jun 19, 2019

We are experiencing the issue (#77465) in production on 1.14, but it seems this is currently only included in the to be released 1.15. Any plans to cherrypick for 1.14?

Yes (see #77475 (comment)). I've opened #79178. Issue was introduced on 1.14 so we only need the one cherrypick.

k8s-ci-robot added a commit that referenced this pull request Sep 11, 2019
Merge pull request #79178 from jpbetz/automated-cherry-pick-of-#77475…
…-origin-release-1.14

Automated cherry pick of #77475: fix the metainternalversion.List change error
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.