Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add CatchUp concurrency policy to Cronjob #79995

Open
kolorful opened this issue Jul 10, 2019 · 6 comments

Comments

Projects
None yet
5 participants
@kolorful
Copy link

commented Jul 10, 2019

What would you like to be added:

  1. Add a new option CatchUp to Concurrency Policy in Cronjob spec, once chosen, when cronjob controller sees a list of recentUnmetScheduleTimes, it will schedule from the oldest instead of the latest.
  2. Only check the limit on 100 max recentUnmetScheduleTimes when concurrencyPolicy is Allow
  3. Allow cronjob controller to pass a start_time and end_time argument to the Job it creates. (optional. This could be done in a separate issue)

Why is this needed:

  1. Currently if any task is missed, cronjob controller will skipped it immediately no matter what concurrencyPolicy is in use. This behavior, however, does not fit for cronjobs that need to know the exact time range of the data it's going to process. For example, I have a cronjob that runs hourly and only processes the logs within the previous hour. Say due to some issue and the job temporarily takes 3 hours to run, with current cronjob set up, many jobs will be skipped in the middle, and we have to manually find out and backfill them.

  2. I think the current cap on 100 seems not that useful because the scenario it tries to prevent does not exist in current code. Right now no matter how concurrencyPolicy is set or how many recentUnmetScheduleTimes there is, we only pick the latest one. I honestly think this number should be lower and configurable.

  3. This is going to be essential for CatchUp behavior, (see examples in Airflow)

One other alternative we've considered is to add a boolean catchUp in spec but it doesn't make sense with concurrencyPolicy: Replace, so that's why we pick this route.

We can work on this if this is something people wants, but we'd like check if it meets the standard.

@kolorful

This comment has been minimized.

Copy link
Author

commented Jul 10, 2019

/sig apps

@k8s-ci-robot k8s-ci-robot added sig/apps and removed needs-sig labels Jul 10, 2019

@kolorful

This comment has been minimized.

Copy link
Author

commented Jul 10, 2019

@kubernetes/sig-apps-feature-requests

@k8s-ci-robot

This comment has been minimized.

Copy link
Contributor

commented Jul 10, 2019

@kolorful: Reiterating the mentions to trigger a notification:
@kubernetes/sig-apps-feature-requests

In response to this:

@kubernetes/sig-apps-feature-requests

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@kow3ns

This comment has been minimized.

Copy link
Member

commented Jul 15, 2019

The SIG would be interested to pursue this idea if you could offer a KEP in SIG Apps in the [enhancements repo[(https://github.com/kubernetes/enhancements).

@kow3ns

This comment has been minimized.

Copy link
Member

commented Jul 15, 2019

/assign @mortent

@alice-sawatzky

This comment has been minimized.

Copy link

commented Jul 16, 2019

the cap on 100 missed jobs has caused some serious problems on our infrastructure; i would be hugely in favor of being able to reconfigure or disable this limit, or simply have it removed for CronJobs with a ConcurrencyPolicy of Forbid.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.