Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(cron): minimum interval #13365

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

Joibel
Copy link
Member

@Joibel Joibel commented Jul 19, 2024

A new field in CronWorkflows minimumInterval which specifies a minimum monotonic time between workflow creations.

Motivation

This has two main uses I can think of

  • allow CronWorkflows which on first glance will run once a day to do so even in a daylight saving observing timezone.
  • allow "run every interval" workflows which run rapidly on first injection into a cluster and as soon as possible again if interval has passed after downtime of the controller.

This is also possible a generically useful change now that we have schedules which allow for much more complex scheduling.

Modifications

If we attempt to schedule inside minimum interval do not do so.

Verification

New e2e test

Tested both of the next DST changes forward and backward against the CronWorkflows in the documentation.

A new field in CronWorkflows `minimumInterval` which specifies a
minimum monotonic time between workflow creations.

This has two main uses I can think of
* allow CronWorkflows which on first glance will run once a day to do
so even in a daylight saving observing timezone.
* allow "run every interval" workflows which run rapidly on first
injection into a cluster and as soon as possible again if interval has
passed after downtime of the controller.

This is also possible a generically useful change now that we have `schedules`
which allow for much more complex scheduling.

If we attempt to schedule inside minimum interval do not do so.

New e2e test

Signed-off-by: Alan Clucas <alan@clucas.org>
Copy link
Member

@agilgur5 agilgur5 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The usefulness, esp in the DST example, makes sense. However, I'm wondering if there is precedent for this in other cron implementations? Since I imagine this has been handled before

@agilgur5 agilgur5 changed the title feat: minimum interval for CronWorkflows feat(cron): minimum interval Jul 19, 2024
@agilgur5 agilgur5 added the area/spec Changes to the workflow specification. label Jul 19, 2024
@Joibel
Copy link
Member Author

Joibel commented Jul 19, 2024

I did do some fairly extensive research into alternative golang schedulers, and didn't find any. Quartz (not implemented for go) didn't seem to do it either.

We could adopt/fork a scheduler and add it. But

  • implementing it either requires persistence (which is what I leverage here) or very complex algorithms in an already hard to implement interface.
  • I don't think we should do this for maintenance reasons anyway.

I thought about adopting gronx as an alternative cron scheduler. The user would have a choice per workflow. It would be relatively clean to do. But doesn't solve either problem this does.

If kubernetes ever move Cronjobs from robfig/cron we'd be best placed if we were able to follow them.

@agilgur5
Copy link
Member

Maybe a different question then, how do people currently work around this issue in existing schedulers? (other than, well, not using a TZ or time affected by DST)

@Joibel
Copy link
Member Author

Joibel commented Jul 19, 2024

Apart from those, I have no evidence of anyone doing it. It's a hard thing to research outside of cron, but no solutions for cron based scheduling anyway.

Copy link
Member

@agilgur5 agilgur5 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few docs/spec suggestions

| `timezone` | Machine timezone | [IANA Timezone](https://en.wikipedia.org/wiki/List_of_tz_database_time_zones) to run `Workflows`. Example: `America/Los_Angeles` |
| `suspend` | `false` | If `true` Workflow scheduling will not occur. Can be set from the CLI, GitOps, or directly |
| `concurrencyPolicy` | `Allow` | What to do if multiple `Workflows` are scheduled at the same time. `Allow`: allow all, `Replace`: remove all old before scheduling new, `Forbid`: do not allow any new while there are old |
| `minimumInterval` | None | The minimum interval between executions of this workflow. Examples: `90s`, `10m`, `2h` |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We use durations in a few places already, but I learned recently that apparently it's k8s convention to not expose them in fields since they are Go specific: https://github.com/kubernetes/community/blob/fb55d44/contributors/devel/sig-architecture/api-conventions.md#units

It also aligns with the startingDeadlineSeconds that's used for CronWorkflows already (and inherits from upstream k8s)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, TIL. Nice to know that.

Comment on lines +44 to +45
| Option Name | Default Value | Description |
|:----------------------------:|:----------------------:|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer if we didn't change the formatting for every row. In #13292 I specifically addressed that and tried to use a formatting that wouldn't require changing every other row if only one description changed

Comment on lines +112 to +141
#### Skip forward

You can use `minimumInterval` to schedule once per day, even if the time you want is in a daylight saving skip forward period where it would otherwise be scheduled twice.

An example 02:30:00 schedule

```yaml
schedules:
- 30 2 * * *
- 0 3 * * *
minimumInterval: 60m
```

The 3:00 run of the schedule will not be scheduled every day of the year except on the day when the clock leaps forward over 2:30.
In that case the 3:00 run will run.

#### Skip backwards (duplication)

You can use `minimumInterval` to schedule once per day, even if the time you want is in a daylight saving skip backwards period where it would otherwise not be scheduled.

An example 01:30:00 schedule

```yaml
schedule: 30 1 * * *
minimumInterval: 120m
```

This will schedule at the first 01:30 on a skip backwards change.
The second will not run because of `minimumInterval`.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
#### Skip forward
You can use `minimumInterval` to schedule once per day, even if the time you want is in a daylight saving skip forward period where it would otherwise be scheduled twice.
An example 02:30:00 schedule
```yaml
schedules:
- 30 2 * * *
- 0 3 * * *
minimumInterval: 60m
```
The 3:00 run of the schedule will not be scheduled every day of the year except on the day when the clock leaps forward over 2:30.
In that case the 3:00 run will run.
#### Skip backwards (duplication)
You can use `minimumInterval` to schedule once per day, even if the time you want is in a daylight saving skip backwards period where it would otherwise not be scheduled.
An example 01:30:00 schedule
```yaml
schedule: 30 1 * * *
minimumInterval: 120m
```
This will schedule at the first 01:30 on a skip backwards change.
The second will not run because of `minimumInterval`.
#### `minimumInterval`
You can set a minimum interval between runs.
For example, if you want to ensure you only run once a day during a DST leap forward:
```yaml
schedules:
- 30 2 * * *
- 0 3 * * *
minimumInterval: 60m
\```
The 3:00 schedule will never run except on the day the clock leaps forward over 2:30.
In that case, only the 3:00 schedule will run and the 2:30 will not.
Another example, if you want to ensure you only run once a day during a DST leap backward:
```yaml
schedule: 30 1 * * *
minimumInterval: 120m
\```
The first run will be at 01:30.
On the day the clock leaps backward, the second will not run because of `minimumInterval`.
  • I think we can simplify this a bit and combine the two sections as two examples. We can also reduce the verbal descriptions as the examples are a lot clearer than them (I found the initial descriptions to be a bit confusing and verbose. The descriptions afterward are good, though I simplified those as well)
  • Use consistent language with "runs" vs. "schedule", and "leap" vs "skip"
    • "leap" has a less technical terminology and is used often for DST (and infrequently for non-DST), so I think works a bit better as specific to DST

| `timezone` | Machine timezone | [IANA Timezone](https://en.wikipedia.org/wiki/List_of_tz_database_time_zones) to run `Workflows`. Example: `America/Los_Angeles` |
| `suspend` | `false` | If `true` Workflow scheduling will not occur. Can be set from the CLI, GitOps, or directly |
| `concurrencyPolicy` | `Allow` | What to do if multiple `Workflows` are scheduled at the same time. `Allow`: allow all, `Replace`: remove all old before scheduling new, `Forbid`: do not allow any new while there are old |
| `minimumInterval` | None | The minimum interval between executions of this workflow. Examples: `90s`, `10m`, `2h` |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
| `minimumInterval` | None | The minimum interval between executions of this workflow. Examples: `90s`, `10m`, `2h` |
| `minimumInterval` | None | The minimum time between runs. Examples: `90s`, `10m`, `2h` |

simplify the description and consistently use "runs". Also use "time" (vs "interval") like the field description, which also indicates the units.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/cron-workflows area/spec Changes to the workflow specification.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants