Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update schedule logic to properly calculate missed schedules #118724

Merged
merged 1 commit into from Jun 27, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
24 changes: 22 additions & 2 deletions pkg/controller/cronjob/utils.go
Expand Up @@ -72,7 +72,8 @@ func deleteFromActiveList(cj *batchv1.CronJob, uid types.UID) {
// mostRecentScheduleTime returns:
// - the last schedule time or CronJob's creation time,
// - the most recent time a Job should be created or nil, if that's after now,
// - number of missed schedules
// - very rough approximation of number of missed schedules (based on difference
// between two schedules),
// - error in an edge case where the schedule specification is grammatically correct,
// but logically doesn't make sense (31st day for months with only 30 days, for example).
func mostRecentScheduleTime(cj *batchv1.CronJob, now time.Time, schedule cron.Schedule, includeStartingDeadlineSeconds bool) (time.Time, *time.Time, int64, error) {
Expand Down Expand Up @@ -106,10 +107,29 @@ func mostRecentScheduleTime(cj *batchv1.CronJob, now time.Time, schedule cron.Sc
if timeBetweenTwoSchedules < 1 {
return earliestTime, nil, 0, fmt.Errorf("time difference between two schedules is less than 1 second")
}
// this logic used for calculating number of missed schedules does a rough
// approximation, by calculating a diff between two schedules (t1 and t2),
// and counting how many of these will fit in between last schedule and now
timeElapsed := int64(now.Sub(t1).Seconds())
numberOfMissedSchedules := (timeElapsed / timeBetweenTwoSchedules) + 1
mostRecentTime := time.Unix(t1.Unix()+((numberOfMissedSchedules-1)*timeBetweenTwoSchedules), 0).UTC()

var mostRecentTime time.Time
// to get the most recent time accurate for regular schedules and the ones
// specified with @every form, we first need to calculate the potential earliest
// time by multiplying the initial number of missed schedules by its interval,
// this is critical to ensure @every starts at the correct time, this explains
// the numberOfMissedSchedules-1, the additional -1 serves there to go back
// in time one more time unit, and let the cron library calculate a proper
// schedule, for case where the schedule is not consistent, for example
// something like 30 6-16/4 * * 1-5
potentialEarliest := t1.Add(time.Duration((numberOfMissedSchedules-1-1)*timeBetweenTwoSchedules) * time.Second)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Upon second thought, this will not work in few corner scenarios.If the schedule is such that if the difference between now and the actual most recent time is greater than twice the timeBetweenTwoSchedules, then mostRecentTime would be returned as empty. An example test case which would fail would be

{
			name: "complex schedule",
			cj: &batchv1.CronJob{
				ObjectMeta: metav1.ObjectMeta{
					CreationTimestamp: metav1TopOfTheHour,
				},
				Spec: batchv1.CronJobSpec{
					Schedule: "30 10,11,12 * * 1-5",
				},
				Status: batchv1.CronJobStatus{
					LastScheduleTime: &metav1HalfPastTheHour,
				},
			},
			now:                    *deltaTimeAfterTopOfTheHour(30*time.Hour + 30*time.Minute),
			expectedRecentTime:     deltaTimeAfterTopOfTheHour(26*time.Hour + 30*time.Minute),
			expectedEarliestTime:   *deltaTimeAfterTopOfTheHour(30 * time.Minute),
			expectedNumberOfMisses: 2,
		},

Copy link
Member

@helayoty helayoty Jun 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we expect the number of misses to be five (5) here?

  • 2016-05-19 T 10:30 (expectedEarliestTime)
  • 2016-05-19 T 11:30
  • 2016-05-19 T 12:30
  • 2016-05-20 T 10:30
  • 2016-05-20 T 11:30
  • 2016-05-20 T 12:30 (expectedRecentTime)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@helayoty yes that's correct, the expectedNumberOfMisses should be 5 but since numberOfMissedSchedules is an approximate, i was concentrating more on the failure with respective expectedRecentTime

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my other comment below about missed schedules.

for t := schedule.Next(potentialEarliest); !t.After(now); t = schedule.Next(t) {
mostRecentTime = t
}

if mostRecentTime.IsZero() {
return earliestTime, nil, numberOfMissedSchedules, nil
}
return earliestTime, &mostRecentTime, numberOfMissedSchedules, nil
}

Expand Down
90 changes: 90 additions & 0 deletions pkg/controller/cronjob/utils_test.go
Expand Up @@ -290,7 +290,9 @@ func TestByJobStartTime(t *testing.T) {
func TestMostRecentScheduleTime(t *testing.T) {
metav1TopOfTheHour := metav1.NewTime(*topOfTheHour())
metav1HalfPastTheHour := metav1.NewTime(*deltaTimeAfterTopOfTheHour(30 * time.Minute))
metav1MinuteAfterTopOfTheHour := metav1.NewTime(*deltaTimeAfterTopOfTheHour(1 * time.Minute))
oneMinute := int64(60)
tenSeconds := int64(10)

tests := []struct {
name string
Expand Down Expand Up @@ -346,6 +348,94 @@ func TestMostRecentScheduleTime(t *testing.T) {
expectedEarliestTime: *deltaTimeAfterTopOfTheHour(10 * time.Second),
expectedNumberOfMisses: 5,
},
{
name: "complex schedule",
cj: &batchv1.CronJob{
ObjectMeta: metav1.ObjectMeta{
CreationTimestamp: metav1TopOfTheHour,
},
Spec: batchv1.CronJobSpec{
Schedule: "30 6-16/4 * * 1-5",
},
Status: batchv1.CronJobStatus{
LastScheduleTime: &metav1HalfPastTheHour,
},
},
now: *deltaTimeAfterTopOfTheHour(24*time.Hour + 31*time.Minute),
expectedRecentTime: deltaTimeAfterTopOfTheHour(24*time.Hour + 30*time.Minute),
expectedEarliestTime: *deltaTimeAfterTopOfTheHour(30 * time.Minute),
expectedNumberOfMisses: 2,
},
{
name: "another complex schedule",
cj: &batchv1.CronJob{
ObjectMeta: metav1.ObjectMeta{
CreationTimestamp: metav1TopOfTheHour,
},
Spec: batchv1.CronJobSpec{
Schedule: "30 10,11,12 * * 1-5",
},
Status: batchv1.CronJobStatus{
LastScheduleTime: &metav1HalfPastTheHour,
},
},
now: *deltaTimeAfterTopOfTheHour(30*time.Hour + 30*time.Minute),
expectedRecentTime: nil,
expectedEarliestTime: *deltaTimeAfterTopOfTheHour(30 * time.Minute),
expectedNumberOfMisses: 30,
},
{
name: "complex schedule with longer diff between executions",
cj: &batchv1.CronJob{
ObjectMeta: metav1.ObjectMeta{
CreationTimestamp: metav1TopOfTheHour,
},
Spec: batchv1.CronJobSpec{
Schedule: "30 6-16/4 * * 1-5",
},
Status: batchv1.CronJobStatus{
LastScheduleTime: &metav1HalfPastTheHour,
},
},
now: *deltaTimeAfterTopOfTheHour(96*time.Hour + 31*time.Minute),
expectedRecentTime: deltaTimeAfterTopOfTheHour(96*time.Hour + 30*time.Minute),
expectedEarliestTime: *deltaTimeAfterTopOfTheHour(30 * time.Minute),
expectedNumberOfMisses: 6,
},
{
name: "complex schedule with shorter diff between executions",
cj: &batchv1.CronJob{
ObjectMeta: metav1.ObjectMeta{
CreationTimestamp: metav1TopOfTheHour,
},
Spec: batchv1.CronJobSpec{
Schedule: "30 6-16/4 * * 1-5",
},
},
now: *deltaTimeAfterTopOfTheHour(24*time.Hour + 31*time.Minute),
expectedRecentTime: deltaTimeAfterTopOfTheHour(24*time.Hour + 30*time.Minute),
expectedEarliestTime: *topOfTheHour(),
expectedNumberOfMisses: 7,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The expectedNumberOfMisses should be four(4) only; (10:30, 14:30, 6:30, and 10:30)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm aware, the error comes from calculating the diff between schedules, which will depend on when it's calculated, so I wouldn't take much care into that value, I'm even inclined to drop that number entirely, if it's confusing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was discussed on slack, in followup we'll get rid of exposing that information from this method entirely.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#118940 is the promised followup dropping that number

},
{
name: "@every schedule",
cj: &batchv1.CronJob{
ObjectMeta: metav1.ObjectMeta{
CreationTimestamp: metav1.NewTime(*deltaTimeAfterTopOfTheHour(-59 * time.Minute)),
},
Spec: batchv1.CronJobSpec{
Schedule: "@every 1h",
StartingDeadlineSeconds: &tenSeconds,
},
Status: batchv1.CronJobStatus{
LastScheduleTime: &metav1MinuteAfterTopOfTheHour,
},
},
now: *deltaTimeAfterTopOfTheHour(7 * 24 * time.Hour),
expectedRecentTime: deltaTimeAfterTopOfTheHour((6 * 24 * time.Hour) + 23*time.Hour + 1*time.Minute),
expectedEarliestTime: *deltaTimeAfterTopOfTheHour(1 * time.Minute),
expectedNumberOfMisses: 167,
},
{
name: "rogue cronjob",
cj: &batchv1.CronJob{
Expand Down