Race-condition when starting job schedule #604

Pharb · 2023-07-11T09:39:08Z

Problem

Jobs are sometimes no longer executed by configured schedule.

Workaround

After server restart, jobs are running again as expected.

Cause

The suspected problem is in SchedulePing.checkActiveSchedule(). When startAllJobs() takes longer than interval, a new iteration of checkActiveSchedule() will run and concurrently run startAllJobs(), since startedJobs is not yet set to true. This causes stop() to be run on the JobScheduler which sets the jobExecutor to stopped. This will prevent the jobExecutor to decrease its running counter at the end of its execute function.

To solve, either the stopped flag in the jobExecutor needs to be cleared during the start of the JobScheduler, or it needs to be made sure that the start function is only run once. Either way it would make sense to only allow one execution of checkActiveSchedule to run in parallel.

Reproduce

See #603 for a test case demonstrating this race-condition.

The text was updated successfully, but these errors were encountered:

Pharb mentioned this issue Jul 11, 2023

Prevent race-condition causing job schedule starting multiple times #603

Merged

NiklasEi closed this as completed in bc0e219 Jul 14, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Race-condition when starting job schedule #604

Race-condition when starting job schedule #604

Pharb commented Jul 11, 2023 •

edited

Race-condition when starting job schedule #604

Race-condition when starting job schedule #604

Comments

Pharb commented Jul 11, 2023 • edited

Problem

Workaround

Cause

Reproduce

Pharb commented Jul 11, 2023 •

edited