Skip to content

fix(scheduler) Make schedule changes take effect immediately#590

Merged
markstory merged 1 commit intomainfrom
feat-schedule-reset
Apr 15, 2026
Merged

fix(scheduler) Make schedule changes take effect immediately#590
markstory merged 1 commit intomainfrom
feat-schedule-reset

Conversation

@markstory
Copy link
Copy Markdown
Member

We had feedback a few months back that changes in schedules should take effect immediately, and these changes implement that with backwards compatibility for existing run state.

The scenario that came up was a task was moved from every 3 hours to every 10 minutes, and the developer was surprised when their task had to wait for 2+hrs to see their task run again.

Refs STREAM-676

We had feedback a few months back that changes in schedules should take
effect immediately, and these changes implement that with backwards
compatibility for existing run state.

The scenario that came up was a task was moved from every 3 hours to
every 10 minutes, and the developer was surprised when their task had to
wait for 2+hrs to see their task run again.

Refs STREAM-676
@markstory markstory requested a review from a team as a code owner April 8, 2026 13:47
@linear-code
Copy link
Copy Markdown

linear-code Bot commented Apr 8, 2026

Comment on lines +83 to +91
legacy_keys = [sk.rsplit(":", 1)[0] for sk in storage_keys]

new_values = self._redis.mget([self._make_key(sk) for sk in storage_keys])
legacy_values = self._redis.mget([self._make_key(lk) for lk in legacy_keys])

run_times: dict[str, datetime | None] = {}
for storage_key, new_val, legacy_val in zip(storage_keys, new_values, legacy_values):
raw = new_val if new_val is not None else legacy_val
run_times[storage_key] = datetime.fromisoformat(raw) if raw else None
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: The backwards-compatibility fallback for legacy run state can prevent a task from running immediately after a schedule change, delaying its execution until the new interval has passed.
Severity: MEDIUM

Suggested Fix

The backwards-compatibility logic should be adjusted to ignore the legacy last_run value if the schedule has changed. One approach is to have read_many not perform the fallback at all, forcing an immediate run on the new schedule. Alternatively, the calling function _load_last_run could compare the schedule ID from the new key with the schedule that produced the legacy data (if possible to determine) and ignore the legacy data if they differ.

Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent.
Verify if this is a real issue. If it is, propose a fix; if not, explain why it's not
valid.

Location: clients/python/src/taskbroker_client/scheduler/runner.py#L83-L91

Potential issue: The backwards-compatibility fallback in `read_many()` can prevent a
task from running immediately after a schedule change. If a task ran recently under an
old schedule, its last run time is stored in a legacy-formatted Redis key. When the code
is updated and the schedule is changed simultaneously, the new logic will read the last
run time from the legacy key. This causes the `is_due()` calculation to incorrectly
delay the task's next execution, respecting the old run time instead of triggering
immediately, which contradicts the feature's goal.

Did we get this right? 👍 / 👎 to inform future reviews.

Retrieve last run times in bulk.

storage_keys are the new-format keys including the schedule_id suffix
(e.g. "test:valid:300"). Falls back to the legacy key (derived by
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a contrived example, but:

storage_keys = [`test:val:300`, `test:val:3000`]
legacy_keys = [`test:val`, `test:val`]

new_values = [123, None] # No value for 3000
legacy_values = [321, 321] # We have a legacy value

run_times[`test:val:300`] = 123
run_times[`test:val:3000`] = 321

Is this scenario possible? And if so, does this result make sense?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this scenario possible? And if so, does this result make sense?

If I understand correctly, a task has both a legacy + new key (test:val and test:val:300) and then the schedule is changed from 300 -> 3000? In that scenario we shouldn't have both test:val:300 and test:val:3000 in storage_keys because test:val is the scheduled task name, and task names would be unique.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But isn't it possible to create such a scenario anyways? For example, if the scheduler command has...

scheduler.add(
    "simple-task", {"task": "test:val", "schedule": timedelta(seconds=300)}
)

...

scheduler.add(
    "simple-task", {"task": "test:val", "schedule": timedelta(seconds=3000)}
)

... wouldn't this recreate the example Evan proposed?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that scenario would trigger the problem. Currently having two schedule entries with the same task and different schedules results in only one of the schedules being followed because the storage key is shared.

We currently prevent tasks from having two different schedules in sentry with a test

https://github.com/getsentry/sentry/blob/41423c6cbfd603d2c63a7589d76c4f0dca2e6014/tests/sentry/taskworker/test_config.py#L30-L40

So the problematic scenario has been prevented for now. As we on-board more applications though we should make this more resilient to operator error. Using the schedule name as the storage key would help with that. I can follow up these changes with that improvement.

@markstory markstory merged commit 7c53d6a into main Apr 15, 2026
33 of 34 checks passed
@markstory markstory deleted the feat-schedule-reset branch April 15, 2026 19:57
markstory added a commit that referenced this pull request Apr 16, 2026
In #590 I made schedule storage keys embed their schedule data, and
another issue was identified that a task with two schedules could have
an overlapping storage key.

These changes embed the schedule entry name into the storage key to
ensure that schedule timing is unique for each entry. Now changing,
the schedule name, task name, or schedule timing will reset the schedule
timing.

Refs STREAM-676
markstory added a commit that referenced this pull request Apr 22, 2026
In #590 I made schedule storage keys embed their schedule data, and
another issue was identified that a task with two schedules could have
an overlapping storage key.

These changes embed the schedule entry name into the storage key to
ensure that schedule timing is unique for each entry. Now changing,
the schedule name, task name, or schedule timing will reset the schedule
timing.

Refs STREAM-676

* Fix compiler warning from new rust
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants