fix(scheduler): reschedule job when worker channel is full or closed#105
Merged
pratyush618 merged 1 commit intomasterfrom May 2, 2026
Merged
fix(scheduler): reschedule job when worker channel is full or closed#105pratyush618 merged 1 commit intomasterfrom
pratyush618 merged 1 commit intomasterfrom
Conversation
`try_dispatch` was logging the channel-send failure and returning `Ok(true)` as if the job had been dispatched. The job had already been moved to `Running` and its execution-claim row written, so it sat in that state until the stale-job reaper timed it out — at which point it was reported to middleware and metrics as a *timeout failure*, the wrong outcome for a job that never executed. Distinguish the two failure modes (channel full vs closed) with separate warnings so operators can tell backpressure from shutdown, and route both through `rollback_claim_and_retry` to clear the claim and reset status to `Pending` with a 100ms delay. The next tick will dispatch normally once the worker pool drains or restarts. Two regression tests cover the closed-channel and full-channel paths.
This was referenced May 2, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
P0-2 from the pre-release audit.
try_dispatchwas silently dropping jobs whentry_sendto the worker pool channel failed. Because the job was already inRunningstate with an execution claim, the stale-job reaper would eventually time it out and report a timeout failure to middleware and metrics — wrong outcome for a job that never ran.What changed
crates/taskito-core/src/scheduler/poller.rs:TrySendError::Full(worker pool is behind — backpressure) fromTrySendError::Closed(worker pool shutting down) with separate log lines, so operators can tell the two states apart.rollback_claim_and_retryhelper — clears the execution-claim row and resets status toPendingwith a 100ms delay (CHANNEL_BACKPRESSURE_RETRY_DELAY_MS).Ok(false)fromtry_dispatchon dispatch failure so adaptive polling backs off, instead of returningOk(true)as if the dispatch had succeeded.The retry delay is short on purpose: 100ms is enough for the worker pool to drain a slot under steady-state backpressure, and the stale-job reaper window stays as the eventual safety net if rollback itself fails.
Tests
Two new regression tests in
crates/taskito-core/src/scheduler/mod.rs:test_try_dispatch_reschedules_on_closed_channel— drops the receiver before tick, verifies job returns toPendingwith no stale claim row.test_try_dispatch_reschedules_on_full_channel— pre-fills a capacity-1 channel with a sentinel job, verifies the same recovery onTrySendError::Full.Test plan
cargo test --workspace— all 80 tests pass + 2 newcargo clippy --workspace --all-targets -- -D warningscleancargo check --workspace --features postgrescleancargo check --workspace --features rediscleanuv run python -m pytest tests/python/— 485 passed, 9 skippeduv run ruff check py_src/ tests/cleanuv run mypy py_src/taskito/clean