refactor: Hoist idle-canister drop in RoundSchedule::start_iteration#10288
Open
alin-at-dfinity wants to merge 2 commits into
Open
refactor: Hoist idle-canister drop in RoundSchedule::start_iteration#10288alin-at-dfinity wants to merge 2 commits into
RoundSchedule::start_iteration#10288alin-at-dfinity wants to merge 2 commits into
Conversation
Move the "drop idle canisters with 0-100 AP from the subnet schedule" logic out of the `NextExecution::None` branch of the main per-canister loop and into a dedicated pre-loop at the top of `start_iteration`. Behavior is unchanged for canisters that are present in both `canister_states` and `subnet_schedule`: the same set of idle canisters with priorities in the 0-100 AP range get dropped. As a side benefit, the new pre-loop iterates over `subnet_schedule`, so it also cleans up orphan entries for canisters that no longer exist in `canister_states` (those are otherwise pruned by `garbage_collect_subnet_schedule()`, so this is a no-op in practice). Also clarify the doc comment for `IterationSchedule::partition_canisters_to_cores`. This is a small standalone refactor extracted from a larger `CanisterStates` integration, where the main per-canister loop will switch from iterating all canisters to iterating only hot canisters (at which point hoisting becomes a correctness requirement: cold canisters would otherwise no longer be visited by the main loop and their idle entries would not be dropped). Co-authored-by: Cursor <cursoragent@cursor.com>
In `RoundSchedule::start_iteration`, the two calls that construct `CanisterRoundState::new(canister, subnet_schedule.get_mut(*canister_id))` only need a `&CanisterPriority`, so the `get_mut` is purely there for its insertion side effect (it inserts a default entry if the canister is not yet in the schedule). This insertion is redundant: `RoundSchedule::finish_round` calls `subnet_schedule.get_mut(*canister_id)` for every entry in `self.scheduled_canisters` (which is a superset of the canisters that go through these two match arms), and that's the first time anything observes `subnet_schedule.iter()` / `len()`. No code between `start_iteration` and `finish_round` observes the contents of the schedule (the only intermediate reader, `abort_paused_executions_above_limit`, uses `SubnetSchedule::get`, which is tolerant of missing entries). Switch the two read-only sites to `get`. The `get_mut` immediately above (line 333) that decrements rate-limited canisters' AP by 100% stays as is, since it does an actual mutation. Co-authored-by: Cursor <cursoragent@cursor.com>
alin-at-dfinity
commented
May 22, 2026
| return None; | ||
| } | ||
| CanisterRoundState::new(canister, subnet_schedule.get_mut(*canister_id)) | ||
| CanisterRoundState::new(canister, subnet_schedule.get(canister_id)) |
Contributor
Author
There was a problem hiding this comment.
FYI: This replacement (and the similar one right below) result in a change in behavior: canisters no longer get a SubnetSchedule entry as soon as they are scheduled by start_iteration().
But this was accidentally left in (during development I had a version of CanisterRoundState::new() that actually mutated the provided CanisterPriority (don't remember why). And finish_round() will call get_mut() to create an entry for all scheduled canisters. And no one looks at the SubnetSchedule in-between.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Move the "drop idle canisters with 0-100 AP from the subnet schedule" logic out of the
NextExecution::Nonebranch of the main per-canister loop and into a dedicated pre-loop at the top ofstart_iteration. Behavior is unchanged: the same set of idle canisters with priorities in the 0-100 AP range get dropped.Also clarify the doc comment for
IterationSchedule::partition_canisters_to_cores.This is a small standalone refactor extracted from #10287, where the main per-canister loop will switch from iterating all canisters to iterating only hot canisters (at which point hoisting becomes a correctness requirement: cold canisters would otherwise no longer be visited by the main loop and their idle entries would not be dropped).