Skip to content

refactor: Hoist idle-canister drop in RoundSchedule::start_iteration#10288

Open
alin-at-dfinity wants to merge 2 commits into
masterfrom
alin/scheduler-hoist-idle-canister-drop
Open

refactor: Hoist idle-canister drop in RoundSchedule::start_iteration#10288
alin-at-dfinity wants to merge 2 commits into
masterfrom
alin/scheduler-hoist-idle-canister-drop

Conversation

@alin-at-dfinity
Copy link
Copy Markdown
Contributor

Move the "drop idle canisters with 0-100 AP from the subnet schedule" logic out of the NextExecution::None branch of the main per-canister loop and into a dedicated pre-loop at the top of start_iteration. Behavior is unchanged: the same set of idle canisters with priorities in the 0-100 AP range get dropped.

Also clarify the doc comment for
IterationSchedule::partition_canisters_to_cores.

This is a small standalone refactor extracted from #10287, where the main per-canister loop will switch from iterating all canisters to iterating only hot canisters (at which point hoisting becomes a correctness requirement: cold canisters would otherwise no longer be visited by the main loop and their idle entries would not be dropped).

Move the "drop idle canisters with 0-100 AP from the subnet schedule"
logic out of the `NextExecution::None` branch of the main per-canister
loop and into a dedicated pre-loop at the top of `start_iteration`.

Behavior is unchanged for canisters that are present in both
`canister_states` and `subnet_schedule`: the same set of idle canisters
with priorities in the 0-100 AP range get dropped.

As a side benefit, the new pre-loop iterates over `subnet_schedule`, so
it also cleans up orphan entries for canisters that no longer exist in
`canister_states` (those are otherwise pruned by
`garbage_collect_subnet_schedule()`, so this is a no-op in practice).

Also clarify the doc comment for
`IterationSchedule::partition_canisters_to_cores`.

This is a small standalone refactor extracted from a larger
`CanisterStates` integration, where the main per-canister loop will
switch from iterating all canisters to iterating only hot canisters
(at which point hoisting becomes a correctness requirement: cold
canisters would otherwise no longer be visited by the main loop and
their idle entries would not be dropped).

Co-authored-by: Cursor <cursoragent@cursor.com>
@alin-at-dfinity alin-at-dfinity requested a review from a team as a code owner May 22, 2026 11:06
In `RoundSchedule::start_iteration`, the two calls that construct
`CanisterRoundState::new(canister, subnet_schedule.get_mut(*canister_id))`
only need a `&CanisterPriority`, so the `get_mut` is purely there for
its insertion side effect (it inserts a default entry if the canister
is not yet in the schedule).

This insertion is redundant: `RoundSchedule::finish_round` calls
`subnet_schedule.get_mut(*canister_id)` for every entry in
`self.scheduled_canisters` (which is a superset of the canisters that
go through these two match arms), and that's the first time anything
observes `subnet_schedule.iter()` / `len()`. No code between
`start_iteration` and `finish_round` observes the contents of the
schedule (the only intermediate reader, `abort_paused_executions_above_limit`,
uses `SubnetSchedule::get`, which is tolerant of missing entries).

Switch the two read-only sites to `get`. The `get_mut` immediately above
(line 333) that decrements rate-limited canisters' AP by 100% stays as
is, since it does an actual mutation.

Co-authored-by: Cursor <cursoragent@cursor.com>
return None;
}
CanisterRoundState::new(canister, subnet_schedule.get_mut(*canister_id))
CanisterRoundState::new(canister, subnet_schedule.get(canister_id))
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI: This replacement (and the similar one right below) result in a change in behavior: canisters no longer get a SubnetSchedule entry as soon as they are scheduled by start_iteration().

But this was accidentally left in (during development I had a version of CanisterRoundState::new() that actually mutated the provided CanisterPriority (don't remember why). And finish_round() will call get_mut() to create an entry for all scheduled canisters. And no one looks at the SubnetSchedule in-between.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant