Skip to content

feat: [DSM-103] Actual priority queue for long executions#10013

Merged
alin-at-dfinity merged 2 commits intomasterfrom
alin/DSM-103-long-execution-priority-queue
Apr 27, 2026
Merged

feat: [DSM-103] Actual priority queue for long executions#10013
alin-at-dfinity merged 2 commits intomasterfrom
alin/DSM-103-long-execution-priority-queue

Conversation

@alin-at-dfinity
Copy link
Copy Markdown
Contributor

Instead of a binary prioritized / opportunistic flag, explicitly (record and) prioritize long executions based on number of slices executed, AP and round when the long execution started. This ensures that we don't starve low priority canisters (which may happen with bounded AP and just the right distribution across execution cores).

Also switch from persisting SubnetSchedule spread across individual canister states to persisting it as part of the subnet's SystemMetadata.

Instead of a binary prioritized / opportunistic flag, explicitly (record and) prioritize long executions based on number of slices executed, AP and round when the long execution started. This ensures that we don't starve low priority canisters (which may happen with bounded AP and just the right distribution across execution cores).

Also switch from persisting SubnetSchedule spread across individual canister states to persisting it as part of the subnet's SystemMetadata.
@alin-at-dfinity alin-at-dfinity requested a review from a team as a code owner April 24, 2026 14:40
@github-actions github-actions Bot added the feat label Apr 24, 2026
Comment thread rs/state_manager/src/checkpoint.rs
Comment thread rs/execution_environment/src/scheduler/round_schedule.rs
Comment thread rs/execution_environment/src/scheduler/round_schedule.rs Outdated
Co-authored-by: Adam Bratschi-Kaye <adam.bratschikaye@dfinity.org>
@alin-at-dfinity alin-at-dfinity added this pull request to the merge queue Apr 27, 2026
Merged via the queue into master with commit 9882446 Apr 27, 2026
37 checks passed
@alin-at-dfinity alin-at-dfinity deleted the alin/DSM-103-long-execution-priority-queue branch April 27, 2026 09:53
basvandijk added a commit that referenced this pull request Apr 27, 2026
#10013)" (#10030)

This reverts commit 9882446 because it
breaks the `//rs/tests/consensus:subnet_splitting_test_colocate` test:
```
2026-04-27 12:20:33.510 INFO[uvms_logs_stream:StdOut] [uvm=colocated-test-driver] TEST_LOG: 2026-04-27 12:20:33.310 INFO[subnet_splitting_test:StdErr] thread 'main' (126) panicked at rs/tests/consensus/subnet_splitting_test.rs:173:33:
2026-04-27 12:20:33.510 INFO[uvms_logs_stream:StdOut] [uvm=colocated-test-driver] TEST_LOG: 2026-04-27 12:20:33.310 INFO[subnet_splitting_test:StdErr] Execution of step SplitOutDestinationState failed: Validation failed: State hash after split b55a0fa013f5d9aa08c1cfc70d683e7b1dbdda7d0789d47ff705b044c45000d1 doesn't match the expected state hash 0295d8206a0b9b88b95291d1d8d31beb4a837794036882b097e38c740a917892
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants