Skip to content

Conversation

@dragoljub-djuric
Copy link
Contributor

@dragoljub-djuric dragoljub-djuric commented Oct 10, 2024

When the canister is executed, it is most likely its Sandbox will not be evicted from the cache, since we are keeping Sandboxes based using LRU logic. At the same time, the Scheduler decreases its priority so it is less likely that it will be executed. So our caching of Sandbox processes is almost the worst possible.

Solution:
Propagate Scheduler priorities for the place we evict Sandbox processes and do evictions based on the lowest priority. That change should decrease the number of cache misses.

Link to follow-up with minimized number of reads to scheduler priorities.

Note:
Furthermore: the scheduler priorities used are from the round before the current round, because the snapshots are saved only at the end of the round, and apply_scheduling_strategy() is run before executing canisters in the round. But that should not influence results by a lot.

It remains to explore if there is an easy way to move apply_scheduling_strategy() after all canisters are executed in the round. In that case, priorities taken from the last snapshot will be exactly the priorities for the current round.

@github-actions github-actions bot added the feat label Oct 10, 2024
Copy link
Contributor

@adambratschikaye adambratschikaye left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Just left some minor comments.

let mut evicted = candidates;

for candidate in candidates.into_iter() {
for candidate in remaining_candidates.into_iter() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess now we could be iterating through all the canisters in more cases (if we don't hit evict_at_most). I guess that's fine since the list is short and this only runs every 10 seconds or so - agree?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we might need to change the 10s logic to something smaller, but I'm not super sure yet, let's keep this under discussion.

Comment on lines 77 to 81
use std::time::{Duration, Instant};

use ic_test_utilities_types::ids::canister_test_id;
use ic_types::AccumulatedPriority;

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we add some tests that actually depend on the new logic?

min_active_sandboxes: usize,
max_active_sandboxes: usize,
max_sandbox_idle_time: Duration,
state_reader: Arc<dyn StateReader<State = ReplicatedState>>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this argument could just be a reference instead of an owned Arc. Then you won't need to Arc::clone each time you call it.


fn get_latest_state(&self) -> Labeled<Arc<ReplicatedState>>;


Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

accidental newline?

@dragoljub-djuric dragoljub-djuric changed the base branch from master to dimitris/scheduler-changes-sandbox-count October 15, 2024 13:46
@dragoljub-djuric dragoljub-djuric changed the base branch from dimitris/scheduler-changes-sandbox-count to master October 15, 2024 13:54
@dragoljub-djuric dragoljub-djuric changed the base branch from master to dimitris/scheduler-changes-sandbox-count October 15, 2024 13:54
@dragoljub-djuric dragoljub-djuric changed the base branch from dimitris/scheduler-changes-sandbox-count to master October 15, 2024 13:55
@dragoljub-djuric dragoljub-djuric changed the base branch from master to dimitris/scheduler-changes-sandbox-count October 15, 2024 13:56
@dragoljub-djuric dragoljub-djuric changed the base branch from dimitris/scheduler-changes-sandbox-count to master October 15, 2024 13:56
berestovskyy and others added 10 commits October 15, 2024 14:01
This change applies charges for each fully executed canister. The total
amount of points charged is evenly distributed across canisters, but it
is not included in the compute capacity used to calculate long/new
execution cores.
The idle canisters in front of the round schedule should be marked as
fully executed, as they were scheduled first in the round.

This helps to rotate the round schedule faster.
This change applies charges for each fully executed canister. The total
amount of points charged is evenly distributed across canisters, but it
is not included in the compute capacity used to calculate long/new
execution cores.
@dragoljub-djuric dragoljub-djuric changed the base branch from master to Add_new_dashboards_to_testnet October 30, 2024 14:16
Base automatically changed from Add_new_dashboards_to_testnet to master October 31, 2024 15:58
@berestovskyy berestovskyy changed the title feat: EXC-1754 Change the way evict_sandbox_processes works feat: EXC-1754 Evict sandboxes based on their priorities Oct 31, 2024
No functional changes, just moving functions in one place.
This change is functionally equivalent but necessary
for the upcoming priority-based eviction.
@dragoljub-djuric dragoljub-djuric changed the base branch from master to andriy/exc-1754-apply-priority-at-the-end November 1, 2024 10:03
@berestovskyy berestovskyy force-pushed the andriy/exc-1754-apply-priority-at-the-end branch from 9142bae to ab95547 Compare November 1, 2024 18:47
Base automatically changed from andriy/exc-1754-apply-priority-at-the-end to master November 6, 2024 18:25
@dragoljub-djuric dragoljub-djuric changed the base branch from master to andriy/exc-1787-scheduler-divergence-debug November 12, 2024 09:16
@dragoljub-djuric dragoljub-djuric changed the base branch from andriy/exc-1787-scheduler-divergence-debug to master November 12, 2024 21:01
@dragoljub-djuric dragoljub-djuric changed the base branch from master to andriy/exc-1787-scheduler-divergence-debug November 12, 2024 21:13
@dragoljub-djuric dragoljub-djuric changed the base branch from andriy/exc-1787-scheduler-divergence-debug to master November 12, 2024 21:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants