-
Notifications
You must be signed in to change notification settings - Fork 372
feat: EXC-1754 Evict sandboxes based on their priorities #1967
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: EXC-1754 Evict sandboxes based on their priorities #1967
Conversation
adambratschikaye
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Just left some minor comments.
| let mut evicted = candidates; | ||
|
|
||
| for candidate in candidates.into_iter() { | ||
| for candidate in remaining_candidates.into_iter() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess now we could be iterating through all the canisters in more cases (if we don't hit evict_at_most). I guess that's fine since the list is short and this only runs every 10 seconds or so - agree?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we might need to change the 10s logic to something smaller, but I'm not super sure yet, let's keep this under discussion.
| use std::time::{Duration, Instant}; | ||
|
|
||
| use ic_test_utilities_types::ids::canister_test_id; | ||
| use ic_types::AccumulatedPriority; | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we add some tests that actually depend on the new logic?
| min_active_sandboxes: usize, | ||
| max_active_sandboxes: usize, | ||
| max_sandbox_idle_time: Duration, | ||
| state_reader: Arc<dyn StateReader<State = ReplicatedState>>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this argument could just be a reference instead of an owned Arc. Then you won't need to Arc::clone each time you call it.
|
|
||
| fn get_latest_state(&self) -> Labeled<Arc<ReplicatedState>>; | ||
|
|
||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
accidental newline?
No functional changes, just moving functions in one place.
This change is functionally equivalent but necessary for the upcoming priority-based eviction.
9142bae to
ab95547
Compare
When the canister is executed, it is most likely its Sandbox will not be evicted from the cache, since we are keeping Sandboxes based using LRU logic. At the same time, the Scheduler decreases its priority so it is less likely that it will be executed. So our caching of Sandbox processes is almost the worst possible.
Solution:
Propagate Scheduler priorities for the place we evict Sandbox processes and do evictions based on the lowest priority. That change should decrease the number of cache misses.
Link to follow-up with minimized number of reads to scheduler priorities.
Note:
Furthermore: the scheduler priorities used are from the round before the current round, because the snapshots are saved only at the end of the round, and
apply_scheduling_strategy()is run before executing canisters in the round. But that should not influence results by a lot.It remains to explore if there is an easy way to move
apply_scheduling_strategy()after all canisters are executed in the round. In that case, priorities taken from the last snapshot will be exactly the priorities for the current round.