-
Notifications
You must be signed in to change notification settings - Fork 39
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rustland: enable preemption #235
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Overloading cpu field to carry flags seems a bit unnecessarily complicated tho. Adding another field wouldn't make any practical difference, right?
Looks great to me. Simple and elegant! It would be worth considering the worst-case behavior: When (almost) all tasks are interactive, will they preempt each other with little actual progress? It would require some sort of rate limiting. |
Reserve some bits of the `cpu` attribute of a task to store special dispatch flags. Initially, let's introduce just RL_CPU_ANY to replace the special value NO_CPU, indicating that the task can be dispatched on any CPU, specifically the first CPU that becomes available. This allows to keep the CPU value assigned by the builtin idle selection logic, that can potentially be used later for further optimizations. Moreover, having the possibility to specify dispatch flags gives more flexibility and it allows to map new scheduling features to such flags. Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
Introduce the new dispatch flag RL_PREEMPT_CPU that can be used to dispatch tasks that can preempt others. Tasks with this flag set will be dispatched by the BPF part using SCX_ENQ_PREEMPT, so they can potentially preempt any other task running on the target CPU. Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
Use the new scx_rustland_core dispatch flag RL_PREEMPT_CPU to allow interactive tasks to preempt other tasks with scx_rustland. If the built-in idle selection logic is enforced (option `-i`), the scheduler prioritizes keeping tasks on the target CPU designated by this logic. With preemption enabled, these tasks have a higher likelihood of reusing their cached working set, potentially improving performance. Alternatively, when tasks are dispatched to the first available CPU (default behavior), interactive tasks benefit from running more promptly by kicking out other tasks before their assigned time slice expires. This potentially allows to increase the default time slice to higher values in the future, to improve the overall throughput in the system and, at the same time, still maintain a good level of responsiveness, because interactive tasks are now able to run pretty much immediately, independently on the remaining time slice of the other tasks that are contending the CPUs in the system. = Results = Measuring the performance of the usual benchmark "playing a video game while running a parallel kernel build in background" seems to give around 2-10% boost in the fps with preemption enabled, depending on the particular video game. Results were obtained running a `make -j32` kernel build on a AMD Ryzen 7 5800X 8-Cores 16GB RAM, while testing video games such as Baldur's Gate 3 (with a solid +10% fps), Counter Strike 2 (around +5%) and Team Fortress 2 (+2% boost). Moreover, some WebGL applications (such as https://webglsamples.org/aquarium/aquarium.html) seem to benefit even more with preemption enabled, providing up to a +15% fps boost. Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
Provide a run-time option to disable task preemption. This option can be used to improve the throughput of the CPU-intensive tasks while still providing a good level of responsiveness in the system. By default preemption is enabled, to provide a higher level of responsiveness to the interactive tasks. Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
I agree. Moreover, adding another u64 increases the struct from 24 bytes to 32 bytes that might be actually better in terms of cacheline usage / performance:
I'll change this and update the PR. Thanks! |
Right, rate-limiting the RL_PREEMPT_CPU dispatches seems the easiest way to mitigate potential storms of unnecessary preempt events and it's probably good enough in practice, that's the first improvement that I'm planning to do and I'll try to re-use some of the logic from |
7c81213
to
7139f3b
Compare
Do not encode dispatch flags in the cpu field, but simply use a separate "flags" field. This makes the code much simpler and it increases the size of dispatched_task_ctx from 24 to 32, that is probably better in terms of cacheline allocation / performance. Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
7139f3b
to
f02e9b0
Compare
Overview
Provide preemption capability in
scx_rustland_core
and use this feature inscx_rustland
to improve the the responsiveness of interactive tasks.Design
For now the design is pretty simple:
scx_rustland_core
provides a new dispatch flagRL_PREEMPT_CPU
that is mapped toSCX_ENQ_PREEMPT
scx_rustland
uses this flag when dispatching tasks that are classified as "interactive", so that they can preempt other tasks before their assigned time slice expiresThis implies that interactive tasks can also preempt each other, potentially causing an excessive amount of preemption events if a consistent amount of tasks is classified as "interactive". This can be improved in the future by limiting the the amount of preemption events per sec., similar to what
scx_lavd
is doing.Results
Measuring the performance of the usual benchmark "playing a video game while running a parallel kernel build in background" seems to give around 2-10% boost in the fps with preemption enabled, depending on the
particular video game.
Results were obtained running a
make -j32
kernel build on a AMD Ryzen 7 5800X 8-Cores 16GB RAM, while testing video games such as Baldur's Gate 3 (with a solid +10% fps), Counter Strike 2 (around +5%) and TeamFortress 2 (+2% boost).
Moreover, some WebGL applications (such as https://webglsamples.org/aquarium/aquarium.html) seem to benefit even more with preemption enabled, providing up to a +15% fps boost.