scx_rustland: reduce scheduler overhead #56

arighi · 2023-12-29T14:51:59Z

always consider the CPU where the scheduler is running as idle
fix a bug that was causing the user-space scheduler to spin and improve the logic to activate the user-space scheduler, preventing unnecessary activations

These changes make scx_rustland more reliable and they strongly reduce the scheduler overhead.

htejun

The commit messages are titled scx_userland. Can you please update them?

htejun · 2023-12-29T19:16:07Z

scheds/rust/scx_rustland/src/bpf/main.bpf.c

@@ -489,7 +505,7 @@ void BPF_STRUCT_OPS(rustland_update_idle, s32 cpu, bool idle)
 	 * Moreover, kick the CPU to make it immediately ready to accept
 	 * dispatched tasks.
 	 */
-	if (__sync_fetch_and_add(&nr_queued, 0)) {
+	if (nr_queued || nr_scheduled) {


It may be worth explaining how this is interlocked with userland component and a situation where userland component is not invoked even while there are tasks to process can't happen. nr_queued is basically used as a boolean here indicating "something happened since userland looked at it, so better call them at least once". And then while the userland is running it updates its internal state through nr_scheduled. As long as update_idle is called after each userland run, this should be safe, right?

@htejun comment updated adding more details. Let me know if it seems clear enough. Thanks!

Edit: ...and changed scx_userland -> scx_rustland in the commit messages, thanks for noticing!

Considering the CPU where the user-space scheduler is running as busy doesn't really provide any benefit, since the user-space scheduler is constantly dispatching an amount of tasks equal to the amount of idle CPUs and then yields (therefore its own CPU should be considered idle). Considering the CPU where the user-space scheduler is running as busy doesn't provide any benefit, as the scheduler consistently dispatches tasks equal to the number of idle CPUs and then yields (therefore its own CPU should be considered idle). This also allows to reduce the overall user-space scheduler CPU utilization, especially when the system is mostly idle, without introducing any measurable performance regression. Measuring the average CPU utilization of a (mostly) idle system over a time period of 60 sec: - wihout this patch: 5.41% avg cpu util - with this patch: 2.26% avg cpu util Signed-off-by: Andrea Righi <andrea.righi@canonical.com>

We want to activate the user-space scheduler only when there are pending tasks that require scheduling actions. To do so we keep track of the queued tasks via nr_queued, that is incremented in .enqueue() when a task is sent to the user-space scheduler and decremented in .dispatch() when a task is dispatched. However, we may trigger an unbalance if the same pid is sent to the scheduler multiple times (because the scheduler store all the tasks by their unique pid). When this happens nr_queued is never decremented back to 0, leading the user-space scheduler to constantly spin, even if there's no activity to do. To prevent this from happening split nr_queued into nr_queued and nr_scheduled. The former will be updated by the BPF component every time that a task is sent to the scheduler and it's up to the user-space scheduler to reset the counter when the queue is fully dreained. The latter is maintained by the user-space scheduler and represents the amount of tasks that are still processed by the scheduler and are waiting to be dispatched. The sum of nr_queued + nr_scheduled will be called nr_waiting and we can rely on this metric to determine if the user-space scheduler has some pending work to do or not. This change makes rust_rustland more reliable and it strongly reduces the CPU usage of the user-space scheduler by eliminating a lot of unnecessary activations. Signed-off-by: Andrea Righi <andrea.righi@canonical.com>

htejun requested changes Dec 29, 2023

View reviewed changes

arighi force-pushed the scx-rustland-reduce-scheduler-overhead branch from 249b557 to e79b5fa Compare December 29, 2023 20:09

arighi added 2 commits December 29, 2023 21:14

arighi force-pushed the scx-rustland-reduce-scheduler-overhead branch from e79b5fa to e90bc92 Compare December 29, 2023 20:15

htejun approved these changes Dec 29, 2023

View reviewed changes

htejun merged commit 474a149 into sched-ext:main Dec 29, 2023
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

scx_rustland: reduce scheduler overhead #56

scx_rustland: reduce scheduler overhead #56

arighi commented Dec 29, 2023

htejun left a comment

htejun Dec 29, 2023

arighi Dec 29, 2023 •

edited

scx_rustland: reduce scheduler overhead #56

scx_rustland: reduce scheduler overhead #56

Conversation

arighi commented Dec 29, 2023

htejun left a comment

Choose a reason for hiding this comment

htejun Dec 29, 2023

Choose a reason for hiding this comment

arighi Dec 29, 2023 • edited

Choose a reason for hiding this comment

arighi Dec 29, 2023 •

edited