-
Notifications
You must be signed in to change notification settings - Fork 39
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
scx_rustland: clarify and improve BPF / userspace interlocking #48
scx_rustland: clarify and improve BPF / userspace interlocking #48
Conversation
BPF doesn't have full memory model yet, and while strict atomicity might not be necessary in this context, it is advisable to enhance clarity in the interlocking model. To achieve this, provide the following primitives to operate on usersched_needed: static void set_usersched_needed(void) static bool test_and_clear_usersched_needed(void) Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
Provide the following primitives to get and set CPU ownership in the BPF part. This improves code readability and these primitives can be used by the BPF part as a baseline to implement a better CPU idle tracking in the future. Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
Trigger the user-space scheduler only upon a task's CPU release event (avoiding its activation during each enqueue event) and only if there are tasks waiting to be processed by the user-space scheduler. This should save unnecessary calls to the user-space scheduler, reducing the overall overhead of the scheduler. Moreover, rename nr_enqueues to nr_queued and store the amount of tasks currently queued to the user-space scheduler (that are waiting to be dispatched). Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
Adding extra overhead to any kthread can potentially slow down the entire system, so make sure this never happens by dispatching all kthreads directly on the same local CPU (not just the per-CPU kthreads), bypassing the user-space scheduler. Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the update_idle
bubbles, I wonder whether what you need is scx_bpf_kick_cpu()
on self. There's nothing triggering immediate dispatch otherwise. Another thing which might be interesting is that as the BPF and usersched parts get more tightly interlocked, it may make sense to have per-cpu usersched threads. Otherwise, some CPUs may go idle while waiting for usersched thread. If there's a dedicated usersched thread per cpu, every cpu can jump into the usersched thread when it runs out of things to do and then usersched can figure out synchronization (using raw spinlocks) and so on. While conceptually neat in that it really just moves per-cpu scheduling logic to userspace, this might be too much complexity for the purpose of this scheduler, so this is more dumping of the train of thought than anything.
*/ | ||
if (is_kthread(p) && p->nr_cpus_allowed == 1) | ||
if (is_kthread(p)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not an immediate blocker but this might lead to not-so-great behaviors for things like unbound workqueue workers that can consume a lot of CPU cycles (e.g. encryption workers). They actually need scheduling.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not an immediate blocker but this might lead to not-so-great behaviors for things like unbound workqueue workers that can consume a lot of CPU cycles (e.g. encryption workers). They actually need scheduling.
Good point! Yes, in that case they definitely need scheduling, I'll revert that in the next set of changes.
I'll do some experiments with |
A set of interlocking improvements for scx_rustland:
.enqueue()
to.stopping()
(and activate only when needed) to reduce unnecessary user-space activationsI have also explored the possibility to move the user-space scheduler activation from
.stopping()
to.update_idle()
, but the scheduler performs quite poorly with this change: it seems that CPUs are not fully used even in presence of CPU intensive workloads, as if the user-space scheduler isn't feeding the dispatcher fast enough. My guess is that activating the user-space scheduler in.update_idle()
is a bit too late and we get bubbles in the whole scheduling pipeline, compared to do the activation in.stopping
. But this is something that I'd like to explore more for the next set of improvements for scx_rustland.