scx: Exit a scheduler for unhandled hotplug events #177

Byte-Lab · 2024-04-09T16:08:44Z

A scheduler may implement ops.cpu_online() and ops.cpu_offline() for CPU
onlining and offlining events respectively. If a scheduler does not
implement these callbacks, it's indicative that they do not support
hotplug. Given that a scheduler that doesn't support hotplug is
essentially guaranteed to perform incorrectly if a hotplug event occurs,
let's update ext.c to do the sane thing, and exit the scheduler
automatically.

Given that there may be other events in the future that could cause the
scheduler to initiate an exit, we create a new enum scx_exit_code type
that reserves the top bit of the exit_code field in struct
scx_exit_info, and defines SCX_ECODE_RESTART. We could as an alternative
instead just return something like -EAGAIN to signify that user space
can try restarting the scheduler.

Note that this isn't a 100% foolproof, race free hotplug detection mechanism.
For some schedulers, if a CPU is hotplugged after inspecting the host
topology but before attaching the scheduler, we could run into problems.
We'll need something like a hotplug generation counter to accommodate
this. We can take care of it in a separate follow up patch set.

We currently provide scx_ops_error() as a way for ext.c to cause a scheduler to be evicted due to erroneous behavior (for example, due to returning an invalid CPU from ops.select_cpu()). Now that we have a method for exiting gracefully with an exit code from BPF programs, we can similarly provide an scx_ops_exit() macro that allows ext.c to exit and pipe an exit code up to user space. This patch adds that macro. A subsequent patch will use it to exit and plumb up hotplug events. Signed-off-by: David Vernet <void@manifault.com>

A scheduler may implement ops.cpu_online() and ops.cpu_offline() for CPU onlining and offlining events respectively. If a scheduler does _not_ implement these callbacks, it's indicative that they do not support hotplug. Given that a scheduler that doesn't support hotplug is essentially guaranteed to perform incorrectly if a hotplug event occurs, let's update ext.c to do the sane thing, and exit the scheduler automatically. Given that there may be other events in the future that could cause the scheduler to initiate an exit, we create a new enum scx_exit_code type that reserves the top bit of the exit_code field in struct scx_exit_info, and defines SCX_ECODE_RESTART. We could as an alternative instead just return something like -EAGAIN to signify that user space can try restarting the scheduler. Signed-off-by: David Vernet <void@manifault.com>

Now that we have bits reserved for system exit code reasons and actions, as well as bits available for user by user space, let's add some ease-of-use macro to user_exit_info.h. A subsequent patch will add selftests that use these macros. Signed-off-by: David Vernet <void@manifault.com>

We've recently added some logic related to hotplug: - If a hotplug event occurs and a scheduler hasn't implemented a callback for it, we automatically exit the scheduler with specific, built-in exit codes - With scx_bpf_exit(), a scheduler can choose to manually exit the scheduler in a hotplug event, or do something else. In any case, the scheduler should _not_ be automatically exited by the kernel Let's add selftests to validate these conditions. Signed-off-by: David Vernet <void@manifault.com>

arighi · 2024-04-10T06:51:52Z

I'm not sure if relying purely on the implementation of ops.cpu_online() / ops.cpu_offline() is ideal. For instance, scx_simple works completely fine with cpu hotplugging, but it doesn't implement these methods. I'm wondering if having a dedicated flag would give more flexibility, something like `SCX_OPS_CPU_HOTPLUG_EXIT`` or similar? In this way it's up to the sched developer to determine if the scheduler supports cpu hotplugging correctly or not, wdyt?

Byte-Lab · 2024-04-10T15:02:53Z

I'm not sure if relying purely on the implementation of ops.cpu_online() / ops.cpu_offline() is ideal. For instance, scx_simple works completely fine with cpu hotplugging, but it doesn't implement these methods. I'm wondering if having a dedicated flag would give more flexibility, something like `SCX_OPS_CPU_HOTPLUG_EXIT`` or similar? In this way it's up to the sched developer to determine if the scheduler supports cpu hotplugging correctly or not, wdyt?

That would indeed be more convenient for scx_simple. On the other hand, scx_simple does have the flexibility of supporting hotplug by implementing empty versions of ops.online() and ops.offline(), albeit with a bit more boilerplate than just specifying an ops flag. It's arguably a bit more confusing / leaky (in terms of the API) to have behavior depend on the implementation of callbacks, but there is a precedent for that. See e.g. ops.update_idle() and ops->flags & SCX_OPS_KEEP_BUILTIN_IDLE. We'd also have to decide what the behavior should be for a scheduler that implements ops.cpu_online() / ops.cpu_offline() but also specifies that flag. I'm guessing we would want to reject loading the scheduler? At that point, the API is arguably no more leaky than just implementing the restart behavior if the callbacks aren't specified.

I'd expect that the cases where the scheduler cares about hotplug are when they have some view of the host topology; and generally speaking I'd expect that to be the norm for most production schedulers. I think the question is whether we want to do what we think is "the right thing" in terms of scheduler correctness for the common case (meaning assuming a scheduler that doesn't implement hotplug is probably not going to do the right thing in the event of a hotplug event), or to instead have a standalone flag that completely controls the behavior. Given that I think it's probably atypical for a scheduler to be completely agnostic to hotplug changes, my inclination is to err on the side of always exiting the scheduler if the callbacks aren't defined.

Wdyt?

arighi · 2024-04-10T15:43:58Z

Hm... good point about implementing an empty cpu_online / cpu_offline to maintain the scheduling active, despite the additional boilerplate code it's probably the right thing to do, and it doesn't add extra complexity to the code. And, as you pointed out, there's also the update_idle() precedent, so it makes this approach more valid. I also agree that the default behavior should be "exit on a cpu hotplug event, if the scheduler ignores cpu hotplugging events".

That said, I agree with this approach, thanks for clarifying it.

Byte-Lab requested a review from htejun April 9, 2024 16:08

Byte-Lab force-pushed the hotplug_restart branch from eb03eb3 to ef91247 Compare April 10, 2024 04:26

Byte-Lab added 3 commits April 9, 2024 23:27

Byte-Lab force-pushed the hotplug_restart branch from ef91247 to e77fa31 Compare April 10, 2024 04:31

htejun approved these changes Apr 10, 2024

View reviewed changes

htejun merged commit 37b3f83 into sched_ext Apr 10, 2024
1 check passed

htejun deleted the hotplug_restart branch April 10, 2024 17:55

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

scx: Exit a scheduler for unhandled hotplug events #177

scx: Exit a scheduler for unhandled hotplug events #177

Byte-Lab commented Apr 9, 2024

arighi commented Apr 10, 2024

Byte-Lab commented Apr 10, 2024

arighi commented Apr 10, 2024

scx: Exit a scheduler for unhandled hotplug events #177

scx: Exit a scheduler for unhandled hotplug events #177

Conversation

Byte-Lab commented Apr 9, 2024

arighi commented Apr 10, 2024

Byte-Lab commented Apr 10, 2024

arighi commented Apr 10, 2024