Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rusty: Account for disabled but offline CPUs #191

Merged
merged 7 commits into from
Mar 14, 2024
Merged

Conversation

Byte-Lab
Copy link
Contributor

As described in https://bugzilla.kernel.org/show_bug.cgi?id=218109,
#147 and sched-ext/sched_ext#69, AMD chips can
sometimes report fully disabled CPUs as offline, which causes us to
count them when looking at /sys/devices/system/cpu/possible.

Additionally, systems can have holes in their active CPU maps. For
example, a system with CPUs 0, 1, 2, 3 possible, may have only 0 and 2
active. To address this, we need to do a few things:

  1. Update topology.rs to be clear that it's returning the number of
    possible CPUs in the system. Also update Topology to only record
    online CPUs when creating its span and iterating over sysfs when
    creating domains. It was previously trying to record when a CPU was
    online, but this was actually broken as the topology directory isn't
    present in sysfs when the CPU is offline.

  2. Schedulers should not be relying on nr_possible_cpus for anything
    other than interacting with per-CPU data (e.g. for stats extraction),
    or e.g. verifying maximum sizes of statically sized arrays in BPF. It
    should not be used for e.g. performing load calculations, etc. With
    that said, we'll also need to update schedulers to not rely on the
    nr_possible_cpus figure being exported by the topology crate. We do
    that for rusty in this patch, but don't fix any of the others other
    than updating how they call topology.rs.

  3. Account for the fact that LLC IDs may be non-contiguous. For example,
    if there is a single core in an LLC, then if we assign LLC IDs to
    domains, then the domain IDs won't be contiguous. This doesn't fit
    our current model which is used by e.g. infeasible_weights.rs. We'll
    update some of the code in rusty to accomodate this, but we'll need
    to do more.

  4. Update schedulers to properly reset themselves in the event of a
    hotplug event. We'll take care of that in a follow-on change.

We're iterating from min..max cpu in cpus_online(), but that's not
inclusive of the max CPU. Let's also include that so we don't think that
last CPU is offline.

Signed-off-by: David Vernet <void@manifault.com>
Offline CPUs don't have a /sys/devices/system/cpu/cpuN/topology
directory, so let's just skip them if they're not online. Schedulers are
expected to detect hotplug, and handle gracefully restarting.

Signed-off-by: David Vernet <void@manifault.com>
We implement functions or(), and(), and xor() for cpumasks, but we
should also implement the bitwise ops for those operations in case
people prefer that syntax.

Signed-off-by: David Vernet <void@manifault.com>
We were looking at the domain cpumask length, instead of its weight.
Correct the oversight.

Signed-off-by: David Vernet <void@manifault.com>
If a CPU is offline, it could cause an LLC to go offline, which could
cause us to have non-contiguous domain IDs. Right now, a few places in
code assume contiguous domain IDs, such as in the infeasible weights
crate. Let's update domain.rs and load_balaance.rs to do the right
thing. We'll fix the others later.

Signed-off-by: David Vernet <void@manifault.com>
As described in https://bugzilla.kernel.org/show_bug.cgi?id=218109,
#147 and
sched-ext/sched_ext#69, AMD chips can
sometimes report fully disabled CPUs as offline, which causes us to
count them when looking at /sys/devices/system/cpu/possible.

Additionally, systems can have holes in their active CPU maps. For
example, a system with CPUs 0, 1, 2, 3 possible, may have only 0 and 2
active. To address this, we need to do a few things:

1. Update topology.rs to be clear that it's returning the number of
   _possible_ CPUs in the system. Also update Topology to only record
   online CPUs when creating its span and iterating over sysfs when
   creating domains. It was previously trying to record when a CPU was
   online, but this was actually broken as the topology directory isn't
   present in sysfs when the CPU is offline.

2. Schedulers should not be relying on nr_possible_cpus for anything
   other than interacting with per-CPU data (e.g. for stats extraction),
   or e.g. verifying maximum sizes of statically sized arrays in BPF. It
   should _not_ be used for e.g. performing load calculations, etc. With
   that said, we'll also need to update schedulers to not rely on the
   nr_possible_cpus figure being exported by the topology crate. We do
   that for rusty in this patch, but don't fix any of the others other
   than updating how they call topology.rs.

3. Account for the fact that LLC IDs may be non-contiguous. For example,
   if there is a single core in an LLC, then if we assign LLC IDs to
   domains, then the domain IDs won't be contiguous. This doesn't fit
   our current model which is used by e.g. infeasible_weights.rs. We'll
   update some of the code in rusty to accomodate this, but we'll need
   to do more.

4. Update schedulers to properly reset themselves in the event of a
   hotplug event. We'll take care of that in a follow-on change.

Signed-off-by: David Vernet <void@manifault.com>
There were a few issues, e.g. us still mentioning the infeasible weights
problem, and arguments using underscores despite clap rendering them
with dashes. Let's fix them up.

Signed-off-by: David Vernet <void@manifault.com>
Copy link
Collaborator

@arighi arighi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for updating also rustland and rlfifo!

@Byte-Lab Byte-Lab merged commit 6ecb6ae into main Mar 14, 2024
1 check passed
@Byte-Lab Byte-Lab deleted the fix_possible_cpus branch March 14, 2024 18:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants