Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

scx-rustland: SMT improvements #94

Merged
merged 2 commits into from Jan 17, 2024
Merged

Commits on Jan 17, 2024

  1. scx_rustland: avoid calling scx_bpf_kick_cpu() from update_idle()

    Prior to commit 676bd88 ("bpf_rustland: do not dispatch the scheduler to
    the global DSQ"), the user-space scheduler was dispatched using
    SCX_DSQ_GLOBAL and we needed to explicitly kick idle CPUs from
    update_idle() to ensure that at least one CPU was available to run the
    user-space scheduler.
    
    Now that we are using SCX_DSQ_LOCAL_ON|cpu to dispatch the user-space
    scheduler, the target CPU is implicitly kicked. Therefore, the call to
    scx_bpf_kick_cpu() within .update_idle() becomes redundant and we can
    get rid of it.
    
    Fixes: 676bd88 ("bpf_rustland: do not dispatch the scheduler to the global DSQ")
    Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
    arighi committed Jan 17, 2024
    Copy the full SHA
    f0c3332 View commit details
    Browse the repository at this point in the history
  2. scx_rustland: improve SMT performance

    The user-space scheduler dispatches tasks in batches, with the batch
    size matching the number of idle CPUs.
    
    Commit 791bdbe ("scx_rustland: introduce SMT support") changed the order
    of idle CPUs, prioritizing dispatching tasks on the least busy cores
    (those with the most idle CPUs) before moving on to busier cores (those
    with the least idle CPUs).
    
    While this approach works well for a small number of tasks, it can lead
    to uneven performance as the number of tasks increases and all cores are
    saturated. Such uneven performance can be attributed to SMT interactions
    causing potential short lags and erratic system performance. In some
    cases, disabling SMT entirely results in better system responsiveness.
    
    To address this issue, instruct the scheduler to implicitly disable SMT
    and consistently dispatch tasks only on the first (or last) CPU of each
    core. This approach ensures an equal distribution of tasks among the
    available cores, preventing SMT disturbances and aligning with non-SMT
    performance, also when a significant amount of tasks are running.
    
    Additionally, the unused sibling CPUs within each core can be used as
    "spare" CPUs for the BPF dispatcher. This is particularly beneficial for
    tasks that cannot be dispatched on the target CPU selected by the
    scheduler, due to cpumask restrictions or congestion conditions.
    
    Therefore, this new approach allows to enhance system responsiveness on
    SMT systems, while simultaneously improving scheduler stability.
    
    Some preliminary results on an AMD Ryzen 7 5800X 8-Cores (SMT enabled):
    running my usual benchmark of measuring the fps of a videogame
    (Counter-Strike 2) during a parallel kernel build-induced system
    overload, shows an improvement of approximately 2x (from 8-10fps to
    15-25fps vs 1-2fps with EEVDF).
    
    Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
    arighi committed Jan 17, 2024
    Copy the full SHA
    be1cb87 View commit details
    Browse the repository at this point in the history