scx_rustland: reduce overhead by caching host topology #238

arighi · 2024-04-23T15:24:31Z

Looking at perf top it seems that the scheduler can spend a significant amount of time in topology::Cache::span(), especially when the system is running a significant amount of tasks:

5.46% scx_rustland [.] scx_utils::topology::Cache::span

Considering that scx_rustland doesn't support CPU hotplugging yet (it requires a full restart to properly handle CPU hotplug events), we can completely avoid this overhead by caching the cores/CPUs mapping at the beginning, when the scheduler starts, instead of constantly re-evaluating the CPU topology information.

This allows to reduce the scheduler overhead by ~10% CPU utilization under heavy load conditions (from ~68% -> ~60%, according to top).

Byte-Lab

LGTM, but we should probably also make the Cpumask implementation more efficient. I'll work on that

Byte-Lab · 2024-04-24T03:58:58Z

In case you want to give it a try, this might help: #241

arighi · 2024-04-24T14:22:12Z

In case you want to give it a try, this might help: #241

Updated this PR on top of #241, also introducing a new TopologyMap object, that can be used to cache Topology information (instead of implementing a custom local caching in scx_rustland).

Introuce a TopologyMap object, represented as an array of arrays, where each inner array corresponds to a core containing its associated CPU IDs. This object can be used as a cache to facilitate efficient iteration over the entire host's topology. Example usage: let topo = Topology::new()?; let topo_map = TopologyMap::new(topo)?; for (core_id, core) in topo_map.iter().enumerate() { for cpu in core { println!("core={} cpu={}", core_id, cpu); } } Signed-off-by: Andrea Righi <andrea.righi@canonical.com>

Looking at perf top it seems that the scheduler can spend a significant amount of time iterating over the CPU topology/cpumask information, especially when the system is running a significant amount of tasks: 2.57% scx_rustland [.] <scx_utils::cpumask::CpumaskIntoIterator as core::iter::traits::iterator::Iterator>::next Considering that scx_rustland doesn't support CPU hotplugging yet (it requires a full restart to properly handle CPU hotplug events), we can completely avoid this overhead by caching a TopologyMap object at the beginning, when the scheduler starts, instead of constantly re-evaluating the CPU topology information. This allows to reduce the scheduler overhead by ~5% CPU utilization under heavy load conditions (from ~65% -> ~60%, according to top). Signed-off-by: Andrea Righi <andrea.righi@canonical.com>

arighi · 2024-04-24T15:11:12Z

(meh... pushed the wrong branch, should be good now)

htejun approved these changes Apr 23, 2024

View reviewed changes

Byte-Lab approved these changes Apr 24, 2024

View reviewed changes

arighi mentioned this pull request Apr 24, 2024

topology: Don't allocate on calls to span() #241

Merged

arighi force-pushed the rustland-reduce-topology-overhead branch from 854fdb4 to 538232c Compare April 24, 2024 14:19

arighi added 2 commits April 24, 2024 17:08

arighi force-pushed the rustland-reduce-topology-overhead branch from 538232c to 5302ff1 Compare April 24, 2024 15:10

arighi merged commit 973aded into main Apr 24, 2024
1 check passed

arighi deleted the rustland-reduce-topology-overhead branch April 24, 2024 20:24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

scx_rustland: reduce overhead by caching host topology #238

scx_rustland: reduce overhead by caching host topology #238

arighi commented Apr 23, 2024

Byte-Lab left a comment

Byte-Lab commented Apr 24, 2024

arighi commented Apr 24, 2024

arighi commented Apr 24, 2024

scx_rustland: reduce overhead by caching host topology #238

scx_rustland: reduce overhead by caching host topology #238

Conversation

arighi commented Apr 23, 2024

Byte-Lab left a comment

Choose a reason for hiding this comment

Byte-Lab commented Apr 24, 2024

arighi commented Apr 24, 2024

arighi commented Apr 24, 2024