topology: Don't allocate on calls to span() #241

Byte-Lab · 2024-04-24T03:57:30Z

We're currently cloning cpumasks returned by calls to {Core, Cache, Node, Topology}::span(). If a caller needs to clone it, they can. Let's not penalize the callers that just want to query the underlying cpumask.

We might be able to further optimize -- the trait implementations for bitwise operations on Cpumasks don't take references on the rhs, so we're doing extra clones where we shouldn't need to. I also wasn't able to figure out how to iterate over the bits set in the underlying cpumask without using .into_iter(), which requires cloning as well.

We're currently cloning cpumasks returned by calls to {Core, Cache, Node, Topology}::span(). If a caller needs to clone it, they can. Let's not penalize the callers that just want to query the underlying cpumask. Signed-off-by: David Vernet <void@manifault.com>

arighi

Much better, with this change applied the overhead of re-creating the iterator every single time goes from 5.5% to 2.5% in scx_rustland.

It's still more efficient to cache the topology map in a Vec<> (as in #238), considering that the topology doesn't change (CPU hotplugging not supported yet). But I was wondering if I should implement that "caching" in topology crate, like providing a new TopologyMap object or similar, that can be potentially used by other schedulers as well, for a more efficient iteration across all the cores and CPUs...

For now this one LTGM, thanks!

Byte-Lab · 2024-04-24T14:21:11Z

Much better, with this change applied the overhead of re-creating the iterator every single time goes from 5.5% to 2.5% in scx_rustland.

It's still more efficient to cache the topology map in a Vec<> (as in #238), considering that the topology doesn't change (CPU hotplugging not supported yet). But I was wondering if I should implement that "caching" in topology crate, like providing a new TopologyMap object or similar, that can be potentially used by other schedulers as well, for a more efficient iteration across all the cores and CPUs...

For now this one LTGM, thanks!

Sure, sounds like it'd be useful!

Byte-Lab requested review from arighi and htejun April 24, 2024 03:57

Byte-Lab mentioned this pull request Apr 24, 2024

scx_rustland: reduce overhead by caching host topology #238

Merged

Byte-Lab force-pushed the cpumask_efficient branch from 7846731 to c187c65 Compare April 24, 2024 04:00

arighi approved these changes Apr 24, 2024

View reviewed changes

htejun approved these changes Apr 24, 2024

View reviewed changes

Byte-Lab merged commit a8daf37 into main Apr 24, 2024
1 check passed

Byte-Lab deleted the cpumask_efficient branch April 24, 2024 14:21

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

topology: Don't allocate on calls to span() #241

topology: Don't allocate on calls to span() #241

Byte-Lab commented Apr 24, 2024 •

edited

Loading

arighi left a comment

Byte-Lab commented Apr 24, 2024

topology: Don't allocate on calls to span() #241

topology: Don't allocate on calls to span() #241

Conversation

Byte-Lab commented Apr 24, 2024 • edited Loading

arighi left a comment

Choose a reason for hiding this comment

Byte-Lab commented Apr 24, 2024

Byte-Lab commented Apr 24, 2024 •

edited

Loading