Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

topology: Don't allocate on calls to span() #241

Merged
merged 1 commit into from
Apr 24, 2024
Merged

Conversation

Byte-Lab
Copy link
Contributor

@Byte-Lab Byte-Lab commented Apr 24, 2024

We're currently cloning cpumasks returned by calls to {Core, Cache, Node, Topology}::span(). If a caller needs to clone it, they can. Let's not penalize the callers that just want to query the underlying cpumask.

We might be able to further optimize -- the trait implementations for bitwise operations on Cpumasks don't take references on the rhs, so we're doing extra clones where we shouldn't need to. I also wasn't able to figure out how to iterate over the bits set in the underlying cpumask without using .into_iter(), which requires cloning as well.

We're currently cloning cpumasks returned by calls to {Core, Cache,
Node, Topology}::span(). If a caller needs to clone it, they can. Let's
not penalize the callers that just want to query the underlying cpumask.

Signed-off-by: David Vernet <void@manifault.com>
Copy link
Collaborator

@arighi arighi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Much better, with this change applied the overhead of re-creating the iterator every single time goes from 5.5% to 2.5% in scx_rustland.

It's still more efficient to cache the topology map in a Vec<> (as in #238), considering that the topology doesn't change (CPU hotplugging not supported yet). But I was wondering if I should implement that "caching" in topology crate, like providing a new TopologyMap object or similar, that can be potentially used by other schedulers as well, for a more efficient iteration across all the cores and CPUs...

For now this one LTGM, thanks!

@Byte-Lab
Copy link
Contributor Author

Much better, with this change applied the overhead of re-creating the iterator every single time goes from 5.5% to 2.5% in scx_rustland.

It's still more efficient to cache the topology map in a Vec<> (as in #238), considering that the topology doesn't change (CPU hotplugging not supported yet). But I was wondering if I should implement that "caching" in topology crate, like providing a new TopologyMap object or similar, that can be potentially used by other schedulers as well, for a more efficient iteration across all the cores and CPUs...

For now this one LTGM, thanks!

Sure, sounds like it'd be useful!

@Byte-Lab Byte-Lab merged commit a8daf37 into main Apr 24, 2024
1 check passed
@Byte-Lab Byte-Lab deleted the cpumask_efficient branch April 24, 2024 14:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants