Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Merged by Bors] - Use collect to build mesh attributes #5255

Closed
wants to merge 1 commit into from

Conversation

kornelski
Copy link
Contributor

Small optimization. .collect() from arrays generates very nice code without reallocations: https://rust.godbolt.org/z/6E6c595bq

Takes advantage of TrustedLen
@IceSentry IceSentry added A-Rendering Drawing game state to the screen C-Code-Quality A section of code that is hard to understand or change labels Jul 9, 2022
Copy link
Contributor

@IceSentry IceSentry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer if the positions and normals used a complete variable name. This code is not obvious for beginners and using single variable names is not helping.

Also, I'm not convinced that it actually matters that much since this shouldn't be in a hot path anyway, but less code is always nice so 🤷‍♂️

@nicopap
Copy link
Contributor

nicopap commented Jul 9, 2022

@kornelski Note that there is also no reallocation in the for loop when the vectors are pre-allocated with with_capacity (I see only one of the three are pre-allocated. Also, in theory, the collects mean 3 loops instead of one. I'm surprised to see the three collects generate half as much code as the single for loop. But that doesn't mean it is faster. I need to look more carefully.

I personally do not agree with IceSentry. Using short variable names inside the map closure is better, since it prevents rustfmt from expanding the iterator chain on four different lines, which produces clutter, which hampers readability. You can easily see what it stands for by looking (on the same line) what variable the iterator gets collected into. The only important info is which tuple element it is taken from, and it's clearly visible. I think it would be much worse to use full names.

let positions: Vec<_> = vertices.iter().map(|(p, _, _)| *p).collect();
let normals: Vec<_> = vertices.iter().map(|(_, n, _)| *n).collect();
let uvs: Vec<_> = vertices.iter().map(|(_, _, uv)| *uv).collect();
// vs
let positions: Vec<_> = vertices
    .iter()
    .map(|(position, _, _)| *position)
    .collect();
let normals: Vec<_> = vertices
    .iter()
    .map(|(_, normal, _)| *normal)
    .collect();
let uvs: Vec<_> = vertices.iter().map(|(_, _, uv)| *uv).collect();

An alternative would be

let positions: Vec<_> = vertices.iter().map(|v| v.0).collect();
let normals: Vec<_> = vertices.iter().map(|v| v.1).collect();
let uvs: Vec<_> = vertices.iter().map(|v| v.2).collect();

But personal preferences 🤷

@afonsolage
Copy link
Contributor

Sorry if I don't get it, but If the current code also doesn't reallocate, like stated by nicopap, what is the point of the PR?

If it is readability, the current one if pretty clear, but if we wanna a short version, I think the alternative suggested also by nicopap is better:

let positions: Vec<_> = vertices.iter().map(|v| v.0).collect();
let normals: Vec<_> = vertices.iter().map(|v| v.1).collect();
let uvs: Vec<_> = vertices.iter().map(|v| v.2).collect();

@nicopap
Copy link
Contributor

nicopap commented Jul 9, 2022

@afonsolage two of the three bits of code updated in this PR actually weren't pre-allocating, so this fixes that. On top of that, the snippet linked by kornelski shows a significant drop in machine instructions with the PR change (which means faster code, and probably better inlining and constant propagation). Even then, it's not that big of a deal, we can imagine a situation where the mesh generation code is very hot, but that's not likely in most games.

@kornelski
Copy link
Contributor Author

Even when the vector has sufficient capacity, every call to push still includes a check whether the capacity is sufficient, and includes code for performing reallocation of the vector if the capacity wasn't sufficient. The optimizer isn't smart enough to figure out that the capacity will suffice, so treats it as a runtime variable with all the code needed to dynamically grow the vector.

OTOH iteration + collect has special-case optimization via private TrustedLen, which allows collect to allocate exactly once, and then fill the allocation without checking. Absence of checks allows optimizer to generate much nicer code.

Yeah, rustfmt is not smart enough and gets in the way of writing readable code. I'll leave that fight up to you.

Copy link
Contributor

@nicopap nicopap left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. The linked godbolt snippet demonstrates the compiler optimizes away the for loop with this change(!!!) this is definitively a perf gain, even if it's pretty niche, it's nice to have.

Copy link
Member

@alice-i-cecile alice-i-cecile left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. Thanks for the godbolt analysis!

@alice-i-cecile alice-i-cecile added the S-Ready-For-Final-Review This PR has been approved by the community. It's ready for a maintainer to consider merging it label Jul 9, 2022
@alice-i-cecile
Copy link
Member

bors r+

bors bot pushed a commit that referenced this pull request Jul 11, 2022
Small optimization. `.collect()` from arrays generates very nice code without reallocations: https://rust.godbolt.org/z/6E6c595bq

Co-authored-by: Kornel <kornel@geekhood.net>
@bors bors bot changed the title Use collect to build mesh attributes [Merged by Bors] - Use collect to build mesh attributes Jul 11, 2022
@bors bors bot closed this Jul 11, 2022
inodentry pushed a commit to IyesGames/bevy that referenced this pull request Aug 8, 2022
Small optimization. `.collect()` from arrays generates very nice code without reallocations: https://rust.godbolt.org/z/6E6c595bq

Co-authored-by: Kornel <kornel@geekhood.net>
james7132 pushed a commit to james7132/bevy that referenced this pull request Oct 28, 2022
Small optimization. `.collect()` from arrays generates very nice code without reallocations: https://rust.godbolt.org/z/6E6c595bq

Co-authored-by: Kornel <kornel@geekhood.net>
ItsDoot pushed a commit to ItsDoot/bevy that referenced this pull request Feb 1, 2023
Small optimization. `.collect()` from arrays generates very nice code without reallocations: https://rust.godbolt.org/z/6E6c595bq

Co-authored-by: Kornel <kornel@geekhood.net>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-Rendering Drawing game state to the screen C-Code-Quality A section of code that is hard to understand or change S-Ready-For-Final-Review This PR has been approved by the community. It's ready for a maintainer to consider merging it
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants