Skip to content

Conversation

samueltardieu
Copy link
Member

This might positively affect performances of the should_implement_trait lint.

r? blyxyas

changelog: none

@rustbot rustbot added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties label Oct 1, 2025
@blyxyas
Copy link
Member

blyxyas commented Oct 2, 2025

Profile results: +0.04% performance difference on tokio-1.38.1, the lint itself grew 11k -> 31.6k

@y21
Copy link
Member

y21 commented Oct 2, 2025

Could also try if let Some(method_config) = TRAIT_METHODS_DEFS.iter().find(|s| s.method_name == impl_item.ident.name) { and see what that's like.
Very similar to the original code but LLVM turns that into an unrolled binary search so there's still very little actual branching (instead of a linear scan like before).

Then again, 0.04% seems so small that it may as well just be noise, right? Don't know much effort would be worth putting into optimizing this then

@blyxyas
Copy link
Member

blyxyas commented Oct 2, 2025

Then again, 0.04% seems so small that it may as well just be noise, right? Don't know much effort would be worth putting into optimizing this then

I profiled again anyhow-1.0.86 this time. Same result, the lint is very cheap (which is great! It's a very good example of how to write a performant lint), so just the act of initializing the lock is enough to triple its icount

@samueltardieu
Copy link
Member Author

Could also try if let Some(method_config) = TRAIT_METHODS_DEFS.iter().find(|s| s.method_name == impl_item.ident.name) { and see what that's like. Very similar to the original code but LLVM turns that into an unrolled binary search so there's still very little actual branching (instead of a linear scan like before).

The other thing I wanted to try after this is a binary search indeed (an explicit one), but this requires the array to be sorted, and there is no const-stable sort function as far as I can see (to do that at compile time). How would LLVM not to a linear scan in a non-sorted array?

@samueltardieu
Copy link
Member Author

Then again, 0.04% seems so small that it may as well just be noise, right? Don't know much effort would be worth putting into optimizing this then

I profiled again anyhow-1.0.86 this time. Same result, the lint is very cheap (which is great! It's a very good example of how to write a performant lint), so just the act of initializing the lock is enough to triple its icount

If we don't get even better performances, I'll just keep the cleanups and restore the linear scan we had if it is that cheap.

@y21
Copy link
Member

y21 commented Oct 3, 2025

How would LLVM not to a linear scan in a non-sorted array?

It unrolls the loop so that it's just a sequence of cmps and jmps (getting rid of the array entirely), then it can freely reorder the compare instructions around to do a binary search.

Example: https://godbolt.org/z/Kdfvx6zKc (heavily simplified but I've verified locally that LLVM does the same on clippy with cargo asm), note how it inlines the symbol u32 comparisons out of the array and if you follow the jumps for any particular number to search it's always at most 3-5 comparisons

@samueltardieu samueltardieu force-pushed the features/hash-map-lookup branch from a1182d2 to cd37148 Compare October 3, 2025 14:10
@samueltardieu
Copy link
Member Author

Let's check whether it gives better performances or not. In any case, it should not be worst and is cleaner.

@samueltardieu samueltardieu force-pushed the features/hash-map-lookup branch 2 times, most recently from 803f4e3 to e832c0c Compare October 3, 2025 14:26
@samueltardieu samueltardieu force-pushed the features/hash-map-lookup branch from e832c0c to 87d2891 Compare October 3, 2025 14:30
@samueltardieu samueltardieu changed the title Lookup record through a hash map instead of a linear scan Cleanup should_implement_trait lint code Oct 3, 2025
Copy link
Member

@blyxyas blyxyas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks! ❤️

View changes since this review

@blyxyas
Copy link
Member

blyxyas commented Oct 6, 2025

It still gives some worst performance (7.5k -> 11k), but it's in the noise range and it could be changed at any moment by compiler optimizations or the user having another program open.

@blyxyas blyxyas added this pull request to the merge queue Oct 6, 2025
Merged via the queue into rust-lang:master with commit aa0b914 Oct 6, 2025
13 checks passed
@rustbot rustbot removed the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties label Oct 6, 2025
@samueltardieu samueltardieu deleted the features/hash-map-lookup branch October 6, 2025 22:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants