-
Notifications
You must be signed in to change notification settings - Fork 4.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Better codegen / perf form Math.Max and Math.Min #33851
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks for doing this and increasing the test coverage.
It might be worth looking at a similar optimization for MaxMagnitude
and MinMagnitude
which are functionally similar but which don't propagate NaN
CC. @pgovind |
I will have a look over the weekend / next week. |
Any performance numbers? How much of the improvement is due to AggressiveInlining vs. the other changes? Also, what is the code side growth due to AggressiveInlining? |
Various perf numbers listed starting here: #33057 (comment), with additional analysis and graphs shown below that in the additional conversation. It is a 2-3x perf increase from aggressive inlining alone. The code reorder increases perf an additional 10% or so from the instruction reordering (depending on inputs). Assuming nothing gets dropped from inlining (both inputs are non-constant) the bloat isn't trivial: https://gist.github.com/tannergooding/d81a6cd7530ec1cdc27e08530922f02a, it would be ~67 bytes for just aggressive inlining and ~66 bytes for the branch reorder. The SIMD version is almost half that (~35 bytes), but doesn't perform as well on Intel CPUs (we get less than a 2x perf increase). |
@jkotas, do you have any concerns with the change based on the numbers and code size growth? |
Math.Max/Min is not used that often. I can live with this. |
Closing and reopening to retrigger tests. |
Thanks for the improvement @gfoidl! |
Based on #33057 (comment) I picked the variant
MinReorder
as it has best overall performance on average.With #33476 there will be "another boost" for this method.
Added more testcases for
Max
/Min
.Fixes #33057
/cc: @tannergooding