-
Notifications
You must be signed in to change notification settings - Fork 5.1k
Have CSE correctly track float/simd and mask registers for their own enregistration #117118
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
3352e17
to
6707408
Compare
ee5c6bb
to
8cde536
Compare
CC. @dotnet/jit-contrib. This should be ready for review. There's a minor (+0.02% for most, but up to +0.04% on x64 Windows) throughput hit, but this gets us more into a correct shape for CSE considerations and removes some historical quirks where we were eating the integer CSE budget due to floating-point and mask types being in the method. So I think it's worth taking and then doing some follow-up targeted fixes (likely around not CSE'ing certain containable constants) where possible. |
@dotnet/samsung Could you please take a look? These changes may be related to riscv64. |
RISC-V Release-CLR-QEMU: 9086 / 9116 (99.67%)
report.xml, report.md, failures.xml, testclr_details.tar.zst RISC-V Release-CLR-VF2: 9087 / 9117 (99.67%)
report.xml, report.md, failures.xml, testclr_details.tar.zst RISC-V Release-FX-QEMU: 335773 / 336836 (99.68%)
report.xml, report.md, failures.xml, testclr_details.tar.zst RISC-V Release-FX-VF2: 499778 / 501520 (99.65%)
report.xml, report.md, failures.xml, testclr_details.tar.zst Build information and commandsGIT: |
@jakobbotsch I requested your review since you touch CSE most frequently. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks reasonable to me. Pinged @AndyAyersMS on a question related to the CSE ML experiment.
bed117e is being scheduled for building and testingGIT: |
This updates CSE to correctly track float/simd and mask registers for their own enregistration.
Previously the logic attempted to skip the handling of these types or even misclassified simd/mask under the integer enregistration limits. This could lead to weird diffs and register selection in some cases due to suboptimal CSE considerations.
As is typical with these changes, it's "difficult" to assess whether its an improvement or not just by looking at the diffs, because the CSE decisions impact other optimizations, register selection, etc. The diffs are primarily impacting methods that use floating-point or SIMD and generally look positive overall.
In general, we're ending up with fewer things marked as
CSE moderate
and instead more being marked asCSE aggressive
instead. This is because we're no longer eating up the CSE budget by classifying everything as integer. This leads to many cases with more total CSEs:This in turn shifts some temp numbers and impacts everything else in the method. Most of which look to be positive changes, such as:
Which is because we no longer reload some constants and do other computations unnecessarily:
There's some tuning to be done, however, because there's some places where things like zero constants are being CSE'd where it'd be better to have
morph
mark against that so we can still optimize accordingly (which should be a simple fix):There are many similar cases on x64 and with some similar fine tuning that could be done.