Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JIT: use blend rather then repair for profile inconsistencies #85171

Merged
merged 1 commit into from
Apr 22, 2023

Conversation

AndyAyersMS
Copy link
Member

If we have a partial profile then the current count reconstruction will adjust the exit likelihood of some loop exit when it hits a capped loop. But for multiple exit loops we might wish to see some profile flow out of all exits, not just one.

In ludcmp we choose to send all the profile weights down an early return path, leaving the bulk of the method with zero counts.

Instead of trying increasingly elaborate repair schemes, we will now use blend mode for these sorts of problems; this gives a more balanced count redistribution.

I also updated blend to use the same logic as repair if a block has zero weights, since presumably whatever likelihood was assigned there during reconstruction is not well supported.

Fixes the ludcmp regression with PGO over no PGO, noted in #84264 (comment)

If we have a partial profile then the current count reconstruction will
adjust the exit likelihood of some loop exit when it hits a capped loop.
But for multiple exit loops we might wish to see some profile flow out
of all exits, not just one.

In `ludcmp` we choose to send all the profile weights down an early return
path, leaving the bulk of the method with zero counts.

Instead of trying increasingly elaborate repair schemes, we will now use
blend mode for these sorts of problems; this gives a more balanced count
redistribution.

I also updated blend to use the same logic as repair if a block has zero
weights, since presumably whatever likelihood was assigned there during
reconstruction is not well supported.

Fixes the `ludcmp` regression with PGO over no PGO, noted in
dotnet#84264 (comment)
@dotnet-issue-labeler dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Apr 21, 2023
@ghost ghost assigned AndyAyersMS Apr 21, 2023
@ghost
Copy link

ghost commented Apr 21, 2023

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

Issue Details

If we have a partial profile then the current count reconstruction will adjust the exit likelihood of some loop exit when it hits a capped loop. But for multiple exit loops we might wish to see some profile flow out of all exits, not just one.

In ludcmp we choose to send all the profile weights down an early return path, leaving the bulk of the method with zero counts.

Instead of trying increasingly elaborate repair schemes, we will now use blend mode for these sorts of problems; this gives a more balanced count redistribution.

I also updated blend to use the same logic as repair if a block has zero weights, since presumably whatever likelihood was assigned there during reconstruction is not well supported.

Fixes the ludcmp regression with PGO over no PGO, noted in #84264 (comment)

Author: AndyAyersMS
Assignees: AndyAyersMS
Labels:

area-CodeGen-coreclr

Milestone: -

@AndyAyersMS
Copy link
Member Author

@EgorBo PTAL
cc @dotnet/jit-contrib

@AndyAyersMS
Copy link
Member Author

/azp run runtime-coreclr pgo, runtime-coreclr libraries-pgo

@azure-pipelines
Copy link

Azure Pipelines successfully started running 2 pipeline(s).

@AndyAyersMS
Copy link
Member Author

Diffs are fairly surgical: ludcmp and one related method, and a couple others here and there.

Copy link
Member

@EgorBo EgorBo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume a few size regressions are expected

@AndyAyersMS
Copy link
Member Author

I assume a few size regressions are expected

Yes. It is mainly just this one method where the old repair process left most of the method cold and so we didn't clone loops like we need to for good perf.

Top method regressions (bytes):
        1784 (124.93 % of base) : 37381.dasm - LUDecomp:ludcmp(double[][],int,int[],byref):int

@AndyAyersMS
Copy link
Member Author

Failure is known issue,

@AndyAyersMS AndyAyersMS merged commit c119e4f into dotnet:main Apr 22, 2023
167 of 170 checks passed
@ghost ghost locked as resolved and limited conversation to collaborators May 22, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants