JIT: weight loop body cost by block frequency in cloning heuristic#129022
Open
AndyAyersMS wants to merge 1 commit into
Open
JIT: weight loop body cost by block frequency in cloning heuristic#129022AndyAyersMS wants to merge 1 commit into
AndyAyersMS wants to merge 1 commit into
Conversation
Encourage cloning for cases where the loop is large but has a small number of hot blocks. The cost estimate in optCloningHeuristic now scales each block's node count by min(1.0, blockWeight / headerWeight), so cold blocks in the loop body don't over-charge the per-call ratio and reject otherwise profitable clones (e.g. ImmutableList<T>.Node:Contains, where the hoistable GDV check is small but the body also contains rarely-taken recursive call setups). Recovers ~7% on System.Collections.ContainsTrue<Int32>.ImmutableList while preserving most of the original heuristic's clone-count reduction. Fixes dotnet#128996. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Contributor
|
Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch |
Member
Author
|
@jakobbotsch PTAL A relatively minor tweak on the just-added profitability heuristic to address some regressions. |
Contributor
There was a problem hiding this comment.
Pull request overview
This PR updates the CoreCLR JIT loop cloning heuristic to use a frequency-weighted loop body cost when computing the per-call benefit ratio, so cold blocks in the loop body contribute less to the cloning “cost” decision.
Changes:
- Replaces the cloning heuristic’s denominator from “duplicated body nodes” to a weighted body cost computed as
sum(blockNodeCount * min(1, blockWeight/headerWeight)). - Updates heuristic comments and expands the rejection
JITDUMPto include benefit and weighted cost.
Comment on lines
+3172
to
3176
| // Cost: tree nodes in the loop body, with each block's contribution scaled | ||
| // by min(1.0, blockWeight / headerWeight) so cold blocks don't over-charge | ||
| // the per-call ratio. The static body size is already capped by | ||
| // JitCloneLoopsSizeLimit in the caller. | ||
| // |
Member
Author
There was a problem hiding this comment.
Yes it will have a TP impact ... hopefully small. If not, we can cache the results of the initial size scan perhaps and just reuse that here.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Encourage cloning for cases where the loop is large but has a small number of hot blocks.
The cost estimate in optCloningHeuristic now scales each block's node count by min(1.0, blockWeight / headerWeight), so cold blocks in the loop body don't over-charge the per-call ratio and reject otherwise profitable clones (e.g. ImmutableList.Node:Contains, where the hoistable GDV check is small but the body also contains rarely-taken recursive call setups).
Recovers ~7% on System.Collections.ContainsTrue.ImmutableList while preserving most of the original heuristic's clone-count reduction.
Fixes #128996.