JIT: Avoid rotating/splitting loop bodies during 3-opt layout #113101

amanasifkhalid · 2025-03-03T23:25:02Z

Part of #107749. My recent comparison of 3-opt to an external TSP optimizer (comment) revealed the former's tendency to leave loop bodies broken up if hot jump compaction decided to create fallthrough along one of the loop's exit paths. The initial layout should probably keep loop bodies compact on principle, and rely on 3-opt's profitability heuristics to decide when to break them up. As a motivating example, consider the following layout from benchmarks.run_pgo:

---------------------------------------------------------------------------------------------------------------------------------------------------------------------
BBnum BBid ref try hnd preds           weight     IBC [IL range]   [jump]                            [EH region]        [flags]
---------------------------------------------------------------------------------------------------------------------------------------------------------------------
BB01 [0000]  1                             1     1106 [000..001)-> BB42(0.0607),BB37(0.939)  ( cond )                     i LIR IBC
BB37 [0081]  1       BB01                  0.94  1039 [000..???)-> BB02(1)                 (always)                     LIR IBC internal
BB10 [0015]  1       BB09                  7.05  7796 [000..001)-> BB12(1)                 (always)                     i LIR IBC bwd
BB12 [0017]  2       BB08,BB10            11.46 12670 [000..001)-> BB02(0.939),BB42(0.0607)  ( cond )                     i LIR IBC bwd bwd-src
BB02 [0012]  2       BB12,BB37            11.70 12940 [000..001)-> BB05(0.678),BB03(0.322) ( cond )                     i LIR IBC bwd bwd-target
BB05 [0022]  1       BB02                  7.93  8770 [000..001)-> BB06(1),BB41(0)         ( cond )                     i LIR IBC bwd
BB06 [0038]  1       BB05                  7.93  8770 [000..001)-> BB07(1)                 (always)                     i LIR IBC bwd
BB07 [0023]  2       BB04,BB06            11.70 12940 [000..001)-> BB09(0.623),BB08(0.377) ( cond )                     i LIR IBC bwd
BB09 [0014]  1       BB07                  7.29  8066 [000..001)-> BB11(0.0335),BB10(0.967)  ( cond )                     i LIR IBC bwd
BB11 [0016]  1       BB09                  0.24   270 [000..001)-> BB14(1)                 (always)                     i LIR IBC
BB16 [0002]  1       BB14                  0.24   270 [01B..021)-> BB18(1)                 (always)                     i LIR IBC
BB18 [0004]  2       BB16,BB17             0.70   774 [029..02F)-> BB25(0.00775),BB19(0.992)   ( cond )                     i LIR IBC bwd
BB19 [0005]  1       BB18                  0.69   768 [02F..030)-> BB22(0.678),BB20(0.322) ( cond )                     i LIR IBC bwd bwd-src
BB22 [0042]  1       BB19                  0.47   520 [02F..030)-> BB23(1),BB41(0)         ( cond )                     i LIR IBC bwd
BB23 [0058]  1       BB22                  0.47   520 [02F..030)-> BB24(1)                 (always)                     i LIR IBC bwd
BB24 [0043]  2       BB21,BB23             0.69   768 [02F..044)-> BB17(0.656),BB25(0.344) ( cond )                     i LIR IBC bwd bwd-src
BB17 [0003]  1       BB24                  0.46   504 [021..029)-> BB18(1)                 (always)                     i LIR IBC bwd bwd-target
BB20 [0041]  1       BB19                  0.22   248 [02F..030)-> BB21(1),BB41(0)         ( cond )                     i LIR IBC bwd
BB21 [0050]  1       BB20                  0.22   248 [02F..030)-> BB24(1)                 (always)                     i LIR IBC bwd
BB25 [0006]  2       BB18,BB24             0.24   270 [044..04A)-> BB27(1)                 (always)                     i LIR IBC
BB27 [0008]  2       BB25,BB26             0.57   634 [052..05A)-> BB38(0),BB28(1)         ( cond )                     i LIR IBC bwd
BB28 [0009]  1       BB27                  0.57   634 [05A..05B)-> BB31(0.678),BB29(0.322) ( cond )                     i LIR IBC bwd bwd-src
BB31 [0062]  1       BB28                  0.39   430 [05A..05B)-> BB33(1),BB41(0)         ( cond )                     i LIR IBC bwd
BB33 [0078]  1       BB31                  0.39   430 [05A..05B)-> BB34(1)                 (always)                     i LIR IBC bwd
BB34 [0063]  2       BB30,BB33             0.57   634 [05A..06F)-> BB26(0.574),BB38(0.426) ( cond )                     i LIR IBC bwd bwd-src
BB26 [0007]  1       BB34                  0.33   364 [04A..052)-> BB27(1)                 (always)                     i LIR IBC bwd bwd-target
BB29 [0061]  1       BB28                  0.18   204 [05A..05B)-> BB30(1),BB41(0)         ( cond )                     i LIR IBC bwd
BB30 [0070]  1       BB29                  0.18   204 [05A..05B)-> BB34(1)                 (always)                     i LIR IBC bwd
BB42 [0086]  2       BB01,BB12             0.76   836 [000..001)-> BB14(1)                 (always)                     i LIR IBC
BB14 [0019]  2       BB11,BB42             1.00  1106 [000..012)-> BB16(0.244),BB15(0.756) ( cond )                     i LIR IBC
BB15 [0001]  1       BB14                  0.76   836 [012..01B)-> BB38(1)                 (always)                     i LIR IBC
BB38 [0082]  3       BB15,BB27,BB34        1.00  1106 [06F..070)                           (return)                     i LIR IBC
BB03 [0021]  1       BB02                  3.77  4170 [000..001)-> BB04(1),BB41(0)         ( cond )                     i LIR IBC bwd
BB04 [0030]  1       BB03                  3.77  4170 [000..001)-> BB07(1)                 (always)                     i LIR IBC bwd
BB08 [0013]  1       BB07                  4.41  4874 [000..001)-> BB12(1)                 (always)                     i LIR IBC bwd
BB41 [0085]  6       BB03,BB05,BB20,BB22,BB29,BB31   0        0 [05A..05B)                           (throw )                     i LIR IBC rare gcsafe bwd
---------------------------------------------------------------------------------------------------------------------------------------------------------------------

Notice how the placement of [BB03, BB08] at the end of the method creates a bimodal distribution of code density. With this PR, this is fixed:

---------------------------------------------------------------------------------------------------------------------------------------------------------------------
BBnum BBid ref try hnd preds           weight     IBC [IL range]   [jump]                            [EH region]        [flags]
---------------------------------------------------------------------------------------------------------------------------------------------------------------------
BB01 [0000]  1                             1     1106 [000..001)-> BB42(0.0607),BB37(0.939)  ( cond )                     i LIR IBC
BB37 [0081]  1       BB01                  0.94  1039 [000..???)-> BB02(1)                 (always)                     LIR IBC internal
BB05 [0022]  1       BB02                  7.93  8770 [000..001)-> BB06(1),BB41(0)         ( cond )                     i LIR IBC bwd
BB06 [0038]  1       BB05                  7.93  8770 [000..001)-> BB07(1)                 (always)                     i LIR IBC bwd
BB07 [0023]  2       BB04,BB06            11.70 12940 [000..001)-> BB09(0.623),BB08(0.377) ( cond )                     i LIR IBC bwd
BB09 [0014]  1       BB07                  7.29  8066 [000..001)-> BB11(0.0335),BB10(0.967)  ( cond )                     i LIR IBC bwd
BB10 [0015]  1       BB09                  7.05  7796 [000..001)-> BB12(1)                 (always)                     i LIR IBC bwd
BB12 [0017]  2       BB08,BB10            11.46 12670 [000..001)-> BB02(0.939),BB42(0.0607)  ( cond )                     i LIR IBC bwd bwd-src
BB02 [0012]  2       BB12,BB37            11.70 12940 [000..001)-> BB05(0.678),BB03(0.322) ( cond )                     i LIR IBC bwd bwd-target
BB03 [0021]  1       BB02                  3.77  4170 [000..001)-> BB04(1),BB41(0)         ( cond )                     i LIR IBC bwd
BB04 [0030]  1       BB03                  3.77  4170 [000..001)-> BB07(1)                 (always)                     i LIR IBC bwd
BB08 [0013]  1       BB07                  4.41  4874 [000..001)-> BB12(1)                 (always)                     i LIR IBC bwd
BB11 [0016]  1       BB09                  0.24   270 [000..001)-> BB14(1)                 (always)                     i LIR IBC
BB42 [0086]  2       BB01,BB12             0.76   836 [000..001)-> BB14(1)                 (always)                     i LIR IBC
BB14 [0019]  2       BB11,BB42             1.00  1106 [000..012)-> BB16(0.244),BB15(0.756) ( cond )                     i LIR IBC
BB15 [0001]  1       BB14                  0.76   836 [012..01B)-> BB38(1)                 (always)                     i LIR IBC
BB38 [0082]  3       BB15,BB27,BB34        1.00  1106 [06F..070)                           (return)                     i LIR IBC
BB16 [0002]  1       BB14                  0.24   270 [01B..021)-> BB18(1)                 (always)                     i LIR IBC
BB18 [0004]  2       BB16,BB17             0.70   774 [029..02F)-> BB25(0.00775),BB19(0.992)   ( cond )                     i LIR IBC bwd
BB19 [0005]  1       BB18                  0.69   768 [02F..030)-> BB22(0.678),BB20(0.322) ( cond )                     i LIR IBC bwd bwd-src
BB22 [0042]  1       BB19                  0.47   520 [02F..030)-> BB23(1),BB41(0)         ( cond )                     i LIR IBC bwd
BB23 [0058]  1       BB22                  0.47   520 [02F..030)-> BB24(1)                 (always)                     i LIR IBC bwd
BB24 [0043]  2       BB21,BB23             0.69   768 [02F..044)-> BB17(0.656),BB25(0.344) ( cond )                     i LIR IBC bwd bwd-src
BB17 [0003]  1       BB24                  0.46   504 [021..029)-> BB18(1)                 (always)                     i LIR IBC bwd bwd-target
BB20 [0041]  1       BB19                  0.22   248 [02F..030)-> BB21(1),BB41(0)         ( cond )                     i LIR IBC bwd
BB21 [0050]  1       BB20                  0.22   248 [02F..030)-> BB24(1)                 (always)                     i LIR IBC bwd
BB25 [0006]  2       BB18,BB24             0.24   270 [044..04A)-> BB27(1)                 (always)                     i LIR IBC
BB27 [0008]  2       BB25,BB26             0.57   634 [052..05A)-> BB38(0),BB28(1)         ( cond )                     i LIR IBC bwd
BB28 [0009]  1       BB27                  0.57   634 [05A..05B)-> BB31(0.678),BB29(0.322) ( cond )                     i LIR IBC bwd bwd-src
BB31 [0062]  1       BB28                  0.39   430 [05A..05B)-> BB33(1),BB41(0)         ( cond )                     i LIR IBC bwd
BB33 [0078]  1       BB31                  0.39   430 [05A..05B)-> BB34(1)                 (always)                     i LIR IBC bwd
BB34 [0063]  2       BB30,BB33             0.57   634 [05A..06F)-> BB26(0.574),BB38(0.426) ( cond )                     i LIR IBC bwd bwd-src
BB26 [0007]  1       BB34                  0.33   364 [04A..052)-> BB27(1)                 (always)                     i LIR IBC bwd bwd-target
BB29 [0061]  1       BB28                  0.18   204 [05A..05B)-> BB30(1),BB41(0)         ( cond )                     i LIR IBC bwd
BB30 [0070]  1       BB29                  0.18   204 [05A..05B)-> BB34(1)                 (always)                     i LIR IBC bwd
BB41 [0085]  6       BB03,BB05,BB20,BB22,BB29,BB31   0        0 [05A..05B)                           (throw )                     i LIR IBC rare gcsafe bwd
---------------------------------------------------------------------------------------------------------------------------------------------------------------------

Copilot reviewed 1 out of 1 changed files in this pull request and generated no comments.

dotnet-policy-service · 2025-03-03T23:25:35Z

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

amanasifkhalid · 2025-03-06T01:58:39Z

While triaging layout regressions in #113108, I noticed some instances (comment) where 3-opt decides to create fallthrough along loop backedges, ultimately rotating the loop shape such that it is top-tested. It makes sense that our cost model would incentivize this behavior, since there frequently is a large difference in weight between loop iteration edges and loop entry/exit edges, but this is a clear case where our cost model is too simple to reflect real performance characteristics. Restricting 3-opt from creating fallthrough along backedges out of test blocks helps keep loops bottom-tested. Here's a motivating example from aspnet without this and the above changes:

---------------------------------------------------------------------------------------------------------------------------------------------------------------------
BBnum BBid ref try hnd preds           weight   IBC [IL range]   [jump]                            [EH region]        [flags]
---------------------------------------------------------------------------------------------------------------------------------------------------------------------
BB01 [0021]  1                             1      1 [???..???)-> BB34(0.00595),BB29(0.994)   ( cond )                     i LIR IBC internal
BB34 [0035]  2       BB14,BB01             1.00   1 [06E..073)-> BB16(1)                 (always)                     i LIR IBC
BB19 [0014]  2       BB18,BB35            98.01 117 [08C..0A5)-> BB21(0.48),BB20(0.52)   ( cond )                     i LIR IBC idxlen bwd
BB20 [0015]  1       BB19                 50.97  61 [0A5..0B6)-> BB21(1)                 (always)                     i LIR IBC idxlen bwd
BB21 [0016]  2       BB19,BB20            98.01 117 [0B6..0C2)-> BB35(0.9),BB45(0.1)     ( cond )                     i LIR IBC bwd
BB35 [0036]  2       BB21,BB43            98.01 117 [078..085)-> BB19(0.48),BB18(0.52)   ( cond )                     i LIR IBC bwd
BB18 [0013]  1       BB35                 50.97  61 [085..08C)-> BB19(1)                 (always)                     i LIR IBC bwd
BB45 [0046]  2       BB21,BB41            10.00  12 [0C2..0CD)-> BB16(0.9),BB33(0.1)     ( cond )                     i LIR IBC bwd
BB16 [0011]  2       BB34,BB45            10.00  12 [073..078)-> BB36(0.01),BB43(0.99)   ( cond )                     i LIR IBC bwd bwd-target
BB43 [0044]  1       BB16                  9.90  12 [???..???)-> BB36(0.01),BB35(0.99)   ( cond )                     LIR IBC internal idxlen
BB33 [0034]  1       BB45                  1.00   1 [0CD..0E4)                           (return)                     i LIR IBC
BB38 [0039]  1       BB36                  1.03   1 [085..08C)-> BB39(1)                 (always)                     i LIR IBC bwd
BB39 [0040]  2       BB36,BB38             1.99   2 [08C..0A5)-> BB41(0.48),BB40(0.52)   ( cond )                     i LIR IBC idxlen bwd
BB40 [0041]  1       BB39                  1.03   1 [0A5..0B6)-> BB41(1)                 (always)                     i LIR IBC idxlen bwd
BB41 [0042]  2       BB39,BB40             1.99   2 [0B6..0C2)-> BB36(0.9),BB45(0.1)     ( cond )                     i LIR IBC bwd
BB36 [0037]  3       BB16,BB41,BB43        1.99   2 [078..085)-> BB39(0.48),BB38(0.52)   ( cond )                     i LIR IBC bwd
BB10 [0005]  1       BB29                 57.25  68 [041..048)-> BB11(1)                 (always)                     i LIR IBC bwd
BB11 [0006]  2       BB10,BB29           167.07 200 [048..055)-> BB13(0.404),BB12(0.596) ( cond )                     i LIR IBC idxlen bwd
BB12 [0007]  1       BB11                 99.57 119 [055..05D)-> BB13(1)                 (always)                     i LIR IBC bwd
BB13 [0008]  2       BB11,BB12           167.07 200 [05D..068)-> BB14(1)                 (always)                     i LIR IBC idxlen bwd
BB14 [0009]  1       BB13                167.07 200 [068..06E)-> BB29(0.994),BB34(0.00595)   ( cond )                     i LIR IBC bwd bwd-src osr-entry
BB29 [0030]  2       BB14,BB01           167.07 200 [035..041)-> BB11(0.657),BB10(0.343) ( cond )                     i LIR IBC bwd
BB46 [0047]  0                             0      0 [???..???)                           (throw )                     i LIR IBC rare keep internal
---------------------------------------------------------------------------------------------------------------------------------------------------------------------

And with these changes:

---------------------------------------------------------------------------------------------------------------------------------------------------------------------
BBnum BBid ref try hnd preds           weight   IBC [IL range]   [jump]                            [EH region]        [flags]
---------------------------------------------------------------------------------------------------------------------------------------------------------------------
BB01 [0021]  1                             1      1 [???..???)-> BB34(0.00595),BB29(0.994)   ( cond )                     i LIR IBC internal
BB29 [0030]  2       BB14,BB01           167.07 200 [035..041)-> BB11(0.657),BB10(0.343) ( cond )                     i LIR IBC bwd
BB10 [0005]  1       BB29                 57.25  68 [041..048)-> BB11(1)                 (always)                     i LIR IBC bwd
BB11 [0006]  2       BB10,BB29           167.07 200 [048..055)-> BB13(0.404),BB12(0.596) ( cond )                     i LIR IBC idxlen bwd
BB12 [0007]  1       BB11                 99.57 119 [055..05D)-> BB13(1)                 (always)                     i LIR IBC bwd
BB13 [0008]  2       BB11,BB12           167.07 200 [05D..068)-> BB14(1)                 (always)                     i LIR IBC idxlen bwd
BB14 [0009]  1       BB13                167.07 200 [068..06E)-> BB29(0.994),BB34(0.00595)   ( cond )                     i LIR IBC bwd bwd-src osr-entry
BB34 [0035]  2       BB14,BB01             1.00   1 [06E..073)-> BB16(1)                 (always)                     i LIR IBC
BB16 [0011]  2       BB34,BB45            10.00  12 [073..078)-> BB36(0.01),BB43(0.99)   ( cond )                     i LIR IBC bwd bwd-target
BB43 [0044]  1       BB16                  9.90  12 [???..???)-> BB36(0.01),BB35(0.99)   ( cond )                     LIR IBC internal idxlen
BB35 [0036]  2       BB21,BB43            98.01 117 [078..085)-> BB19(0.48),BB18(0.52)   ( cond )                     i LIR IBC bwd
BB18 [0013]  1       BB35                 50.97  61 [085..08C)-> BB19(1)                 (always)                     i LIR IBC bwd
BB19 [0014]  2       BB18,BB35            98.01 117 [08C..0A5)-> BB21(0.48),BB20(0.52)   ( cond )                     i LIR IBC idxlen bwd
BB20 [0015]  1       BB19                 50.97  61 [0A5..0B6)-> BB21(1)                 (always)                     i LIR IBC idxlen bwd
BB21 [0016]  2       BB19,BB20            98.01 117 [0B6..0C2)-> BB35(0.9),BB45(0.1)     ( cond )                     i LIR IBC bwd
BB45 [0046]  2       BB21,BB41            10.00  12 [0C2..0CD)-> BB16(0.9),BB33(0.1)     ( cond )                     i LIR IBC bwd
BB33 [0034]  1       BB45                  1.00   1 [0CD..0E4)                           (return)                     i LIR IBC
BB36 [0037]  3       BB16,BB41,BB43        1.99   2 [078..085)-> BB39(0.48),BB38(0.52)   ( cond )                     i LIR IBC bwd
BB38 [0039]  1       BB36                  1.03   1 [085..08C)-> BB39(1)                 (always)                     i LIR IBC bwd
BB39 [0040]  2       BB36,BB38             1.99   2 [08C..0A5)-> BB41(0.48),BB40(0.52)   ( cond )                     i LIR IBC idxlen bwd
BB40 [0041]  1       BB39                  1.03   1 [0A5..0B6)-> BB41(1)                 (always)                     i LIR IBC idxlen bwd
BB41 [0042]  2       BB39,BB40             1.99   2 [0B6..0C2)-> BB36(0.9),BB45(0.1)     ( cond )                     i LIR IBC bwd
BB46 [0047]  0                             0      0 [???..???)                           (throw )                     i LIR IBC rare keep internal
---------------------------------------------------------------------------------------------------------------------------------------------------------------------

amanasifkhalid · 2025-03-06T16:49:17Z

On second thought, I don't think inhibiting 3-opt from converging to a locally-minimal solution is the right approach (comment). I'm going to try something more surgical, and separate from 3-opt.

Don't break up loop bodies during hot jump compaction

71efce2

Copilot bot review requested due to automatic review settings March 3, 2025 23:25

Copilot AI reviewed Mar 3, 2025

View reviewed changes

dotnet-issue-labeler bot added the area-CodeGen-coreclr label Mar 3, 2025

dotnet-policy-service bot assigned amanasifkhalid Mar 3, 2025

amanasifkhalid mentioned this pull request Mar 5, 2025

Benchmark Regressions from Profile Maintenance and Block Layout Changes #113108

Open

amanasifkhalid changed the title ~~JIT: Don't break up loop bodies during hot jump compaction~~ JIT: Avoid rotating/splitting loop bodies during 3-opt layout Mar 6, 2025

Don't create fallthrough along backedges from test blocks

4402f21

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

JIT: Avoid rotating/splitting loop bodies during 3-opt layout #113101

JIT: Avoid rotating/splitting loop bodies during 3-opt layout #113101

amanasifkhalid commented Mar 3, 2025

dotnet-policy-service bot commented Mar 3, 2025

amanasifkhalid commented Mar 6, 2025

amanasifkhalid commented Mar 6, 2025 •

edited

Loading

JIT: Avoid rotating/splitting loop bodies during 3-opt layout #113101

Are you sure you want to change the base?

JIT: Avoid rotating/splitting loop bodies during 3-opt layout #113101

Conversation

amanasifkhalid commented Mar 3, 2025

Choose a reason for hiding this comment

dotnet-policy-service bot commented Mar 3, 2025

amanasifkhalid commented Mar 6, 2025

amanasifkhalid commented Mar 6, 2025 • edited Loading

amanasifkhalid commented Mar 6, 2025 •

edited

Loading