Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace a load with cheaper mov instruction when possible #83458

Merged
merged 1 commit into from
Mar 21, 2023

Conversation

SwapnilGaikwad
Copy link
Contributor

Fixes #35141

@dotnet-issue-labeler dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Mar 15, 2023
@ghost ghost added the community-contribution Indicates that the PR has been added by a community member label Mar 15, 2023
@SwapnilGaikwad
Copy link
Contributor Author

SuperPMI asmdiff show that the patch replaces following number of loads to movs

asm.benchmarks.run.linux.arm64.checked        : 112
asm.coreclr_tests.run.linux.arm64.checked     : 94
asm.libraries.crossgen2.linux.arm64.checked   : 62
asm.libraries.pmi.linux.arm64.checked         : 87
asm.libraries_tests.pmi.linux.arm64.checked   : 76

Summary from SuperPMI diff:
Diffs are based on 1,456,321 contexts (399,935 MinOpts, 1,056,386 FullOpts).

Overall (-64 bytes)
Collection Base size (bytes) Diff size (bytes)
benchmarks.run.linux.arm64.checked.mch 20,597,120 -12
libraries_tests.pmi.linux.arm64.checked.mch 163,340,880 -16
libraries.crossgen2.linux.arm64.checked.mch 41,317,512 -12
libraries.pmi.linux.arm64.checked.mch 66,011,160 -4
coreclr_tests.run.linux.arm64.checked.mch 514,286,872 -20
MinOpts (+0 bytes)
Collection Base size (bytes) Diff size (bytes)
libraries_tests.pmi.linux.arm64.checked.mch 5,695,280 +0
coreclr_tests.run.linux.arm64.checked.mch 359,923,456 +0
FullOpts (-64 bytes)
Collection Base size (bytes) Diff size (bytes)
benchmarks.run.linux.arm64.checked.mch 19,363,068 -12
libraries_tests.pmi.linux.arm64.checked.mch 157,645,600 -16
libraries.crossgen2.linux.arm64.checked.mch 41,315,876 -12
libraries.pmi.linux.arm64.checked.mch 64,473,844 -4
coreclr_tests.run.linux.arm64.checked.mch 154,363,416 -20
Example diffs
benchmarks.run.linux.arm64.checked.mch
-12 (-0.49%) : 17542.dasm - System.Environment:ReadXdgDirectory(System.String,System.String,System.String):System.String
@@ -238,9 +238,8 @@ G_M35333_IG02:        ; bbWeight=1, gcVars=00000000000000000000200000820000 {V00
             ; GC ptr vars +{V11}
             cbz     x0, G_M35333_IG03
             ldr     w2, [x0, #0x08]
-            ldr     w2, [x0, #0x08]
             cbnz    w2, G_M35333_IG04
-						;; size=16 bbWeight=1 PerfScore 8.00
+						;; size=12 bbWeight=1 PerfScore 5.00
 G_M35333_IG03:        ; bbWeight=1, gcrefRegs=80000 {x19}, byrefRegs=0000 {}, byref
             ; gcrRegs -[x0]
             movz    x0, #0xD1FFAB1E
@@ -262,9 +261,8 @@ G_M35333_IG03:        ; bbWeight=1, gcrefRegs=80000 {x19}, byrefRegs=0000 {}, by
 G_M35333_IG04:        ; bbWeight=1, gcrefRegs=80000 {x19}, byrefRegs=0000 {}, byref, isz
             cbz     x19, G_M35333_IG05
             ldr     w2, [x19, #0x08]
-            ldr     w2, [x19, #0x08]
             cbnz    w2, G_M35333_IG06
-						;; size=16 bbWeight=1 PerfScore 8.00
+						;; size=12 bbWeight=1 PerfScore 5.00
 G_M35333_IG05:        ; bbWeight=1, gcrefRegs=80000 {x19}, byrefRegs=0000 {}, byref
             movz    x0, #0xD1FFAB1E
             movk    x0, #0xD1FFAB1E LSL #16
@@ -287,9 +285,8 @@ G_M35333_IG06:        ; bbWeight=1, gcrefRegs=80000 {x19}, byrefRegs=0000 {}, by
             ; gcrRegs +[x1]
             cbz     x1, G_M35333_IG07
             ldr     w2, [x1, #0x08]
-            ldr     w2, [x1, #0x08]
             cbnz    w2, G_M35333_IG08
-						;; size=20 bbWeight=1 PerfScore 10.00
+						;; size=16 bbWeight=1 PerfScore 7.00
 G_M35333_IG07:        ; bbWeight=1, gcrefRegs=80000 {x19}, byrefRegs=0000 {}, byref
             ; gcrRegs -[x1]
             movz    x0, #0xD1FFAB1E
@@ -1150,7 +1147,7 @@ G_M35333_IG62:        ; bbWeight=0, funclet epilog, nogc, extend
             ret     lr
 						;; size=24 bbWeight=0 PerfScore 0.00
 
-; Total bytes of code 2436, prolog size 68, PerfScore 3069.60, instruction count 609, allocated bytes for code 2436 (MethodHash=a8cb75fa) for method System.Environment:ReadXdgDirectory(System.String,System.String,System.String):System.String
+; Total bytes of code 2424, prolog size 68, PerfScore 3059.40, instruction count 606, allocated bytes for code 2424 (MethodHash=a8cb75fa) for method System.Environment:ReadXdgDirectory(System.String,System.String,System.String):System.String
 ; ============================================================
 
 Unwind Info:
@@ -1161,7 +1158,7 @@ Unwind Info:
   E bit             : 0
   X bit             : 0
   Vers              : 0
-  Function Length   : 461 (0x001cd) Actual length = 1844 (0x000734)
+  Function Length   : 458 (0x001ca) Actual length = 1832 (0x000728)
   ---- Epilog scopes ----
   ---- Scope 0
   Epilog Start Offset        : 3523193630 (0xd1ffab1e) Actual offset = 3523193630 (0xd1ffab1e) Offset from main function begin = 3523193630 (0xd1ffab1e)
+0 (0.00%) : 14082.dasm - MessagePack.Internal.ObjectSerializationInfo:CreateOrNull(System.Type,bool,bool,bool):MessagePack.Internal.ObjectSerializationInfo
@@ -4050,7 +4050,7 @@ G_M29960_IG150:        ; bbWeight=4, gcrefRegs=14980000 {x19 x20 x23 x26 x28}, b
             str     wzr, [fp, #0xD1FFAB1E]	// [V38 loc34]
             ldr     x0, [fp, #0xD1FFAB1E]	// [V12 loc8]
             ; gcrRegs +[x0]
-            ldr     x2, [fp, #0xD1FFAB1E]	// [V12 loc8]
+            mov     x2, x0
             ; gcrRegs +[x2]
             ldr     x2, [x2]
             ; gcrRegs -[x2]
@@ -4067,7 +4067,7 @@ G_M29960_IG150:        ; bbWeight=4, gcrefRegs=14980000 {x19 x20 x23 x26 x28}, b
             ; gcrRegs -[x0]
             cmp     w0, #0
             ble     G_M29960_IG171
-						;; size=52 bbWeight=4 PerfScore 86.00
+						;; size=52 bbWeight=4 PerfScore 80.00
 G_M29960_IG151:        ; bbWeight=16, gcVars=0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000010000000000100000 {V10 V39}, gcrefRegs=14980004 {x2 x19 x20 x23 x26 x28}, byrefRegs=0000 {}, gcvars, byref, isz
             add     x0, x2, #16
             ; byrRegs +[x0]
@@ -5624,7 +5624,7 @@ G_M29960_IG204:        ; bbWeight=0, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref
             ret     lr
 						;; size=28 bbWeight=0 PerfScore 0.00
 
-; Total bytes of code 12136, prolog size 100, PerfScore 12153.60, instruction count 3034, allocated bytes for code 12136 (MethodHash=186e8af7) for method MessagePack.Internal.ObjectSerializationInfo:CreateOrNull(System.Type,bool,bool,bool):MessagePack.Internal.ObjectSerializationInfo
+; Total bytes of code 12136, prolog size 100, PerfScore 12147.60, instruction count 3034, allocated bytes for code 12136 (MethodHash=186e8af7) for method MessagePack.Internal.ObjectSerializationInfo:CreateOrNull(System.Type,bool,bool,bool):MessagePack.Internal.ObjectSerializationInfo
 ; ============================================================
 
 Unwind Info:
+0 (0.00%) : 28674.dasm - Microsoft.CodeAnalysis.CSharp.Syntax.InternalSyntax.Lexer:QuickScanSyntaxToken():Microsoft.CodeAnalysis.CSharp.Syntax.InternalSyntax.SyntaxToken:this
@@ -246,7 +246,7 @@ G_M28731_IG14:        ; bbWeight=0.50, gcrefRegs=1080000 {x19 x24}, byrefRegs=00
             ldr     x27, [x24, #0x10]
             ; gcrRegs +[x27]
             ldr     w26, [x24, #0x30]
-            ldr     w0, [x24, #0x30]
+            mov     w0, w26
             sub     w24, w21, w0
             ; gcrRegs -[x24]
             ldr     x21, [x19, #0x58]
@@ -292,7 +292,7 @@ G_M28731_IG14:        ; bbWeight=0.50, gcrefRegs=1080000 {x19 x24}, byrefRegs=00
             blr     x6
             ; gcrRegs -[x0-x1 x5 x27-x28]
             ; gcr arg pop 0
-						;; size=140 bbWeight=0.50 PerfScore 26.00
+						;; size=140 bbWeight=0.50 PerfScore 24.75
 G_M28731_IG15:        ; bbWeight=0.50, gcrefRegs=2000000 {x25}, byrefRegs=0000 {}, byref
             mov     x0, x25
             ; gcrRegs +[x0]
@@ -338,7 +338,7 @@ G_M28731_IG19:        ; bbWeight=0, gcVars=0000000000000000 {}, gcrefRegs=0000 {
             brk_unix #0
 						;; size=8 bbWeight=0 PerfScore 0.00
 
-; Total bytes of code 776, prolog size 32, PerfScore 347.97, instruction count 194, allocated bytes for code 776 (MethodHash=8e5a8fc4) for method Microsoft.CodeAnalysis.CSharp.Syntax.InternalSyntax.Lexer:QuickScanSyntaxToken():Microsoft.CodeAnalysis.CSharp.Syntax.InternalSyntax.SyntaxToken:this
+; Total bytes of code 776, prolog size 32, PerfScore 346.72, instruction count 194, allocated bytes for code 776 (MethodHash=8e5a8fc4) for method Microsoft.CodeAnalysis.CSharp.Syntax.InternalSyntax.Lexer:QuickScanSyntaxToken():Microsoft.CodeAnalysis.CSharp.Syntax.InternalSyntax.SyntaxToken:this
 ; ============================================================
 
 Unwind Info:
+0 (0.00%) : 29948.dasm - Microsoft.CodeAnalysis.CSharp.Binder:BindNamespaceOrTypeOrAliasSymbol(Microsoft.CodeAnalysis.CSharp.Syntax.ExpressionSyntax,Microsoft.CodeAnalysis.DiagnosticBag,Roslyn.Utilities.ConsList`1[Microsoft.CodeAnalysis.CSharp.Symbols.TypeSymbol],bool):Microsoft.CodeAnalysis.CSharp.Binder+NamespaceOrTypeOrAliasSymbolWithAnnotations:this
@@ -1972,7 +1972,7 @@ G_M4799_IG72:        ; bbWeight=0.50, gcrefRegs=1980000 {x19 x20 x23 x24}, byref
             ; gcrRegs -[x0]
             ldr     x0, [fp, #0xD1FFAB1E]	// [V322 tmp293]
             ; gcrRegs +[x0]
-            ldr     x1, [fp, #0xD1FFAB1E]	// [V322 tmp293]
+            mov     x1, x0
             ; gcrRegs +[x1]
             ldr     x1, [x1]
             ; gcrRegs -[x1]
@@ -1981,7 +1981,7 @@ G_M4799_IG72:        ; bbWeight=0.50, gcrefRegs=1980000 {x19 x20 x23 x24}, byref
             blr     x1
             mov     x15, x0
             ; gcrRegs +[x15]
-						;; size=28 bbWeight=0.50 PerfScore 7.25
+						;; size=28 bbWeight=0.50 PerfScore 6.50
 G_M4799_IG73:        ; bbWeight=0.50, gcrefRegs=1988000 {x15 x19 x20 x23 x24}, byrefRegs=200000 {x21}, byref
             ; gcrRegs -[x0]
             add     x14, x20, #16
@@ -3130,7 +3130,7 @@ G_M4799_IG114:        ; bbWeight=0.50, gcrefRegs=7580000 {x19 x20 x22 x24 x25 x2
             ; gcrRegs -[x0]
             ldr     x0, [fp, #0xD1FFAB1E]	// [V329 tmp300]
             ; gcrRegs +[x0]
-            ldr     x1, [fp, #0xD1FFAB1E]	// [V329 tmp300]
+            mov     x1, x0
             ; gcrRegs +[x1]
             ldr     x1, [x1]
             ; gcrRegs -[x1]
@@ -3139,7 +3139,7 @@ G_M4799_IG114:        ; bbWeight=0.50, gcrefRegs=7580000 {x19 x20 x22 x24 x25 x2
             blr     x1
             mov     x15, x0
             ; gcrRegs +[x15]
-						;; size=28 bbWeight=0.50 PerfScore 7.25
+						;; size=28 bbWeight=0.50 PerfScore 6.50
 G_M4799_IG115:        ; bbWeight=0.50, gcrefRegs=7588000 {x15 x19 x20 x22 x24 x25 x26}, byrefRegs=200000 {x21}, byref
             ; gcrRegs -[x0]
             add     x14, x26, #16
@@ -3381,7 +3381,7 @@ RWD00  	dd	G_M4799_IG62 - G_M4799_IG02
        	dd	G_M4799_IG96 - G_M4799_IG02
 
 
-; Total bytes of code 7556, prolog size 80, PerfScore 1764.73, instruction count 1889, allocated bytes for code 7556 (MethodHash=44ebed40) for method Microsoft.CodeAnalysis.CSharp.Binder:BindNamespaceOrTypeOrAliasSymbol(Microsoft.CodeAnalysis.CSharp.Syntax.ExpressionSyntax,Microsoft.CodeAnalysis.DiagnosticBag,Roslyn.Utilities.ConsList`1[Microsoft.CodeAnalysis.CSharp.Symbols.TypeSymbol],bool):Microsoft.CodeAnalysis.CSharp.Binder+NamespaceOrTypeOrAliasSymbolWithAnnotations:this
+; Total bytes of code 7556, prolog size 80, PerfScore 1763.23, instruction count 1889, allocated bytes for code 7556 (MethodHash=44ebed40) for method Microsoft.CodeAnalysis.CSharp.Binder:BindNamespaceOrTypeOrAliasSymbol(Microsoft.CodeAnalysis.CSharp.Syntax.ExpressionSyntax,Microsoft.CodeAnalysis.DiagnosticBag,Roslyn.Utilities.ConsList`1[Microsoft.CodeAnalysis.CSharp.Symbols.TypeSymbol],bool):Microsoft.CodeAnalysis.CSharp.Binder+NamespaceOrTypeOrAliasSymbolWithAnnotations:this
 ; ============================================================
 
 Unwind Info:
+0 (0.00%) : 10239.dasm - ProtoBuf.Meta.MetaType:ApplyDefaultBehaviourImpl(int):this
@@ -747,10 +747,10 @@ G_M17167_IG26:        ; bbWeight=2, gcVars=0000000000000000000000000001000000000
             cbz     w0, G_M17167_IG30
             ldr     x4, [fp, #0xD1FFAB1E]	// [V21 loc19]
             ; gcrRegs +[x4]
-            ldr     x0, [fp, #0xD1FFAB1E]	// [V21 loc19]
+            mov     x0, x4
             ; gcrRegs +[x0]
             cbz     x0, G_M17167_IG29
-						;; size=52 bbWeight=2 PerfScore 33.00
+						;; size=52 bbWeight=2 PerfScore 30.00
 G_M17167_IG27:        ; bbWeight=1, gcVars=000000000000000000000000000100000000000000000000000080000000001A {V00 V13 V14 V20 V22}, gcrefRegs=3300010 {x4 x20 x21 x24 x25}, byrefRegs=0000 {}, gcvars, byref, isz
             ; gcrRegs -[x0]
             ; GC ptr vars -{V01 V04 V26 V38 V144}
@@ -905,10 +905,10 @@ G_M17167_IG33:        ; bbWeight=2, gcrefRegs=3300000 {x20 x21 x24 x25}, byrefRe
 G_M17167_IG34:        ; bbWeight=2, gcrefRegs=3300001 {x0 x20 x21 x24 x25}, byrefRegs=0000 {}, byref, isz
             ldr     x2, [fp, #0xD1FFAB1E]	// [V21 loc19]
             ; gcrRegs +[x2]
-            ldr     x1, [fp, #0xD1FFAB1E]	// [V21 loc19]
+            mov     x1, x2
             ; gcrRegs +[x1]
             cbz     x1, G_M17167_IG36
-						;; size=12 bbWeight=2 PerfScore 10.00
+						;; size=12 bbWeight=2 PerfScore 7.00
 G_M17167_IG35:        ; bbWeight=1, gcrefRegs=3300005 {x0 x2 x20 x21 x24 x25}, byrefRegs=0000 {}, byref, isz
             ; gcrRegs -[x1]
             ldr     x1, [fp, #0xD1FFAB1E]	// [V21 loc19]
@@ -1151,10 +1151,10 @@ G_M17167_IG53:        ; bbWeight=2, gcrefRegs=3300000 {x20 x21 x24 x25}, byrefRe
             cbz     w0, G_M17167_IG56
             ldr     x1, [fp, #0xD1FFAB1E]	// [V21 loc19]
             ; gcrRegs +[x1]
-            ldr     x3, [fp, #0xD1FFAB1E]	// [V21 loc19]
+            mov     x3, x1
             ; gcrRegs +[x3]
             cbz     x3, G_M17167_IG55
-						;; size=52 bbWeight=2 PerfScore 33.00
+						;; size=52 bbWeight=2 PerfScore 30.00
 G_M17167_IG54:        ; bbWeight=1, gcVars=000000000000000000000000000100000000000000000000000000000000001A {V00 V14 V20 V22}, gcrefRegs=3300002 {x1 x20 x21 x24 x25}, byrefRegs=0000 {}, gcvars, byref, isz
             ; gcrRegs -[x3]
             ; GC ptr vars -{V13}
@@ -1189,10 +1189,10 @@ G_M17167_IG56:        ; bbWeight=2, gcVars=0000000000000000000000000001000000000
             cbz     w0, G_M17167_IG59
             ldr     x2, [fp, #0xD1FFAB1E]	// [V21 loc19]
             ; gcrRegs +[x2]
-            ldr     x0, [fp, #0xD1FFAB1E]	// [V21 loc19]
+            mov     x0, x2
             ; gcrRegs +[x0]
             cbz     x0, G_M17167_IG58
-						;; size=52 bbWeight=2 PerfScore 33.00
+						;; size=52 bbWeight=2 PerfScore 30.00
 G_M17167_IG57:        ; bbWeight=1, gcVars=000000000000000000000000000000000000000000000000000080000000001A {V00 V13 V20 V22}, gcrefRegs=3300004 {x2 x20 x21 x24 x25}, byrefRegs=0000 {}, gcvars, byref, isz
             ; gcrRegs -[x0]
             ; GC ptr vars -{V14}
@@ -1845,10 +1845,10 @@ G_M17167_IG99:        ; bbWeight=2, gcVars=0000000000000000000000000001000000000
             cbz     w0, G_M17167_IG103
             ldr     x1, [fp, #0xD1FFAB1E]	// [V21 loc19]
             ; gcrRegs +[x1]
-            ldr     x0, [fp, #0xD1FFAB1E]	// [V21 loc19]
+            mov     x0, x1
             ; gcrRegs +[x0]
             cbz     x0, G_M17167_IG102
-						;; size=52 bbWeight=2 PerfScore 33.00
+						;; size=52 bbWeight=2 PerfScore 30.00
 G_M17167_IG100:        ; bbWeight=1, gcrefRegs=3300002 {x1 x20 x21 x24 x25}, byrefRegs=0000 {}, byref, isz
             ; gcrRegs -[x0]
             ldr     x0, [fp, #0xD1FFAB1E]	// [V21 loc19]
@@ -1907,10 +1907,10 @@ G_M17167_IG103:        ; bbWeight=2, gcrefRegs=3300000 {x20 x21 x24 x25}, byrefR
             cbz     w0, G_M17167_IG140
             ldr     x1, [fp, #0xD1FFAB1E]	// [V21 loc19]
             ; gcrRegs +[x1]
-            ldr     x0, [fp, #0xD1FFAB1E]	// [V21 loc19]
+            mov     x0, x1
             ; gcrRegs +[x0]
             cbz     x0, G_M17167_IG106
-						;; size=52 bbWeight=2 PerfScore 33.00
+						;; size=52 bbWeight=2 PerfScore 30.00
 G_M17167_IG104:        ; bbWeight=1, gcrefRegs=3300002 {x1 x20 x21 x24 x25}, byrefRegs=0000 {}, byref, isz
             ; gcrRegs -[x0]
             ldr     x0, [fp, #0xD1FFAB1E]	// [V21 loc19]
@@ -2005,10 +2005,10 @@ G_M17167_IG109:        ; bbWeight=2, gcrefRegs=3300008 {x3 x20 x21 x24 x25}, byr
             cbz     w0, G_M17167_IG139
             ldr     x4, [fp, #0xD1FFAB1E]	// [V21 loc19]
             ; gcrRegs +[x4]
-            ldr     x2, [fp, #0xD1FFAB1E]	// [V21 loc19]
+            mov     x2, x4
             ; gcrRegs +[x2]
             cbz     x2, G_M17167_IG111
-						;; size=60 bbWeight=2 PerfScore 39.00
+						;; size=60 bbWeight=2 PerfScore 36.00
 G_M17167_IG110:        ; bbWeight=1, gcVars=000000000000000000000000000100000000000000000000000000000000001A {V00 V14 V20 V22}, gcrefRegs=3300010 {x4 x20 x21 x24 x25}, byrefRegs=0000 {}, gcvars, byref, isz
             ; gcrRegs -[x2]
             ; GC ptr vars -{V13}
@@ -2083,10 +2083,10 @@ G_M17167_IG114:        ; bbWeight=2, gcrefRegs=3300008 {x3 x20 x21 x24 x25}, byr
             cbz     w0, G_M17167_IG138
             ldr     x4, [fp, #0xD1FFAB1E]	// [V21 loc19]
             ; gcrRegs +[x4]
-            ldr     x2, [fp, #0xD1FFAB1E]	// [V21 loc19]
+            mov     x2, x4
             ; gcrRegs +[x2]
             cbz     x2, G_M17167_IG116
-						;; size=60 bbWeight=2 PerfScore 39.00
+						;; size=60 bbWeight=2 PerfScore 36.00
 G_M17167_IG115:        ; bbWeight=1, gcVars=000000000000000000000000000100000000000000000000000000000000001A {V00 V14 V20 V22}, gcrefRegs=3300010 {x4 x20 x21 x24 x25}, byrefRegs=0000 {}, gcvars, byref, isz
             ; gcrRegs -[x2]
             ; GC ptr vars -{V13}
@@ -3574,7 +3574,7 @@ G_M17167_IG183:        ; bbWeight=0, gcVars=000000000000000000000000000000000000
             brk_unix #0
 						;; size=160 bbWeight=0 PerfScore 0.00
 
-; Total bytes of code 8140, prolog size 80, PerfScore 5526.50, instruction count 2035, allocated bytes for code 8140 (MethodHash=70dabcf0) for method ProtoBuf.Meta.MetaType:ApplyDefaultBehaviourImpl(int):this
+; Total bytes of code 8140, prolog size 80, PerfScore 5502.50, instruction count 2035, allocated bytes for code 8140 (MethodHash=70dabcf0) for method ProtoBuf.Meta.MetaType:ApplyDefaultBehaviourImpl(int):this
 ; ============================================================
 
 Unwind Info:
+0 (0.00%) : 30080.dasm - Microsoft.CodeAnalysis.CSharp.Symbols.SourceMemberContainerTypeSymbol:NoteFieldDefinitions():this
@@ -205,7 +205,7 @@ G_M17784_IG09:        ; bbWeight=4, gcrefRegs=580000 {x19 x20 x22}, byrefRegs=00
 G_M17784_IG10:        ; bbWeight=2, gcrefRegs=580000 {x19 x20 x22}, byrefRegs=0000 {}, byref, isz
             ldr     x0, [fp, #0x40]
             ; gcrRegs +[x0]
-            ldr     x1, [fp, #0x40]
+            mov     x1, x0
             ; gcrRegs +[x1]
             ldr     x1, [x1]
             ; gcrRegs -[x1]
@@ -217,7 +217,7 @@ G_M17784_IG10:        ; bbWeight=2, gcrefRegs=580000 {x19 x20 x22}, byrefRegs=00
             cbnz    w0, G_M17784_IG11
             ldr     x0, [fp, #0x40]
             ; gcrRegs +[x0]
-            ldr     x1, [fp, #0x40]
+            mov     x1, x0
             ; gcrRegs +[x1]
             ldr     x1, [x1]
             ; gcrRegs -[x1]
@@ -229,7 +229,7 @@ G_M17784_IG10:        ; bbWeight=2, gcrefRegs=580000 {x19 x20 x22}, byrefRegs=00
             cbnz    w0, G_M17784_IG11
             ldr     x0, [fp, #0x40]
             ; gcrRegs +[x0]
-            ldr     x1, [fp, #0x40]
+            mov     x1, x0
             ; gcrRegs +[x1]
             ldr     x1, [x1]
             ; gcrRegs -[x1]
@@ -286,7 +286,7 @@ G_M17784_IG10:        ; bbWeight=2, gcrefRegs=580000 {x19 x20 x22}, byrefRegs=00
             blr     x8
             ; gcrRegs -[x0-x2 x25]
             ; gcr arg pop 0
-						;; size=228 bbWeight=2 PerfScore 223.00
+						;; size=228 bbWeight=2 PerfScore 214.00
 G_M17784_IG11:        ; bbWeight=4, gcrefRegs=580000 {x19 x20 x22}, byrefRegs=0000 {}, byref, isz
             add     w24, w24, #1
             cmp     w23, w24
@@ -462,7 +462,7 @@ G_M17784_IG26:        ; bbWeight=0, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref,
             ret     lr
 						;; size=24 bbWeight=0 PerfScore 0.00
 
-; Total bytes of code 956, prolog size 52, PerfScore 617.60, instruction count 239, allocated bytes for code 956 (MethodHash=6908ba87) for method Microsoft.CodeAnalysis.CSharp.Symbols.SourceMemberContainerTypeSymbol:NoteFieldDefinitions():this
+; Total bytes of code 956, prolog size 52, PerfScore 608.60, instruction count 239, allocated bytes for code 956 (MethodHash=6908ba87) for method Microsoft.CodeAnalysis.CSharp.Symbols.SourceMemberContainerTypeSymbol:NoteFieldDefinitions():this
 ; ============================================================
 
 Unwind Info:
libraries_tests.pmi.linux.arm64.checked.mch
-12 (-0.80%) : 274355.dasm - Humanizer.Localisation.NumberToWords.SpanishNumberToWordsConverter:ConvertToOrdinal(int,int,int):System.String:this
@@ -255,7 +255,6 @@ G_M30388_IG11:        ; bbWeight=0.50, gcrefRegs=400000 {x22}, byrefRegs=0000 {}
             mov     x24, x23
             ; gcrRegs +[x24]
             ldr     w1, [fp, #0x10]
-            ldr     w1, [fp, #0x10]	// [V05 loc1]
             movz    w11, #0xD1FFAB1E
             movk    w11, #0xD1FFAB1E LSL #16
             smull   x1, w11, w1
@@ -289,7 +288,7 @@ G_M30388_IG11:        ; bbWeight=0.50, gcrefRegs=400000 {x22}, byrefRegs=0000 {}
             mov     x2, x0
             ; gcrRegs +[x2]
             ldr     w1, [fp, #0x10]	// [V05 loc1]
-            ldr     w0, [fp, #0x10]	// [V05 loc1]
+            mov     w0, w1
             ; gcrRegs -[x0]
             movz    w3, #0xD1FFAB1E
             movk    w3, #0xD1FFAB1E LSL #16
@@ -302,7 +301,7 @@ G_M30388_IG11:        ; bbWeight=0.50, gcrefRegs=400000 {x22}, byrefRegs=0000 {}
             str     w1, [fp, #0x10]	// [V05 loc1]
             mov     x24, x2
             ; gcrRegs +[x24]
-						;; size=220 bbWeight=0.50 PerfScore 25.50
+						;; size=216 bbWeight=0.50 PerfScore 23.75
 G_M30388_IG12:        ; bbWeight=0.50, gcrefRegs=1C00000 {x22 x23 x24}, byrefRegs=0000 {}, byref, isz
             ; gcrRegs -[x2]
             mov     x2, x24
@@ -346,7 +345,6 @@ G_M30388_IG15:        ; bbWeight=0.50, gcrefRegs=C00000 {x22 x23}, byrefRegs=000
             mov     x24, x23
             ; gcrRegs +[x24]
             ldr     w1, [fp, #0x10]	// [V05 loc1]
-            ldr     w1, [fp, #0x10]	// [V05 loc1]
             movz    w11, #0xD1FFAB1E
             movk    w11, #0xD1FFAB1E LSL #16
             smull   x1, w11, w1
@@ -379,7 +377,7 @@ G_M30388_IG15:        ; bbWeight=0.50, gcrefRegs=C00000 {x22 x23}, byrefRegs=000
             mov     x2, x0
             ; gcrRegs +[x2]
             ldr     w1, [fp, #0x10]	// [V05 loc1]
-            ldr     w0, [fp, #0x10]	// [V05 loc1]
+            mov     w0, w1
             ; gcrRegs -[x0]
             movz    w3, #0xD1FFAB1E
             movk    w3, #0xD1FFAB1E LSL #16
@@ -392,7 +390,7 @@ G_M30388_IG15:        ; bbWeight=0.50, gcrefRegs=C00000 {x22 x23}, byrefRegs=000
             str     w1, [fp, #0x10]	// [V05 loc1]
             mov     x24, x2
             ; gcrRegs +[x24]
-						;; size=176 bbWeight=0.50 PerfScore 22.50
+						;; size=172 bbWeight=0.50 PerfScore 20.75
 G_M30388_IG16:        ; bbWeight=0.50, gcrefRegs=1C00000 {x22 x23 x24}, byrefRegs=0000 {}, byref, isz
             ; gcrRegs -[x2]
             mov     x2, x24
@@ -436,7 +434,6 @@ G_M30388_IG19:        ; bbWeight=0.50, gcrefRegs=C00000 {x22 x23}, byrefRegs=000
             mov     x2, x23
             ; gcrRegs +[x2]
             ldr     w1, [fp, #0x10]	// [V05 loc1]
-            ldr     w1, [fp, #0x10]	// [V05 loc1]
             movz    w11, #0xD1FFAB1E
             movk    w11, #0xD1FFAB1E LSL #16
             smull   x1, w11, w1
@@ -472,7 +469,7 @@ G_M30388_IG19:        ; bbWeight=0.50, gcrefRegs=C00000 {x22 x23}, byrefRegs=000
             mov     x2, x0
             ; gcrRegs +[x2]
             ldr     w1, [fp, #0x10]	// [V05 loc1]
-            ldr     w0, [fp, #0x10]	// [V05 loc1]
+            mov     w0, w1
             ; gcrRegs -[x0]
             movz    w3, #0xD1FFAB1E
             movk    w3, #0xD1FFAB1E LSL #16
@@ -483,7 +480,7 @@ G_M30388_IG19:        ; bbWeight=0.50, gcrefRegs=C00000 {x22 x23}, byrefRegs=000
             mov     w3, #10
             msub    w1, w0, w3, w1
             str     w1, [fp, #0x10]	// [V05 loc1]
-						;; size=172 bbWeight=0.50 PerfScore 22.25
+						;; size=168 bbWeight=0.50 PerfScore 20.50
 G_M30388_IG20:        ; bbWeight=0.50, gcrefRegs=400004 {x2 x22}, byrefRegs=0000 {}, byref, isz
             ldr     w1, [x22, #0x14]
             add     w1, w1, #1
@@ -593,7 +590,7 @@ G_M30388_IG30:        ; bbWeight=0, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref
             brk_unix #0
 						;; size=24 bbWeight=0 PerfScore 0.00
 
-; Total bytes of code 1504, prolog size 24, PerfScore 347.93, instruction count 376, allocated bytes for code 1504 (MethodHash=e313894b) for method Humanizer.Localisation.NumberToWords.SpanishNumberToWordsConverter:ConvertToOrdinal(int,int,int):System.String:this
+; Total bytes of code 1492, prolog size 24, PerfScore 341.48, instruction count 373, allocated bytes for code 1492 (MethodHash=e313894b) for method Humanizer.Localisation.NumberToWords.SpanishNumberToWordsConverter:ConvertToOrdinal(int,int,int):System.String:this
 ; ============================================================
 
 Unwind Info:
@@ -604,7 +601,7 @@ Unwind Info:
   E bit             : 0
   X bit             : 0
   Vers              : 0
-  Function Length   : 376 (0x00178) Actual length = 1504 (0x0005e0)
+  Function Length   : 373 (0x00175) Actual length = 1492 (0x0005d4)
   ---- Epilog scopes ----
   ---- Scope 0
   Epilog Start Offset        : 3523193630 (0xd1ffab1e) Actual offset = 3523193630 (0xd1ffab1e) Offset from main function begin = 3523193630 (0xd1ffab1e)
-4 (-0.19%) : 245583.dasm - System.Memory.Tests.SequenceReader.IsNext:IsNext_Empty(bool):this
@@ -747,7 +747,6 @@ G_M58786_IG30:        ; bbWeight=0.50, gcrefRegs=0000 {}, byrefRegs=100001 {x0 x
             beq     G_M58786_IG31
             ldr     x0, [fp, #0x28]	// [V02 loc0+0x08]
             ldr     w0, [fp, #0x30]	// [V02 loc0+0x10]
-            ldr     w0, [fp, #0x30]	// [V02 loc0+0x10]
             ldr     w1, [fp, #0x78]	// [V02 loc0+0x58]
             cmp     w0, w1
             blt     G_M58786_IG31
@@ -757,7 +756,7 @@ G_M58786_IG30:        ; bbWeight=0.50, gcrefRegs=0000 {}, byrefRegs=100001 {x0 x
             movk    x1, #0xD1FFAB1E LSL #32
             ldr     x1, [x1]
             blr     x1
-						;; size=88 bbWeight=0.50 PerfScore 12.25
+						;; size=84 bbWeight=0.50 PerfScore 11.25
 G_M58786_IG31:        ; bbWeight=0.50, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref
             mov     w22, #1
             b       G_M58786_IG34
@@ -1004,7 +1003,7 @@ G_M58786_IG45:        ; bbWeight=0, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref
             brk_unix #0
 						;; size=8 bbWeight=0 PerfScore 0.00
 
-; Total bytes of code 2132, prolog size 48, PerfScore 541.20, instruction count 533, allocated bytes for code 2132 (MethodHash=6c091a5d) for method System.Memory.Tests.SequenceReader.IsNext:IsNext_Empty(bool):this
+; Total bytes of code 2128, prolog size 48, PerfScore 539.80, instruction count 532, allocated bytes for code 2128 (MethodHash=6c091a5d) for method System.Memory.Tests.SequenceReader.IsNext:IsNext_Empty(bool):this
 ; ============================================================
 
 Unwind Info:
@@ -1015,7 +1014,7 @@ Unwind Info:
   E bit             : 0
   X bit             : 0
   Vers              : 0
-  Function Length   : 533 (0x00215) Actual length = 2132 (0x000854)
+  Function Length   : 532 (0x00214) Actual length = 2128 (0x000850)
   ---- Epilog scopes ----
   ---- Scope 0
   Epilog Start Offset        : 3523193630 (0xd1ffab1e) Actual offset = 3523193630 (0xd1ffab1e) Offset from main function begin = 3523193630 (0xd1ffab1e)
-4 (-0.07%) : 245584.dasm - System.Memory.Tests.SequenceReader.IsNext:IsNext_Span():this
@@ -1515,7 +1515,6 @@ G_M41749_IG28:        ; bbWeight=0.50, gcrefRegs=0000 {}, byrefRegs=80001 {x0 x1
             cbz     w0, G_M41749_IG31
             ldr     x0, [fp, #0x30]	// [V02 loc1+0x08]
             ldr     w0, [fp, #0x38]	// [V02 loc1+0x10]
-            ldr     w0, [fp, #0x38]	// [V02 loc1+0x10]
             ldr     w1, [fp, #0x80]	// [V02 loc1+0x58]
             cmp     w0, w1
             blt     G_M41749_IG29
@@ -1525,7 +1524,7 @@ G_M41749_IG28:        ; bbWeight=0.50, gcrefRegs=0000 {}, byrefRegs=80001 {x0 x1
             movk    x1, #0xD1FFAB1E LSL #32
             ldr     x1, [x1]
             blr     x1
-						;; size=80 bbWeight=0.50 PerfScore 11.50
+						;; size=76 bbWeight=0.50 PerfScore 10.50
 G_M41749_IG29:        ; bbWeight=0.50, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref
             mov     w21, #1
             b       G_M41749_IG32
@@ -2822,7 +2821,7 @@ G_M41749_IG99:        ; bbWeight=0, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref
             brk_unix #0
 						;; size=8 bbWeight=0 PerfScore 0.00
 
-; Total bytes of code 5372, prolog size 56, PerfScore 1464.20, instruction count 1343, allocated bytes for code 5372 (MethodHash=3c225cea) for method System.Memory.Tests.SequenceReader.IsNext:IsNext_Span():this
+; Total bytes of code 5368, prolog size 56, PerfScore 1462.80, instruction count 1342, allocated bytes for code 5368 (MethodHash=3c225cea) for method System.Memory.Tests.SequenceReader.IsNext:IsNext_Span():this
 ; ============================================================
 
 Unwind Info:
@@ -2833,7 +2832,7 @@ Unwind Info:
   E bit             : 0
   X bit             : 0
   Vers              : 0
-  Function Length   : 1343 (0x0053f) Actual length = 5372 (0x0014fc)
+  Function Length   : 1342 (0x0053e) Actual length = 5368 (0x0014f8)
   ---- Epilog scopes ----
   ---- Scope 0
   Epilog Start Offset        : 3523193630 (0xd1ffab1e) Actual offset = 3523193630 (0xd1ffab1e) Offset from main function begin = 3523193630 (0xd1ffab1e)
+0 (0.00%) : 324288.dasm - Autofac.Core.Registration.ComponentRegistryBuilder:GetRegistered():System.EventHandler`1[Autofac.Core.ComponentRegisteredEventArgs]:this
@@ -38,10 +38,10 @@ G_M24249_IG02:        ; bbWeight=1, gcrefRegs=0001 {x0}, byrefRegs=0000 {}, byre
 G_M24249_IG03:        ; bbWeight=0.50, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref, isz
             ldr     x0, [fp, #0x18]
             ; gcrRegs +[x0]
-            ldr     x1, [fp, #0x18]
+            mov     x1, x0
             ; gcrRegs +[x1]
             cbz     x1, G_M24249_IG05
-						;; size=12 bbWeight=0.50 PerfScore 2.50
+						;; size=12 bbWeight=0.50 PerfScore 1.75
 G_M24249_IG04:        ; bbWeight=0.25, gcrefRegs=0001 {x0}, byrefRegs=0000 {}, byref, isz
             ; gcrRegs -[x1]
             ldr     x1, [fp, #0x18]
@@ -79,7 +79,7 @@ G_M24249_IG08:        ; bbWeight=0, gcVars=0000000000000000 {}, gcrefRegs=0000 {
             brk_unix #0
 						;; size=32 bbWeight=0 PerfScore 0.00
 
-; Total bytes of code 148, prolog size 12, PerfScore 35.55, instruction count 37, allocated bytes for code 148 (MethodHash=dda8a146) for method Autofac.Core.Registration.ComponentRegistryBuilder:GetRegistered():System.EventHandler`1[Autofac.Core.ComponentRegisteredEventArgs]:this
+; Total bytes of code 148, prolog size 12, PerfScore 34.80, instruction count 37, allocated bytes for code 148 (MethodHash=dda8a146) for method Autofac.Core.Registration.ComponentRegistryBuilder:GetRegistered():System.EventHandler`1[Autofac.Core.ComponentRegisteredEventArgs]:this
 ; ============================================================
 
 Unwind Info:
+0 (0.00%) : 373440.dasm - LibraryImportGenerator.IntegrationTests.NativeExportsNE+LPTStr:Reverse_Return(System.String):System.String
@@ -123,9 +123,9 @@ G_M18255_IG11:        ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref
 G_M18255_IG12:        ; bbWeight=1, gcrefRegs=80000 {x19}, byrefRegs=0000 {}, byref, isz
             ; gcrRegs -[x0]
             ldr     x0, [fp, #0x60]	// [V02 loc1]
-            ldr     x1, [fp, #0x60]	// [V02 loc1]
+            mov     x1, x0
             cbz     x1, G_M18255_IG14
-						;; size=12 bbWeight=1 PerfScore 5.00
+						;; size=12 bbWeight=1 PerfScore 3.50
 G_M18255_IG13:        ; bbWeight=0.50, gcrefRegs=80000 {x19}, byrefRegs=0000 {}, byref
             bl      <unknown method>
             ; gcr arg pop 0
@@ -178,7 +178,7 @@ G_M18255_IG19:        ; bbWeight=0, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref,
 						;; size=32 bbWeight=0 PerfScore 0.00
 G_M18255_IG20:        ; bbWeight=0, gcVars=0000000000000000 {}, gcrefRegs=0000 {}, byrefRegs=0000 {}, gcvars, byref, isz
             ldr     x0, [fp, #0x60]	// [V02 loc1]
-            ldr     x1, [fp, #0x60]	// [V02 loc1]
+            mov     x1, x0
             cbz     x1, G_M18255_IG21
             bl      <unknown method>
             ; gcr arg pop 0
@@ -193,7 +193,7 @@ G_M18255_IG21:        ; bbWeight=0, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref,
             ret     lr
 						;; size=28 bbWeight=0 PerfScore 0.00
 
-; Total bytes of code 428, prolog size 44, PerfScore 108.54, instruction count 107, allocated bytes for code 428 (MethodHash=1e33b8b0) for method LibraryImportGenerator.IntegrationTests.NativeExportsNE+LPTStr:Reverse_Return(System.String):System.String
+; Total bytes of code 428, prolog size 44, PerfScore 107.04, instruction count 107, allocated bytes for code 428 (MethodHash=1e33b8b0) for method LibraryImportGenerator.IntegrationTests.NativeExportsNE+LPTStr:Reverse_Return(System.String):System.String
 ; ============================================================
 
 Unwind Info:
+4 (+0.16%) : 311915.dasm - System.IO.Tests.StreamConformanceTests+d__48:MoveNext():this
@@ -202,11 +202,12 @@ G_M4893_IG04:        ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0001 {x0}, byref
 						;; size=36 bbWeight=1 PerfScore 10.50
 G_M4893_IG05:        ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0001 {x0}, byref, isz
             ldr     w19, [x0, #0x18]
-            ldp     w1, w2, [x0, #0x18]
+            mov     w1, w19
+            ldr     w2, [x0, #0x1C]
             add     w1, w1, w2
             cmp     w1, w19
             ble     G_M4893_IG43
-						;; size=20 bbWeight=1 PerfScore 8.00
+						;; size=24 bbWeight=1 PerfScore 8.50
 G_M4893_IG06:        ; bbWeight=8, gcrefRegs=0000 {}, byrefRegs=0001 {x0}, byref, isz
             ldr     x2, [x0]
             ; gcrRegs +[x2]
@@ -685,7 +686,7 @@ G_M4893_IG31:        ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0001 {x0}, byref
             ; byrRegs +[x0]
             ldr     x19, [x0]
             ; gcrRegs +[x19]
-            ldr     x6, [x0]
+            mov     x6, x19
             ; gcrRegs +[x6]
             mov     x0, x6
             ; gcrRegs +[x0]
@@ -713,7 +714,7 @@ G_M4893_IG31:        ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0001 {x0}, byref
             ; gcrRegs -[x0-x1 x19]
             ; gcr arg pop 0
             b       G_M4893_IG43
-						;; size=72 bbWeight=1 PerfScore 32.00
+						;; size=72 bbWeight=1 PerfScore 29.50
 G_M4893_IG32:        ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0001 {x0}, byref, isz
             ; byrRegs -[x20] +[x0]
             movz    x1, #0xD1FFAB1E
@@ -1176,7 +1177,7 @@ RWD12  	dd	G_M4893_IG05 - G_M4893_IG02
        	dd	G_M4893_IG32 - G_M4893_IG02
 
 
-; Total bytes of code 2516, prolog size 60, PerfScore 1072.76, instruction count 629, allocated bytes for code 2516 (MethodHash=e9e3ece2) for method System.IO.Tests.StreamConformanceTests+<WriteAsync>d__48:MoveNext():this
+; Total bytes of code 2520, prolog size 60, PerfScore 1071.16, instruction count 630, allocated bytes for code 2520 (MethodHash=e9e3ece2) for method System.IO.Tests.StreamConformanceTests+<WriteAsync>d__48:MoveNext():this
 ; ============================================================
 
 Unwind Info:
@@ -1187,7 +1188,7 @@ Unwind Info:
   E bit             : 0
   X bit             : 0
   Vers              : 0
-  Function Length   : 605 (0x0025d) Actual length = 2420 (0x000974)
+  Function Length   : 606 (0x0025e) Actual length = 2424 (0x000978)
   ---- Epilog scopes ----
   ---- Scope 0
   Epilog Start Offset        : 3523193630 (0xd1ffab1e) Actual offset = 3523193630 (0xd1ffab1e) Offset from main function begin = 3523193630 (0xd1ffab1e)
libraries.crossgen2.linux.arm64.checked.mch
-4 (-0.77%) : 47758.dasm - System.IO.Path:GetFullPathInternal(System.String):System.String
@@ -71,10 +71,9 @@ G_M7994_IG02:        ; bbWeight=1, gcrefRegs=80000 {x19}, byrefRegs=0000 {}, byr
             cbz     x19, G_M7994_IG04
 						;; size=4 bbWeight=1 PerfScore 1.00
 G_M7994_IG03:        ; bbWeight=0.50, gcrefRegs=80000 {x19}, byrefRegs=0000 {}, byref, isz
-            ldr     w1, [x19, #0x08]
             ldr     w1, [x19, #0x08]
             cbnz    w1, G_M7994_IG05
-						;; size=12 bbWeight=0.50 PerfScore 3.50
+						;; size=8 bbWeight=0.50 PerfScore 2.00
 G_M7994_IG04:        ; bbWeight=0.50, gcrefRegs=80000 {x19}, byrefRegs=0000 {}, byref
             adrp    x1, [HIGH RELOC #0xD1FFAB1E]      // const ptr
             add     x1, x1, [LOW RELOC #0xD1FFAB1E]
@@ -265,7 +264,7 @@ G_M7994_IG17:        ; bbWeight=1, gcrefRegs=0001 {x0}, byrefRegs=0000 {}, byref
             ret     lr
 						;; size=16 bbWeight=1 PerfScore 5.00
 
-; Total bytes of code 520, prolog size 16, PerfScore 160.75, instruction count 130, allocated bytes for code 520 (MethodHash=fc5ee0c5) for method System.IO.Path:GetFullPathInternal(System.String):System.String
+; Total bytes of code 516, prolog size 16, PerfScore 158.85, instruction count 129, allocated bytes for code 516 (MethodHash=fc5ee0c5) for method System.IO.Path:GetFullPathInternal(System.String):System.String
 ; ============================================================
 
 Unwind Info:
@@ -276,7 +275,7 @@ Unwind Info:
   E bit             : 0
   X bit             : 0
   Vers              : 0
-  Function Length   : 130 (0x00082) Actual length = 520 (0x000208)
+  Function Length   : 129 (0x00081) Actual length = 516 (0x000204)
   ---- Epilog scopes ----
   ---- Scope 0
   Epilog Start Offset        : 3523193630 (0xd1ffab1e) Actual offset = 3523193630 (0xd1ffab1e) Offset from main function begin = 3523193630 (0xd1ffab1e)
-4 (-0.20%) : 51116.dasm - System.Environment:ReadXdgDirectory(System.String,System.String,System.String):System.String
@@ -144,9 +144,8 @@ G_M35333_IG03:        ; bbWeight=1, gcrefRegs=80000 {x19}, byrefRegs=0000 {}, by
 G_M35333_IG04:        ; bbWeight=1, gcrefRegs=80000 {x19}, byrefRegs=0000 {}, byref, isz
             cbz     x19, G_M35333_IG05
             ldr     w11, [x19, #0x08]
-            ldr     w11, [x19, #0x08]
             cbnz    w11, G_M35333_IG06
-						;; size=16 bbWeight=1 PerfScore 8.00
+						;; size=12 bbWeight=1 PerfScore 5.00
 G_M35333_IG05:        ; bbWeight=1, gcrefRegs=80000 {x19}, byrefRegs=0000 {}, byref
             adrp    x11, [HIGH RELOC #0xD1FFAB1E]      // const ptr
             add     x11, x11, [LOW RELOC #0xD1FFAB1E]
@@ -888,7 +887,7 @@ G_M35333_IG59:        ; bbWeight=0, funclet epilog, nogc, extend
             ret     lr
 						;; size=20 bbWeight=0 PerfScore 0.00
 
-; Total bytes of code 1976, prolog size 60, PerfScore 2322.10, instruction count 494, allocated bytes for code 1976 (MethodHash=a8cb75fa) for method System.Environment:ReadXdgDirectory(System.String,System.String,System.String):System.String
+; Total bytes of code 1972, prolog size 60, PerfScore 2318.70, instruction count 493, allocated bytes for code 1972 (MethodHash=a8cb75fa) for method System.Environment:ReadXdgDirectory(System.String,System.String,System.String):System.String
 ; ============================================================
 
 Unwind Info:
@@ -899,7 +898,7 @@ Unwind Info:
   E bit             : 0
   X bit             : 0
   Vers              : 0
-  Function Length   : 346 (0x0015a) Actual length = 1384 (0x000568)
+  Function Length   : 345 (0x00159) Actual length = 1380 (0x000564)
   ---- Epilog scopes ----
   ---- Scope 0
   Epilog Start Offset        : 3523193630 (0xd1ffab1e) Actual offset = 3523193630 (0xd1ffab1e) Offset from main function begin = 3523193630 (0xd1ffab1e)
-4 (-0.08%) : 63176.dasm - System.Xml.Serialization.XmlSerializationWriterILGen:GenerateMembersElement(System.Xml.Serialization.XmlMembersMapping):System.String:this
@@ -2011,7 +2011,6 @@ G_M40744_IG41:        ; bbWeight=2, gcVars=00000000000000000100000000000000 {V28
             cbz     x27, G_M40744_IG42
             ldr     x0, [x27, #0x08]
             ; gcrRegs +[x0]
-            ldr     x0, [x27, #0x08]
             cbz     x0, G_M40744_IG42
             ldr     x0, [x19, #0x70]
             adrp    x11, [HIGH RELOC #0xD1FFAB1E]      // function address
@@ -2021,7 +2020,7 @@ G_M40744_IG41:        ; bbWeight=2, gcVars=00000000000000000100000000000000 {V28
             blr     x1
             ; gcrRegs -[x0 x27]
             ; gcr arg pop 0
-						;; size=160 bbWeight=2 PerfScore 108.00
+						;; size=156 bbWeight=2 PerfScore 102.00
 G_M40744_IG42:        ; bbWeight=2, gcrefRegs=3280000 {x19 x21 x24 x25}, byrefRegs=0000 {}, byref
             ldr     x0, [x19, #0x70]
             ; gcrRegs +[x0]
@@ -2087,7 +2086,7 @@ G_M40744_IG48:        ; bbWeight=0, gcVars=00000000000000000000000000000000 {},
             brk_unix #0
 						;; size=20 bbWeight=0 PerfScore 0.00
 
-; Total bytes of code 5004, prolog size 56, PerfScore 5114.15, instruction count 1251, allocated bytes for code 5004 (MethodHash=33d560d7) for method System.Xml.Serialization.XmlSerializationWriterILGen:GenerateMembersElement(System.Xml.Serialization.XmlMembersMapping):System.String:this
+; Total bytes of code 5000, prolog size 56, PerfScore 5107.75, instruction count 1250, allocated bytes for code 5000 (MethodHash=33d560d7) for method System.Xml.Serialization.XmlSerializationWriterILGen:GenerateMembersElement(System.Xml.Serialization.XmlMembersMapping):System.String:this
 ; ============================================================
 
 Unwind Info:
@@ -2098,7 +2097,7 @@ Unwind Info:
   E bit             : 0
   X bit             : 0
   Vers              : 0
-  Function Length   : 1251 (0x004e3) Actual length = 5004 (0x00138c)
+  Function Length   : 1250 (0x004e2) Actual length = 5000 (0x001388)
   ---- Epilog scopes ----
   ---- Scope 0
   Epilog Start Offset        : 3523193630 (0xd1ffab1e) Actual offset = 3523193630 (0xd1ffab1e) Offset from main function begin = 3523193630 (0xd1ffab1e)
+0 (0.00%) : 123392.dasm - System.Net.Http.Headers.ContentRangeHeaderValue:GetContentRangeLength(System.String,int,byref):int
@@ -122,7 +122,7 @@ G_M36127_IG03:        ; bbWeight=0.50, gcrefRegs=80000 {x19}, byrefRegs=200000 {
             add     w1, w1, #1
             str     w1, [fp, #0x38]	// [V05 loc2]
             ldr     w25, [fp, #0x38]	// [V05 loc2]
-            ldr     w1, [fp, #0x38]	// [V05 loc2]
+            mov     w1, w25
             mov     x0, x19
             ; gcrRegs +[x0]
             adrp    x11, [HIGH RELOC #0xD1FFAB1E]      // function address
@@ -166,7 +166,7 @@ G_M36127_IG03:        ; bbWeight=0.50, gcrefRegs=80000 {x19}, byrefRegs=200000 {
             cbz     w0, G_M36127_IG05
             ldr     w0, [fp, #0x38]	// [V05 loc2]
             sub     w0, w0, w20
-						;; size=384 bbWeight=0.50 PerfScore 53.75
+						;; size=384 bbWeight=0.50 PerfScore 53.00
 G_M36127_IG04:        ; bbWeight=0.50, epilog, nogc, extend
             ldr     x25, [sp, #0x78]
             ldp     x23, x24, [sp, #0x68]
@@ -196,7 +196,7 @@ G_M36127_IG07:        ; bbWeight=0, gcVars=0000000000000000 {}, gcrefRegs=0000 {
             brk_unix #0
 						;; size=20 bbWeight=0 PerfScore 0.00
 
-; Total bytes of code 532, prolog size 28, PerfScore 129.70, instruction count 133, allocated bytes for code 532 (MethodHash=7ab672e0) for method System.Net.Http.Headers.ContentRangeHeaderValue:GetContentRangeLength(System.String,int,byref):int
+; Total bytes of code 532, prolog size 28, PerfScore 128.95, instruction count 133, allocated bytes for code 532 (MethodHash=7ab672e0) for method System.Net.Http.Headers.ContentRangeHeaderValue:GetContentRangeLength(System.String,int,byref):int
 ; ============================================================
 
 Unwind Info:
+0 (0.00%) : 150016.dasm - Microsoft.CodeAnalysis.CodeGen.ILBuilder+LocalScopeManager:GetAllScopesWithLocals():System.Collections.Immutable.ImmutableArray`1[Microsoft.Cci.LocalScope]:this
@@ -121,7 +121,7 @@ G_M176_IG03:        ; bbWeight=0.50, gcrefRegs=100000 {x20}, byrefRegs=0000 {},
             ; byrRegs +[x0]
             ldr     x1, [x0]
             ; gcrRegs +[x1]
-            ldr     x0, [x0]
+            mov     x0, x1
             ; gcrRegs +[x0]
             ; byrRegs -[x0]
             ldr     x11, [x20, #0x08]
@@ -140,7 +140,7 @@ G_M176_IG03:        ; bbWeight=0.50, gcrefRegs=100000 {x20}, byrefRegs=0000 {},
             ldr     x2, [x11]
             blr     x2
             ; gcrRegs -[x0]
-						;; size=148 bbWeight=0.50 PerfScore 28.00
+						;; size=148 bbWeight=0.50 PerfScore 26.75
 G_M176_IG04:        ; bbWeight=1, gcrefRegs=100000 {x20}, byrefRegs=0000 {}, byref
             adrp    x11, [HIGH RELOC #0xD1FFAB1E]      // function address
             add     x11, x11, [LOW RELOC #0xD1FFAB1E]
@@ -172,7 +172,7 @@ G_M176_IG05:        ; bbWeight=1, epilog, nogc, extend
             ret     lr
 						;; size=12 bbWeight=1 PerfScore 3.00
 
-; Total bytes of code 376, prolog size 28, PerfScore 145.10, instruction count 94, allocated bytes for code 376 (MethodHash=5bebff4f) for method Microsoft.CodeAnalysis.CodeGen.ILBuilder+LocalScopeManager:GetAllScopesWithLocals():System.Collections.Immutable.ImmutableArray`1[Microsoft.Cci.LocalScope]:this
+; Total bytes of code 376, prolog size 28, PerfScore 143.85, instruction count 94, allocated bytes for code 376 (MethodHash=5bebff4f) for method Microsoft.CodeAnalysis.CodeGen.ILBuilder+LocalScopeManager:GetAllScopesWithLocals():System.Collections.Immutable.ImmutableArray`1[Microsoft.Cci.LocalScope]:this
 ; ============================================================
 
 Unwind Info:
+0 (0.00%) : 157888.dasm - System.Text.BinHexEncoding:GetMaxByteCount(int):int:this
@@ -33,10 +33,10 @@ G_M45031_IG02:        ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref,
             ldr     w0, [fp, #0x1C]
             tbnz    w0, #0, G_M45031_IG05
             ldr     w0, [fp, #0x1C]
-            ldr     w11, [fp, #0x1C]
+            mov     w11, w0
             add     w0, w11, w0,  LSR #31
             asr     w0, w0, #1
-						;; size=36 bbWeight=1 PerfScore 12.50
+						;; size=36 bbWeight=1 PerfScore 11.00
 G_M45031_IG03:        ; bbWeight=1, epilog, nogc, extend
             ldp     x19, x20, [sp, #0x20]
             ldp     fp, lr, [sp], #0x30
@@ -129,7 +129,7 @@ G_M45031_IG05:        ; bbWeight=0, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref
             brk_unix #0
 						;; size=132 bbWeight=0 PerfScore 0.00
 
-; Total bytes of code 296, prolog size 12, PerfScore 48.60, instruction count 74, allocated bytes for code 296 (MethodHash=a2365018) for method System.Text.BinHexEncoding:GetMaxByteCount(int):int:this
+; Total bytes of code 296, prolog size 12, PerfScore 47.10, instruction count 74, allocated bytes for code 296 (MethodHash=a2365018) for method System.Text.BinHexEncoding:GetMaxByteCount(int):int:this
 ; ============================================================
 
 Unwind Info:
libraries.pmi.linux.arm64.checked.mch
-4 (-1.37%) : 82027.dasm - ILCompiler.CecilCompatibleTypeParser:GetType(Internal.TypeSystem.ModuleDesc,System.String):Internal.TypeSystem.TypeDesc
@@ -48,10 +48,9 @@ G_M10356_IG02:        ; bbWeight=1, gcrefRegs=180000 {x19 x20}, byrefRegs=0000 {
             cbz     x19, G_M10356_IG04
 						;; size=4 bbWeight=1 PerfScore 1.00
 G_M10356_IG03:        ; bbWeight=0.46, gcrefRegs=180000 {x19 x20}, byrefRegs=0000 {}, byref, isz
-            ldr     w0, [x19, #0x08]
             ldr     w0, [x19, #0x08]
             cbnz    w0, G_M10356_IG05
-						;; size=12 bbWeight=0.46 PerfScore 3.22
+						;; size=8 bbWeight=0.46 PerfScore 1.84
 G_M10356_IG04:        ; bbWeight=0.50, gcrefRegs=180000 {x19 x20}, byrefRegs=0000 {}, byref
             movz    x1, #8
             movk    x1, #0xD1FFAB1E LSL #16
@@ -148,7 +147,7 @@ G_M10356_IG11:        ; bbWeight=0.50, gcrefRegs=0001 {x0}, byrefRegs=0000 {}, b
             ret     lr
 						;; size=12 bbWeight=0.50 PerfScore 1.50
 
-; Total bytes of code 292, prolog size 20, PerfScore 73.92, instruction count 73, allocated bytes for code 292 (MethodHash=ac08d78b) for method ILCompiler.CecilCompatibleTypeParser:GetType(Internal.TypeSystem.ModuleDesc,System.String):Internal.TypeSystem.TypeDesc
+; Total bytes of code 288, prolog size 20, PerfScore 72.14, instruction count 72, allocated bytes for code 288 (MethodHash=ac08d78b) for method ILCompiler.CecilCompatibleTypeParser:GetType(Internal.TypeSystem.ModuleDesc,System.String):Internal.TypeSystem.TypeDesc
 ; ============================================================
 
 Unwind Info:
@@ -159,7 +158,7 @@ Unwind Info:
   E bit             : 0
   X bit             : 0
   Vers              : 0
-  Function Length   : 73 (0x00049) Actual length = 292 (0x000124)
+  Function Length   : 72 (0x00048) Actual length = 288 (0x000120)
   ---- Epilog scopes ----
   ---- Scope 0
   Epilog Start Offset        : 3523193630 (0xd1ffab1e) Actual offset = 3523193630 (0xd1ffab1e) Offset from main function begin = 3523193630 (0xd1ffab1e)
+0 (0.00%) : 16641.dasm - System.Data.Common.DbCommand:ExecuteScalarAsync(System.Threading.CancellationToken):System.Threading.Tasks.Task`1[System.Object]:this
@@ -343,10 +343,10 @@ G_M42079_IG15:        ; bbWeight=0, gcVars=0000000000020200 {V03 V59}, gcrefRegs
 G_M42079_IG16:        ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref, isz
             ldr     x19, [fp, #0x18]	// [V59 tmp54]
             ; gcrRegs +[x19]
-            ldr     x0, [fp, #0x18]	// [V59 tmp54]
+            mov     x0, x19
             ; gcrRegs +[x0]
             cbz     x0, G_M42079_IG22
-						;; size=12 bbWeight=1 PerfScore 5.00
+						;; size=12 bbWeight=1 PerfScore 3.50
 G_M42079_IG17:        ; bbWeight=0.19, gcVars=0000000000020000 {V03}, gcrefRegs=80000 {x19}, byrefRegs=0000 {}, gcvars, byref, isz
             ; gcrRegs -[x0]
             ; GC ptr vars -{V59}
@@ -514,7 +514,7 @@ G_M42079_IG31:        ; bbWeight=0, gcVars=0000000000000200 {V59}, gcrefRegs=000
 G_M42079_IG32:        ; bbWeight=0, gcVars=0000000000000200 {V59}, gcrefRegs=0000 {}, byrefRegs=0000 {}, gcvars, byref, isz
             ldr     x19, [fp, #0x18]	// [V59 tmp54]
             ; gcrRegs +[x19]
-            ldr     x0, [fp, #0x18]	// [V59 tmp54]
+            mov     x0, x19
             ; gcrRegs +[x0]
             cbz     x0, G_M42079_IG33
             ldr     x0, [x19, #0x08]
@@ -574,7 +574,7 @@ G_M42079_IG33:        ; bbWeight=0, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref,
             ret     lr
 						;; size=20 bbWeight=0 PerfScore 0.00
 
-; Total bytes of code 1152, prolog size 40, PerfScore 300.56, instruction count 288, allocated bytes for code 1152 (MethodHash=134a5ba0) for method System.Data.Common.DbCommand:ExecuteScalarAsync(System.Threading.CancellationToken):System.Threading.Tasks.Task`1[System.Object]:this
+; Total bytes of code 1152, prolog size 40, PerfScore 299.06, instruction count 288, allocated bytes for code 1152 (MethodHash=134a5ba0) for method System.Data.Common.DbCommand:ExecuteScalarAsync(System.Threading.CancellationToken):System.Threading.Tasks.Task`1[System.Object]:this
 ; ============================================================
 
 Unwind Info:
+0 (0.00%) : 20865.dasm - Microsoft.CodeAnalysis.VisualBasic.Symbols.DeclarationTreeBuilder:VisitDelegateStatement(Microsoft.CodeAnalysis.VisualBasic.Syntax.DelegateStatementSyntax):Microsoft.CodeAnalysis.VisualBasic.Symbols.SingleNamespaceOrTypeDeclaration:this
@@ -258,7 +258,7 @@ G_M26984_IG11:        ; bbWeight=0.50, gcrefRegs=2500000 {x20 x22 x25}, byrefReg
             ldr     w19, [fp, #0x4C]
             ldr     x0, [fp, #0x40]
             ; gcrRegs +[x0]
-            ldr     x1, [fp, #0x40]
+            mov     x1, x0
             ; gcrRegs +[x1]
             ldr     x1, [x1]
             ; gcrRegs -[x1]
@@ -270,7 +270,7 @@ G_M26984_IG11:        ; bbWeight=0.50, gcrefRegs=2500000 {x20 x22 x25}, byrefReg
             str     xzr, [fp, #0x18]	// [V35 tmp30]
             ldr     x0, [fp, #0x40]	// [V41 tmp36]
             ; gcrRegs +[x0]
-            ldr     x1, [fp, #0x40]	// [V41 tmp36]
+            mov     x1, x0
             ; gcrRegs +[x1]
             ldr     x1, [x1]
             ; gcrRegs -[x1]
@@ -288,7 +288,7 @@ G_M26984_IG11:        ; bbWeight=0.50, gcrefRegs=2500000 {x20 x22 x25}, byrefReg
             blr     x3
             ldr     w19, [fp, #0x18]	// [V50 tmp45]
             ldr     w26, [fp, #0x1C]	// [V51 tmp46]
-						;; size=100 bbWeight=0.50 PerfScore 21.25
+						;; size=100 bbWeight=0.50 PerfScore 19.75
 G_M26984_IG12:        ; bbWeight=1, gcrefRegs=2500000 {x20 x22 x25}, byrefRegs=0000 {}, byref
             mov     x0, x20
             ; gcrRegs +[x0]
@@ -397,7 +397,7 @@ G_M26984_IG14:        ; bbWeight=0, gcVars=0000000000000000 {}, gcrefRegs=0002 {
             brk_unix #0
 						;; size=28 bbWeight=0 PerfScore 0.00
 
-; Total bytes of code 860, prolog size 40, PerfScore 273.75, instruction count 215, allocated bytes for code 860 (MethodHash=72499697) for method Microsoft.CodeAnalysis.VisualBasic.Symbols.DeclarationTreeBuilder:VisitDelegateStatement(Microsoft.CodeAnalysis.VisualBasic.Syntax.DelegateStatementSyntax):Microsoft.CodeAnalysis.VisualBasic.Symbols.SingleNamespaceOrTypeDeclaration:this
+; Total bytes of code 860, prolog size 40, PerfScore 272.25, instruction count 215, allocated bytes for code 860 (MethodHash=72499697) for method Microsoft.CodeAnalysis.VisualBasic.Symbols.DeclarationTreeBuilder:VisitDelegateStatement(Microsoft.CodeAnalysis.VisualBasic.Syntax.DelegateStatementSyntax):Microsoft.CodeAnalysis.VisualBasic.Symbols.SingleNamespaceOrTypeDeclaration:this
 ; ============================================================
 
 Unwind Info:
+0 (0.00%) : 217152.dasm - System.Net.Http.Headers.MediaTypeWithQualityHeaderValue:TryParse(System.String,byref):bool
@@ -57,10 +57,10 @@ G_M51950_IG02:        ; bbWeight=1, gcrefRegs=100000 {x20}, byrefRegs=80000 {x19
 G_M51950_IG03:        ; bbWeight=0.50, gcrefRegs=0000 {}, byrefRegs=80000 {x19}, byref, isz
             ldr     x15, [fp, #0x10]	// [V03 loc1]
             ; gcrRegs +[x15]
-            ldr     x14, [fp, #0x10]	// [V03 loc1]
+            mov     x14, x15
             ; gcrRegs +[x14]
             cbz     x14, G_M51950_IG05
-						;; size=12 bbWeight=0.50 PerfScore 2.50
+						;; size=12 bbWeight=0.50 PerfScore 1.75
 G_M51950_IG04:        ; bbWeight=0.25, gcrefRegs=8000 {x15}, byrefRegs=80000 {x19}, byref, isz
             ; gcrRegs -[x14]
             ldr     x14, [fp, #0x10]	// [V03 loc1]
@@ -106,7 +106,7 @@ G_M51950_IG09:        ; bbWeight=0, gcVars=0000000000000000 {}, gcrefRegs=0000 {
             brk_unix #0
 						;; size=28 bbWeight=0 PerfScore 0.00
 
-; Total bytes of code 220, prolog size 16, PerfScore 56.25, instruction count 55, allocated bytes for code 220 (MethodHash=dc143511) for method System.Net.Http.Headers.MediaTypeWithQualityHeaderValue:TryParse(System.String,byref):bool
+; Total bytes of code 220, prolog size 16, PerfScore 55.50, instruction count 55, allocated bytes for code 220 (MethodHash=dc143511) for method System.Net.Http.Headers.MediaTypeWithQualityHeaderValue:TryParse(System.String,byref):bool
 ; ============================================================
 
 Unwind Info:
+0 (0.00%) : 241984.dasm - System.Composition.Hosting.Core.LifetimeContext:TryGetExport(System.Composition.Hosting.Core.CompositionContract,byref):bool:this
@@ -53,7 +53,7 @@ G_M24388_IG05:        ; bbWeight=0.50, gcVars=0000000000000000 {}, gcrefRegs=800
             ; gcrRegs +[x19]
             ldr     x0, [fp, #0x18]
             ; gcrRegs +[x0]
-            ldr     x1, [fp, #0x18]
+            mov     x1, x0
             ; gcrRegs +[x1]
             ldr     x1, [x1]
             ; gcrRegs -[x1]
@@ -77,14 +77,14 @@ G_M24388_IG05:        ; bbWeight=0.50, gcVars=0000000000000000 {}, gcrefRegs=800
             ; gcrRegs -[x0 x15]
             ; byrRegs -[x14 x20]
             mov     w0, #1
-						;; size=68 bbWeight=0.50 PerfScore 11.50
+						;; size=68 bbWeight=0.50 PerfScore 10.75
 G_M24388_IG06:        ; bbWeight=0.50, epilog, nogc, extend
             ldp     x19, x20, [sp, #0x20]
             ldp     fp, lr, [sp], #0x30
             ret     lr
 						;; size=12 bbWeight=0.50 PerfScore 1.50
 
-; Total bytes of code 160, prolog size 16, PerfScore 48.75, instruction count 40, allocated bytes for code 160 (MethodHash=af7ea0bb) for method System.Composition.Hosting.Core.LifetimeContext:TryGetExport(System.Composition.Hosting.Core.CompositionContract,byref):bool:this
+; Total bytes of code 160, prolog size 16, PerfScore 48.00, instruction count 40, allocated bytes for code 160 (MethodHash=af7ea0bb) for method System.Composition.Hosting.Core.LifetimeContext:TryGetExport(System.Composition.Hosting.Core.CompositionContract,byref):bool:this
 ; ============================================================
 
 Unwind Info:
+0 (0.00%) : 252352.dasm - System.IO.IsolatedStorage.IsolatedStorageFileStream:.ctor(System.String,int):this
@@ -80,7 +80,7 @@ G_M25423_IG05:        ; bbWeight=1, gcrefRegs=180000 {x19 x20}, byrefRegs=0000 {
             ; gcrRegs -[x0]
             ldr     x0, [fp, #0x10]
             ; gcrRegs +[x0]
-            ldr     x1, [fp, #0x10]
+            mov     x1, x0
             ; gcrRegs +[x1]
             ldr     x1, [x1]
             ; gcrRegs -[x1]
@@ -195,7 +195,7 @@ G_M25423_IG05:        ; bbWeight=1, gcrefRegs=180000 {x19 x20}, byrefRegs=0000 {
             bl      CORINFO_HELP_ASSIGN_REF
             ; gcrRegs -[x15 x19]
             ; byrRegs -[x14]
-						;; size=340 bbWeight=1 PerfScore 99.50
+						;; size=340 bbWeight=1 PerfScore 98.00
 G_M25423_IG06:        ; bbWeight=1, epilog, nogc, extend
             ldr     x23, [sp, #0x48]
             ldp     x21, x22, [sp, #0x38]
@@ -204,7 +204,7 @@ G_M25423_IG06:        ; bbWeight=1, epilog, nogc, extend
             ret     lr
 						;; size=20 bbWeight=1 PerfScore 6.00
 
-; Total bytes of code 428, prolog size 32, PerfScore 160.05, instruction count 107, allocated bytes for code 428 (MethodHash=bd7c9cb0) for method System.IO.IsolatedStorage.IsolatedStorageFileStream:.ctor(System.String,int):this
+; Total bytes of code 428, prolog size 32, PerfScore 158.55, instruction count 107, allocated bytes for code 428 (MethodHash=bd7c9cb0) for method System.IO.IsolatedStorage.IsolatedStorageFileStream:.ctor(System.String,int):this
 ; ============================================================
 
 Unwind Info:
coreclr_tests.run.linux.arm64.checked.mch
-4 (-1.45%) : 554178.dasm - JitTest_val_ctor_newobj_il.TestStruct:Main():int
@@ -37,7 +37,6 @@ G_M35170_IG02:        ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref
             ; gcr arg pop 0
 						;; size=24 bbWeight=1 PerfScore 3.50
 G_M35170_IG03:        ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref, isz
-            ldr     x0, [x19, #0x38]
             ldr     x0, [x19, #0x38]
             cmp     x0, #100
             bge     G_M35170_IG04
@@ -50,7 +49,7 @@ G_M35170_IG03:        ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref,
             ldr     x2, [x2]
             blr     x2
             ; gcr arg pop 0
-						;; size=48 bbWeight=1 PerfScore 15.00
+						;; size=44 bbWeight=1 PerfScore 12.00
 G_M35170_IG04:        ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref, isz
             ldr     x0, [x19, #0x38]
             cmp     x0, #105
@@ -115,7 +114,7 @@ G_M35170_IG10:        ; bbWeight=0, funclet epilog, nogc, extend
             ret     lr
 						;; size=12 bbWeight=0 PerfScore 0.00
 
-; Total bytes of code 276, prolog size 20, PerfScore 78.60, instruction count 69, allocated bytes for code 276 (MethodHash=7069769d) for method JitTest_val_ctor_newobj_il.TestStruct:Main():int
+; Total bytes of code 272, prolog size 20, PerfScore 75.20, instruction count 68, allocated bytes for code 272 (MethodHash=7069769d) for method JitTest_val_ctor_newobj_il.TestStruct:Main():int
 ; ============================================================
 
 Unwind Info:
@@ -126,7 +125,7 @@ Unwind Info:
   E bit             : 0
   X bit             : 0
   Vers              : 0
-  Function Length   : 51 (0x00033) Actual length = 204 (0x0000cc)
+  Function Length   : 50 (0x00032) Actual length = 200 (0x0000c8)
   ---- Epilog scopes ----
   ---- Scope 0
   Epilog Start Offset        : 3523193630 (0xd1ffab1e) Actual offset = 3523193630 (0xd1ffab1e) Offset from main function begin = 3523193630 (0xd1ffab1e)
-4 (-1.37%) : 561619.dasm - ILCompiler.CecilCompatibleTypeParser:GetType(Internal.TypeSystem.ModuleDesc,System.String):Internal.TypeSystem.TypeDesc
@@ -48,10 +48,9 @@ G_M10356_IG02:        ; bbWeight=1, gcrefRegs=180000 {x19 x20}, byrefRegs=0000 {
             cbz     x19, G_M10356_IG04
 						;; size=4 bbWeight=1 PerfScore 1.00
 G_M10356_IG03:        ; bbWeight=0.46, gcrefRegs=180000 {x19 x20}, byrefRegs=0000 {}, byref, isz
-            ldr     w0, [x19, #0x08]
             ldr     w0, [x19, #0x08]
             cbnz    w0, G_M10356_IG05
-						;; size=12 bbWeight=0.46 PerfScore 3.22
+						;; size=8 bbWeight=0.46 PerfScore 1.84
 G_M10356_IG04:        ; bbWeight=0.50, gcrefRegs=180000 {x19 x20}, byrefRegs=0000 {}, byref
             movz    x1, #0xD1FFAB1E
             movk    x1, #0xD1FFAB1E LSL #16
@@ -148,7 +147,7 @@ G_M10356_IG11:        ; bbWeight=0.50, gcrefRegs=0001 {x0}, byrefRegs=0000 {}, b
             ret     lr
 						;; size=12 bbWeight=0.50 PerfScore 1.50
 
-; Total bytes of code 292, prolog size 20, PerfScore 73.92, instruction count 73, allocated bytes for code 292 (MethodHash=ac08d78b) for method ILCompiler.CecilCompatibleTypeParser:GetType(Internal.TypeSystem.ModuleDesc,System.String):Internal.TypeSystem.TypeDesc
+; Total bytes of code 288, prolog size 20, PerfScore 72.14, instruction count 72, allocated bytes for code 288 (MethodHash=ac08d78b) for method ILCompiler.CecilCompatibleTypeParser:GetType(Internal.TypeSystem.ModuleDesc,System.String):Internal.TypeSystem.TypeDesc
 ; ============================================================
 
 Unwind Info:
@@ -159,7 +158,7 @@ Unwind Info:
   E bit             : 0
   X bit             : 0
   Vers              : 0
-  Function Length   : 73 (0x00049) Actual length = 292 (0x000124)
+  Function Length   : 72 (0x00048) Actual length = 288 (0x000120)
   ---- Epilog scopes ----
   ---- Scope 0
   Epilog Start Offset        : 3523193630 (0xd1ffab1e) Actual offset = 3523193630 (0xd1ffab1e) Offset from main function begin = 3523193630 (0xd1ffab1e)
-4 (-1.37%) : 370595.dasm - ILCompiler.CecilCompatibleTypeParser:GetType(Internal.TypeSystem.ModuleDesc,System.String):Internal.TypeSystem.TypeDesc
@@ -49,10 +49,9 @@ G_M10356_IG02:        ; bbWeight=1, gcrefRegs=180000 {x19 x20}, byrefRegs=0000 {
             cbz     x19, G_M10356_IG04
 						;; size=4 bbWeight=1 PerfScore 1.00
 G_M10356_IG03:        ; bbWeight=0.46, gcrefRegs=180000 {x19 x20}, byrefRegs=0000 {}, byref, isz
-            ldr     w0, [x19, #0x08]
             ldr     w0, [x19, #0x08]
             cbnz    w0, G_M10356_IG05
-						;; size=12 bbWeight=0.46 PerfScore 3.22
+						;; size=8 bbWeight=0.46 PerfScore 1.84
 G_M10356_IG04:        ; bbWeight=0.50, gcrefRegs=180000 {x19 x20}, byrefRegs=0000 {}, byref
             movz    x1, #0xD1FFAB1E
             movk    x1, #0xD1FFAB1E LSL #16
@@ -149,7 +148,7 @@ G_M10356_IG11:        ; bbWeight=0.50, gcrefRegs=0001 {x0}, byrefRegs=0000 {}, b
             ret     lr
 						;; size=12 bbWeight=0.50 PerfScore 1.50
 
-; Total bytes of code 292, prolog size 20, PerfScore 73.92, instruction count 73, allocated bytes for code 292 (MethodHash=ac08d78b) for method ILCompiler.CecilCompatibleTypeParser:GetType(Internal.TypeSystem.ModuleDesc,System.String):Internal.TypeSystem.TypeDesc
+; Total bytes of code 288, prolog size 20, PerfScore 72.14, instruction count 72, allocated bytes for code 288 (MethodHash=ac08d78b) for method ILCompiler.CecilCompatibleTypeParser:GetType(Internal.TypeSystem.ModuleDesc,System.String):Internal.TypeSystem.TypeDesc
 ; ============================================================
 
 Unwind Info:
@@ -160,7 +159,7 @@ Unwind Info:
   E bit             : 0
   X bit             : 0
   Vers              : 0
-  Function Length   : 73 (0x00049) Actual length = 292 (0x000124)
+  Function Length   : 72 (0x00048) Actual length = 288 (0x000120)
   ---- Epilog scopes ----
   ---- Scope 0
   Epilog Start Offset        : 3523193630 (0xd1ffab1e) Actual offset = 3523193630 (0xd1ffab1e) Offset from main function begin = 3523193630 (0xd1ffab1e)
+0 (0.00%) : 535616.dasm - (dynamicClass):ABIStress_PInvoker750(int,ABIStress.I128_1,ABIStress.S15U,System.Int128,ABIStress.S12U,ABIStress.S13U,float,System.Int128,ABIStress.S9U,ABIStress.S8U,ABIStress.S4U,System.Int128,double,ABIStress.S3U,ABIStress.S17U,double,ABIStress.S10U,double,ABIStress.S3U,ABIStress.S8U,double):int[]
@@ -594,7 +594,7 @@ G_M9188_IG07:        ; bbWeight=1, isz, extend
             ldr     x6, [fp, #0xD1FFAB1E]	// [V08 arg8]
             mov     x3, x6
             ldr     x4, [fp, #0xD1FFAB1E]	// [V08 arg8+0x08]
-            ldr     x7, [fp, #0xD1FFAB1E]	// [V08 arg8+0x08]
+            mov     x7, x4
             ldr     w0, [fp, #0xD1FFAB1E]	// [V18 arg18]
             ldr     d0, [fp, #0xD1FFAB1E]	// [V17 arg17]
             movz    w1, #0xD1FFAB1E
@@ -607,7 +607,7 @@ G_M9188_IG07:        ; bbWeight=1, isz, extend
             ; byrRegs +[x8]
             str     x8, [x21, #0x10]
             strb    wzr, [x21, #0x0C]
-						;; size=68 bbWeight=1 PerfScore 22.00
+						;; size=68 bbWeight=1 PerfScore 20.50
 G_M9188_IG08:        ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref
             ; byrRegs -[x8 x20]
             movz    x8, #0xD1FFAB1E
@@ -654,7 +654,7 @@ RWD00  	dd	463AD000h		;     11956
 RWD04  	dd	46F6C600h		;     31587
 
 
-; Total bytes of code 1992, prolog size 32, PerfScore 629.20, instruction count 498, allocated bytes for code 1992 (MethodHash=6120dc1b) for method (dynamicClass):ABIStress_PInvoker750(int,ABIStress.I128_1,ABIStress.S15U,System.Int128,ABIStress.S12U,ABIStress.S13U,float,System.Int128,ABIStress.S9U,ABIStress.S8U,ABIStress.S4U,System.Int128,double,ABIStress.S3U,ABIStress.S17U,double,ABIStress.S10U,double,ABIStress.S3U,ABIStress.S8U,double):int[]
+; Total bytes of code 1992, prolog size 32, PerfScore 627.70, instruction count 498, allocated bytes for code 1992 (MethodHash=6120dc1b) for method (dynamicClass):ABIStress_PInvoker750(int,ABIStress.I128_1,ABIStress.S15U,System.Int128,ABIStress.S12U,ABIStress.S13U,float,System.Int128,ABIStress.S9U,ABIStress.S8U,ABIStress.S4U,System.Int128,double,ABIStress.S3U,ABIStress.S17U,double,ABIStress.S10U,double,ABIStress.S3U,ABIStress.S8U,double):int[]
 ; ============================================================
 
 Unwind Info:
+0 (0.00%) : 561088.dasm - AssemblyDependencyResolverTests.TestBase:RunSingleTest(System.Action,System.String):this
@@ -79,7 +79,7 @@ G_M48672_IG01:        ; bbWeight=1, gcVars=0000000000000000 {}, gcrefRegs=0000 {
 G_M48672_IG02:        ; bbWeight=1, gcVars=0000000000000081 {V00 V02}, gcrefRegs=80000 {x19}, byrefRegs=0000 {}, gcvars, byref, isz
             ldr     x20, [fp, #0x18]
             ; gcrRegs +[x20]
-            ldr     x0, [fp, #0x18]
+            mov     x0, x20
             ; gcrRegs +[x0]
             cbnz    x0, G_M48672_IG03
             mov     x0, x19
@@ -99,7 +99,7 @@ G_M48672_IG02:        ; bbWeight=1, gcVars=0000000000000081 {V00 V02}, gcrefRegs
             ; gcr arg pop 0
             mov     x20, x0
             ; gcrRegs +[x20]
-						;; size=60 bbWeight=1 PerfScore 24.50
+						;; size=60 bbWeight=1 PerfScore 23.00
 G_M48672_IG03:        ; bbWeight=1, gcrefRegs=180000 {x19 x20}, byrefRegs=0000 {}, byref
             ; gcrRegs -[x0]
             str     x20, [fp, #0x18]	// [V02 arg2]
@@ -470,7 +470,7 @@ G_M48672_IG25:        ; bbWeight=0, funclet epilog, nogc, extend
             ret     lr
 						;; size=20 bbWeight=0 PerfScore 0.00
 
-; Total bytes of code 968, prolog size 40, PerfScore 251.34, instruction count 242, allocated bytes for code 968 (MethodHash=c3a841df) for method AssemblyDependencyResolverTests.TestBase:RunSingleTest(System.Action,System.String):this
+; Total bytes of code 968, prolog size 40, PerfScore 249.84, instruction count 242, allocated bytes for code 968 (MethodHash=c3a841df) for method AssemblyDependencyResolverTests.TestBase:RunSingleTest(System.Action,System.String):this
 ; ============================================================
 
 Unwind Info:
+8 (+0.03%) : 543212.dasm - CseTest.Test_Main:Main():int
@@ -4232,8 +4232,9 @@ G_M22188_IG242:        ; bbWeight=1, gcrefRegs=100000 {x20}, byrefRegs=0000 {},
             blr     x2
             str     w0, [x20, #0x28]
             ldr     w0, [x20, #0x08]
-            ldp     w1, w2, [x20, #0x08]
-            ldp     w3, w4, [x20, #0x10]
+            mov     w1, w0
+            ldp     w2, w3, [x20, #0x0C]
+            ldr     w4, [x20, #0x14]
             madd    w2, w3, w4, w2
             ldp     w3, w4, [x20, #0x14]
             msub    w2, w3, w4, w2
@@ -4250,7 +4251,7 @@ G_M22188_IG242:        ; bbWeight=1, gcrefRegs=100000 {x20}, byrefRegs=0000 {},
             movk    w0, #0xD1FFAB1E LSL #16
             cmp     w21, w0
             beq     G_M22188_IG244
-						;; size=108 bbWeight=1 PerfScore 45.00
+						;; size=112 bbWeight=1 PerfScore 45.50
 G_M22188_IG243:        ; bbWeight=0.50, gcrefRegs=100000 {x20}, byrefRegs=0000 {}, byref
             movz    x0, #0xD1FFAB1E
             movk    x0, #0xD1FFAB1E LSL #16
@@ -5478,8 +5479,9 @@ G_M22188_IG321:        ; bbWeight=0.50, gcrefRegs=100000 {x20}, byrefRegs=0000 {
 						;; size=60 bbWeight=0.50 PerfScore 5.75
 G_M22188_IG322:        ; bbWeight=1, gcrefRegs=100000 {x20}, byrefRegs=0000 {}, byref, isz
             ldr     w0, [x20, #0x08]
-            ldp     w1, w2, [x20, #0x08]
-            ldp     w3, w4, [x20, #0x10]
+            mov     w1, w0
+            ldp     w2, w3, [x20, #0x0C]
+            ldr     w4, [x20, #0x14]
             madd    w2, w3, w4, w2
             ldp     w3, w4, [x20, #0x14]
             msub    w2, w3, w4, w2
@@ -5496,7 +5498,7 @@ G_M22188_IG322:        ; bbWeight=1, gcrefRegs=100000 {x20}, byrefRegs=0000 {},
             movk    w0, #0xD1FFAB1E LSL #16
             cmp     w21, w0
             beq     G_M22188_IG324
-						;; size=76 bbWeight=1 PerfScore 37.50
+						;; size=80 bbWeight=1 PerfScore 38.00
 G_M22188_IG323:        ; bbWeight=0.50, gcrefRegs=100000 {x20}, byrefRegs=0000 {}, byref
             movz    x0, #0xD1FFAB1E
             movk    x0, #0xD1FFAB1E LSL #16
@@ -8374,7 +8376,7 @@ G_M22188_IG497:        ; bbWeight=0.50, gcrefRegs=100000 {x20}, byrefRegs=0000 {
 						;; size=60 bbWeight=0.50 PerfScore 5.75
 G_M22188_IG498:        ; bbWeight=1, gcrefRegs=100000 {x20}, byrefRegs=0000 {}, byref, isz
             ldr     w0, [x20, #0x10]
-            ldr     w1, [x20, #0x10]
+            mov     w1, w0
             mul     w0, w0, w1
             ldr     w1, [x20, #0x20]
             add     w0, w1, w0
@@ -8384,7 +8386,7 @@ G_M22188_IG498:        ; bbWeight=1, gcrefRegs=100000 {x20}, byrefRegs=0000 {},
             msub    w21, w1, w2, w0
             cmn     w21, #20
             beq     G_M22188_IG500
-						;; size=44 bbWeight=1 PerfScore 21.50
+						;; size=44 bbWeight=1 PerfScore 19.00
 G_M22188_IG499:        ; bbWeight=0.50, gcrefRegs=100000 {x20}, byrefRegs=0000 {}, byref
             movz    x0, #0xD1FFAB1E
             movk    x0, #0xD1FFAB1E LSL #16
@@ -8408,13 +8410,13 @@ G_M22188_IG499:        ; bbWeight=0.50, gcrefRegs=100000 {x20}, byrefRegs=0000 {
 						;; size=60 bbWeight=0.50 PerfScore 5.75
 G_M22188_IG500:        ; bbWeight=1, gcrefRegs=100000 {x20}, byrefRegs=0000 {}, byref, isz
             ldr     w0, [x20, #0x10]
-            ldr     w1, [x20, #0x10]
+            mov     w1, w0
             mul     w0, w0, w1
             ldr     w1, [x20, #0x20]
             add     w21, w1, w0
             cmp     w21, #73
             beq     G_M22188_IG502
-						;; size=28 bbWeight=1 PerfScore 13.00
+						;; size=28 bbWeight=1 PerfScore 10.50
 G_M22188_IG501:        ; bbWeight=0.50, gcrefRegs=100000 {x20}, byrefRegs=0000 {}, byref
             movz    x0, #0xD1FFAB1E
             movk    x0, #0xD1FFAB1E LSL #16
@@ -8439,11 +8441,11 @@ G_M22188_IG501:        ; bbWeight=0.50, gcrefRegs=100000 {x20}, byrefRegs=0000 {
 G_M22188_IG502:        ; bbWeight=1, gcrefRegs=100000 {x20}, byrefRegs=0000 {}, byref, isz
             ldr     w0, [x20, #0x20]
             ldr     w1, [x20, #0x10]
-            ldr     w2, [x20, #0x10]
+            mov     w2, w1
             madd    w21, w1, w2, w0
             cmp     w21, #73
             beq     G_M22188_IG504
-						;; size=24 bbWeight=1 PerfScore 12.50
+						;; size=24 bbWeight=1 PerfScore 10.00
 G_M22188_IG503:        ; bbWeight=0.50, gcrefRegs=100000 {x20}, byrefRegs=0000 {}, byref
             movz    x0, #0xD1FFAB1E
             movk    x0, #0xD1FFAB1E LSL #16
@@ -8467,11 +8469,11 @@ G_M22188_IG503:        ; bbWeight=0.50, gcrefRegs=100000 {x20}, byrefRegs=0000 {
 						;; size=60 bbWeight=0.50 PerfScore 5.75
 G_M22188_IG504:        ; bbWeight=1, gcrefRegs=100000 {x20}, byrefRegs=0000 {}, byref, isz
             ldr     w0, [x20, #0x10]
-            ldr     w1, [x20, #0x10]
+            mov     w1, w0
             mul     w21, w0, w1
             cmp     w21, #4
             beq     G_M22188_IG506
-						;; size=20 bbWeight=1 PerfScore 9.50
+						;; size=20 bbWeight=1 PerfScore 7.00
 G_M22188_IG505:        ; bbWeight=0.50, gcrefRegs=100000 {x20}, byrefRegs=0000 {}, byref
             movz    x0, #0xD1FFAB1E
             movk    x0, #0xD1FFAB1E LSL #16
@@ -8495,11 +8497,11 @@ G_M22188_IG505:        ; bbWeight=0.50, gcrefRegs=100000 {x20}, byrefRegs=0000 {
 						;; size=60 bbWeight=0.50 PerfScore 5.75
 G_M22188_IG506:        ; bbWeight=1, gcrefRegs=100000 {x20}, byrefRegs=0000 {}, byref, isz
             ldr     w0, [x20, #0x10]
-            ldr     w1, [x20, #0x10]
+            mov     w1, w0
             mul     w21, w0, w1
             cmp     w21, #4
             beq     G_M22188_IG508
-						;; size=20 bbWeight=1 PerfScore 9.50
+						;; size=20 bbWeight=1 PerfScore 7.00
 G_M22188_IG507:        ; bbWeight=0.50, gcrefRegs=100000 {x20}, byrefRegs=0000 {}, byref
             movz    x0, #0xD1FFAB1E
             movk    x0, #0xD1FFAB1E LSL #16
@@ -8523,11 +8525,11 @@ G_M22188_IG507:        ; bbWeight=0.50, gcrefRegs=100000 {x20}, byrefRegs=0000 {
 						;; size=60 bbWeight=0.50 PerfScore 5.75
 G_M22188_IG508:        ; bbWeight=1, gcrefRegs=100000 {x20}, byrefRegs=0000 {}, byref, isz
             ldr     w0, [x20, #0x10]
-            ldr     w1, [x20, #0x10]
+            mov     w1, w0
             mul     w21, w0, w1
             cmp     w21, #4
             beq     G_M22188_IG510
-						;; size=20 bbWeight=1 PerfScore 9.50
+						;; size=20 bbWeight=1 PerfScore 7.00
 G_M22188_IG509:        ; bbWeight=0.50, gcrefRegs=100000 {x20}, byrefRegs=0000 {}, byref
             movz    x0, #0xD1FFAB1E
             movk    x0, #0xD1FFAB1E LSL #16
@@ -8551,11 +8553,11 @@ G_M22188_IG509:        ; bbWeight=0.50, gcrefRegs=100000 {x20}, byrefRegs=0000 {
 						;; size=60 bbWeight=0.50 PerfScore 5.75
 G_M22188_IG510:        ; bbWeight=1, gcrefRegs=100000 {x20}, byrefRegs=0000 {}, byref, isz
             ldr     w0, [x20, #0x10]
-            ldr     w1, [x20, #0x10]
+            mov     w1, w0
             mul     w21, w0, w1
             cmp     w21, #4
             beq     G_M22188_IG512
-						;; size=20 bbWeight=1 PerfScore 9.50
+						;; size=20 bbWeight=1 PerfScore 7.00
 G_M22188_IG511:        ; bbWeight=0.50, gcrefRegs=100000 {x20}, byrefRegs=0000 {}, byref
             movz    x0, #0xD1FFAB1E
             movk    x0, #0xD1FFAB1E LSL #16
@@ -8580,11 +8582,11 @@ G_M22188_IG511:        ; bbWeight=0.50, gcrefRegs=100000 {x20}, byrefRegs=0000 {
 G_M22188_IG512:        ; bbWeight=1, gcrefRegs=100000 {x20}, byrefRegs=0000 {}, byref, isz
             ldr     w0, [x20, #0x20]
             ldr     w1, [x20, #0x10]
-            ldr     w2, [x20, #0x10]
+            mov     w2, w1
             madd    w21, w1, w2, w0
             cmp     w21, #73
             beq     G_M22188_IG514
-						;; size=24 bbWeight=1 PerfScore 12.50
+						;; size=24 bbWeight=1 PerfScore 10.00
 G_M22188_IG513:        ; bbWeight=0.50, gcrefRegs=100000 {x20}, byrefRegs=0000 {}, byref
             movz    x0, #0xD1FFAB1E
             movk    x0, #0xD1FFAB1E LSL #16
@@ -8608,13 +8610,13 @@ G_M22188_IG513:        ; bbWeight=0.50, gcrefRegs=100000 {x20}, byrefRegs=0000 {
 						;; size=60 bbWeight=0.50 PerfScore 5.75
 G_M22188_IG514:        ; bbWeight=1, gcrefRegs=100000 {x20}, byrefRegs=0000 {}, byref, isz
             ldr     w0, [x20, #0x10]
-            ldr     w1, [x20, #0x10]
+            mov     w1, w0
             mul     w0, w0, w1
             ldr     w1, [x20, #0x20]
             add     w21, w1, w0
             cmp     w21, #73
             beq     G_M22188_IG516
-						;; size=28 bbWeight=1 PerfScore 13.00
+						;; size=28 bbWeight=1 PerfScore 10.50
 G_M22188_IG515:        ; bbWeight=0.50, gcrefRegs=100000 {x20}, byrefRegs=0000 {}, byref
             movz    x0, #0xD1FFAB1E
             movk    x0, #0xD1FFAB1E LSL #16
@@ -8870,13 +8872,13 @@ G_M22188_IG531:        ; bbWeight=0.50, gcrefRegs=100000 {x20}, byrefRegs=0000 {
 						;; size=60 bbWeight=0.50 PerfScore 5.75
 G_M22188_IG532:        ; bbWeight=1, gcrefRegs=100000 {x20}, byrefRegs=0000 {}, byref, isz
             ldr     w0, [x20, #0x10]
-            ldr     w1, [x20, #0x10]
+            mov     w1, w0
             mul     w0, w0, w1
             ldr     w1, [x20, #0x20]
             add     w21, w1, w0
             cmp     w21, #73
             beq     G_M22188_IG534
-						;; size=28 bbWeight=1 PerfScore 13.00
+						;; size=28 bbWeight=1 PerfScore 10.50
 G_M22188_IG533:        ; bbWeight=0.50, gcrefRegs=100000 {x20}, byrefRegs=0000 {}, byref
             movz    x0, #0xD1FFAB1E
             movk    x0, #0xD1FFAB1E LSL #16
@@ -8900,7 +8902,7 @@ G_M22188_IG533:        ; bbWeight=0.50, gcrefRegs=100000 {x20}, byrefRegs=0000 {
 						;; size=60 bbWeight=0.50 PerfScore 5.75
 G_M22188_IG534:        ; bbWeight=1, gcrefRegs=100000 {x20}, byrefRegs=0000 {}, byref, isz
             ldr     w0, [x20, #0x10]
-            ldr     w1, [x20, #0x10]
+            mov     w1, w0
             mul     w0, w0, w1
             ldr     w1, [x20, #0x20]
             add     w0, w1, w0
@@ -8910,7 +8912,7 @@ G_M22188_IG534:        ; bbWeight=1, gcrefRegs=100000 {x20}, byrefRegs=0000 {},
             msub    w21, w1, w2, w0
             cmp     w21, #0xD1FFAB1E
             beq     G_M22188_IG536
-						;; size=44 bbWeight=1 PerfScore 21.50
+						;; size=44 bbWeight=1 PerfScore 19.00
 G_M22188_IG535:        ; bbWeight=0.50, gcrefRegs=100000 {x20}, byrefRegs=0000 {}, byref
             movz    x0, #0xD1FFAB1E
             movk    x0, #0xD1FFAB1E LSL #16
@@ -9063,7 +9065,7 @@ G_M22188_IG538:        ; bbWeight=1, gcrefRegs=100000 {x20}, byrefRegs=0000 {},
             msub    w2, w3, w4, w2
             madd    w0, w2, w1, w0
             ldr     w1, [x20, #0x10]
-            ldr     w2, [x20, #0x10]
+            mov     w2, w1
             mul     w1, w1, w2
             ldr     w2, [x20, #0x20]
             add     w1, w2, w1
@@ -9076,7 +9078,7 @@ G_M22188_IG538:        ; bbWeight=1, gcrefRegs=100000 {x20}, byrefRegs=0000 {},
             movk    w0, #0xD1FFAB1E LSL #16
             cmp     w21, w0
             beq     G_M22188_IG540
-						;; size=244 bbWeight=1 PerfScore 124.50
+						;; size=244 bbWeight=1 PerfScore 122.00
 G_M22188_IG539:        ; bbWeight=0.50, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref
             ; gcrRegs -[x20]
             movz    x0, #0xD1FFAB1E
@@ -9115,7 +9117,7 @@ G_M22188_IG541:        ; bbWeight=1, epilog, nogc, extend
             ret     lr
 						;; size=16 bbWeight=1 PerfScore 5.00
 
-; Total bytes of code 26468, prolog size 16, PerfScore 8659.05, instruction count 6617, allocated bytes for code 26468 (MethodHash=5b1ca953) for method CseTest.Test_Main:Main():int
+; Total bytes of code 26476, prolog size 16, PerfScore 8630.85, instruction count 6619, allocated bytes for code 26476 (MethodHash=5b1ca953) for method CseTest.Test_Main:Main():int
 ; ============================================================
 
 Unwind Info:
@@ -9126,7 +9128,7 @@ Unwind Info:
   E bit             : 0
   X bit             : 0
   Vers              : 0
-  Function Length   : 6617 (0x019d9) Actual length = 26468 (0x006764)
+  Function Length   : 6619 (0x019db) Actual length = 26476 (0x00676c)
   ---- Epilog scopes ----
   ---- Scope 0
   Epilog Start Offset        : 3523193630 (0xd1ffab1e) Actual offset = 3523193630 (0xd1ffab1e) Offset from main function begin = 3523193630 (0xd1ffab1e)
Details

Improvements/regressions per collection

Collection Contexts with diffs Improvements Regressions Same size Improvements (bytes) Regressions (bytes)
benchmarks.run.linux.arm64.checked.mch 133 1 0 132 -12 +0
libraries_tests.pmi.linux.arm64.checked.mch 1,562 3 1 1,558 -20 +4
libraries.crossgen2.linux.arm64.checked.mch 269 3 0 266 -12 +0
libraries.pmi.linux.arm64.checked.mch 885 1 0 884 -4 +0
coreclr_tests.run.linux.arm64.checked.mch 2,136 5 1 2,130 -28 +8
4,985 13 2 4,970 -76 +12

Context information

Collection Diffed contexts MinOpts FullOpts Missed, base Missed, diff
benchmarks.run.linux.arm64.checked.mch 44,429 8,020 36,409 0 (0.00%) 0 (0.00%)
libraries_tests.pmi.linux.arm64.checked.mch 376,001 8,240 367,761 0 (0.00%) 0 (0.00%)
libraries.crossgen2.linux.arm64.checked.mch 174,939 15 174,924 0 (0.00%) 0 (0.00%)
libraries.pmi.linux.arm64.checked.mch 260,048 4,779 255,269 0 (0.00%) 0 (0.00%)
coreclr_tests.run.linux.arm64.checked.mch 600,904 378,881 222,023 0 (0.00%) 0 (0.00%)
1,456,321 399,935 1,056,386 0 (0.00%) 0 (0.00%)

jit-analyze output

benchmarks.run.linux.arm64.checked.mch

To reproduce these diffs on Windows arm64:

superpmi.py asmdiffs -target_os linux -target_arch arm64 -arch arm64

Summary of Code Size diffs:
(Lower is better)

Total bytes of base: 20597120 (overridden on cmd)
Total bytes of diff: 20597108 (overridden on cmd)
Total bytes of delta: -12 (-0.00 % of base)
    diff is an improvement.
    relative diff is an improvement.
Detail diffs


Top file improvements (bytes):
         -12 : 17542.dasm (-0.49 % of base)

1 total files with Code Size differences (1 improved, 0 regressed), 57 unchanged.

Top method improvements (bytes):
         -12 (-0.49 % of base) : 17542.dasm - System.Environment:ReadXdgDirectory(System.String,System.String,System.String):System.String

Top method improvements (percentages):
         -12 (-0.49 % of base) : 17542.dasm - System.Environment:ReadXdgDirectory(System.String,System.String,System.String):System.String


libraries_tests.pmi.linux.arm64.checked.mch

To reproduce these diffs on Windows arm64:

superpmi.py asmdiffs -target_os linux -target_arch arm64 -arch arm64

Summary of Code Size diffs:
(Lower is better)

Total bytes of base: 163340880 (overridden on cmd)
Total bytes of diff: 163340864 (overridden on cmd)
Total bytes of delta: -16 (-0.00 % of base)
    diff is an improvement.
    relative diff is an improvement.
Detail diffs


Top file regressions (bytes):
           4 : 311915.dasm (0.16 % of base)

Top file improvements (bytes):
         -12 : 274355.dasm (-0.80 % of base)
          -4 : 245584.dasm (-0.07 % of base)
          -4 : 245583.dasm (-0.19 % of base)

4 total files with Code Size differences (3 improved, 1 regressed), 55 unchanged.

Top method regressions (bytes):
           4 (0.16 % of base) : 311915.dasm - System.IO.Tests.StreamConformanceTests+<WriteAsync>d__48:MoveNext():this

Top method improvements (bytes):
         -12 (-0.80 % of base) : 274355.dasm - Humanizer.Localisation.NumberToWords.SpanishNumberToWordsConverter:ConvertToOrdinal(int,int,int):System.String:this
          -4 (-0.19 % of base) : 245583.dasm - System.Memory.Tests.SequenceReader.IsNext:IsNext_Empty(bool):this
          -4 (-0.07 % of base) : 245584.dasm - System.Memory.Tests.SequenceReader.IsNext:IsNext_Span():this

Top method regressions (percentages):
           4 (0.16 % of base) : 311915.dasm - System.IO.Tests.StreamConformanceTests+<WriteAsync>d__48:MoveNext():this

Top method improvements (percentages):
         -12 (-0.80 % of base) : 274355.dasm - Humanizer.Localisation.NumberToWords.SpanishNumberToWordsConverter:ConvertToOrdinal(int,int,int):System.String:this
          -4 (-0.19 % of base) : 245583.dasm - System.Memory.Tests.SequenceReader.IsNext:IsNext_Empty(bool):this
          -4 (-0.07 % of base) : 245584.dasm - System.Memory.Tests.SequenceReader.IsNext:IsNext_Span():this


libraries.crossgen2.linux.arm64.checked.mch

To reproduce these diffs on Windows arm64:

superpmi.py asmdiffs -target_os linux -target_arch arm64 -arch arm64

Summary of Code Size diffs:
(Lower is better)

Total bytes of base: 41317512 (overridden on cmd)
Total bytes of diff: 41317500 (overridden on cmd)
Total bytes of delta: -12 (-0.00 % of base)
    diff is an improvement.
    relative diff is an improvement.
Detail diffs


Top file improvements (bytes):
          -4 : 47758.dasm (-0.77 % of base)
          -4 : 51116.dasm (-0.20 % of base)
          -4 : 63176.dasm (-0.08 % of base)

3 total files with Code Size differences (3 improved, 0 regressed), 54 unchanged.

Top method improvements (bytes):
          -4 (-0.20 % of base) : 51116.dasm - System.Environment:ReadXdgDirectory(System.String,System.String,System.String):System.String
          -4 (-0.77 % of base) : 47758.dasm - System.IO.Path:GetFullPathInternal(System.String):System.String
          -4 (-0.08 % of base) : 63176.dasm - System.Xml.Serialization.XmlSerializationWriterILGen:GenerateMembersElement(System.Xml.Serialization.XmlMembersMapping):System.String:this

Top method improvements (percentages):
          -4 (-0.77 % of base) : 47758.dasm - System.IO.Path:GetFullPathInternal(System.String):System.String
          -4 (-0.20 % of base) : 51116.dasm - System.Environment:ReadXdgDirectory(System.String,System.String,System.String):System.String
          -4 (-0.08 % of base) : 63176.dasm - System.Xml.Serialization.XmlSerializationWriterILGen:GenerateMembersElement(System.Xml.Serialization.XmlMembersMapping):System.String:this


libraries.pmi.linux.arm64.checked.mch

To reproduce these diffs on Windows arm64:

superpmi.py asmdiffs -target_os linux -target_arch arm64 -arch arm64

Summary of Code Size diffs:
(Lower is better)

Total bytes of base: 66011160 (overridden on cmd)
Total bytes of diff: 66011156 (overridden on cmd)
Total bytes of delta: -4 (-0.00 % of base)
    diff is an improvement.
    relative diff is an improvement.
Detail diffs


Top file improvements (bytes):
          -4 : 82027.dasm (-1.37 % of base)

1 total files with Code Size differences (1 improved, 0 regressed), 59 unchanged.

Top method improvements (bytes):
          -4 (-1.37 % of base) : 82027.dasm - ILCompiler.CecilCompatibleTypeParser:GetType(Internal.TypeSystem.ModuleDesc,System.String):Internal.TypeSystem.TypeDesc

Top method improvements (percentages):
          -4 (-1.37 % of base) : 82027.dasm - ILCompiler.CecilCompatibleTypeParser:GetType(Internal.TypeSystem.ModuleDesc,System.String):Internal.TypeSystem.TypeDesc


coreclr_tests.run.linux.arm64.checked.mch

To reproduce these diffs on Windows arm64:

superpmi.py asmdiffs -target_os linux -target_arch arm64 -arch arm64

Summary of Code Size diffs:
(Lower is better)

Total bytes of base: 514286872 (overridden on cmd)
Total bytes of diff: 514286852 (overridden on cmd)
Total bytes of delta: -20 (-0.00 % of base)
    diff is an improvement.
    relative diff is an improvement.
Detail diffs


Top file regressions (bytes):
           8 : 543212.dasm (0.03 % of base)

Top file improvements (bytes):
         -12 : 485719.dasm (-1.24 % of base)
          -4 : 561619.dasm (-1.37 % of base)
          -4 : 558675.dasm (-0.67 % of base)
          -4 : 554178.dasm (-1.45 % of base)
          -4 : 370595.dasm (-1.37 % of base)

6 total files with Code Size differences (5 improved, 1 regressed), 54 unchanged.

Top method regressions (bytes):
           8 (0.03 % of base) : 543212.dasm - CseTest.Test_Main:Main():int

Top method improvements (bytes):
         -12 (-1.24 % of base) : 485719.dasm - ComWrappersTests.Common.ComWrappersHelper:Init[System.__Canon](byref,System.Object,bool,System.Runtime.InteropServices.ComWrappers,ulong)
          -4 (-1.37 % of base) : 561619.dasm - ILCompiler.CecilCompatibleTypeParser:GetType(Internal.TypeSystem.ModuleDesc,System.String):Internal.TypeSystem.TypeDesc
          -4 (-1.37 % of base) : 370595.dasm - ILCompiler.CecilCompatibleTypeParser:GetType(Internal.TypeSystem.ModuleDesc,System.String):Internal.TypeSystem.TypeDesc
          -4 (-1.45 % of base) : 554178.dasm - JitTest_val_ctor_newobj_il.TestStruct:Main():int
          -4 (-0.67 % of base) : 558675.dasm - Microsoft.Build.Internal.Utilities:GenerateToolsVersionToUse(System.String,System.String,Microsoft.Build.Internal.Utilities+GetToolset,System.String,byref):System.String

Top method regressions (percentages):
           8 (0.03 % of base) : 543212.dasm - CseTest.Test_Main:Main():int

Top method improvements (percentages):
          -4 (-1.45 % of base) : 554178.dasm - JitTest_val_ctor_newobj_il.TestStruct:Main():int
          -4 (-1.37 % of base) : 561619.dasm - ILCompiler.CecilCompatibleTypeParser:GetType(Internal.TypeSystem.ModuleDesc,System.String):Internal.TypeSystem.TypeDesc
          -4 (-1.37 % of base) : 370595.dasm - ILCompiler.CecilCompatibleTypeParser:GetType(Internal.TypeSystem.ModuleDesc,System.String):Internal.TypeSystem.TypeDesc
         -12 (-1.24 % of base) : 485719.dasm - ComWrappersTests.Common.ComWrappersHelper:Init[System.__Canon](byref,System.Object,bool,System.Runtime.InteropServices.ComWrappers,ulong)
          -4 (-0.67 % of base) : 558675.dasm - Microsoft.Build.Internal.Utilities:GenerateToolsVersionToUse(System.String,System.String,Microsoft.Build.Internal.Utilities+GetToolset,System.String,byref):System.String


The above summary shows one regression. The issue there is that the code had five loads in the following order.

ldr     w0, [x20, #0x08]
ldr     w1, [x20, #0x08]
ldr     w2, [x20, #0x0C]
ldr     w3, [x20, #0x10]
ldr     w4, [x20, #0x14]

Previously, the last four loads were converted to ldp as

ldr     w0, [x20, #0x08]
ldp     w1, w2, [x20, #0x08]
ldp     w3, w4, [x20, #0x10]

Because of the first two loads from the same location, the current patch changed the second load to mov. Consequently, all the following loads couldn't be converted to pairwise loads. It generated:

ldr     w0, [x20, #0x08]
mov     w1, w0
ldp     w2, w3, [x20, #0x0C]
ldr     w4, [x20, #0x14]

@ghost
Copy link

ghost commented Mar 15, 2023

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch, @kunalspathak
See info in area-owners.md if you want to be subscribed.

Issue Details

Fixes #35141

Author: SwapnilGaikwad
Assignees: -
Labels:

area-CodeGen-coreclr

Milestone: -

@a74nh
Copy link
Contributor

a74nh commented Mar 16, 2023

@kunalspathak

@a74nh
Copy link
Contributor

a74nh commented Mar 17, 2023

Probably also @BruceForstall too

@@ -164,6 +168,13 @@ inline bool OptimizeLdrStr(instruction ins,
return true;
}

// If we have a second LDR instruction from the same source, then try to replace it with a MOV.
if (IsOptimizableLdrToMov(ins, reg1, reg2, imm, size, fmt))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering if this creates a GC hole? Should this be gated by a !localVar check, or even better emitIns_Mov variant that handles varx/offs.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see gc liveness changing in the diffs so probably it should be fine.

image

Here, in 1st example, x1 is reported to have gc value before and after. Same in other places.

I will still run gcstress now.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also the gcstress pipelines are clean.

@kunalspathak kunalspathak self-requested a review March 20, 2023 12:52
@kunalspathak
Copy link
Member

/azp run runtime-coreclr runtime-coreclr gcstress0x3-gcstress0xc

@azure-pipelines
Copy link

No pipelines are associated with this pull request.

@SwapnilGaikwad
Copy link
Contributor Author

/azp run runtime-coreclr jitstress

@azure-pipelines
Copy link

Commenter does not have sufficient privileges for PR 83458 in repo dotnet/runtime

@SwapnilGaikwad
Copy link
Contributor Author

/azp run runtime-coreclr runtime-coreclr gcstress0x3-gcstress0xc

It seems, it didn't kicked any jobs.

@kunalspathak
Copy link
Member

/azp run runtime-coreclr runtime-coreclr gcstress0x3-gcstress0xc

It seems, it didn't kicked any jobs.

Kicked off internally - https://dev.azure.com/dnceng-public/public/_build/results?buildId=211187&view=results

Copy link
Member

@kunalspathak kunalspathak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks for your contribution!

@@ -164,6 +168,13 @@ inline bool OptimizeLdrStr(instruction ins,
return true;
}

// If we have a second LDR instruction from the same source, then try to replace it with a MOV.
if (IsOptimizableLdrToMov(ins, reg1, reg2, imm, size, fmt))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see gc liveness changing in the diffs so probably it should be fine.

image

Here, in 1st example, x1 is reported to have gc value before and after. Same in other places.

I will still run gcstress now.

@@ -164,6 +168,13 @@ inline bool OptimizeLdrStr(instruction ins,
return true;
}

// If we have a second LDR instruction from the same source, then try to replace it with a MOV.
if (IsOptimizableLdrToMov(ins, reg1, reg2, imm, size, fmt))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also the gcstress pipelines are clean.

@kunalspathak kunalspathak merged commit c1acb35 into dotnet:main Mar 21, 2023
@SwapnilGaikwad SwapnilGaikwad deleted the github-ldMov branch March 21, 2023 21:42
Copy link
Member

@BruceForstall BruceForstall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed a potential bug or missing optimization.

// instruction into a cheaper "mov" instruction.
//
// Examples: ldr w1, [x20, #0x10]
// ldr w2, [x20, #0x10] => mov w1, w2
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be mov w2, w1?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, I'll fix this.

}

regNumber prevReg1 = emitLastIns->idReg1();
regNumber prevReg2 = emitLastIns->idReg2();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

prevReg2 here is encoded. Doesn't it need prevReg2 = encodingZRtoSP(prevReg2)?

(Did you see any diffs with a base register of SP?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The check is pulled up in the OptimizeLdrStr().

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The check you refer to, reg2 = encodingZRtoSP(reg2);, only applies to the reg2 argument. The prevReg2 here is still in the encoded form, meaning if reg2 == SP (after decoding) it will never match prevReg2 == ZR (before decoding).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@BruceForstall - Does that mean that we would have lost some opportunities because of not encoding it to SP?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, now I am confused - why we have to do the encodingZRtoSP(reg2) for this optimization?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@BruceForstall - Does that mean that we would have lost some opportunities because of not encoding it to SP?

That's my thinking. However, when I add prevReg2 = encodingZRtoSP(prevReg2), there are no diffs. If I add assert(prevReg2 != REG_ZR) there are hits, so obviously there are cases where we are incorrectly comparing against ZR instead of SP. There might not be any opportunities due to the way we generate code.

Actually, now I am confused - why we have to do the encodingZRtoSP(reg2) for this optimization?

The problem is that we store the encoded register value in the instrDesc -- and arm64 encodings sometimes encode SP as ZR in the instruction encoding bits.

It's possibly unfortunate that we (long ago) decided to store the encoded form into igReg1()/idReg2()/etc. An alternative would be to store the "actual" register in the instrDesc and do the SP->ZR encoding when we're building the instruction bits. This would eliminate the need to "un-encode" which we do in the emitDispIns code path (to display/output the textual instruction). And it would simplify thinking about peephole optimizations (which were not considered when this was designed).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An alternative would be to store the "actual" register in the instrDesc

Unfortunately, REG_SP is 64 and REGNUM_BITS is 6, so we don't have enough bits currently to store REG_SP in idReg1/idReg2/etc. Bumping REGNUM_BITS to 7 just to allow this might have negative effects on memory usage or other instrDesc effects.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the explanation @BruceForstall, I'll add a new PR addressing the comments 👍

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Created #83886

Comment on lines +16421 to +16422
// Either register 1 or previous register 1 is not a general register
// or the zero register, so we cannot optimise.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment is wrong (or confusing) since isGeneralRegister doesn't include ZR. (So we don't optimize if one is ZR, which makes sense)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I'll change this. Should I coalesce these changes with another PR?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you have an additional PR coming, it's fine to add the comment changes to that PR.

@ghost ghost locked as resolved and limited conversation to collaborators Apr 23, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI community-contribution Indicates that the PR has been added by a community member
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ARM64: Optimize redundant memory loads with mov
4 participants