Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize casts from constant byrefs #99130

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

MichalPetryka
Copy link
Contributor

@MichalPetryka MichalPetryka commented Feb 29, 2024

Noticed this was causing some missing folding with static fields in #99011.

Sample codegen:
Before:

G_M54480_IG01:  ;; offset=0x0000
       50                   push     rax
						;; size=1 bbWeight=1 PerfScore 1.00
G_M54480_IG02:  ;; offset=0x0001
       48BAD611FD29F87F0000 mov      rdx, 0x7FF829FD11D6      ; static handle
       41B802000000         mov      r8d, 2      ; global ptr
       4883C2FE             add      rdx, -2
       41C1E003             shl      r8d, 3
       4183E01F             and      r8d, 31
       B8FF000000           mov      eax, 255
       C4E239F7C0           shlx     eax, eax, r8d
       448BD0               mov      r10d, eax
       41F7D2               not      r10d
       0FB6C1               movzx    rax, cl
       C4E239F7C8           shlx     ecx, eax, r8d
       33C0                 xor      eax, eax
                            align    [0 bytes for IG03]
						;; size=54 bbWeight=1 PerfScore 3.75
G_M54480_IG03:  ;; offset=0x0037
       448BC8               mov      r9d, eax
       4523CA               and      r9d, r10d
       440BC9               or       r9d, ecx
       89442404             mov      dword ptr [rsp+0x04], eax
       F0                   lock     
       440FB10A             cmpxchg  dword ptr [rdx], r9d
       448B4C2404           mov      r9d, dword ptr [rsp+0x04]
       448BD8               mov      r11d, eax
       453BCB               cmp      r9d, r11d
       750D                 jne      SHORT G_M54480_IG06
						;; size=31 bbWeight=8 PerfScore 178.00
G_M54480_IG04:  ;; offset=0x0056
       C4E23BF7C0           shrx     eax, eax, r8d
       0FB6C0               movzx    rax, al
						;; size=8 bbWeight=1 PerfScore 0.75
G_M54480_IG05:  ;; offset=0x005E
       4883C408             add      rsp, 8
       C3                   ret      
						;; size=5 bbWeight=1 PerfScore 1.25
G_M54480_IG06:  ;; offset=0x0063
       418BC3               mov      eax, r11d
       EBCF                 jmp      SHORT G_M54480_IG03
						;; size=5 bbWeight=4 PerfScore 9.00

After:

G_M54480_IG01:  ;; offset=0x0000
       50                   push     rax
						;; size=1 bbWeight=1 PerfScore 1.00
G_M54480_IG02:  ;; offset=0x0001
       48BAD611502BF87F0000 mov      rdx, 0x7FF82B5011D6      ; static handle
       4883C2FE             add      rdx, -2
       0FB6C9               movzx    rcx, cl
       C1E110               shl      ecx, 16
       33C0                 xor      eax, eax
                            align    [0 bytes for IG03]
						;; size=22 bbWeight=1 PerfScore 1.50
G_M54480_IG03:  ;; offset=0x0017
       448BC0               mov      r8d, eax
       4181E0FFFF00FF       and      r8d, -0xFF0001
       440BC1               or       r8d, ecx
       89442404             mov      dword ptr [rsp+0x04], eax
       F0                   lock     
       440FB102             cmpxchg  dword ptr [rdx], r8d
       448B442404           mov      r8d, dword ptr [rsp+0x04]
       448BD0               mov      r10d, eax
       453BC2               cmp      r8d, r10d
       750B                 jne      SHORT G_M54480_IG06
						;; size=35 bbWeight=8 PerfScore 178.00
G_M54480_IG04:  ;; offset=0x003A
       C1E810               shr      eax, 16
       0FB6C0               movzx    rax, al
						;; size=6 bbWeight=1 PerfScore 0.75
G_M54480_IG05:  ;; offset=0x0040
       4883C408             add      rsp, 8
       C3                   ret      
						;; size=5 bbWeight=1 PerfScore 1.25
G_M54480_IG06:  ;; offset=0x0045
       418BC2               mov      eax, r10d
       EBCD                 jmp      SHORT G_M54480_IG03
						;; size=5 bbWeight=4 PerfScore 9.00

With a static field happening to be aligned:

G_M54480_IG01:  ;; offset=0x0000
       50                   push     rax
						;; size=1 bbWeight=1 PerfScore 1.00
G_M54480_IG02:  ;; offset=0x0001
       48BAF01142A0FF7F0000 mov      rdx, 0x7FFFA04211F0      ; static handle
       0FB6C9               movzx    rcx, cl
       33C0                 xor      eax, eax
                            align    [0 bytes for IG03]
						;; size=15 bbWeight=1 PerfScore 0.75
G_M54480_IG03:  ;; offset=0x0010
       448BC0               mov      r8d, eax
       4181E000FFFFFF       and      r8d, -256
       440BC1               or       r8d, ecx
       89442404             mov      dword ptr [rsp+0x04], eax
       F0                   lock     
       440FB102             cmpxchg  dword ptr [rdx], r8d
       448B442404           mov      r8d, dword ptr [rsp+0x04]
       448BD0               mov      r10d, eax
       453BC2               cmp      r8d, r10d
       7508                 jne      SHORT G_M54480_IG06
						;; size=35 bbWeight=8 PerfScore 178.00
G_M54480_IG04:  ;; offset=0x0033
       0FB6C0               movzx    rax, al
						;; size=3 bbWeight=1 PerfScore 0.25
G_M54480_IG05:  ;; offset=0x0036
       4883C408             add      rsp, 8
       C3                   ret      
						;; size=5 bbWeight=1 PerfScore 1.25
G_M54480_IG06:  ;; offset=0x003B
       418BC2               mov      eax, r10d
       EBD0                 jmp      SHORT G_M54480_IG03
						;; size=5 bbWeight=4 PerfScore 9.00

@dotnet-issue-labeler dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Feb 29, 2024
@ghost
Copy link

ghost commented Feb 29, 2024

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

Issue Details

Noticed this was causing some missing folding with static fields in #99011.

Author: MichalPetryka
Assignees: -
Labels:

area-CodeGen-coreclr

Milestone: -

@MichalPetryka
Copy link
Contributor Author

@MihuBot

if (oper->IsCnsIntOrI())
{
oper->gtType = TYP_I_IMPL;
return fgMorphTree(gtNewCastNode(TYP_I_IMPL, oper, false, dstType));
Copy link
Member

@EgorBo EgorBo Feb 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might create e.g. a nongc icon handle typed as TYP_I_IMPL. I think we have a couple of asserts expecting those to be always TYP_REF. What exactly this change improves?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. The cast here seems to strip handle-ness.
  2. Would restricting it to TYP_BYREF make it okay?
  3. Added codegen diff to the PR description.

Copy link
Member

@EgorBo EgorBo Mar 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. The cast here seems to strip handle-ness.

Which doesn't sound right e.g. - does it strip it for relocable (AOT) handles as well?

  1. Would restricting it to TYP_BYREF make it okay?
  2. Added codegen diff to the PR description.

Diffs are small and are regressions so does it make sense?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which doesn't sound right e.g. - does it strip it for relocable (AOT) handles as well?

Made it retype the handles instead.

Diffs are small and are regressions so does it make sense?

The only diff currently is from a GC hole that's fixed in #99138, I'd guess this won't have any diffs in the BCL or tests outside of #99011 since those don't generally use AsPointer on statics or RVA statics.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's nice that it helped you to find a gc hole! No diff in BCL means it's basically untested, it's probably fine now but may fail in future if we introduce byref icon handle? I think we should do changes like these only if they provide clear benefits and risks are understood

MichalPetryka added a commit to MichalPetryka/runtime that referenced this pull request Mar 1, 2024
stephentoub added a commit that referenced this pull request Mar 7, 2024
* Fix GC hole in Guid

Found in #99130.

* Add missing in

---------

Co-authored-by: Stephen Toub <stoub@microsoft.com>
@MichalPetryka
Copy link
Contributor Author

@MihuBot

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants