-
Notifications
You must be signed in to change notification settings - Fork 4.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add SIMD to LowerCallMemcmp #84530
Add SIMD to LowerCallMemcmp #84530
Conversation
Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch, @kunalspathak Issue DetailsAdd SIMD to unroll length bool Test(Span<byte> s) => s.SequenceEqual(
"THE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND"u8); Old codegen:; Method Prog:Test(System.Span`1[ubyte]):bool:this
G_M52730_IG01:
4883EC28 sub rsp, 40
G_M52730_IG02:
49B8882A908BDA010000 mov r8, 0x1DA8B902A88
488B0A mov rcx, bword ptr [rdx]
8B5208 mov edx, dword ptr [rdx+08H]
4C89442420 mov bword ptr [rsp+20H], r8
83FA3E cmp edx, 62
7513 jne SHORT G_M52730_IG04
G_M52730_IG03:
41B83E000000 mov r8d, 62
488B542420 mov rdx, bword ptr [rsp+20H]
FF1591FE1600 call [System.SpanHelpers:SequenceEqual(byref,byref,ulong):bool]
EB02 jmp SHORT G_M52730_IG05
G_M52730_IG04:
33C0 xor eax, eax
G_M52730_IG05:
4883C428 add rsp, 40
C3 ret
; Total bytes of code: 56 New codegen:; Method Prog:Test(System.Span`1[ubyte]):bool:this
G_M52730_IG01:
C5F877 vzeroupper
G_M52730_IG02:
48B8882A7D01B3020000 mov rax, 0x2B3017D2A88
488B0A mov rcx, bword ptr [rdx]
8B5208 mov edx, dword ptr [rdx+08H]
4883FA3E cmp rdx, 62
752B jne SHORT G_M52730_IG04
G_M52730_IG03:
C5FC1001 vmovups ymm0, ymmword ptr[rcx]
C5FC1008 vmovups ymm1, ymmword ptr[rax]
C5FC10511E vmovups ymm2, ymmword ptr[rcx+1EH]
C5FC10581E vmovups ymm3, ymmword ptr[rax+1EH]
C5FDEFC1 vpxor ymm0, ymm0, ymm1
C5EDEFCB vpxor ymm1, ymm2, ymm3
C5FDEBC1 vpor ymm0, ymm0, ymm1
C4E27D17C0 vptest ymm0, ymm0
0F94C0 sete al
0FB6C0 movzx rax, al
EB02 jmp SHORT G_M52730_IG05
G_M52730_IG04:
33C0 xor eax, eax
G_M52730_IG05:
C5F877 vzeroupper
C3 ret
; Total bytes of code: 74
|
GenTree* rXor = newBinaryOp(comp, GT_XOR, actualLoadType, l2Indir, r2Indir); | ||
GenTree* resultOr = newBinaryOp(comp, GT_OR, actualLoadType, lXor, rXor); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you log an issue tracking us fixing this to opportunistically using vpternlog
for AVX-512 hardware?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you log an issue tracking us fixing this to opportunistically using
vpternlog
for AVX-512 hardware?
Good idea, done: #84534
#84536 is the SPMI replay failure |
PTAL @jakobbotsch since you reviewed the previous impl of |
Add SIMD to unroll length
[16..64]
(can be enabled for[64..128]
with avx512),[16..32]
on arm64.Old codegen:
New codegen: