-
Notifications
You must be signed in to change notification settings - Fork 5.3k
Closed
Labels
area-CodeGen-coreclrCLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMICLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Milestone
Description
Nor does Vector128<byte>.Count() or Vector128:AsByte(Vector128`1):Vector128`1
ASCIIUtility.WidenAsciiToUtf16_Sse2 calls Vector128<byte>.Zero
runtime/src/libraries/System.Private.CoreLib/src/System/Text/ASCIIUtility.cs
Lines 1633 to 1637 in fd181c0
| // Then perform an unaligned write of the first part of the input buffer. | |
| Vector128<byte> zeroVector = Vector128<byte>.Zero; | |
| utf16FirstHalfVector = Sse2.UnpackLow(asciiVector, zeroVector); |
Which ends up reserving stack, making call and reading the stack back to zero a xmm register:
G_M55642_IG05:
lea rcx, [rsp+20H]
call [Vector128`1:get_Zero():Vector128`1]
movaps xmm0, xmmword ptr [rsp+20H]
movaps xmm1, xmm6
punpcklbw xmm1, xmm0
movdqu xmmword ptr [rsi], xmm1
mov rax, rsi
shr rax, 1
and rax, 7
mov edx, 8
sub rdx, rax
mov rax, rdx
sub rbx, 16Which is quite inefficient
/cc @tannergooding @GrabYourPitchforks
category:cq
theme:intrinsics
skill-level:expert
cost:medium
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
area-CodeGen-coreclrCLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMICLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI