Suboptimal code patterns when using Unsafe methods + intrinsics

Micro-optimizing a particular piece of code I found suboptimal codegen introduced by the signature of the Unsafe class design that could be fixed by the JIT. 

Say that I need to read data from 2 different memory locations where offset is an `int`

```csharp
matches = Sse2.MoveMask(Sse2.CompareEqual(LoadVector128(ref first, (IntPtr)offset), LoadVector128(ref second, (IntPtr)offset)));
```

Now you can see that it is performing 2 times the same operation. 

```asm
**movsxd      r8,eax  
vmovupd     xmm0,xmmword ptr [rcx+r8]  
**movsxd      r8,eax  
vmovupd     xmm1,xmmword ptr [rdx+r8]  
vpcmpeqb    xmm0,xmm0,xmm1  
vpmovmskb   r8d,xmm0  
```

This has been solved (somehow) for AVX2 but it also introduce another strange behavior:

```csharp
matches = Avx2.MoveMask(Avx2.CompareEqual(LoadVector256(ref first, (IntPtr)offset), LoadVector256(ref second, (IntPtr)offset)));
```

As you can see not only we copy with sign extension but we are also coping it into r9. While at the architectural level that is a simple rename (better than the other one) we are still issuing an extra operation. 

```asm
**movsxd      r8,eax  
**mov         r9,r8  
vmovupd     ymm0,ymmword ptr [rcx+r9]  
vmovupd     ymm1,ymmword ptr [rdx+r8]  
vpcmpeqb    ymm0,ymm0,ymm1  
vpmovmskb   r8d,ymm0  
```

What I dont understand is why if eax has been set in the same code (not coming from anywhere else) the JIT decides to use an extra mov operation instead of emitting:

```asm
vmovupd     ymm0,ymmword ptr [rcx+eax]  
vmovupd     ymm1,ymmword ptr [rdx+eax]  
vpcmpeqb    ymm0,ymm0,ymm1  
vpmovmskb   r8d,ymm0  
```

And, futhermore, this can also be optimized to:

```asm
vmovupd     ymm0,ymmword ptr [rcx+eax]  
vpcmpeqb    ymm0,ymm0,ymmword ptr [rdx+eax]  
vpmovmskb   r8d,ymm0  
```

I am running nightly from today. `3.0.0-preview4-27506-5`
Any idea how I can achieve the latter code?

category:cq
theme:hardware-intrinsics
skill-level:expert
cost:medium

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Suboptimal code patterns when using Unsafe methods + intrinsics #12201

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

Suboptimal code patterns when using Unsafe methods + intrinsics #12201

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions