Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jit double zeros for async int method returning ValueTask<T> #38070

Closed
benaadams opened this issue Jun 18, 2020 · 2 comments · Fixed by #38314
Closed

Jit double zeros for async int method returning ValueTask<T> #38070

benaadams opened this issue Jun 18, 2020 · 2 comments · Fixed by #38314
Assignees
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI tenet-performance Performance related issue
Milestone

Comments

@benaadams
Copy link
Member

benaadams commented Jun 18, 2020

Follow up to #2325

An optimization was added in #36918 "Optimization to remove redundant zero initializations."; however it doesn't kick in for the initiation methods of async methods returning large ValueTask<T>; which still double zero.

Given

using System.Buffers;
using System.Threading.Tasks;

public class DoubleZero
{
    public async ValueTask<ReadResult> SingleMethod()
    {
        return default;

    }

    public readonly struct ReadResult
    {
        internal readonly ReadOnlySequence<byte> _resultBuffer;
        internal readonly ResultFlags _resultFlags;
    }

    public enum ResultFlags : byte { }
}

It produces the following asm which double zeros:

; Assembly listing for method DoubleZero:SingleMethod():System.Threading.Tasks.ValueTask`1[ReadResult]:this
; Emitting BLENDED_CODE for X64 CPU with AVX - Windows
; optimized code
; rsp based frame
; partially interruptible
; Final local variable assignments
;
;* V00 this         [V00    ] (  0,  0   )     ref  ->  zero-ref    this class-hnd
;  V01 RetBuf       [V01,T00] (  4,  4   )   byref  ->  rsi        
;  V02 loc0         [V02    ] (  4,  4   )  struct (48) [rsp+0x20]   do-not-enreg[XSFB] must-init addr-exposed ld-addr-op
;  V03 OutArgs      [V03    ] (  1,  1   )  lclBlk (32) [rsp+0x00]   "OutgoingArgSpace"
;* V04 tmp1         [V04    ] (  0,  0   )  struct (40) zero-ref    do-not-enreg[SB] ld-addr-op "Inline ldloca(s) first use temp"

G_M9604_IG01:
       push     rsi
       sub      rsp, 80
       vzeroupper 
       vxorps   xmm4, xmm4                  ; zeros
       vmovdqa  xmmword ptr [rsp+20H], xmm4 ;   |
       vmovdqa  xmmword ptr [rsp+30H], xmm4 ;   |
       vmovdqa  xmmword ptr [rsp+40H], xmm4 ;   v
       mov      rsi, rdx
						
G_M9604_IG02:
       xor      ecx, ecx                    ; re-zeros
       vxorps   xmm0, xmm0                  ;    |
       vmovdqu  xmmword ptr [rsp+28H], xmm0 ;    |
       vmovdqu  xmmword ptr [rsp+38H], xmm0 ;    |
       mov      qword ptr [rsp+48H], rcx    ;    V
       mov      dword ptr [rsp+20H], -1
       lea      rcx, [rsp+20H]
       call     System.Runtime.CompilerServices.AsyncMethodBuilderCore:Start(byref)

Preferably it would be (skipping the second zeroing)

G_M9604_IG01:
       push     rsi
       sub      rsp, 80
       vzeroupper 
       vxorps   xmm4, xmm4                  ; zeros
       vmovdqa  xmmword ptr [rsp+20H], xmm4 ;   |
       vmovdqa  xmmword ptr [rsp+30H], xmm4 ;   |
       vmovdqa  xmmword ptr [rsp+40H], xmm4 ;   v
       mov      rsi, rdx
						
G_M9604_IG02:
       mov      dword ptr [rsp+20H], -1
       lea      rcx, [rsp+20H]
       call     System.Runtime.CompilerServices.AsyncMethodBuilderCore:Start(byref)

Aside I've added an optimization to Roslyn to move storing parameters to the async statemachine later in the method, after the second zeroing, rather than before which may have blocked the optimization dotnet/roslyn#45262; however this issue occurs even when parameters are not passed as seen above.

/cc @erozenfeld

@benaadams benaadams added the tenet-performance Performance related issue label Jun 18, 2020
@Dotnet-GitSync-Bot Dotnet-GitSync-Bot added area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI untriaged New issue has not been triaged by the area owner labels Jun 18, 2020
@benaadams
Copy link
Member Author

CIL for above

.method public hidebysig 
	instance valuetype [System.Runtime]System.Threading.Tasks.ValueTask`1<valuetype DoubleZero/ReadResult> SingleMethod () cil managed 
{
	.custom instance void [System.Runtime]System.Runtime.CompilerServices.AsyncStateMachineAttribute::.ctor(class [System.Runtime]System.Type) = (
		01 00 1d 44 6f 75 62 6c 65 5a 65 72 6f 2b 3c 53
		69 6e 67 6c 65 4d 65 74 68 6f 64 3e 64 5f 5f 30
		00 00
	)
	// Method begins at RVA 0x2050
	// Code size 47 (0x2f)
	.maxstack 2
	.locals init (
		[0] valuetype DoubleZero/'<SingleMethod>d__0'
	)

	IL_0000: ldloca.s 0
	IL_0002: call valuetype [System.Runtime]System.Runtime.CompilerServices.AsyncValueTaskMethodBuilder`1<!0> valuetype [System.Runtime]System.Runtime.CompilerServices.AsyncValueTaskMethodBuilder`1<valuetype DoubleZero/ReadResult>::Create()
	IL_0007: stfld valuetype [System.Runtime]System.Runtime.CompilerServices.AsyncValueTaskMethodBuilder`1<valuetype DoubleZero/ReadResult> DoubleZero/'<SingleMethod>d__0'::'<>t__builder'
	IL_000c: ldloca.s 0
	IL_000e: ldc.i4.m1
	IL_000f: stfld int32 DoubleZero/'<SingleMethod>d__0'::'<>1__state'
	IL_0014: ldloca.s 0
	IL_0016: ldflda valuetype [System.Runtime]System.Runtime.CompilerServices.AsyncValueTaskMethodBuilder`1<valuetype DoubleZero/ReadResult> DoubleZero/'<SingleMethod>d__0'::'<>t__builder'
	IL_001b: ldloca.s 0
	IL_001d: call instance void valuetype [System.Runtime]System.Runtime.CompilerServices.AsyncValueTaskMethodBuilder`1<valuetype DoubleZero/ReadResult>::Start<valuetype DoubleZero/'<SingleMethod>d__0'>(!!0&)
	IL_0022: ldloca.s 0
	IL_0024: ldflda valuetype [System.Runtime]System.Runtime.CompilerServices.AsyncValueTaskMethodBuilder`1<valuetype DoubleZero/ReadResult> DoubleZero/'<SingleMethod>d__0'::'<>t__builder'
	IL_0029: call instance valuetype [System.Runtime]System.Threading.Tasks.ValueTask`1<!0> valuetype [System.Runtime]System.Runtime.CompilerServices.AsyncValueTaskMethodBuilder`1<valuetype DoubleZero/ReadResult>::get_Task()
	IL_002e: ret
} // end of method DoubleZero::SingleMethod

@JulieLeeMSFT JulieLeeMSFT removed the untriaged New issue has not been triaged by the area owner label Jun 18, 2020
@JulieLeeMSFT JulieLeeMSFT added this to the 5.0.0 milestone Jun 18, 2020
@erozenfeld
Copy link
Member

@benaadams Thank you for following up on this and for the repro. I'll take a look this week.

erozenfeld added a commit to erozenfeld/runtime that referenced this issue Jun 24, 2020
erozenfeld added a commit that referenced this issue Jun 24, 2020
@ghost ghost locked as resolved and limited conversation to collaborators Dec 8, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI tenet-performance Performance related issue
Projects
None yet
4 participants