Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RyuJIT properly optimizes structs with a single field if the field type is int but not if it is double #4323

Closed
mikedn opened this issue Jun 20, 2015 · 8 comments
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI enhancement Product code improvement that does NOT require public API changes/additions optimization tenet-performance Performance related issue
Milestone

Comments

@mikedn
Copy link
Contributor

mikedn commented Jun 20, 2015

Sample C# code:

struct Number {
    private double value;
    public static implicit operator Number(double value) {
        return new Number { value = value };
    }
    public static implicit operator double (Number number) {
        return number.value;
    }
    public static Number operator +(Number x, Number y) {
        return x.value + y.value;
    }
}
class Program {
    static int Main() {
        Number x = 4, y = 2;
        return (int)(x + y);
    }
}

A portion of the generated code:

vmovsd      xmm0,qword ptr [rsp+20h]
vaddsd      xmm0,xmm0,mmword ptr [rsp+18h]
xor         eax,eax
mov         qword ptr [rsp+10h],rax
mov         qword ptr [rsp+10h],rax
vmovsd      qword ptr [rsp+10h],xmm0
mov         rax,qword ptr [rsp+10h]
mov         qword ptr [rsp+38h],rax
mov         rax,qword ptr [rsp+38h]
mov         qword ptr [rsp+8],rax
vmovsd      xmm0,qword ptr [rsp+8]
vcvttsd2si  eax,xmm0

If int is used intstead of double or if double is used directly instead of Number then the generated code is simply mov eax, 6.

category:cq
theme:structs
skill-level:expert
cost:medium
impact:medium

@mikedn
Copy link
Contributor Author

mikedn commented Jul 8, 2015

This is indirectly caused by the following code from fgMorphStructs:

#ifdef _TARGET_AMD64_
// on AMD don't promote structs with a single float field
// Promoting it would just cause us to shuffle it back and forth between int and float regs.
// On ARM this would be an HFA and passed/returned in float regs.
if (structPromotionInfo.fieldCnt==1
    && varTypeIsFloating(structPromotionInfo.fields[0].fldType))
{
    JITDUMP("Not promoting promotable struct local V%02u: #fields = %d because it is a struct with single float field.\n",
        lclNum, structPromotionInfo.fieldCnt);
    continue;
}
#endif

I don't know what that comment is trying to say about shuffling the value between int and float regs but adding a second field to the Number struct changes the generated code to the expected mov eax, 6.

It's interesting that if the restriction is removed an assert fires in lsra.cpp, that probably means that constant propagation and folding still fails to work even if the struct is promoted.

@mikedn
Copy link
Contributor Author

mikedn commented Jul 9, 2015

The int-float shuffling is probably a reference to the fact that the inliner insists on using the ABI rules of returning Number via rax and reinterprets the double as long. Probably that derails all future optimizations.

mikedn referenced this issue in AndyAyersMS/coreclr Oct 25, 2015
@Rohansi
Copy link

Rohansi commented Jul 19, 2018

IL_0000: ldc.r8 10
IL_0009: call Value Value::op_Implicit(System.Double)
IL_000e: stloc.0
vmovsd  xmm0,qword ptr [00007ffc`411935c0]
vmovsd  qword ptr [rsp+18h],xmm0
mov     rax,qword ptr [rsp+18h]
mov     qword ptr [rsp+30h],rax

😞

@mikedn
Copy link
Contributor Author

mikedn commented Jan 31, 2019

Hmm, this looks like a case for BITCAST, that would likely work better than the current LCL_FLD based approach to handling such structs returned in integer registers.

@msftgits msftgits transferred this issue from dotnet/coreclr Jan 30, 2020
@msftgits msftgits added this to the Future milestone Jan 30, 2020
@CarolEidt CarolEidt modified the milestones: Future, 6.0.0 Oct 14, 2020
@CarolEidt
Copy link
Contributor

This was originally issue 1161 in dotnet/coreclr, and there is a test, JIT\Regression\JitBlue\GitHub_1161, that captures this (to enable improvements or regressions to show up when we run jit diffs).
Progress has been made toward addressing this issue by preserving struct types through the front-end of the JIT. The remaining work involves support the cases that require a register file move.

@JulieLeeMSFT JulieLeeMSFT added the needs-further-triage Issue has been initially triaged, but needs deeper consideration or reconsideration label Mar 23, 2021
@JulieLeeMSFT JulieLeeMSFT removed the needs-further-triage Issue has been initially triaged, but needs deeper consideration or reconsideration label Jun 7, 2021
@sandreenko sandreenko modified the milestones: 6.0.0, 7.0.0 Jul 9, 2021
@AndyAyersMS
Copy link
Member

I don't think we will fix this in 7.0, so moving to future

@AndyAyersMS AndyAyersMS modified the milestones: 7.0.0, Future Apr 28, 2022
@AndyAyersMS
Copy link
Member

@sandreenko should we unassign you?

@sandreenko sandreenko removed their assignment Apr 28, 2022
@jakobbotsch
Copy link
Member

Codegen today is the same for both versions:

; Assembly listing for method Program:Main():int
; Emitting BLENDED_CODE for X64 CPU with AVX - Windows
; optimized code
; rsp based frame
; partially interruptible
; No PGO data
; 0 inlinees with PGO data; 5 single block inlinees; 0 inlinees without PGO data
; Final local variable assignments
;
;* V00 loc0         [V00    ] (  0,  0   )  struct ( 8) zero-ref    single-def
;# V01 OutArgs      [V01    ] (  1,  1   )  lclBlk ( 0) [rsp+00H]   "OutgoingArgSpace"
;* V02 tmp1         [V02    ] (  0,  0   )  struct ( 8) zero-ref    "impAppendStmt"
;* V03 tmp2         [V03    ] (  0,  0   )  struct ( 8) zero-ref    "struct address for call/obj"
;* V04 tmp3         [V04,T00] (  0,  0   )  struct ( 8) zero-ref    do-not-enreg[SF] ld-addr-op "Inline ldloca(s) first use temp"
;* V05 tmp4         [V05,T01] (  0,  0   )  struct ( 8) zero-ref    do-not-enreg[SF] ld-addr-op "Inline ldloca(s) first use temp"
;* V06 tmp5         [V06    ] (  0,  0   )  struct ( 8) zero-ref    do-not-enreg[SF] "Inlining Arg"
;* V07 tmp6         [V07    ] (  0,  0   )  struct ( 8) zero-ref    do-not-enreg[SF] "Inlining Arg"
;* V08 tmp7         [V08,T02] (  0,  0   )  struct ( 8) zero-ref    do-not-enreg[SF] ld-addr-op "Inline ldloca(s) first use temp"
;* V09 tmp8         [V09    ] (  0,  0   )  double  ->  zero-ref    "Inlining Arg"
;* V10 tmp9         [V10    ] (  0,  0   )  struct ( 8) zero-ref    do-not-enreg[SF] "Inlining Arg"
;* V11 cse0         [V11,T03] (  0,  0   )  double  ->  zero-ref    "CSE - aggressive"
;* V12 cse1         [V12,T04] (  0,  0   )  double  ->  zero-ref    "CSE - aggressive"
;
; Lcl frame size = 0

G_M24375_IG01:
						;; size=0 bbWeight=1    PerfScore 0.00
G_M24375_IG02:
       mov      eax, 6
						;; size=5 bbWeight=1    PerfScore 0.25
G_M24375_IG03:
       ret      
						;; size=1 bbWeight=1    PerfScore 1.00

; Total bytes of code 6, prolog size 0, PerfScore 1.85, instruction count 2, allocated bytes for code 6 (MethodHash=d9b8a0c8) for method Program:Main():int
; ============================================================

Looks fixed.

@ghost ghost locked as resolved and limited conversation to collaborators Jan 15, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI enhancement Product code improvement that does NOT require public API changes/additions optimization tenet-performance Performance related issue
Projects
None yet
Development

No branches or pull requests

8 participants