Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Jit] Overzealous w/ do-not-enreg in presence of EH #9514

Open
benaadams opened this issue Jan 6, 2018 · 11 comments
Open

[Jit] Overzealous w/ do-not-enreg in presence of EH #9514

benaadams opened this issue Jan 6, 2018 · 11 comments
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI enhancement Product code improvement that does NOT require public API changes/additions JitUntriaged CLR JIT issues needing additional triage optimization tenet-performance Performance related issue
Milestone

Comments

@benaadams
Copy link
Member

Given the following code

using System;
using System.Runtime.ExceptionServices;
using System.Threading;

class Program
{
    int _a;
    int _b;

    int Func1(ContextCallback callback, object state)
    {
        int i = _a;

        ExceptionDispatchInfo edi = null;
        try
        {
            callback.Invoke(state);
        }
        catch (Exception ex)
        {
            edi = ExceptionDispatchInfo.Capture(ex);
        }

        edi?.Throw();

        int ii = _b + 1;
        return i + ii;
    }

    int Func2(ContextCallback callback, object state)
    {
        int i = _a;

        ExceptionDispatchInfo edi = null;
        try
        {
            callback.Invoke(state);
        }
        catch (Exception ex)
        {
            edi = ExceptionDispatchInfo.Capture(ex);
        }

        edi?.Throw();

        int ii = _b + i;
        return ii;
    }

    static void Main(string[] args)
    {
        var p = new Program();

        Console.WriteLine($"{p.Func1((o) => { }, null)} {p.Func2((o) => { }, null)}");
    }
}

In both Func1 and Func2 as the variable i's lifetime crosses exception handling it is marked as do-not-enreg

In neither Func1 nor Func2 does the variable ii's lifetime cross exception handling.

In Func1 ii's lifetime crosses i's lifetime so they cannot share a register; so ii is not marked as do-not-enreg 👍

In Func2 ii's lifetime does not cross i's lifetime so they can share a register; which means ii inherits is do-not-enreg 👎

Seen in "Reduce Execution Context Save+Restore" dotnet/coreclr#15629

category:cq
theme:eh
skill-level:expert
cost:medium

@benaadams
Copy link
Member Author

benaadams commented Jan 6, 2018

The two functions differ by the presence of

;  V05 loc2         [V05,T06] (  2,  2   )     int  ->  rax   

Even though in Program:Func2 ii could also be a register and not stack

 ; Assembly listing for method Program:Func1(ref,ref):int:this
 ; Emitting BLENDED_CODE for X64 CPU with AVX
 ; optimized code
 ; rbp based frame
 ; fully interruptible
 ; Final local variable assignments
 ;
 ;  V00 this         [V00,T00] (  4,  4   )     ref  ->  [rbp+0x10]   do-not-enreg[H] this class-hnd
 ;  V01 arg1         [V01,T01] (  3,  3   )     ref  ->  rdx         class-hnd
 ;  V02 arg2         [V02,T02] (  3,  3   )     ref  ->   r8         class-hnd
 ;  V03 loc0         [V03,T05] (  2,  2   )     int  ->  [rbp-0x04]   do-not-enreg[H]
 ;  V04 loc1         [V04,T04] (  4,  2   )     ref  ->  [rbp-0x10]   do-not-enreg[H] class-hnd
+;  V05 loc2         [V05,T06] (  2,  2   )     int  ->  rax        
 ;  V06 tmp0         [V06,T07] (  2,  0   )     ref  ->  rcx         class-hnd
 ;  V07 OutArgs      [V07    ] (  1,  1   )  lclBlk (32) [rsp+0x00]  
 ;  V08 PSPSym       [V08    ] (  1,  1   )    long  ->  [rbp-0x20]   do-not-enreg[X] addr-exposed
 ;  V09 rat0         [V09,T03] (  2,  4   )     ref  ->  rax        
 ;
 ; Lcl frame size = 64
 ; Assembly listing for method Program:Func2(ref,ref):int:this
 ; Emitting BLENDED_CODE for X64 CPU with AVX
 ; optimized code
 ; rbp based frame
 ; fully interruptible
 ; Final local variable assignments
 ;
 ;  V00 this         [V00,T00] (  4,  4   )     ref  ->  [rbp+0x10]   do-not-enreg[H] this class-hnd
 ;  V01 arg1         [V01,T01] (  3,  3   )     ref  ->  rdx         class-hnd
 ;  V02 arg2         [V02,T02] (  3,  3   )     ref  ->   r8         class-hnd
 ;  V03 loc0         [V03,T05] (  2,  2   )     int  ->  [rbp-0x04]   do-not-enreg[H]
 ;  V04 loc1         [V04,T04] (  4,  2   )     ref  ->  [rbp-0x10]   do-not-enreg[H] class-hnd
-;  V05 loc2         [V05,T06] (  2,  2   )     int  ->  rax        
 ;  V05 tmp0         [V05,T06] (  2,  0   )     ref  ->  rcx         class-hnd
 ;  V06 OutArgs      [V06    ] (  1,  1   )  lclBlk (32) [rsp+0x00]  
 ;  V07 PSPSym       [V07    ] (  1,  1   )    long  ->  [rbp-0x20]   do-not-enreg[X] addr-exposed
 ;  V08 rat0         [V08,T03] (  2,  4   )     ref  ->  rax        
 ;
 ; Lcl frame size = 64

@benaadams
Copy link
Member Author

benaadams commented Jan 6, 2018

Not sure the exact terminology but is the analysis correct @mikedn ?

@mikedn
Copy link
Contributor

mikedn commented Jan 6, 2018

Looks like a variant of #7775? @CarolEidt

@benaadams
Copy link
Member Author

benaadams commented Jan 6, 2018

Interesting, ExecutionContext.Run would be a good candidate for examining with "Exception Handling Write Through" as its an extremely exercised function and the stack work it does is pretty heavy due to do-not-enreg[H] dotnet/coreclr#15629 (comment)

@AndyAyersMS
Copy link
Member

In Func1, the jit does enregister ii, it is in RAX as you note:

;  V05 loc2         [V05,T06] (  2,  2   )     int  ->  rax

       488B4510             mov      rax, gword ptr [rbp+10H]    // load 'this'
       8B400C               mov      eax, dword ptr [rax+12]     // ii = _b
       FFC0                 inc      eax                         // ii = _b + 1
       0345FC               add      eax, dword ptr [rbp-04H]    // return = ii + i 

Note this is "live across EH" and hence on the stack, so any field fetch requires two loads.

In Func2 CSC has optimized away ii so the jit only sees references to i. I can't find a way to disable CSC's optimizer without also disabling the jit optimizer.

It could be that fixing the jit to handle never-written locals in the presence of EH would be a nice incremental improvement and would be less complex than full write-through. In particular this is almost never written and forcing it to the stack causes the quite a bit of unexpected extra load traffic.

@benaadams
Copy link
Member Author

Could it only consider candidate merges if they have the same do-not-enreg/enreg state; or is that discovered after?

@AndyAyersMS
Copy link
Member

Not quite sure what "it" you have in mind here.

@benaadams
Copy link
Member Author

benaadams commented Jan 7, 2018

Not sure which stage; but at the start of the dump it starts with loc1 and loc2 at some point loc2 is removed because it can just use loc1's spot.

Which would be fine; if they could both either be registers or both could not be registers.

However, since one can be a register and the other cannot, currently, that should block them being merged? (unless there aren't enough registers already)

@AndyAyersMS
Copy link
Member

The IL I see for Func2 does not have a loc2; ii lives entirely on the eval stack and is never saved as a local. So the "merging" as such is done by the C# compiler.

.method private hidebysig instance int32 
        Func2(class [mscorlib]System.Threading.ContextCallback callback,
              object state) cil managed
{
  // Code size       44 (0x2c)
  .maxstack  2
  .locals init ([0] int32 i,
           [1] class [mscorlib]System.Runtime.ExceptionServices.ExceptionDispatchInfo edi)
  IL_0000:  ldarg.0
  IL_0001:  ldfld      int32 Program::_a
  IL_0006:  stloc.0
  IL_0007:  ldnull
 ... snip ...
  IL_0023:  ldarg.0
  IL_0024:  ldfld      int32 Program::_b
  IL_0029:  ldloc.0
  IL_002a:  add
  IL_002b:  ret
} // end of method Program::Func2

@benaadams
Copy link
Member Author

benaadams commented Jan 7, 2018

Ah, you are correct! Sorry, I was completely misreading the output 😦

@benaadams
Copy link
Member Author

@AndyAyersMS thanks for pointing that out! A solution seems to be to introduce extra variables before and after the EH and to use those; then use a set to only cross EH and to reload the second set of variables from dotnet/coreclr#15629 (comment)

@msftgits msftgits transferred this issue from dotnet/coreclr Jan 31, 2020
@msftgits msftgits added this to the Future milestone Jan 31, 2020
@BruceForstall BruceForstall added the JitUntriaged CLR JIT issues needing additional triage label Oct 28, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI enhancement Product code improvement that does NOT require public API changes/additions JitUntriaged CLR JIT issues needing additional triage optimization tenet-performance Performance related issue
Projects
None yet
Development

No branches or pull requests

5 participants