Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RyuJIT incorrectly reorders expression containing a CSE, resulting in exception thrown in release #18770

Closed
jakobbotsch opened this Issue Jul 3, 2018 · 10 comments

Comments

Projects
None yet
4 participants
@jakobbotsch
Copy link
Collaborator

jakobbotsch commented Jul 3, 2018

The following example throws NullReferenceException in release but not in debug:

// Generated by Fuzzlyn on 2018-07-02 17:44:35
// Seed: 7883145904562184282
// Reduced from 39.4 KiB to 0.4 KiB
// Debug: Runs successfully
// Release: Throws 'System.NullReferenceException'
class C0
{
    public sbyte F0;
    public ushort F7;
    public uint F8;
}

public class Program
{
    static int s_0;
    static C0 s_1 = new C0();

    public static void Main()
    {
        s_0 = 0; // ensure no runtime checks in M0
        M0();
    }

    static void M0()
    {
        bool vr0 = (s_1.F7 < s_1.F0) ^ (s_1.F8 != s_1.F8);
        if (vr0)
        {
            s_1.F7 = s_1.F7;
        }
    }
}

Some more information can be found in #18232 (comment).

@RussKeldorph RussKeldorph added this to the 3.0 milestone Jul 3, 2018

@briansull briansull changed the title RyuJIT performs incorrect CSE resulting in exception thrown in release RyuJIT incorrectly reorders expression containing a CSE, resulting in exception thrown in release Jul 3, 2018

@jakobbotsch

This comment has been minimized.

Copy link
Collaborator Author

jakobbotsch commented Jul 7, 2018

Here is another one which looks to be related to the ordering of side effects:

// Generated by Fuzzlyn on 2018-06-25 13:41:42
// Seed: 16212821363581638952
// Reduced from 46.7 KiB to 0.6 KiB
// Debug: Outputs 0
// Release: Outputs -1
class C0
{
    public long F0;
    public C0(long f0)
    {
        F0 = f0;
    }
}

public class Program
{
    static char s_1;
    static ulong s_2 = 1;
    static int s_6;
    static C0 s_7;
    static C0 s_9 = new C0(-1L);
    public static void Main()
    {
        s_6 = 0;
        M1();
        System.Console.WriteLine(s_7.GetType());
    }

    static void M1()
    {
        long vr1 = ((0 & (M2() + s_9.F0)) | s_1) / (long)s_2;
        s_7 = s_9;
    }

    static int M2()
    {
        s_9 = new C0(0);
        return s_6;
    }
}

The load of s_9 in M1 is before the call of M2, even though it does not appear before M2 in the program:

; Assembly listing for method Program:M1()
; Emitting BLENDED_CODE for X64 CPU with AVX
; optimized code
; rsp based frame
; partially interruptible
; Final local variable assignments
;
;  V00 OutArgs      [V00    ] (  1,  1   )  lclBlk (32) [rsp+0x00]
;  V01 cse0         [V01,T00] (  8,  8   )    long  ->  rax
;  V02 cse1         [V02,T01] (  5,  5   )     ref  ->  rsi
;
; Lcl frame size = 32

G_M23187_IG01:
       56                   push     rsi
       4883EC20             sub      rsp, 32

G_M23187_IG02:
       48B87029AF1ABC010000 mov      rax, 0x1BC1AAF2970
       488B30               mov      rsi, gword ptr [rax]                      ; rsi = s_9
       E8F1FAFFFF           call     Program:M2():int                          ; changes s_9
       0FB705CE29EEFF       movzx    rax, word  ptr [reloc classVar[0xb8db5290]]
       8BC0                 mov      eax, eax
       4899                 cdq
       48F73DB729EEFF       idiv     rdx:rax, qword ptr [reloc classVar[0xb8db52a8]]
       48B96829AF1ABC010000 mov      rcx, 0x1BC1AAF2968
       488BD6               mov      rdx, rsi                                  ; old value of s_9
       E8B5A86D5F           call     CORINFO_HELP_CHECKED_ASSIGN_REF
       90                   nop

G_M23187_IG03:
       4883C420             add      rsp, 32
       5E                   pop      rsi
       C3                   ret

The tree for the first statement is:

***** BB01, stmt 1
     ( 55, 40)              [000018] ------------              *  STMT      void  (IL   ???...  ???)
N016 (  0,  0)              [000016] ------------              |  /--*  NOP       void   $380
N017 ( 55, 40)              [000017] --CXG-------              \--*  COMMA     void   $380
N014 (  3,  4)              [000014] ----G-------                 |  /--*  CLS_VAR   long   Hnd=0xb8dc52a8 Fseq[s_2] <l:$147, c:$281>
N015 ( 55, 40)              [000015] --CXG-------                 \--*  DIV       long   <l:$149, c:$148>
N012 (  5,  7) CSE #01 (def)[000012] ----G----U--                    |  /--*  CAST      long <- ulong <- uint <l:$146, c:$145>
N011 (  4,  5)              [000011] ----G-------                    |  |  \--*  CLS_VAR   ushort Hnd=0xb8dc5290 Fseq[s_1] <l:$341, c:$340>
N013 ( 32, 33) CSE #01 (use)[000013] --CXG-------                    \--*  OR        long   <l:$146, c:$145>
N009 (  1,  1)              [000002] ------------                       |  /--*  CNS_INT   long   0 $201
N010 ( 26, 25)              [000010] --CXG-------                       \--*  AND       long   $201
N007 (  8, 15)              [000008] ---XG-------                          |  /--*  IND       long   <l:$142, c:$280>
N005 (  1,  1)              [000026] ------------                          |  |  |  /--*  CNS_INT   long   8 field offset Fseq[F0] $200
N006 (  6, 13)              [000027] ----G--N----                          |  |  \--*  ADD       byref  <l:$241, c:$240>
N004 (  5, 12) CSE #02 (def)[000007] x---G-------                          |  |     \--*  IND       ref    <l:$1c0, c:$101>
N003 (  3, 10)              [000028] ------------                          |  |        \--*  CNS_INT(h) long   0x18e9d5d2970 static Fseq[s_9] $180
N008 ( 24, 23)              [000009] --CXG-------                          \--*  ADD       long   <l:$144, c:$143>
N002 ( 15,  7)              [000006] --CXG-------                             \--*  CAST      long <- int $140
N001 ( 14,  5)              [000003] --CXG-------                                \--*  CALL      int    Program.M2 $c0

But it gets messed up after the first CSE:

optValnumCSE morphed tree:
N014 (  0,  0)              [000016] ------------              /--*  NOP       void   $380
N015 ( 48, 32)              [000017] -ACXG-------              *  COMMA     void   $380
N012 (  3,  4)              [000014] ----G-------              |  /--*  CLS_VAR   long   Hnd=0xb8dc52a8 Fseq[s_2] <l:$147, c:$281>
N013 ( 48, 32)              [000015] -ACXG-------              \--*  DIV       long   <l:$149, c:$148>
N010 (  1,  1)              [000036] ------------                 |  /--*  LCL_VAR   long   V01 cse0          <l:$146, c:$145>
N011 ( 25, 25)              [000038] -ACXG-------                 \--*  COMMA     long   <l:$146, c:$145>
N005 (  5,  7)              [000012] ----G----U--                    |        /--*  CAST      long <- ulong <- uint <l:$146, c:$145>
N004 (  4,  5)              [000011] ----G-------                    |        |  \--*  CLS_VAR   ushort Hnd=0xb8dc5290 Fseq[s_1] <l:$341, c:$340>
N007 (  5,  7)              [000032] -A--G---R---                    |     /--*  ASG       long   $VN.Void
N006 (  1,  1)              [000031] D------N----                    |     |  \--*  LCL_VAR   long   V01 cse0          <l:$146, c:$145>
N008 ( 19, 12)              [000035] -ACXG-------                    |  /--*  COMMA     void   $VN.Void
N003 ( 14,  5)              [000003] --CXG-------                    |  |  \--*  CALL      int    Program.M2 $c0
N009 ( 24, 24)              [000037] -ACXG-------                    \--*  COMMA     void   $VN.Void
N002 (  5, 12) CSE #02 (def)[000007] x---G-------                       \--*  IND       ref    <l:$1c0, c:$101>            ; before M2 call
N001 (  3, 10)              [000028] ------------                          \--*  CNS_INT(h) long   0x18e9d5d2970 static Fseq[s_9] $180

Which then breaks the second CSE as well. If s_9 is assigned null in its initializer this throws a NRE instead.

Like @mikedn pointed out in #18232, there is also a parent node that gets the same VN as one of its children (CSE #01 use and def) in this case. So I assume this is the same problem.

cc @mikedn @briansull

@mikedn

This comment has been minimized.

Copy link
Contributor

mikedn commented Jul 17, 2018

This code seems suspect:

coreclr/src/jit/optcse.cpp

Lines 2202 to 2210 in f69989e

// Now we need to unmark any nested CSE's uses that are found in 'exp'
// As well we extract any nested CSE defs that are found in 'exp' and
// these are appended to the sideEffList
// Afterwards the set of nodes in the 'sideEffectList' are preserved and
// all other nodes are removed and have their ref counts decremented
//
exp->gtCSEnum = NO_CSE; // clear the gtCSEnum field
bool result = m_pCompiler->optValnumCSE_UnmarkCSEs(exp, &sideEffList);

So side effects are extracted first and then CSE defs are extracted and appended to the side effect list. Hmm, how is the original order preserved?

@mikedn

This comment has been minimized.

Copy link
Contributor

mikedn commented Jul 21, 2018

@briansull Is there a test for VSO 566984? I don't see any in #16367 and #16413.

It looks to me that we can avoid all this complication by extracting side effects/CSE defs and unmarking CSEs at the same time, during a single tree traversal. Something like mikedn@57fac83

@mikedn

This comment has been minimized.

Copy link
Contributor

mikedn commented Jul 24, 2018

@briansull Thanks for the repro! It looks like it still works well with my changes so I'll probably put together a PR to fix this issue one of these days.

@briansull

This comment has been minimized.

Copy link
Contributor

briansull commented Jul 24, 2018

I am currently working on adding the missing VNPWithExc information for nodes that can throw exceptions, such as GT_DIV and GT_IND. My work should address #8648

@jakobbotsch

This comment has been minimized.

Copy link
Collaborator Author

jakobbotsch commented Aug 20, 2018

Not sure if this is related in some way:

// Debug: Outputs 0
// Release: Outputs 1
public class Program
{
    public static void Main()
    {
        long vr4 = 0;
        long vr5 = vr4 ^ System.Threading.Interlocked.Exchange(ref vr4, 1);
        System.Console.WriteLine(vr5);
    }
}

The code generated is:

G_M5092_IG01:
       4883EC28             sub      rsp, 40
       33C0                 xor      rax, rax
       4889442420           mov      qword ptr [rsp+20H], rax

G_M5092_IG02:
       33C9                 xor      rcx, rcx
       48894C2420           mov      qword ptr [rsp+20H], rcx
       488D4C2420           lea      rcx, bword ptr [rsp+20H]
       B801000000           mov      eax, 1
       488701               xchg     qword ptr [rcx], rax
       488BC8               mov      rcx, rax
       48334C2420           xor      rcx, qword ptr [rsp+20H]
       E82CFCFFFF           call     System.Console:WriteLine(long)
       90                   nop

G_M5092_IG03:
       4883C428             add      rsp, 40
       C3                   ret

I can try to take a closer look tomorrow.

@mikedn

This comment has been minimized.

Copy link
Contributor

mikedn commented Aug 21, 2018

Not related to CSE. It looks like in this case the second operand of xor was incorrectly marked as reg optional which basically caused the load from [rsp+20h] to move past xchg. If I disable that reg optional code the following code is generated:

       33C9                 xor      rcx, rcx
       48894C2420           mov      qword ptr [rsp+20H], rcx
       488B4C2420           mov      rcx, qword ptr [rsp+20H]
       488D442420           lea      rax, bword ptr [rsp+20H]
       BA01000000           mov      edx, 1
       488710               xchg     qword ptr [rax], rdx
       4833CA               xor      rcx, rdx
       E804FCFFFF           call     System.Console:WriteLine(long)

Seems to be the same as #12398.
cc @CarolEidt

@jakobbotsch

This comment has been minimized.

Copy link
Collaborator Author

jakobbotsch commented Aug 21, 2018

Thank you for the analysis! It seems to be able to cause problems in some less 'exotic' ways as well:

// Generated by Fuzzlyn v1.1 on 2018-08-21 04:34:52
// Seed: 4306035957799762988
// Reduced from 25.6 KiB to 0.3 KiB in 00:00:21
// Debug: Outputs 0
// Release: Outputs 1
public class Program
{
    static long s_1;
    static int s_3;
    public static void Main()
    {
        int vr16 = s_3;
        int vr19 = System.Threading.Interlocked.Exchange(ref s_3, 1);
        s_1 = vr16;
        System.Console.WriteLine(s_1);
    }
}
@mikedn

This comment has been minimized.

Copy link
Contributor

mikedn commented Aug 21, 2018

This looks like a different issue, a quick look at the dump shows that the load from s_3 moved past exchange quite early, possibly as early as import time.

Importing BB01 (PC=000) of 'Program:Main()'
    [ 0]   0 (0x000) ldsfld 04000002
    [ 1]   5 (0x005) ldsflda 04000002
    [ 2]  10 (0x00a) ldc.i4.1 1
    [ 3]  11 (0x00b) call 0A000006
In Compiler::impImportCall: opcode is call, kind=0, callRetType is int, structSize is 0

    [ 2]  16 (0x010) pop

               [000007] ------------              *  STMT      void  (IL 0x000...  ???)
               [000005] ------------              |  /--*  NOP       void  
               [000006] -A--G-------              \--*  COMMA     void  
               [000003] ------------                 |  /--*  CNS_INT   int    1
               [000004] -A--G-------                 \--*  XCHG      int   
               [000002] ------------                    \--*  CNS_INT(h) long   0x7ffb4ffb4598 static Fseq[s_3]

    [ 1]  17 (0x011) conv.i8
    [ 1]  18 (0x012) stsfld 04000001

               [000011] ------------              *  STMT      void  (IL   ???...  ???)
               [000008] ----G-------              |  /--*  CAST      long <- int
               [000001] ----G-------              |  |  \--*  FIELD     int    s_3
               [000010] -A--G-------              \--*  ASG       long  
               [000009] ----G--N----                 \--*  FIELD     long   s_1
@mikedn

This comment has been minimized.

Copy link
Contributor

mikedn commented Aug 29, 2018

@RussKeldorph You can assign to me, fix in progress in #19125

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.