Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RyuJIT: Incorrect ordering around Interlocked.Exchange and Interlocked.CompareExchange #19583

Closed
jakobbotsch opened this Issue Aug 21, 2018 · 13 comments

Comments

Projects
None yet
6 participants
@jakobbotsch
Copy link
Collaborator

jakobbotsch commented Aug 21, 2018

Example:

// Debug: Outputs 0
// Release: Outputs 1
public class Program
{
    static long s_1;
    static int s_3;
    public static void Main()
    {
        int vr16 = s_3;
        int vr19 = System.Threading.Interlocked.Exchange(ref s_3, 1);
        s_1 = vr16;
        System.Console.WriteLine(s_1);
    }
}

It also reproduces if Exchange(ref s_3, 1) is replaced by CompareExchange(ref s_3, 1, s_3).

Quoting @mikedn:

a quick look at the dump shows that the load from s_3 moved past exchange quite early, possibly as early as import time.

Importing BB01 (PC=000) of 'Program:Main()'
    [ 0]   0 (0x000) ldsfld 04000002
    [ 1]   5 (0x005) ldsflda 04000002
    [ 2]  10 (0x00a) ldc.i4.1 1
    [ 3]  11 (0x00b) call 0A000006
In Compiler::impImportCall: opcode is call, kind=0, callRetType is int, structSize is 0

    [ 2]  16 (0x010) pop

               [000007] ------------              *  STMT      void  (IL 0x000...  ???)
               [000005] ------------              |  /--*  NOP       void  
               [000006] -A--G-------              \--*  COMMA     void  
               [000003] ------------                 |  /--*  CNS_INT   int    1
               [000004] -A--G-------                 \--*  XCHG      int   
               [000002] ------------                    \--*  CNS_INT(h) long   0x7ffb4ffb4598 static Fseq[s_3]

    [ 1]  17 (0x011) conv.i8
    [ 1]  18 (0x012) stsfld 04000001

               [000011] ------------              *  STMT      void  (IL   ???...  ???)
               [000008] ----G-------              |  /--*  CAST      long <- int
               [000001] ----G-------              |  |  \--*  FIELD     int    s_3
               [000010] -A--G-------              \--*  ASG       long  
               [000009] ----G--N----                 \--*  FIELD     long   s_1
@mikedn

This comment has been minimized.

Copy link
Contributor

mikedn commented Aug 21, 2018

Yup, importer bug. Compiler::impAppendStmt does not see that Exchange modifies memory, it's neither a call nor an actual assignment. This means that the tree generated by the first ldsfld does not get spilled and ends up after XCHG.

@mikedn

This comment has been minimized.

Copy link
Contributor

mikedn commented Aug 21, 2018

I thought that this may caused by vr19 not being used but it happens with the following example as well:

int vr16 = s_3;
int vr19 = System.Threading.Interlocked.Exchange(ref s_3, 1);
s_1 = vr16;
System.Console.WriteLine(s_1);
return vr19;

The way impAppendStmt works might mean that we need to add GTF_CALL to these intrinsics, that's how it seems to have been designed. Hrm.

@CarolEidt

This comment has been minimized.

Copy link
Member

CarolEidt commented Aug 21, 2018

The way impAppendStmt works might mean that we need to add GTF_CALL to these intrinsics, that's how it seems to have been designed. Hrm.

That definitely doesn't sound ideal. The fundamental issue is that it is assuming that writes can only happen as assignments or calls. I wonder if we have similar issues with the hardware intrinsics that perform stores.

@mikedn

This comment has been minimized.

Copy link
Contributor

mikedn commented Aug 21, 2018

I wonder if we have similar issues with the hardware intrinsics that perform stores.

I was going to look at that, it seems highly likely.

So yeah, side effects and ordering - the other worst feature of JIT's IR. This and various other issues would very likely not exist in LIR or some variation of it.

@mikedn

This comment has been minimized.

Copy link
Contributor

mikedn commented Aug 21, 2018

Tried hw intrinsics, sort of. I couldn't make a similar repro due to various restrictions (e.g. can't take the address of a vector). And in the example I looked at the problem doesn't occur for a simple reason - the hw intrinsic store node has GTF_EXCEPT. The same is true for Exchange, if instance fields are used then the code works correctly.

I suppose it may still be possible to reproduce this issue with hw intrinsics but it's going to be rather difficult to produce a hw intrinsic store without GTF_EXCEPT. In fact, I wouldn't be surprised to find out that they always have GTF_EXCEPT, even if the store address is obviously non-null.

@CarolEidt

This comment has been minimized.

Copy link
Member

CarolEidt commented Aug 21, 2018

@mikedn - thanks for looking into this!

@mikedn

This comment has been minimized.

Copy link
Contributor

mikedn commented Aug 22, 2018

thanks for looking into this!

No problem. Hopefully if I take care of some of the bugs the team can take care of more important things :)

Looks like it can happen with hw intrinsics as well, even if they have GTF_EXCEPT:

        vec v = default;
        Vector128<int> o = Sse2.SetAllVector128(1);
        int vr16 = v.y;
        Sse2.Store(&v.x, o);
        System.Console.WriteLine(vr16);

generates:

       B901000000           mov      ecx, 1
       C4E1796EC1           vmovd    xmm0, xrcx
       C4E17970C000         vpshufd  xmm0, xmm0, 0
       488D4C2428           lea      rcx, bword ptr [rsp+28H]
       C4E17A7F01           vmovdqu  xmmword ptr [rcx], xmm0
       8B4C242C             mov      ecx, dword ptr [rsp+2CH]  ; v.y load moved past store
       E8CBEEFFFF           call     System.Console:WriteLine(int)

And unlike the original example this happens even in minopts because these intrinsics are always expanded.

More later, heading to work.

@mikedn

This comment has been minimized.

Copy link
Contributor

mikedn commented Aug 22, 2018

Should be an easy fix, I think. impAppendStmt should check for GTF_ASG in addition to GT_ASG.

            if (expr->gtOper == GT_ASG)
            {
                GenTree* lhs = expr->gtGetOp1();
                // If we are assigning to a global ref, we have to spill global refs on stack.
                // TODO-1stClassStructs: Previously, spillGlobEffects was set to true for
                // GT_INITBLK and GT_COPYBLK, but this is overly conservative, and should be
                // revisited. (Note that it was NOT set to true for GT_COPYOBJ.)
                if (!expr->OperIsBlkOp())
                {
                    // If we are assigning to a global ref, we have to spill global refs on stack
                    if ((lhs->gtFlags & GTF_GLOB_REF) != 0)
                    {
                        spillGlobEffects = true;
                    }

So yes, spill if assigning to a global ref. And if expr is not GT_ASG but it has GTF_ASG then it means that it's some special tree that the importer doesn't quite understand so it should be spilled as well.

@erozenfeld

This comment has been minimized.

Copy link
Member

erozenfeld commented Sep 5, 2018

Two more instances of this pointed out by @jakobbotsch in #12398:

// Debug: Outputs 1
// Release: Outputs 0
public class Program
{
    static int s_32;
    static int s_46 = 1;
    public static void Main()
    {
        s_32 = 0;
        M5();
        System.Console.WriteLine(s_46);
    }

    static void M5()
    {
        s_46 *= (System.Threading.Interlocked.Exchange(ref s_46, 0) | s_32--);
    }
}
// Debug: Outputs 0
// Release: Outputs 1
public class Program
{
    static int s_3;
    static int s_11;
    public static void Main()
    {
        M9(s_3, System.Threading.Interlocked.Exchange(ref s_3, 1), s_11++);
    }

    static void M9(int arg2, int arg3, int arg4)
    {
        System.Console.WriteLine(arg2);
    }
}
@erozenfeld

This comment has been minimized.

Copy link
Member

erozenfeld commented Sep 5, 2018

@mikedn The fix you proposed won't be sufficient for the first example I added. In that case impAppendStmt is called on

               [000011] ------------              *  STMT      void  (IL 0x000...  ???)
               [000003] ------------              |     /--*  CNS_INT   int    0
               [000004] -A--G-------              |  /--*  XCHG      int   
               [000002] ------------              |  |  \--*  CNS_INT(h) long   0x7ffcdebe4658 static Fseq[s2]
               [000010] -A--G-------              \--*  ASG       int   
               [000009] D------N----                 \--*  LCL_VAR   int    V02 tmp2 

It looks like that in addition to your proposed fix we need to set spillGlobEffects to true if we are processing a GT_ASG whose RHS has GTF_ASG set.

@erozenfeld

This comment has been minimized.

Copy link
Member

erozenfeld commented Sep 5, 2018

@mikedn

This comment has been minimized.

Copy link
Contributor

mikedn commented Sep 13, 2018

It looks like that in addition to your proposed fix we need to set spillGlobEffects to true if we are processing a GT_ASG whose RHS has GTF_ASG set.

Yep.

And maybe we should also check for GTF_ASG on the LHS. In the non-struct case we're probably covered by the existing GTF_GLOB_REF check but in the struct case, well, I'm not sure what will happen if such a case ever arises.

@mikedn

This comment has been minimized.

Copy link
Contributor

mikedn commented Sep 13, 2018

Fix in #19950

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.