New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ARM/Linux] ARM-softfp delegate code generation issue #5275

Closed
papaslavik opened this Issue May 27, 2016 · 16 comments

Comments

Projects
None yet
4 participants
@papaslavik
Contributor

papaslavik commented May 27, 2016

Update:

the issue in fact occurs in delegate code generation, and this small example shows the problem:

using System;
using Point = System.Numerics.Vector2;
namespace ClassLibrary
{
    public class test
    {
        public delegate void PointPrinter(Point a);
        static void cmp(Point a)
        {
            Console.WriteLine("cmp a.X = " + a.X.ToString("R") + "  a.Y = " + a.Y.ToString("R"));
        }
       static int Main(string[] args)
        {
            Point pi;
            pi.X = 1;
            pi.Y = -1;
            PointPrinter ccmp = cmp;
            ccmp(pi);
            return 0;
        }
   }
}

What follows below is my initial description of the issue.

Testing JIT.SIMD.CircleInConvex_r.CircleInConvex_r in arm-softfp we discovered an issue with generics when the floating point arguments were passed.

There's a simplified code which will demostrate the issue:

using System;
using System.Collections.Generic;
using Point = System.Numerics.Vector2;
namespace ClassLibrary
{
    public class test
    {
        static int cmp(Point a, Point b)
        {
            Console.WriteLine("cmp a.X = " + a.X.ToString("R") + "  a.Y = " + b.Y.ToString("R") + " b.X = " + b.X.ToString("R") + "  b.Y = " + b.Y.ToString("R"));
            if (a.X < b.X || a.X == b.X && a.Y < b.Y)
                return 1;
            else
                return 0;
        }
        private class CmpTest<T> {
          public static int DoCmpI(Comparison<T> ccmp, List<T> a, int i) {
            return ccmp(a[i-1], a[i]);
          }
        }

       static int Main(string[] args)
        {
            List<Point> points = new List<Point>();
            for (int i = 0; i < 100; ++i)
            {
                Point pi;
                pi.X = i;
                pi.Y = -i;
                points.Add(pi);
                if (i>=1) {
                  CmpTest<Point>.DoCmpI(cmp, points, i);
                }
            }
            return 0;
        }
   }
}
@papaslavik

This comment has been minimized.

Show comment
Hide comment
@papaslavik

papaslavik May 27, 2016

Contributor

For generics we pass an additional parameter in r0, which makes the last soft-fp argument go to the stack.
But the code generated for the cmp function doesn't know anything about how it's used in generics.

000068 ED93 4A00 vldr s8, [r3]
00006C EDD3 4A01 vldr s9, [r3+4]
000070 EDCD 4A00 vstr s9, [sp]  // [V18 OutArgs]
000074 EE14 3A10 vmov.f2i r3, s8
000078 EE18 1A10 vmov.f2i r1, s16
00007C EE18 2A90 vmov.f2i r2, s17
000080 4630 mov r0, r6
000082 68C4 ldr r4, [r0+12]
000084 6840 ldr r0, [r0+4]
000086 47A0 blx r4

And so the r0 gets spoiled for the arm-softfp abi.

if we try to do the same with the integer arguments for ARM hard then debugging the calls we'll see the trampoline code looking like the following:

0x75b0b060 ldr.w r12, r0, #16
0x75b0b064 mov r0, r1
0x75b0b066 mov r1, r2
0x75b0b068 mov r2, r3
0x75b0b06a ldr.w r3, [sp]
0x75b0b06e bx r12 

could someone give me a hint of where to look for the source of the generated trampoline?
@pgavlin @JohnChen0

Contributor

papaslavik commented May 27, 2016

For generics we pass an additional parameter in r0, which makes the last soft-fp argument go to the stack.
But the code generated for the cmp function doesn't know anything about how it's used in generics.

000068 ED93 4A00 vldr s8, [r3]
00006C EDD3 4A01 vldr s9, [r3+4]
000070 EDCD 4A00 vstr s9, [sp]  // [V18 OutArgs]
000074 EE14 3A10 vmov.f2i r3, s8
000078 EE18 1A10 vmov.f2i r1, s16
00007C EE18 2A90 vmov.f2i r2, s17
000080 4630 mov r0, r6
000082 68C4 ldr r4, [r0+12]
000084 6840 ldr r0, [r0+4]
000086 47A0 blx r4

And so the r0 gets spoiled for the arm-softfp abi.

if we try to do the same with the integer arguments for ARM hard then debugging the calls we'll see the trampoline code looking like the following:

0x75b0b060 ldr.w r12, r0, #16
0x75b0b064 mov r0, r1
0x75b0b066 mov r1, r2
0x75b0b068 mov r2, r3
0x75b0b06a ldr.w r3, [sp]
0x75b0b06e bx r12 

could someone give me a hint of where to look for the source of the generated trampoline?
@pgavlin @JohnChen0

@papaslavik

This comment has been minimized.

Show comment
Hide comment
Contributor

papaslavik commented May 30, 2016

@papaslavik

This comment has been minimized.

Show comment
Hide comment
@papaslavik

papaslavik May 30, 2016

Contributor

If I replace the occurrence of Point in the above example with a new structure PointInt which has the members X and Y integers, then in the code generation process I see the call to
System.Comparison`1[PointInt][ClassLibrary.test+PointInt].Invoke

Contributor

papaslavik commented May 30, 2016

If I replace the occurrence of Point in the above example with a new structure PointInt which has the members X and Y integers, then in the code generation process I see the call to
System.Comparison`1[PointInt][ClassLibrary.test+PointInt].Invoke

@papaslavik

This comment has been minimized.

Show comment
Hide comment
@papaslavik

papaslavik May 30, 2016

Contributor

The whole call tree looks like this, so it is the call with the 2 structs and additional this argument in r0:

N016 ( 35, 29) [000036] --CXG-------- >>>         *  call      int    System.Comparison`1[PointInt][ClassLibrary.test+PointInt].Invoke $214
N008 (  9,  7) [000043] ---XG-------- arg2 in r3  +--*  obj       struct (last use) $445
N007 (  3,  3) [000042] L------------             |  \--*  addr      byref  $482
N006 (  3,  2) [000041] -------N-----             |     \--*  lclVar    struct(U) V06 tmp2         
                                                  |     \--*    int    V06.X (offs=0x00) -> V25 tmp21         r2
                                                  |     \--*    int    V06.Y (offs=0x04) -> V26 tmp22         r4 $444
N011 (  9,  7) [000046] ---XG-------- arg1 in r1  +--*  obj       struct (last use) $447
N010 (  3,  3) [000045] L------------             |  \--*  addr      byref  $483
N009 (  3,  2) [000034] -------N-----             |     \--*  lclVar    struct(U) V05 tmp1         
                                                  |     \--*    int    V05.X (offs=0x00) -> V23 tmp19         r8
                                                  |     \--*    int    V05.Y (offs=0x04) -> V24 tmp20         r9 $446
N012 (  1,  1) [000023] ------------- this in r0  \--*  regVar    ref    V04 tmp0          r7 (last use) REG r0 RV $2c1
Contributor

papaslavik commented May 30, 2016

The whole call tree looks like this, so it is the call with the 2 structs and additional this argument in r0:

N016 ( 35, 29) [000036] --CXG-------- >>>         *  call      int    System.Comparison`1[PointInt][ClassLibrary.test+PointInt].Invoke $214
N008 (  9,  7) [000043] ---XG-------- arg2 in r3  +--*  obj       struct (last use) $445
N007 (  3,  3) [000042] L------------             |  \--*  addr      byref  $482
N006 (  3,  2) [000041] -------N-----             |     \--*  lclVar    struct(U) V06 tmp2         
                                                  |     \--*    int    V06.X (offs=0x00) -> V25 tmp21         r2
                                                  |     \--*    int    V06.Y (offs=0x04) -> V26 tmp22         r4 $444
N011 (  9,  7) [000046] ---XG-------- arg1 in r1  +--*  obj       struct (last use) $447
N010 (  3,  3) [000045] L------------             |  \--*  addr      byref  $483
N009 (  3,  2) [000034] -------N-----             |     \--*  lclVar    struct(U) V05 tmp1         
                                                  |     \--*    int    V05.X (offs=0x00) -> V23 tmp19         r8
                                                  |     \--*    int    V05.Y (offs=0x04) -> V24 tmp20         r9 $446
N012 (  1,  1) [000023] ------------- this in r0  \--*  regVar    ref    V04 tmp0          r7 (last use) REG r0 RV $2c1

@papaslavik papaslavik changed the title from [ARM/Linux] ARM-softfp generic issue to [ARM/Linux] ARM-softfp delegate code generation issue May 30, 2016

@papaslavik

This comment has been minimized.

Show comment
Hide comment
@papaslavik

papaslavik May 30, 2016

Contributor

The issue is in fact related to the delegate calls: see the changed text of the description of the issue.

Contributor

papaslavik commented May 30, 2016

The issue is in fact related to the delegate calls: see the changed text of the description of the issue.

@papaslavik

This comment has been minimized.

Show comment
Hide comment
@papaslavik

papaslavik May 30, 2016

Contributor

The delegate call for the last example looks like this:

===================================================================== *Tree 1
               [000025] --C-G-------- >>>         *  call      void   PointPrinter.Invoke
               [000023] ------------- this in r0  +--*  lclVar    ref    V02 loc1         
               [000027] ----G-------- arg1        \--*  obj       struct
               [000026] L------------                \--*  addr      byref 
               [000024] -------------                   \--*  lclVar    struct(P) V01 loc0         
                                                        \--*    float  V01.X (offs=0x00) -> V05 tmp2         
                                                        \--*    float  V01.Y (offs=0x04) -> V06 tmp3         
$2 = void

Contributor

papaslavik commented May 30, 2016

The delegate call for the last example looks like this:

===================================================================== *Tree 1
               [000025] --C-G-------- >>>         *  call      void   PointPrinter.Invoke
               [000023] ------------- this in r0  +--*  lclVar    ref    V02 loc1         
               [000027] ----G-------- arg1        \--*  obj       struct
               [000026] L------------                \--*  addr      byref 
               [000024] -------------                   \--*  lclVar    struct(P) V01 loc0         
                                                        \--*    float  V01.X (offs=0x00) -> V05 tmp2         
                                                        \--*    float  V01.Y (offs=0x04) -> V06 tmp3         
$2 = void

@janvorli

This comment has been minimized.

Show comment
Hide comment
@janvorli

janvorli May 30, 2016

Member

@papaslavik - the delegate thunk is generated by the COMDelegate::SetupShuffleThunk, which uses GenerateShuffleArray to determine what shuffles are necessary.

Member

janvorli commented May 30, 2016

@papaslavik - the delegate thunk is generated by the COMDelegate::SetupShuffleThunk, which uses GenerateShuffleArray to determine what shuffles are necessary.

@papaslavik

This comment has been minimized.

Show comment
Hide comment
@papaslavik

papaslavik May 30, 2016

Contributor

at code execution the trampoline for the delegate example with Point based on floats looks like the following:

0x75972060 ldr.w  r12, [r0, #16]                                                                                                                                                            
0x75972064      bx     r12    
Contributor

papaslavik commented May 30, 2016

at code execution the trampoline for the delegate example with Point based on floats looks like the following:

0x75972060 ldr.w  r12, [r0, #16]                                                                                                                                                            
0x75972064      bx     r12    
@papaslavik

This comment has been minimized.

Show comment
Hide comment
@papaslavik

papaslavik May 30, 2016

Contributor

@janvorli thanks a lot! will look in that direction.

Contributor

papaslavik commented May 30, 2016

@janvorli thanks a lot! will look in that direction.

@papaslavik

This comment has been minimized.

Show comment
Hide comment
@papaslavik

papaslavik May 31, 2016

Contributor

So, in src/vm/comdelegate.cpp we work with the calling conventions, in particular we have the code working for ARM with the following comments:

// Find the argument location mapping for both source and destination signature. A single argument can
// occupy a floating point register (in which case we don't need to do anything, they're not shuffled)
// or some combination of general registers and the stack.

And that is done before we do the fgMorphCall in which the convention is in fact implemented, so there's no information about ARM soft fp abi at that point of compilation.

Presumably, we need to fix ShuffleIterator::HasNextOfs() function for ARM softfp

Contributor

papaslavik commented May 31, 2016

So, in src/vm/comdelegate.cpp we work with the calling conventions, in particular we have the code working for ARM with the following comments:

// Find the argument location mapping for both source and destination signature. A single argument can
// occupy a floating point register (in which case we don't need to do anything, they're not shuffled)
// or some combination of general registers and the stack.

And that is done before we do the fgMorphCall in which the convention is in fact implemented, so there's no information about ARM soft fp abi at that point of compilation.

Presumably, we need to fix ShuffleIterator::HasNextOfs() function for ARM softfp

@janvorli

This comment has been minimized.

Show comment
Hide comment
@janvorli

janvorli May 31, 2016

Member

@papaslavik - what is the difference between the soft and hard abi in argument passing? Does the softfp mean that floating point arguments are passed in general purpose registers?
If that's the case, then the ArgIteratorTemplate<ARGITERATOR_BASE>::GetNextOffset() in the callingconvention.h below the
https://github.com/dotnet/coreclr/blob/master/src/vm/callingconvention.h#L1064
needs to be modified for the softfp.
This code is responsible for determining which argument goes into which register - it returns an integer offset into the transition block where the parameter is stored.

Member

janvorli commented May 31, 2016

@papaslavik - what is the difference between the soft and hard abi in argument passing? Does the softfp mean that floating point arguments are passed in general purpose registers?
If that's the case, then the ArgIteratorTemplate<ARGITERATOR_BASE>::GetNextOffset() in the callingconvention.h below the
https://github.com/dotnet/coreclr/blob/master/src/vm/callingconvention.h#L1064
needs to be modified for the softfp.
This code is responsible for determining which argument goes into which register - it returns an integer offset into the transition block where the parameter is stored.

@papaslavik

This comment has been minimized.

Show comment
Hide comment
@papaslavik

papaslavik May 31, 2016

Contributor

@janvorli yes, the float arguments are passed on integer registers.
Thanks for the advice.

Contributor

papaslavik commented May 31, 2016

@janvorli yes, the float arguments are passed on integer registers.
Thanks for the advice.

@papaslavik

This comment has been minimized.

Show comment
Hide comment
@papaslavik

papaslavik May 31, 2016

Contributor

In fact that is very similar to varargs passing, so I will stick to that in the code you pointed at:
if (fFloatingPoint && !this->IsVarArg())

Contributor

papaslavik commented May 31, 2016

In fact that is very similar to varargs passing, so I will stick to that in the code you pointed at:
if (fFloatingPoint && !this->IsVarArg())

@papaslavik

This comment has been minimized.

Show comment
Hide comment
@papaslavik

papaslavik Jun 1, 2016

Contributor

@janvorli are there simplified tests for the delegates in the framework, or perhaps, where we could add our simple test to check that the conventions work?

Contributor

papaslavik commented Jun 1, 2016

@janvorli are there simplified tests for the delegates in the framework, or perhaps, where we could add our simple test to check that the conventions work?

papaslavik pushed a commit to papaslavik/coreclr that referenced this issue Jun 1, 2016

@janvorli

This comment has been minimized.

Show comment
Hide comment
@janvorli

janvorli Jun 2, 2016

Member

@papaslavik There are delegate tests in tests/src/CoreMangLib/cti/system/delegate and tests/src/CoreMangLib/cti/system/multicastdelegate. I don't see tests related to passing floats or doubles, so it seems it would make sense to add one in there.

Member

janvorli commented Jun 2, 2016

@papaslavik There are delegate tests in tests/src/CoreMangLib/cti/system/delegate and tests/src/CoreMangLib/cti/system/multicastdelegate. I don't see tests related to passing floats or doubles, so it seems it would make sense to add one in there.

@RussKeldorph RussKeldorph added this to the Future milestone Jun 2, 2016

@mkborg

This comment has been minimized.

Show comment
Hide comment
Contributor

mkborg commented Jun 3, 2016

mkborg pushed a commit to papaslavik/coreclr that referenced this issue Jun 6, 2016

papaslavik added a commit to papaslavik/coreclr that referenced this issue Jun 6, 2016

papaslavik added a commit to papaslavik/coreclr that referenced this issue Jun 6, 2016

Delegate ABI Test
Detects issue #5275

mkborg pushed a commit to papaslavik/coreclr that referenced this issue Jun 6, 2016

Delegate ABI Test
Detects issue #5275

papaslavik added a commit to papaslavik/coreclr that referenced this issue Jun 7, 2016

Delegate ABI Test
Detects issue #5275

mkborg pushed a commit to papaslavik/coreclr that referenced this issue Jun 7, 2016

Delegate ABI Test
Detects issue #5275

jkotas added a commit that referenced this issue Jun 7, 2016

@papaslavik papaslavik closed this Jun 8, 2016

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment