Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement struct marshalling via IL Stubs instead of via FieldMarshalers #26340

Open
wants to merge 94 commits into
base: master
from

Conversation

@jkoritzinsky
Copy link
Member

commented Aug 23, 2019

Currently, our system for marshalling fields of structures between managed and native code is completely separate from our system for marshalling parameters or return values, even though most of the code in the two systems can be shared. This PR unifies the two systems by removing the field marshalers in favor of using IL stubs and a new NativeFieldDescriptor concept which is 9 bytes smaller then the old FieldMarshalers.

Perf numbers:

CoreCLR.dll's size is reduced by ~30kB on Windows x64.

I wrote some microbenchmarks to benchmark marshalling with various types of structs:

Struct Definitions
public class Common
{
    public const int NumArrElements = 2;
}
//////////////////////////////struct definition///////////////////////////
[StructLayout(LayoutKind.Sequential)]
public struct InnerSequential
{
    public int f1;
    public float f2;
    public string f3;
}

[StructLayout(LayoutKind.Sequential)]//struct containing one field of array type
public struct InnerArraySequential
{
    [MarshalAs(UnmanagedType.ByValArray, SizeConst = Common.NumArrElements)]
    public InnerSequential[] arr;
}


public struct HFA
{
    public float f1;
    public float f2;
    public float f3;
    public float f4;
}

[StructLayout(LayoutKind.Sequential)]
public unsafe struct FixedBufferClassificationTest
{
    public fixed int arr[3];
    public NonBlittableFloat f;
}

// A non-blittable wrapper for a float value.
// Used to force a type with a float field to be non-blittable
// and take a different code path.
[StructLayout(LayoutKind.Sequential)]
public struct NonBlittableFloat
{
    public NonBlittableFloat(float f)
    {
        arr = new []{f};
    }

    [MarshalAs(UnmanagedType.ByValArray, SizeConst = 1)]
    private float[] arr;
    
    public float F => arr[0];
}

[StructLayout(LayoutKind.Sequential)]
public struct S8
{
    public string name;
    public bool gender;
    [MarshalAs(UnmanagedType.Error)]
    public int i32;
    [MarshalAs(UnmanagedType.Error)]
    public uint ui32;
    [MarshalAs(UnmanagedType.U2)]
    public ushort jobNum;
    [MarshalAs(UnmanagedType.I1)]
    public sbyte mySByte;
}

The following are the perf numbers I got on Windows x64: (CoreRun Master is a local Release build of CoreCLR on commit 402af7b.)

BenchmarkDotNet=v0.11.5, OS=Windows 10.0.18970
Intel Core i7-7700 CPU 3.60GHz (Kaby Lake), 1 CPU, 8 logical and 4 physical cores
.NET Core SDK=3.0.100-preview9-014004
  [Host]     : .NET Core 3.0.0-preview9-19423-09 (CoreCLR 4.700.19.42102, CoreFX 4.700.19.42104), 64bit RyuJIT
  Job-EFUBVD : .NET Core ? (CoreCLR 5.0.19.42001, CoreFX 4.700.19.35605), 64bit RyuJIT
  Job-WIAUZG : .NET Core ? (CoreCLR 5.0.19.45401, CoreFX 5.0.19.42613), 64bit RyuJIT

Method Toolchain Mean Error StdDev Ratio RatioSD Gen 0 Gen 1 Gen 2 Allocated
S8ByValue CoreRun Master 13,962.77 ns 212.9081 ns 199.1543 ns 1.00 0.00 - - - -
S8ByValue CoreRun Struct IL Stubs 12,518.12 ns 151.9199 ns 142.1060 ns 0.90 0.02 - - - -
InnerArraySequentialByValue CoreRun Master 27,112.46 ns 372.1051 ns 348.0673 ns 1.00 0.00 - - - -
InnerArraySequentialByValue CoreRun Struct IL Stubs 25,386.27 ns 503.3039 ns 446.1658 ns 0.94 0.02 - - - -
FixedBufferClassificationTestByValue CoreRun Master 144.83 ns 2.4056 ns 2.2502 ns 1.00 0.00 - - - -
FixedBufferClassificationTestByValue CoreRun Struct IL Stubs 77.08 ns 0.7635 ns 0.6768 ns 0.53 0.01 - - - -

The allocation in InnerArraySequentialByValue is a System.RuntimeMethodInfoStub allocated by the JIT_GetRuntimeMethodStub helper for implementing the ldtoken instruction that loads the token for the nested InnerSequential struct IL stub onto the stack in the InnerArraySequential struct IL stub.

In addition to normal CoreCLR testing, I've also run the WinForms integration test suite with this local build of CoreCLR to validate that it doesn't break upstream. I've also run the struct marshalling tests with GCStress modes 3 and C.

jkoritzinsky added 30 commits Jun 21, 2019
… field marshalling logic. Still need to handle WinRT struct field logic correctly.
…arshalers consistent with old FieldMarshaler error reporting.
Fix marshalling of LayoutClass fields in structs.
@jkoritzinsky

This comment has been minimized.

Copy link
Member Author

commented Aug 27, 2019

/azp run coreclr-ci

@azure-pipelines

This comment has been minimized.

Copy link

commented Aug 27, 2019

Pull request contains merge conflicts.
@jkoritzinsky

This comment has been minimized.

Copy link
Member Author

commented Aug 28, 2019

I've updated the perf benchmarks with the current result. I've removed the allocation from each iteration by caching it in the MethodDesc's LoaderAllocator (which has the same maximum lifetime as the allocated object).

@jkoritzinsky

This comment has been minimized.

Copy link
Member Author

commented Sep 5, 2019

Perf numbers updated for commit 034db68 where I've changed the ldtoken StructStub to ldftn StructStub and removed the RuntimeMethodInfoStub caching.

@jkoritzinsky jkoritzinsky marked this pull request as ready for review Sep 11, 2019
@jkoritzinsky jkoritzinsky requested a review from AaronRobinsonMSFT Sep 11, 2019
…t-marshalling-ilstubs
@jkoritzinsky

This comment has been minimized.

Copy link
Member Author

commented Oct 7, 2019

@AaronRobinsonMSFT can you make a review pass when you have a chance?

Copy link
Member

left a comment

I am half way through at dllimport.h.

int numChars = strManaged.Length;
if (numChars >= length)
{
numChars = length - 1;

This comment has been minimized.

Copy link
@AaronRobinsonMSFT

AaronRobinsonMSFT Oct 9, 2019

Member

Is there a doc reference for the logic where we apply a guaranteed null terminator?

This comment has been minimized.

Copy link
@jkoritzinsky

jkoritzinsky Oct 11, 2019

Author Member

I don't believe we have a doc reference for the auto-null-terminator logic.

src/vm/dllimport.cpp Outdated Show resolved Hide resolved
src/vm/dllimport.cpp Outdated Show resolved Hide resolved
src/vm/dllimport.cpp Outdated Show resolved Hide resolved
src/vm/dllimport.cpp Show resolved Hide resolved
src/vm/dllimport.cpp Outdated Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants
You can’t perform that action at this time.