Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

.NET 5.0+ Marshal.PtrToStructure is slower 8x than netcore3.1 #45100

Closed
sunliusi opened this issue Nov 23, 2020 · 13 comments · Fixed by #81941
Closed

.NET 5.0+ Marshal.PtrToStructure is slower 8x than netcore3.1 #45100

sunliusi opened this issue Nov 23, 2020 · 13 comments · Fixed by #81941
Labels
Milestone

Comments

@sunliusi
Copy link

sunliusi commented Nov 23, 2020

Here is my test code:

public class PtrBenchmarkTest
    {
        private readonly byte[] buff = new byte[Marshal.SizeOf<StructObject>()];

        public PtrBenchmarkTest()
        {
            var obj = new StructObject();
            obj.Symbol = "rb2101";
            obj.Exchange = 12;
            obj.Px = 54.90F;
            unsafe
            {
                fixed (byte* bp = buff)
                {
                    Marshal.StructureToPtr(obj, (IntPtr)bp, false);
                }
            }
        }

        [Benchmark]
        public StructObject PtrToStructure()
        {
            unsafe
            {
                fixed (byte* bp = buff)
                {
                    return (StructObject)Marshal.PtrToStructure((IntPtr)bp, typeof(StructObject));
                }
            }
        }
    }

    [StructLayout(LayoutKind.Sequential, Pack = 1)]
    public class StructObject
    {
        [MarshalAs(UnmanagedType.ByValTStr, SizeConst = 16)]
        public string Symbol;

        [MarshalAs(UnmanagedType.I8)]
        public Int64 Exchange;

        [MarshalAs(UnmanagedType.R4)]
        public float Px;
    };

netcore3.1 Benchmark test result:

Method Mean Error StdDev Median
PtrToStructure 522.44 ns 29.418 ns 84.878 ns 503.07 ns

net5.0 Benchmark test result:

Method Mean Error StdDev Median
PtrToStructure 4,098.30 ns 132.015 ns 372.350 ns 4,004.22 ns

os: windows 10

@Dotnet-GitSync-Bot Dotnet-GitSync-Bot added the untriaged New issue has not been triaged by the area owner label Nov 23, 2020
@Dotnet-GitSync-Bot
Copy link
Collaborator

I couldn't figure out the best area label to add to this issue. If you have write-permissions please help me learn by adding exactly one area label.

@GrabYourPitchforks
Copy link
Member

Note to whoever looks into this - the PR #32520 will impact PtrToStructure performance. If you're running benchmarks between 3.x / 5.0 / 6.0 master, double-check whether that PR has been merged, otherwise it could mask any regressions introduced in 5.0.

@jkotas
Copy link
Member

jkotas commented Nov 23, 2020

@jkoritzinsky

@jkoritzinsky
Copy link
Member

This regression was caused by dotnet/coreclr#26340. That PR improved performance in the P/Invoke and Reverse P/Invoke cases, but we didn't realize until recently that it regressed performance for the Marshal APIs.

Generally for best performance in interop scenarios, we recommend using blittable types when possible.

In any case, I'll investigate and see if there's a way we can get this performance loss back in .NET 6.

@NN---
Copy link
Contributor

NN--- commented Nov 23, 2020

@sunliusi Unexpectedly in C# it is type safer to use pointers , also it makes faster code since it doesn't involve copying which Marshal methods do.

@sunliusi
Copy link
Author

sunliusi commented Nov 26, 2020

It is more efficient to use Pointers directly.

here is my code:

[Benchmark] public StructObject ReadIntPtrUtf8() { var packet = new StructObject(); unsafe { fixed (byte* bp = buff) { packet.Symbol = Encoding.UTF8.GetString(bp, 16); Unsafe.Copy(ref packet.Exchange, bp + 16); Unsafe.Copy(ref packet.Px, bp + 24); } } return packet; }

netcore3.1 Benchmark test result:

Method Mean Error StdDev Median
ReadIntPtrUtf8 63.93 ns 2.083 ns 6.011 ns 62.61 ns
PtrToStructure 575.86 ns 25.006 ns 69.292 ns 555.69 ns

net5.0 Benchmark test result:

Method Mean Error StdDev Median
ReadIntPtrUtf8 54.64 ns 2.280 ns 6.541 ns 53.60 ns
PtrToStructure 3,483.66 ns 69.699 ns 169.657 ns 3,424.06 ns

@AaronRobinsonMSFT AaronRobinsonMSFT removed the untriaged New issue has not been triaged by the area owner label Nov 30, 2020
@AaronRobinsonMSFT AaronRobinsonMSFT added this to the 6.0.0 milestone Nov 30, 2020
@AaronRobinsonMSFT
Copy link
Member

I think given we have a best-practice and there was a reason for other more common scenarios to be improved we can defer this for now.

@AaronRobinsonMSFT AaronRobinsonMSFT modified the milestones: 6.0.0, 7.0.0 Jun 22, 2021
@deeprobin
Copy link
Contributor

I suspect it is because we are creating a new instance via the Activator.

/// <summary>
/// Creates a new instance of "structuretype" and marshals data from a
/// native memory block to it.
/// </summary>
[RequiresDynamicCode("Marshalling code for the object might not be available")]
public static object? PtrToStructure(IntPtr ptr,
[DynamicallyAccessedMembers(DynamicallyAccessedMemberTypes.PublicConstructors | DynamicallyAccessedMemberTypes.NonPublicConstructors)]
Type structureType!!)
{
if (ptr == IntPtr.Zero)
{
return null;
}
if (structureType.IsGenericType)
{
throw new ArgumentException(SR.Argument_NeedNonGenericType, nameof(structureType));
}
if (structureType is not RuntimeType)
{
throw new ArgumentException(SR.Argument_MustBeRuntimeType, nameof(structureType));
}
object structure = Activator.CreateInstance(structureType, nonPublic: true)!;
PtrToStructureHelper(ptr, structure, allowValueClasses: true);
return structure;
}

Can you provide a benchmark which affects the generic overload PtrToStructure<T>.

@AaronRobinsonMSFT AaronRobinsonMSFT modified the milestones: 7.0.0, Future Mar 21, 2022
@AaronRobinsonMSFT AaronRobinsonMSFT changed the title dotnet 5.0 Marshal.PtrToStructure is slower 8x than netcore3.1 .NET 5.0+ Marshal.PtrToStructure is slower 8x than netcore3.1 Mar 21, 2022
@gchernis
Copy link

gchernis commented Feb 3, 2023

The performance issue is real. We have relied on the performant mapping of bytes to structs with MarshalAs/SizeConst for quite some time, with just under 100 hits throughout the code base. Please find the time to resurrect the speed.

@d79ima
Copy link

d79ima commented Feb 3, 2023

The performance issue is real. We have relied on the performant mapping of bytes to structs with MarshalAs/SizeConst for quite some time, with just under 100 hits throughout the code base. Please find the time to resurrect the speed.

I would second that

@AaronRobinsonMSFT
Copy link
Member

The performance issue is real. We have relied on the performant mapping of bytes to structs with MarshalAs/SizeConst for quite some time, with just under 100 hits throughout the code base. Please find the time to resurrect the speed.

For the best performance avoiding this API is recommended. As noted in #45100 (comment), it is far more efficient to fallback to pointers if needed or ideally switch over to LibraryImport with custom struct marshalling - which can provide the most ideal performance.

This isn't to say the issue isn't real, it very much is a regression. The push back is based on the benefits that were gained in the other scenarios as noted in #45100 (comment).

Understanding why teams cannot move to blittable types or LibraryImport will help us to provide guidance but at this point effort to gain back the regression for this API will be minimal.

@Symbai
Copy link

Symbai commented Feb 4, 2023

ideally switch over to LibraryImport with custom struct marshalling - which can provide the most ideal performance.

It would be superior if we could use VS to convert existing structs into custom struct marshalling ones. Especially as we can currently use VS to convert DllImport to LibraryImport. This would makes it a LOT easier for developers to gain the most performance without knowing all the underlying details. Can this be done? Where can I add a request for it?

@AaronRobinsonMSFT
Copy link
Member

Can this be done? Where can I add a request for it?

It can. @jkoritzinsky has been working on a source generator for producing custom struct marshallers. Work is being done for a COM source generator right now. After that, having support for a source generator solution is possible, especially with user request. Feel free to file a new issue in this repo for that feature.

/cc @agocke @dotnet/interop-contrib

jkotas added a commit to jkotas/runtime that referenced this issue Feb 10, 2023
This change adds a simple cache for the marshalling stubs. This cache improves performance of the test mentioned in the issue by 20+x.

Fixes dotnet#45100
@ghost ghost added the in-pr There is an active PR which will close this issue when it is merged label Feb 10, 2023
jkotas added a commit that referenced this issue Feb 10, 2023
…81941)

* Fix major performance regression in legacy struct marshalling APIs

This change adds a simple cache for the marshalling stubs. This cache improves performance of the test mentioned in the issue by 20+x.

Fixes #45100

Co-authored-by: Aaron Robinson <arobins@microsoft.com>
@ghost ghost removed the in-pr There is an active PR which will close this issue when it is merged label Feb 10, 2023
@ghost ghost locked as resolved and limited conversation to collaborators Mar 12, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.