Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[API Proposal]: RuntimeHelpers.Box to create a box around a dynamically-typed byref #97341

Closed
jkoritzinsky opened this issue Jan 22, 2024 · 32 comments · Fixed by #100561
Closed
Labels
api-approved API was approved in API review, it can be implemented area-System.Runtime in-pr There is an active PR which will close this issue when it is merged
Milestone

Comments

@jkoritzinsky
Copy link
Member

jkoritzinsky commented Jan 22, 2024

Background and motivation

In interop scenarios where we have unbounded generic APIs that can take blittable or non-blittable types, we can end up in a scenario where we have a generic type T that has a corresponding type TAbi. However, there is no way to represent TAbi as a generic argument, so we have a mechanism to retrieve it as a System.Type instance. This allows us to still reason about the type.

If we want to do anything with the type though (like copy values around, go from a pointer into a buffer to a managed representation of it), we end up needing to use the APIs on System.Runtime.InteropServices.Marshal that take a System.Type, which are slow and not AOT-compatible.

We hit this scenario in CsWinRT when we are trying to marshal from WinRT a T[] where T is non-blittable. Today CsWinRT uses the APIs on Marshal even though the type will always be blittable as there's no alternative APIs.

API Proposal

namespace System.Runtime.CompilerServices;

public static class RuntimeHelpers
{
+    public object? Box(ref byte target, RuntimeTypeHandle type);
}

API Usage

int length = ...;
IntPtr data = ...;
if (data == IntPtr.Zero)
{
    return null;
}
var array = new T[length];
var data = (byte*)data.ToPointer();
var abi_element_size = Marshal.SizeOf(AbiType); // #97344 API would go here to get rid of the remaining usage of Marshal.SizeOf
for (int i = 0; i < abi.length; i++)
{
    array[i] = Marshaler<T>.FromAbi(RuntimeHelpers.Box(ref *data, AbiType.TypeHandle));
    data += abi_element_size;
}
return array;

Alternative Designs

Mentioned below in the collapsable section.

Risks

Minimal risk.

Previous Proposal

API Proposal

namespace System;

public ref struct TypedReference
{
+    public TypedReference(ref byte target, RuntimeTypeHandle type);
}

API Usage

int length = ...;
IntPtr data = ...;
if (data == IntPtr.Zero)
{
    return null;
}
var array = new T[length];
var data = (byte*)data.ToPointer();
var abi_element_size = Marshal.SizeOf(AbiType); // #97344 API would go here to get rid of the remaining usage of Marshal.SizeOf
for (int i = 0; i < abi.length; i++)
{
    var abi_element_ref = new TypedReference(ref *data, AbiType.TypeHandle);
    array[i] = Marshaler<T>.FromAbi(abi_element_ref.ToObject());
    data += abi_element_size;
}
return array;

Alternative Designs

We could provide another API on RuntimeHelpers to read and box an unmanaged value-type from a buffer, or even provide a higher level API over a ReadOnlySpan<byte>, like a MemoryMarshal.Read(ref ReadOnlySpan<byte>, System.Type) method.

We could make this method a static method with an Unsafe suffix to further describe its unsafeness.

Risks

System.TypedReference is a rarely used type. There may be issues in the runtime implementations around supporting it.

@jkoritzinsky jkoritzinsky added api-suggestion Early API idea and discussion, it is NOT ready for implementation area-System.Runtime labels Jan 22, 2024
@ghost
Copy link

ghost commented Jan 22, 2024

Tagging subscribers to this area: @dotnet/area-system-runtime
See info in area-owners.md if you want to be subscribed.

Issue Details

Background and motivation

In interop scenarios where we have unbounded generic APIs that can take blittable or non-blittable types, we can end up in a scenario where we have a generic type T that has a corresponding type TAbi. However, there is no way to represent TAbi as a generic argument, so we have a mechanism to retrieve it as a System.Type instance. This allows us to still reason about the type.

If we want to do any thing with the type though (like copy values around, go from a pointer into a buffer to a managed representation of it), we end up needing to use the APIs on System.Runtime.InteropServices.Marshal that take a System.Type, which are slow and not AOT-compatible.

We hit this scenario in CsWinRT when we are trying to marshal from WinRT a T[] where T is non-blittable. Today CsWinRT uses the APIs on Marshal even though the type will always be blittable as there's no alternative APIs.

.NET already has a mechanism to represent a byref with a dynamically-determined type: System.TypedReference. However, there's no way to construct one currently from a byref and a type. This API proposal provides a constructor to create such a TypedReference.

API Proposal

namespace System;

public ref struct TypedReference
{
+    public TypedReference(ref byte target, RuntimeTypeHandle type);
}

API Usage

int length = ...;
IntPtr data = ...;
if (data == IntPtr.Zero)
{
    return null;
}
var array = new T[length];
var data = (byte*)data.ToPointer();
var abi_element_size = Marshal.SizeOf(AbiType); // TODO: another API proposal to remove this usage of Marshal.SizeOf
for (int i = 0; i < abi.length; i++)
{
    var abi_element_ref = new TypedReference(ref *data, AbiType.TypeHandle);
    array[i] = Marshaler<T>.FromAbi(abi_element_ref.ToObject());
    data += abi_element_size;
}
return array;

Alternative Designs

We could provide another API on RuntimeHelpers to read and box an unmanaged value-type from a buffer, or even provide a higher level API over a ReadOnlySpan<byte>, like a MemoryMarshal.Read(ref ReadOnlySpan<byte>, System.Type) method.

We could make this method a static method with an Unsafe suffix to further describe its unsafeness.

Risks

System.TypedReference is a rarely used type. There may be issues in the runtime implementations around supporting it.

Author: jkoritzinsky
Assignees: -
Labels:

api-suggestion, area-System.Runtime

Milestone: -

@ghost ghost added the untriaged New issue has not been triaged by the area owner label Jan 22, 2024
@jkotas
Copy link
Member

jkotas commented Jan 23, 2024

We could make this method a static method with an Unsafe suffix to further describe its unsafeness.

We typically deal with this problem by adding the method somewhere under System.Runtime.InteropServices or System.Runtime.CompilerServices namespace (for example, MemoryMarshal.CreateReadOnlySpan).

Should we introduce unsafe getter for the byref as well while we are on it?

@Sergio0694
Copy link
Contributor

"Should we introduce unsafe getter for the byref as well while we are on it?"

Would that be the same as __refvalue? As in, would the idea just be that it'd be documented (but no new functionality)?

"We typically deal with this problem by adding the method somewhere under System.Runtime.InteropServices or System.Runtime.CompilerServices namespace (for example, MemoryMarshal.CreateReadOnlySpan)."

Something like this?

namespace System.Runtime.CompilerServices;

public static class RuntimeHelpers
{
    public static TypedReference CreateTypedReferenceUnsafe(ref byte target, RuntimeTypeHandle type);
    public static ref byte GetValueRefUnsafe(TypedReference reference);
}

@Sergio0694
Copy link
Contributor

Sergio0694 commented Jan 23, 2024

Random idea: would it perhaps make sense to add a TypedReferenceMarshal class instead?
To kinda group the methods and do what we already did for eg. CollectionsMarshal, ImmutableCollectionsMarshal?

Ie.

namespace System.Runtime.InteropServices;

public static class TypedReferenceMarshal
{
    public static TypedReference CreateTypedReference(ref byte target, RuntimeTypeHandle type);
    public static ref byte GetValueRef(TypedReference reference);
}

@jkotas
Copy link
Member

jkotas commented Jan 23, 2024

"Should we introduce unsafe getter for the byref as well while we are on it?"

Would that be the same as __refvalue? As in, would the idea just be that it'd be documented (but no new functionality)?

It would not be the same. __refvalue is safe. It takes the target type statically and validates it. We have #26186 about adding safe ref accessors on TypedRefence. We may want to look at them together with this one to ensure consistency.

public static TypedReference CreateTypedReferenceUnsafe(ref byte target, RuntimeTypeHandle type);
public static ref byte GetValueRefUnsafe(TypedReference reference);

We do not use Unsafe suffixes in unsafe namespaces.

would it perhaps make sense to add a TypedReferenceMarshal class instead?

This is a question for API review discussion. When there is a choice, we typically avoid introducing a new type with just a few members and add the methods to the existing type instead. For example, MemoryMarshal would be a fine place for these two methods. There are two overloads of MemoryMarshal.GetArrayDataReference and these methods are very similar.

@MichalStrehovsky
Copy link
Member

Unfortunately this API will require Roslyn work first - we cannot have APIs that return TypedReference due to the type's special snowflake status within the compiler (any method that returns TypedReference runs into CS1599).

@Sergio0694
Copy link
Contributor

Sergio0694 commented Jan 23, 2024

Well that's quite unfortunate and seems like it'd make this feature much less cheap than expected 🥲
Could we just sidestep the whole issue and just add an API for what we actually need to do? That is:

namespace System.Runtime.CompilerServices;

public static class RuntimeHelpers
{
    public static object? CreateObjectUnsafe(ref byte target, RuntimeTypeHandle type);
}

Like, the whole reason we needed a TypedReference was only so we could then use TypedReference.ToObject on it, but especially if that complicates things and needs language support too, perhaps we could just not surface it in our API shape at all?

@AaronRobinsonMSFT
Copy link
Member

Could we just sidestep the whole issue and just add an API for what we actually need to do?

Is the expectation here that target is an existing block of memory that contains a previously allocated managed object? The example above makes it seem like one has a blob of raw bytes and wants it massaged into a managed object (class or value type), is that the case?

@Sergio0694
Copy link
Contributor

"The example above makes it seem like one has a blob of raw bytes and wants it massaged into a managed object (class or value type), is that the case?"

Yup that's correct! The pointer points to some memory (managed or native, could be either) holding the data for a given ABI-type-for-T value. We need to read that and create a boxed object of type ABI-type-for-T, which we can then pass to that Marshaler<T>.FromAbi method, which will convert that to the right T value. Those FromAbi will usually just do something like (ABI-type-for-T)box and then create a T instance from it with the right (generated) marshalling between the two,

So eg. in the example Jeremy posted, we'd use the API like this:

for (int i = 0; i < abi.length; i++)
{
    array[i] = Marshaler<T>.FromAbi(RuntimeHelpers.CreateObjectUnsafe(ref *data, AbiType.TypeHandle);
    data += abi_element_size;
}

@AaronRobinsonMSFT
Copy link
Member

olding the data for a given ABI-type-for-T value.

Alright. Based on this statement then, my next assumption would be the memory itself could either be unmanaged (on the native heap) or managed (on the GC heap, pinned of course), but contained no managed references. For example, the following would be an ABI type:

// Unmanaged according to C# language rules
struct ABIType
{
    public int* Data;
    public int Len;
}

Where as the following would not:

// Managed according to C# language rules
struct NotABIType
{
    public int[] Data;
}

@jkotas
Copy link
Member

jkotas commented Jan 23, 2024

I would expect RuntimeHelpers.CreateObjectUnsafe to behave exactly same as the IL box instruction, except that the type is specified as an argument instead of statically. It has very similar shape as #97344. #97344 is about providing API that behaves exactly same as the IL sizeof instruction, except that the type is specified as an argument instead of statically.

@Sergio0694
Copy link
Contributor

"my next assumption would be the memory itself could either be unmanaged (on the native heap) or managed (on the GC heap, pinned of course), but contained no managed references."

Yup! All the ABI types are either handcrafted in CsWinRT (eg. for Type, TimeSpan, DateTimeOffset, etc.) or produced by the CsWinRT generator, and they'll always be blittable. We just need to create an object of this ABI type given a pointer to its raw data, in an AOT-safe way.

@Sergio0694
Copy link
Contributor

"It has very similar shape as #97344."

Exactly. Ultimatley we'd like to end up with something like this in CsWinRT, being fully AOT-safe:

var array = new T[length];
var data = (byte*)data.ToPointer();
var abi_element_size = RuntimeHelpers.SizeOf(AbiType.TypeHandle); // 97344
for (int i = 0; i < abi.length; i++)
{
    array[i] = Marshaler<T>.FromAbi(RuntimeHelpers.CreateObjectUnsafe(ref *data, AbiType.TypeHandle); // 97341
    data += abi_element_size;
}

@Sergio0694
Copy link
Contributor

I've opened microsoft/CsWinRT#1463 to enable IsAotCompatible in CsWinRT and fix all AOT warnings, and for now I've just had to always throw NotSupportedException in all of these array marshalling methods for non blittable types. The idea would be that with both this and #97344, we'd be able to make those paths functional again in .NET 9.

@jkotas
Copy link
Member

jkotas commented Jan 23, 2024

It should be called Box since that's what it is doing.

@Sergio0694
Copy link
Contributor

Makes sense. And I assume the unsafe is already implied like you said.

So, updated API proposal:

namespace System.Runtime.CompilerServices;

public static class RuntimeHelpers
{
    public static object? Box(ref byte target, RuntimeTypeHandle type);
}

@AaronRobinsonMSFT
Copy link
Member

ref byte target

I would also like to see if we can add a max length. It seems we should be able to compute that is most cases, no?

@Sergio0694
Copy link
Contributor

I would assume once we also got #97344, callers would have to ensure they do have that size available at the memory area pointed at by target? 🤔

Also: should we make that ref readonly byte?

@jkotas
Copy link
Member

jkotas commented Jan 23, 2024

I would also like to see if we can add a max length. It seems we should be able to compute that is most cases, no?

In most case, the callers would just to call RuntimeHelpers.SizeOf proposed in the other issue to make the API argument validation happy. I do not think that max length argument makes sense.

@AaronRobinsonMSFT
Copy link
Member

AaronRobinsonMSFT commented Jan 23, 2024

I would also like to see if we can add a max length. It seems we should be able to compute that is most cases, no?

In most case, the callers would just to call RuntimeHelpers.SizeOf proposed in the other issue to make the API argument validation happy. I do not think that max length argument makes sense.

That is a different problem. I am looking at this from the interop perspective. In the proposed use case the input is coming from a native context via a raw pointer. My assumption here is that pointer, when passed to native code, may also come with a length or perhaps some other byte size component. My desire here would be to pass that along not necessarily as a "this is the correct size" but instead as a "the max length to consider is X". This is similar to some native APIs where a max length to consider is provided with a pointer. It doesn't mean the end will be at that length, but it does provide the notion of "if you read past this, you've gone too far".

@Sergio0694
Copy link
Contributor

This kinda makes me wonder: if the implementation of the method is going to have to validate the number of bytes actually being read against that additional "max number of bytes" parameter, would it make sense to have this method also return some out int bytesRead value, which would also mean (at least for this scenario) we wouldn't even need #97344 anymore? 🤔

@AaronRobinsonMSFT
Copy link
Member

would it make sense to have this method also return some out int bytesRead value

That seems reasonable to me and aligns with the desired validation from both the caller and callee I would appreciate.

The managed object size is an community request though so I don't think this API can replaced that.

@Sergio0694
Copy link
Contributor

"The managed object size is an community request though"

Just to clarify, are you referring to some other proposal other than #97344? Because if not, Jeremy opened that one as part of the same scenario in CsWinRT, so if eg. we made the API from this proposal also return the number of bytes read, we would have no need for the other one anymore.

@jkotas
Copy link
Member

jkotas commented Jan 23, 2024

That is a different problem. I am looking at this from the interop perspective.

This API is not interop specific. I am looking at this API as a primitive building block that you can use to build interpreters. The CsWinRT use case is replacing a static generic code specialization by dynamic code specialization. It is effectively manually written universal shared generic code - it is slower, but one instance of the code can handle all specializations.

@AaronRobinsonMSFT
Copy link
Member

Just to clarify, are you referring to some other proposal other than #97344?

The notion of managed object size has been around for a long time. See issue #24200. If I recall, there was a COM interface in .NET Framework that I think gave you details on instance size. I would consider #97344 as a special case of the idea.

I am looking at this API as a primitive building block that you can use to build interpreters.

You and your interpreter examples :) Yeah, I get that. Okay, I accept the proposed shape is a better fit in that case. I retract the size argument suggestion.

@jkoritzinsky
Copy link
Member Author

My general concern with using object? here is that it makes doing the opposite operation, "Given an object that is boxed around some dynamic type T, write it to a ref", not as simple as "implement TypedReference.SetObjectReference.

However, I talked with Aaron and he allayed my concerns. We could implement this with a new "UnboxInto" API like the new Box API or just wait until we need the API and revisit it.

I'll change the API proposal to the proposed Box API.

@jkoritzinsky jkoritzinsky changed the title [API Proposal]: Create a System.TypedReference from a ref and a type handle [API Proposal]: RuntimeHelpers.Box to create a box around a dynamically-typed byref Jan 23, 2024
@lambdageek
Copy link
Member

If the method is going to take a pointer and a length, should the API then be:

namespace System.Runtime.CompilerServices;

public static class RuntimeHelpers
{
+    public static object? Box(ReadOnlySpan<byte> target, RuntimeTypeHandle type);
}

@AaronRobinsonMSFT
Copy link
Member

If the method is going to take a pointer and a length, should the API then be:

Yep, but I wanted to discuss the length first prior to going down the span abstraction. I think the length argument has been settled so it isn't needed here.

@lambdageek
Copy link
Member

Yep, but I wanted to discuss the length first prior to going down the span abstraction. I think the length argument has been settled so it isn't needed here.

I'm unclear on why the length argument is settled - if anything the interpreter use case is another reason to have extra validation. The API is bringing together two unrelated arguments: a chunk of memory and a type descriptor and essentially doing a memcpy of a number of bytes (determined by the type) from the chunk of memory. There's no reason to believe the chunk of memory is big enough.

@jkotas
Copy link
Member

jkotas commented Jan 24, 2024

This is very unsafe low-level API. Validating the length is not going to make this API safe, but it is going to make it slower and more cumbersome to use.

For example, consider how something like TypedReference.ToObject would be implemented using this API. TypedReference has a reference to the memory block and the type. The memory block is assumed to be valid for the given type. Validating the length would not provide any benefit in this case.

If you want to validate the memory size for your case, you can do that by building a helper method that checks the length using API proposed in #97344 and then calls this one.

@tannergooding
Copy link
Member

Is this ready to be marked api-ready-for-review? It looks like it, but I just want to confirm based on the discussion above.

@jkotas jkotas added api-ready-for-review API is ready for review, it is NOT ready for implementation api-suggestion Early API idea and discussion, it is NOT ready for implementation and removed api-suggestion Early API idea and discussion, it is NOT ready for implementation labels Feb 1, 2024
@AaronRobinsonMSFT AaronRobinsonMSFT added this to the 9.0.0 milestone Feb 1, 2024
@AaronRobinsonMSFT AaronRobinsonMSFT removed the untriaged New issue has not been triaged by the area owner label Feb 1, 2024
@bartonjs
Copy link
Member

bartonjs commented Apr 2, 2024

Video

  • Looks good as proposed
namespace System.Runtime.CompilerServices;

public static partial class RuntimeHelpers
{
     public object? Box(ref byte target, RuntimeTypeHandle type);
}

@bartonjs bartonjs added api-approved API was approved in API review, it can be implemented and removed api-ready-for-review API is ready for review, it is NOT ready for implementation labels Apr 2, 2024
@dotnet-policy-service dotnet-policy-service bot added the in-pr There is an active PR which will close this issue when it is merged label Apr 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api-approved API was approved in API review, it can be implemented area-System.Runtime in-pr There is an active PR which will close this issue when it is merged
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants