Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Treat zero-length array fields as variable length, inline arrays #387

Closed
AArnott opened this issue Sep 2, 2021 · 15 comments · Fixed by #1130
Closed

Treat zero-length array fields as variable length, inline arrays #387

AArnott opened this issue Sep 2, 2021 · 15 comments · Fixed by #1130
Assignees
Labels
enhancement New feature or request metadata gem A feature of the metadata that cswin32 does not yet utilitze

Comments

@AArnott
Copy link
Member

AArnott commented Sep 2, 2021

See microsoft/win32metadata#571 and microsoft/win32metadata#266 as examples.

1-length arrays apparently appear to represent variable length as well: microsoft/win32metadata#912

@AArnott AArnott added the enhancement New feature or request label Sep 2, 2021
@AArnott AArnott added the metadata gem A feature of the metadata that cswin32 does not yet utilitze label May 17, 2022
@Slion
Copy link

Slion commented Jun 18, 2022

As discussed in #589 this is also an issue for SP_DEVICE_INTERFACE_DETAIL_DATA_W.

@AArnott
Copy link
Member Author

AArnott commented Oct 27, 2022

Fix #195 together with this one.

@AArnott
Copy link
Member Author

AArnott commented Aug 11, 2023

We have FlexibleArrayAttribute in the metadata that now identifies these fields.

@AArnott
Copy link
Member Author

AArnott commented Jan 24, 2024

We have a couple options for APIs that we could generate to make using variable-length arrays inside structs easier. All of them involve allocating native memory or pinning managed memory, as necessary because these structs have arbitrary length and must cross the interop boundary.

Option 1: minimal assistance

Pros

  • Alloc and Free methods assist with correctly sizing the buffer behind the list.
  • An UnsafeCreateSpan method makes interacting with the list easier.
  • UnsafeCreateSpan applies even when receiving the struct instead of constructing it.

Cons

  • Calling code must use a finally block to ensure memory is released.
  • Calling code must remember the size of the array used in Alloc to pass into the UnsafeCreateSpan method.
  • Usage requires unsafe blocks. Though this isn't a real big deal since inevitably unsafe code will be required to share the struct across interop.

Sample usage

[Fact]
public unsafe void FlexibleArraySizing()
{
    const int count = 3;
    PAGESET* pPageSet = PAGESET.Alloc(count);
    try
    {
        pageSet->cPageRange = count;
        Span<__PAGERANGE_INLINE> pageRange = pPageSet->rgPages_UnsafeCreateSpan(count);
        for (int i = 0; i < count; i++)
        {
            pageRange[i].iFrom = i * 2;
            pageRange[i].cPage = (i * 2) + 1;
        }
    }
    finally
    {
        PAGESET.Free(pPageSet);
    }
}

Option 2: Employ the C# using pattern for memory management

This option adds a nested struct called Helper that remembers the count passed in during construction.

Pros

  • The AsSpan property on the helper already knows the length of the struct as it was provided when allocated.
  • A using statement usually eliminates the need for a finally block to release native memory.
  • The caller needn't use unsafe blocks while interacting with the struct. A pointer returning property exposes T* for use in interop APIs.

Cons

  • Received structs would have to be passed to another Helper factory to set the pointer and length before AsSpan could be used.
  • The original struct can only be accessed via the AsRef property on the helper struct.

Sample usage

[Fact]
public void FlexibleArraySizing_Helper()
{
    const int count = 3;
    using PAGESET.Helper pageSet = PAGESET.Helper.Create(count);
    pageSet.AsRef.cPageRange = count;
    Span<__PAGERANGE_INLINE> pageRange = pageSet.AsSpan;
    for (int i = 0; i < count; i++)
    {
        pageRange[i].iFrom = i * 2;
        pageRange[i].cPage = (i * 2) + 1;
    }

    unsafe
    {
        PAGESET* p = pageSet.AsPtr;
    }
}

You can see these options and a bit more, plus the code behind the proposed APIs here.

@riverar @tannergooding thoughts?

@AArnott
Copy link
Member Author

AArnott commented Jan 24, 2024

😆 ... or I could go with the simple and elegant solution @tannergooding uses here.

That makes access easy, though it does nothing for the allocation of the memory, which I'm still interested in making easier.

@riverar
Copy link

riverar commented Jan 24, 2024

Looking at BITMAPINFO

public struct BITMAPINFO
{
  public BITMAPINFOHEADER bmiHeader;

  [FlexibleArray]
  public RGBQUAD[] bmiColors;
}

I don't have a lot of CsWin32 context in my brain at the moment, so forgive my silly question: Can CsWin32 generator expose bmiColors as a Span<RGBQUAD> here?

BITMAPINFO bitmapInfo = new();
bitmapInfo.biColors = new[] { 0x11223344, 0x44332211 };
PInvoke.CreateDIBSection(..., bitmapInfo, ...);

BITMAPINFO bitmapInfo = new();
bitmapInfo.biColors = new RGBQUAD[512];
PInvoke.GetDIBits(..., ref bitmapInfo, ...);

@AArnott
Copy link
Member Author

AArnott commented Jan 24, 2024

Thanks for looking, @riverar. No, in C# structs cannot have Span<T> fields unless the struct itself is a ref struct, which is severely limiting in terms of where the program is allowed to use and keep the struct -- it must always be on the stack, which pretty much prohibits the use case of variable sized structs.

And your new RGBQUAD[512] syntax would allocate an array on the heap, separate from the struct. This wouldn't meet the memory requirement that the struct act as a header to the array, meaning the array must immediately follow the other struct fields in memory.

What I like about @tannergooding's approach in terrafx is that the struct field almost feels like an array or span. Its indexer "just works" and you can get a span through calling a method directly on that field. IMO far more intuitive than the options I suggested earlier.

But I want to keep the Alloc/Free methods. Or maybe just a method to help calculate the size required for a given length of array, since it's not quite trivial, yet important to get right.

@riverar
Copy link

riverar commented Jan 24, 2024

Oh right, Span is stack-only, derp.

I think his approach provides the most natural C# developer UX. If we can build on top of that, I think that's ideal.

If I had to type out any of the other options above, I'd be very angry. Especially since it's all in the name of avoiding unnecessary allocations at all costs, a CsWin32 principle that I (respectfully) never agreed with and that developers may not even care about.

@AArnott
Copy link
Member Author

AArnott commented Jan 24, 2024

Your concern about CsWin32's principle to "avoiding unnecessary allocations at all costs" seems more extreme than the actual principle I'm following. After all, CsWin32 defaults to generating SafeHandle overloads, and allocations are incurred by that.

But really what I am curious about is what bearing that has on this particular thread. Variable length structs are fundamentally tricky in C#. I can't think of a way to significantly simplify the requirement on the user by allowing the generated code to make allocations. If you can, I'm interested in considering that option.

I've consolidated and revised the samples above to one that follows the TerraFX pattern below. I've added a SizeOf(uint) static method to the struct so while the user does have to allocate memory for the struct explicitly, at least the user doesn't have to do the math to figure out how much they need.
There are several ways to allocate this memory, BTW. They could allocate a byte[] on the GC heap and pin it, or use a variety of native heap allocating methods (one of these being demonstrated below). But all of them requires releasing the memory (or pin) explicitly when done with it. And since CsWin32 has no idea how long a given struct will need to be in memory, I can't think of how CsWin32 can make this any easier. These are just fundamentally hard structures to work with, I think.

[Fact]
public unsafe void FlexibleArraySizing()
{
    const int count = 3;
    PAGESET* pPageSet = (PAGESET*)Marshal.AllocHGlobal(PAGESET.SizeOf(count));
    try
    {
        pPageSet->rgPages[0].nFromPage = 0;

        Span<PAGERANGE> pageRange = pPageSet->rgPages.AsSpan(count);
        for (int i = 0; i < count; i++)
        {
            pageRange[i].nFromPage = i * 2;
            pageRange[i].nToPage = (i * 2) + 1;
        }
    }
    finally
    {
        Marshal.FreeHGlobal((IntPtr)pPageSet);
    }
}

@AArnott
Copy link
Member Author

AArnott commented Jan 24, 2024

BTW regarding the SizeOf method, I'm wondering whether I should be using sizeof (which requires the unsafe keyword to appear on the method declaration) or Marshal.SizeOf<T>. I expect the latter to be safer, but would sizeof be 'safe' to use as well?

internal static unsafe int SizeOf(int count)
{
    int v = sizeof(PAGESET);
    if (count > 1)
    {
        checked
        {
            v += (count - 1) * sizeof(PAGERANGE);
        }
    }
    else if (count < 0)
    {
        throw new ArgumentOutOfRangeException(nameof(count));
    }

    return v;
}

vs

internal static int SizeOf(int count)
{
    int v = Marshal.SizeOf<PAGESET>();
    if (count > 1)
    {
        checked
        {
            v += (count - 1) * Marshal.SizeOf<PAGERANGE>();
        }
    }
    else if (count < 0)
    {
        throw new ArgumentOutOfRangeException(nameof(count));
    }

    return v;
}

@AArnott
Copy link
Member Author

AArnott commented Jan 24, 2024

@tannergooding since I have to emit code that works on .NET Framework as well, my AsSpan method can't be just like yours. What do you think of this?

        [UnscopedRef]
        [MethodImpl(MethodImplOptions.AggressiveInlining)]
        internal Span<T> AsSpan(int length)
        {
#if NET5_0_OR_GREATER
            return MemoryMarshal.CreateSpan(ref this.e0, length);
#else
            unsafe
            {
                fixed (void* p = &this.e0)
                {
                    return new Span<T>(p, length);
                }
            }
#endif
        }

@tannergooding
Copy link
Member

tannergooding commented Jan 24, 2024

What do you think of this?

This unfortunately isn't safe/doesn't work because Span<T> on .NET Framework is "slow span". Really this just means it doesn't use interior pointers and so new Span(pointer, length) is untracked, where-as it and MemoryMarshal.CreateSpan(ref x, length) is tracked on .NET Core (the latter only being available on .NET Core as well, notably)

So if you had class MyClass { public BITMAPINFO x; } and then did x.AsSpan(...) using that API, then MyClass could be collected on .NET Framework, while it wouldn't be collected on .NET Core.

For .NET Framework, really the only safe thing is to expose an indexer which uses Unsafe.Add to index the correct element efficiently. You can't safely expose an AsSpan() helper at all there.

I'm wondering whether I should be using sizeof (which requires the unsafe keyword to appear on the method declaration) or Marshal.SizeOf. I expect the latter to be safer, but would sizeof be 'safe' to use as well?

For a blittable type, they should be reporting identical sizes. However, Marshal.SizeOf<T>() is much slower and not optimized, while sizeof(T) and Unsafe.SizeOf<T>() are treated as JIT constants, so they are always preferred for blittable types.

For a non-blittable type, they aren't equivalent. sizeof(T) and Unsafe.SizeOf<T>() represent the size of the object in managed memory, Marshal.SizeOf<T>() represents the size after marshalling has occurred.

@AArnott
Copy link
Member Author

AArnott commented Jan 24, 2024

Thanks, @tannergooding. The sizeof stuff makes sense. I consider (and enforce) all such structs to be unmanaged because marshaling won't be happening as all this has to go through native pointers anyway.

Regarding the use of Span, I understand and agree in general. However in this case, your syntax of BITMAPINFO as a field on a class is itself incorrect. Because the struct has variable size, it can never be stored directly in a field or local variable -- at least not while more than 1 element is stored (and that only by a fluke). It must always be stored as a BITMAPINFO*, and thus always be pinned. As such, even on .NET Core, Span is only safe so long as the pointer remains valid. I suspect the same is true on the .NET Framework. And if so, I should be able to expose this span on either runtime.

True?

If not, I can scope out the AsSpan method to only appear on .NET Core easily enough.

@tannergooding
Copy link
Member

tannergooding commented Jan 24, 2024

all such structs to be unmanaged

Note that unmanaged != blittable. There is some minor nuance in how bool and char are handled. You have to use [assembly: DisableRuntimeMarshalling] to make unmanaged == blittable, but that is .NET Core only.
-- On .NET Framework and on .NET Core without DisableRuntimeMarshalling, bool is never blittable and always has a marshalling stub which normalizes it to 0 or 1. You can at least make the size correct by explicitly setting it to marshal as U1, however. char defaults to ASCII (well Ansi technically) on Windows and likewise needs to be explicitly told to marshal as UTF-16 (2-bytes)

thus always be pinned

This isn't strictly a guarantee. For example, a dev could opt to do this as byte[] tmp = new byte[sizeof(BITMAPINFO) + sizeof(RGBQUAD) * (length - 1)] and then ref BITMAPINFO info = ref Unsafe.As<byte, BITMAPINFO>(ref tmp[0]).

That is notably unlikely and you might decide that it isn't worth supporting or that its fine to tell devs on .NET Framework "don't do that". But it is something to at least consider from a general correctness perspective.

This also becomes relevant when deciding how to handle other types of InlineArrays which aren't variable sized. For example on .NET 8+ you can now use [InlineArray] and the compiler just gives you implicit conversions to spans, efficient indexers, etc. But on .NET Framework or .NET 7 and below you have to expose your own indexers and such, so its a similar position where AsSpan is safe and desirable on .NET Core, but not feasible on .NET Framework and where you can't trivially argue that it must be passed around by pointer/reference.

@AArnott
Copy link
Member Author

AArnott commented Jan 24, 2024

Thanks again.
CsWin32 already has fixed-length inline array support, and as you say, it only offers the span properties on .NET.

CsWin32 doesn't leverage .NET 8 and [InlineArray] yet (#1086), and when it does, it'll have to continue to support the downlevel runtimes as it does today.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request metadata gem A feature of the metadata that cswin32 does not yet utilitze
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants