COM interfaces should be interfaces, and structs should be marshable #26

AArnott · 2021-01-17T21:31:43Z

In consideration of projects that are maintained by developers that prefer to avoid pointers and the unsafe keyword in C#, we might offer a setting to generate code that does not use pointers. In particular, code generation would:

Generate only the "friendly overloads" as the direct extern methods. Instead of getting pointer-based extern methods with pointer-free friendly overloads, we would generate just the extern methods with the friendly signatures.
Use delegates instead of function pointers.
Use marshalled structs instead of blittable structs.
Use interfaces for COM instead of structs.

Note that mixing the two varieties where marshalled structs and pointer-based extern methods coexist is a non-goal as it would be very complex to maintain and perhaps generate confusing code as well.

The text was updated successfully, but these errors were encountered:

wjk · 2021-02-03T02:16:06Z

Does this mean that CsWinRT would use ComImportAttribute with CsWinRT code if this option is implemented and enabled? I don’t mind pointers or unsafe code if it means that the output assembly is NativeAOT-compatible, which ComImportAttribute certainly isn’t.

AArnott · 2021-02-05T15:48:19Z

How does the ComImportAttribute affect AOT compilation?
But no, I don't think we would add that attribute as the interfaces would not have been imported from a TLB. Although I think the attribute is necessary to make the CLR realize type equivalence between interfaces.

wjk · 2021-02-05T17:06:14Z

Simple: ComImportAttribute requires support in the AOT runtime that is not currently implemented, and is far too complex for me to try to implement. If I try to use such a type in a NativeAOT’d project today, I will get a PNSE whenever I call a method from an imported COM interface.

Edit: AFAIK other than ComImportAttribute NativeAOT supports all other interop marshalling scenarios. Use of P/Invoke (incl. Marshal.GetDelegateForFunctionPointer) instead of raw pointers is just fine; we would simply have to manage the VTables of COM interface pointers ourselves, rather than relying on the runtime to do it for us.

weltkante · 2021-02-16T19:23:00Z

Coming from #86 I'd suggest making the non-struct projection the default and not an opt-in, so people who don't have in-depth knowledge of COM don't immediately break the whole environment by violating COM rules when talking with external components. Struct based COM objects have a lot of disadvantages and are only useful if you know exactly what you're doing. For general one-off interop its usually not worth the effort, C++ has smart pointers and ATL/WRL for implementing COM objects for a reason, because its really hard to get everything right if you have to do everything yourself. Exposing the raw COM ABI as the "default experience" to C# developers looking for an interop nuget package feels like a flawed design.

Disadvantages of struct based COM objects:

manual reference management is very error prone (you need to know a lot of rules)
you need to litter your code with try-finally blocks to properly release references
cannot easily be stored anywhere outside the stack, as the GC of its container will cause a reference leak
if you implement a Finalizer for the container you need in-depth knowledge on how to properly handle apartments

tannergooding · 2021-02-22T16:10:43Z

My two cents is that this is a bad idea, in general.

There are certainly places where not using pointers is possible and where using "safe" alternatives like ref, out, or in can provide the same functionality.

However, interop code is itself fundamentally unsafe. The correctness of it is dependent on the bindings themselves being correct and safe and in many cases they cannot because they use concepts that cannot be represented in C# or understood by the JIT/VM.

In many cases trying to avoid unsafe can be more unsafe than just using unsafe directly.

Some examples....

In many cases, APIs take a LPCWSTR (a string) but they support certain inputs that are themselves not string but are instead atoms (this is the case across most of User32).

In several cases, APIs take or have variable-length arrays which means they fundamentally cannot ever be handled by value because the type system does not contain the full metadata required to do so (this is true even in C/C++). BITMAPINFO being discussed in #117 is one such case. Likewise there are a handful of cases where the struct itself is variable-lengthed.

C# does not have good scoping rules for ref with regards to struct fields (improving this is tracked by https://github.com/dotnet/csharplang/blob/master/proposals/low-level-struct-improvements.md) and so returning a ref to avoid a pointer may leak memory that is no longer valid.

Likewise, there are many APIs that take ref, out, or in where null is valid because the parameter is optional. The only way you can represent these today is via Unsafe.NullRef<T> which is just as unsafe and which may lead to unexpected breaks or NullReferenceExceptions due to C# not itself understanding that a ref could be null.

AArnott · 2021-02-22T21:16:42Z

@tannergooding: I started where you're at, which is why the first codegen I wrote uses all blittable structs and only relied on .NET marshaling for method signatures (and not on COM "interface" methods, which are implemented via C# 9 function pointers and thus cannot be marshaled).

But we're getting a lot of feedback about this. Folks want to implement COM interfaces and this simply cannot be done if we declare it as a struct. But once we use interfaces, structs that refer to them become managed types and thus non-blittable. Then methods that take those structs cannot use pointers to them any more and are forced into using more of the .NET marshalling layer. So it is very much a cascade.

I agree that there are APIs that simply cannot be correctly represented in the marshaling world. And the two worlds are incompatible due to the cascading effect. So my idea here is that if the switch is set to generate the .NET marshaled world, some APIs will simply be off-limits -- we will error out rather than generate them.

Thoughts?

tannergooding · 2021-02-22T21:50:58Z

Folks want to implement COM interfaces and this simply cannot be done if we declare it as a struct.

I think this is part of what the new low level APIs for RCW and CCW is meant to support: dotnet/runtime#1845.
But, it might be good to check in with @AaronRobinsonMSFT to confirm since I haven't tried to wrap my own head around the feature yet 😄

some APIs will simply be off-limits -- we will error out rather than generate them.

Thoughts?

I think that's probably reasonable overall.

That being said, one of my concerns is that there are many APIs (especially in COM) where com->Method(null) is different from var x = new SomeStruct(); com->Method(&x);. That is, there are many cases where a single method is used for both validation/querying and for actual construction/getting.

One example of this is D3D12CreateDevice where the fourth parameter is considered optional and if it is null then its a check for if adapter supports the given MinimumFeatureLevel. This actually impacts allocations, workload that the function does, and what needs to be cleaned up/disposed on return.

There are also functions where you basically do a query to determine number of elements by passing a valid pointer to count but null for outputs. You would then allocate space at least as big as the returned count, and then call the function again passing in the count and the allocated space.

So it's very easy to get into a scenario where you define these "safe" COM interfaces but where one function basically prevents the overall usage of them or where you are now dependent on Unsafe.NullRef for key functionality.

With regular P/Invokes, you can at least define multiple helpers that resolve to the same internal P/Invoke. However, with COM any method you define in the interface is considered part of the VTBL and thus can hinder things.

You can somewhat use extension methods, but that often means you have some overload that either takes pointers or IntPtr (the latter being "bad" because it hides the number of indirections and increases bug risk, IMO).

AArnott · 2021-02-22T22:00:47Z

With regular P/Invokes, you can at least define multiple helpers that resolve to the same internal P/Invoke.

Yes, that's what we do today. I was hoping to get the best of both worlds at once by using pointers for the extern methods and then doing my own marshaling/interop in a friendly overload. But that simply doesn't cut it for COM interfaces.

However, with COM any method you define in the interface is considered part of the VTBL and thus can hinder things.

I plan to keep generating friendly overloads on COM interfaces via extension methods, assuming the original overloads aren't friendly enough.

weltkante · 2021-02-22T22:18:00Z

However, interop code is itself fundamentally unsafe. The correctness of it is dependent on the bindings themselves being correct and safe and in many cases they cannot because they use concepts that cannot be represented in C# or understood by the JIT/VM.

Its one thing having to write unsafe code, and a completely different thing having to write the whole COM infrastructure like apartment marshaling logic manually because .NET is free threaded while COM is not. People will either forget finalizers (leaking COM objects) or forget apartments (violating COM rules, destabilzing the environment).

C# is used by people not exposed to low level COM, C++ teams already recognized the code quality problems of manual reference counting and are using smart pointer classes, reintroducing manual reference counting as the default experience into .NET is simply a design mistake and not appropriate.

(Maybe this library isn't intended to be the default go-to for interop, but looking at the goals of the Windows metadata project and how prominently this projection is featured there, it definitely will get a lot of visibility to people who "just" want to do interop. This happens a lot for desktop applications, including COM interop..)

[...] And the two worlds are incompatible due to the cascading effect [...]
Thoughts?

I don't exactly see why these are supposed to be incompatible, COM references are just pointers with lots of rules attached and I'd expect to be able to cast them between representations (via helper methods, not necessarily a C# language cast). For classic COM interop this is already possible via the Marshal helpers.

Personally I'd have expected this interop layer to coordinate with the work from CsWinRT which, the last time I looked, intended to build the new generation of the native COM interop layer outside the dotnet runtime. So I'd definitely prefer if CsWin32 picks up CsWinRT infrastructure rather than legacy .NET runtime COM marshaling.

If theres any reason to still have struct based vtables for performance sensitive scenarios I believe they should be opt-in on a per-interface level and people can do the appropriate "casts" at the appropriate stage (early once vs. late before a call) like they already can do now in classic COM marshaling (IntPtr vs. interface) when the runtime treatment of COM interop is not sufficient.

one of my concerns is that there are many APIs (especially in COM) where com->Method(null) is different from var x = new SomeStruct(); com->Method(&x);.

This definitely happens very often, and its always been awkward in C# projections. I believe the correct projection is to a one-element array of the interface type (?) which isn't very user friendly so most people prefer writing their ComImport incorrectly to make it easier to use when they think they won't be needing to support null-pointer-to-interface scenarios.

However I don't believe using structs for "literal" projections outweights the complete loss of infrastructure and the resulting destabilization and complexity increase.

tannergooding · 2021-02-22T22:54:06Z

I agree that ref counting can be painful. In my own libraries I largely just have a small ComPtr<T> like wrapper that I can call dispose on in the right places so that I get the perf/allocation benefits and easier lifetime management.

It definitely still isn't perfect and we don't have an allocation free way to make it such today.
Having some analyzer to help ensure you call Dispose on ComPtr<T> is something I've looked into, but haven't actually implemented yet.

tannergooding · 2021-02-22T22:59:29Z

Personally I'd have expected this interop layer to coordinate with the work from CsWinRT

This would be the new lowlevel CCW/RCW APIs I mentioned above: dotnet/runtime#1845
@AaronRobinsonMSFT is the expert here and is probably the best person to comment on intended usecases, etc.

I talked to him on teams a few minutes back and he basically summed it up as being for 2 scenarios:

GC interaction for WinRT
Efficient object identity

and basically that if there is no desire to support WinRT is basically offers a more efficient lifetime mechanism.

AaronRobinsonMSFT · 2021-02-22T23:01:58Z

One more thing to consider. The prototype for our AOT friendly DllImport is designed to be extendable and we hope can be used to fill out all the marshalling needed for COM method argument/return dispatch. See https://github.com/dotnet/runtimelab/tree/feature/DllImportGenerator/DllImportGenerator.

AArnott · 2021-02-22T23:33:27Z

Personally I'd have expected this interop layer to coordinate with the work from CsWinRT which, the last time I looked, intended to build the new generation of the native COM interop layer outside the dotnet runtime. So I'd definitely prefer if CsWin32 picks up CsWinRT infrastructure rather than legacy .NET runtime COM marshaling.

I haven't looked into how CsWinRT works, but as it interacts with WinRT whereas CsWin32 interacts with Win32, I don't expect it follows that we can do everything the same way.

I also do not intend to emit code that requires .NET 6 when no one is on it yet. Folks who target net35 want to use this generator, and while I'm on the fence on how much to support that, I certainly can't give up supporting net472.

AaronRobinsonMSFT · 2021-02-22T23:38:32Z

but as it interacts with WinRT whereas CsWin32 interacts with Win32, I don't expect it follows that we can do everything the same way.

All of the COM could be done in the same way CsWinRT supports WinRT. CsWinRT even has a limited/naive/slow .netstandard2.0 version.

I also do not intend to emit code that requires .NET 6 when no one is on it yet

Agreed. The ComWrappers support is for .NET 5, but your point is well taken. The best way to support COM is in .NET 5+ and if one needs to support pre-.NET 5 I don't see the current approach in here all that bad given that limitation.

AaronRobinsonMSFT · 2021-02-22T23:41:28Z

@AArnott There might actually be another way to improve the interoperability for COM in this model - ICustomMarshaler. It will make marshalling slower but may enable the desired user semantics.

AArnott · 2021-02-22T23:46:07Z

CsWinRT even has a limited/naive/slow .netstandard2.0 version.

Is it slower than ordinary .NET /COM interop? If so, what would the user gain by switching to a slower WinRT system?
I also wonder if it is WinRT based, whether it works on Win7. What we do now presumably works on Win7.

AaronRobinsonMSFT · 2021-02-22T23:55:01Z

Is it slower than ordinary .NET /COM interop? If so, what would the user gain by switching to a slower WinRT system?

AOT friendly is the key here. It is all known managed code. So it is slower than ComWrappers or the built-in system, but can AOT'd where as the built-in system is never going to support that. Also, this is only slower for pre-.NET5 scenarios for COM. For WinRT this is the only option going forward in .NET.

AArnott · 2021-03-16T01:59:35Z

Crossing off one of the 4 goals and renaming this issue to scope to just the COM problem.

As discussed above, the first "goal" in the issue description may not be a good goal. Someone can open another issue to track that if our friendly overloads are not enough. But remember that we're just getting started with friendly overloads--they're going to get friendlier.

weltkante · 2021-03-16T17:52:00Z

@AArnott last nightly build seems from 5 days ago (3/10/2021, 0.1.410-beta), am I missing something? would love to test this update to see if its usable for me now

AArnott · 2021-03-16T19:09:14Z

The nightly build hit a network snag. It's now available as 0.1.413-beta on the nightly feed.

This was referenced Jan 21, 2021

New Source Generator based approach to getting all the Win32 methods and types you need dotnet/pinvoke#565

Open

Source Generator Generates Structs Instead of Interfaces #52

Closed

AArnott mentioned this issue Feb 3, 2021

Provide a way for user code to implement COM interfaces #79

Closed

AArnott mentioned this issue Feb 16, 2021

Make COM interfaces easier to use #86

Closed

This was referenced Feb 16, 2021

Idiomatic bool marshalling #63

Closed

Generate generic method overloads where out void* appears for one parameter #103

Closed

Avoid using Span<T> on TFMs older than net45 #117

Closed

AArnott mentioned this issue Mar 2, 2021

COM interfaces returning structs use wrong calling convention #167

Open

AArnott self-assigned this Mar 9, 2021

AArnott mentioned this issue Mar 16, 2021

Generate COM interfaces, delegates, and managed structures #194

Merged

AArnott changed the title ~~Offer setting to generate pointer-free code~~ COM interfaces should be *interfaces*, and structs should be marshable Mar 16, 2021

AArnott closed this as completed in #194 Mar 16, 2021

smourier mentioned this issue Jul 12, 2021

Please provide bindings for DirectX. microsoft/WindowsAppSDK#14

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

COM interfaces should be interfaces, and structs should be marshable #26

COM interfaces should be interfaces, and structs should be marshable #26

AArnott commented Jan 17, 2021 •

edited

Loading

wjk commented Feb 3, 2021

AArnott commented Feb 5, 2021 •

edited

Loading

wjk commented Feb 5, 2021 •

edited

Loading

weltkante commented Feb 16, 2021

tannergooding commented Feb 22, 2021

AArnott commented Feb 22, 2021

tannergooding commented Feb 22, 2021

AArnott commented Feb 22, 2021

weltkante commented Feb 22, 2021 •

edited

Loading

tannergooding commented Feb 22, 2021

tannergooding commented Feb 22, 2021

AaronRobinsonMSFT commented Feb 22, 2021 •

edited

Loading

AArnott commented Feb 22, 2021

AaronRobinsonMSFT commented Feb 22, 2021

AaronRobinsonMSFT commented Feb 22, 2021

AArnott commented Feb 22, 2021

AaronRobinsonMSFT commented Feb 22, 2021

AArnott commented Mar 16, 2021

weltkante commented Mar 16, 2021

AArnott commented Mar 16, 2021

COM interfaces should be *interfaces*, and structs should be marshable #26

COM interfaces should be *interfaces*, and structs should be marshable #26

Comments

AArnott commented Jan 17, 2021 • edited Loading

wjk commented Feb 3, 2021

AArnott commented Feb 5, 2021 • edited Loading

wjk commented Feb 5, 2021 • edited Loading

weltkante commented Feb 16, 2021

tannergooding commented Feb 22, 2021

AArnott commented Feb 22, 2021

tannergooding commented Feb 22, 2021

AArnott commented Feb 22, 2021

weltkante commented Feb 22, 2021 • edited Loading

tannergooding commented Feb 22, 2021

tannergooding commented Feb 22, 2021

AaronRobinsonMSFT commented Feb 22, 2021 • edited Loading

AArnott commented Feb 22, 2021

AaronRobinsonMSFT commented Feb 22, 2021

AaronRobinsonMSFT commented Feb 22, 2021

AArnott commented Feb 22, 2021

AaronRobinsonMSFT commented Feb 22, 2021

AArnott commented Mar 16, 2021

weltkante commented Mar 16, 2021

AArnott commented Mar 16, 2021

COM interfaces should be interfaces, and structs should be marshable #26

COM interfaces should be interfaces, and structs should be marshable #26

AArnott commented Jan 17, 2021 •

edited

Loading

AArnott commented Feb 5, 2021 •

edited

Loading

wjk commented Feb 5, 2021 •

edited

Loading

weltkante commented Feb 22, 2021 •

edited

Loading

AaronRobinsonMSFT commented Feb 22, 2021 •

edited

Loading