Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

COM interfaces should be *interfaces*, and structs should be marshable #26

Closed
AArnott opened this issue Jan 17, 2021 · 20 comments · Fixed by #194
Closed

COM interfaces should be *interfaces*, and structs should be marshable #26

AArnott opened this issue Jan 17, 2021 · 20 comments · Fixed by #194
Assignees

Comments

@AArnott
Copy link
Member

AArnott commented Jan 17, 2021

In consideration of projects that are maintained by developers that prefer to avoid pointers and the unsafe keyword in C#, we might offer a setting to generate code that does not use pointers. In particular, code generation would:

  1. Generate only the "friendly overloads" as the direct extern methods. Instead of getting pointer-based extern methods with pointer-free friendly overloads, we would generate just the extern methods with the friendly signatures.
  2. Use delegates instead of function pointers.
  3. Use marshalled structs instead of blittable structs.
  4. Use interfaces for COM instead of structs.

Note that mixing the two varieties where marshalled structs and pointer-based extern methods coexist is a non-goal as it would be very complex to maintain and perhaps generate confusing code as well.

@wjk
Copy link

wjk commented Feb 3, 2021

Does this mean that CsWinRT would use ComImportAttribute with CsWinRT code if this option is implemented and enabled? I don’t mind pointers or unsafe code if it means that the output assembly is NativeAOT-compatible, which ComImportAttribute certainly isn’t.

@AArnott
Copy link
Member Author

AArnott commented Feb 5, 2021

How does the ComImportAttribute affect AOT compilation?
But no, I don't think we would add that attribute as the interfaces would not have been imported from a TLB. Although I think the attribute is necessary to make the CLR realize type equivalence between interfaces.

@wjk
Copy link

wjk commented Feb 5, 2021

Simple: ComImportAttribute requires support in the AOT runtime that is not currently implemented, and is far too complex for me to try to implement. If I try to use such a type in a NativeAOT’d project today, I will get a PNSE whenever I call a method from an imported COM interface.

Edit: AFAIK other than ComImportAttribute NativeAOT supports all other interop marshalling scenarios. Use of P/Invoke (incl. Marshal.GetDelegateForFunctionPointer) instead of raw pointers is just fine; we would simply have to manage the VTables of COM interface pointers ourselves, rather than relying on the runtime to do it for us.

@weltkante
Copy link

Coming from #86 I'd suggest making the non-struct projection the default and not an opt-in, so people who don't have in-depth knowledge of COM don't immediately break the whole environment by violating COM rules when talking with external components. Struct based COM objects have a lot of disadvantages and are only useful if you know exactly what you're doing. For general one-off interop its usually not worth the effort, C++ has smart pointers and ATL/WRL for implementing COM objects for a reason, because its really hard to get everything right if you have to do everything yourself. Exposing the raw COM ABI as the "default experience" to C# developers looking for an interop nuget package feels like a flawed design.

Disadvantages of struct based COM objects:

  • manual reference management is very error prone (you need to know a lot of rules)
  • you need to litter your code with try-finally blocks to properly release references
  • cannot easily be stored anywhere outside the stack, as the GC of its container will cause a reference leak
  • if you implement a Finalizer for the container you need in-depth knowledge on how to properly handle apartments

@tannergooding
Copy link
Member

My two cents is that this is a bad idea, in general.

There are certainly places where not using pointers is possible and where using "safe" alternatives like ref, out, or in can provide the same functionality.

However, interop code is itself fundamentally unsafe. The correctness of it is dependent on the bindings themselves being correct and safe and in many cases they cannot because they use concepts that cannot be represented in C# or understood by the JIT/VM.

In many cases trying to avoid unsafe can be more unsafe than just using unsafe directly.


Some examples....

In many cases, APIs take a LPCWSTR (a string) but they support certain inputs that are themselves not string but are instead atoms (this is the case across most of User32).

In several cases, APIs take or have variable-length arrays which means they fundamentally cannot ever be handled by value because the type system does not contain the full metadata required to do so (this is true even in C/C++). BITMAPINFO being discussed in #117 is one such case. Likewise there are a handful of cases where the struct itself is variable-lengthed.

C# does not have good scoping rules for ref with regards to struct fields (improving this is tracked by https://github.com/dotnet/csharplang/blob/master/proposals/low-level-struct-improvements.md) and so returning a ref to avoid a pointer may leak memory that is no longer valid.

Likewise, there are many APIs that take ref, out, or in where null is valid because the parameter is optional. The only way you can represent these today is via Unsafe.NullRef<T> which is just as unsafe and which may lead to unexpected breaks or NullReferenceExceptions due to C# not itself understanding that a ref could be null.

@AArnott
Copy link
Member Author

AArnott commented Feb 22, 2021

@tannergooding: I started where you're at, which is why the first codegen I wrote uses all blittable structs and only relied on .NET marshaling for method signatures (and not on COM "interface" methods, which are implemented via C# 9 function pointers and thus cannot be marshaled).

But we're getting a lot of feedback about this. Folks want to implement COM interfaces and this simply cannot be done if we declare it as a struct. But once we use interfaces, structs that refer to them become managed types and thus non-blittable. Then methods that take those structs cannot use pointers to them any more and are forced into using more of the .NET marshalling layer. So it is very much a cascade.

I agree that there are APIs that simply cannot be correctly represented in the marshaling world. And the two worlds are incompatible due to the cascading effect. So my idea here is that if the switch is set to generate the .NET marshaled world, some APIs will simply be off-limits -- we will error out rather than generate them.

Thoughts?

@tannergooding
Copy link
Member

Folks want to implement COM interfaces and this simply cannot be done if we declare it as a struct.

I think this is part of what the new low level APIs for RCW and CCW is meant to support: dotnet/runtime#1845.
But, it might be good to check in with @AaronRobinsonMSFT to confirm since I haven't tried to wrap my own head around the feature yet 😄

some APIs will simply be off-limits -- we will error out rather than generate them.

Thoughts?

I think that's probably reasonable overall.

That being said, one of my concerns is that there are many APIs (especially in COM) where com->Method(null) is different from var x = new SomeStruct(); com->Method(&x);. That is, there are many cases where a single method is used for both validation/querying and for actual construction/getting.

One example of this is D3D12CreateDevice where the fourth parameter is considered optional and if it is null then its a check for if adapter supports the given MinimumFeatureLevel. This actually impacts allocations, workload that the function does, and what needs to be cleaned up/disposed on return.

There are also functions where you basically do a query to determine number of elements by passing a valid pointer to count but null for outputs. You would then allocate space at least as big as the returned count, and then call the function again passing in the count and the allocated space.


So it's very easy to get into a scenario where you define these "safe" COM interfaces but where one function basically prevents the overall usage of them or where you are now dependent on Unsafe.NullRef for key functionality.

With regular P/Invokes, you can at least define multiple helpers that resolve to the same internal P/Invoke. However, with COM any method you define in the interface is considered part of the VTBL and thus can hinder things.

You can somewhat use extension methods, but that often means you have some overload that either takes pointers or IntPtr (the latter being "bad" because it hides the number of indirections and increases bug risk, IMO).

@AArnott
Copy link
Member Author

AArnott commented Feb 22, 2021

With regular P/Invokes, you can at least define multiple helpers that resolve to the same internal P/Invoke.

Yes, that's what we do today. I was hoping to get the best of both worlds at once by using pointers for the extern methods and then doing my own marshaling/interop in a friendly overload. But that simply doesn't cut it for COM interfaces.

However, with COM any method you define in the interface is considered part of the VTBL and thus can hinder things.

I plan to keep generating friendly overloads on COM interfaces via extension methods, assuming the original overloads aren't friendly enough.

@weltkante
Copy link

weltkante commented Feb 22, 2021

However, interop code is itself fundamentally unsafe. The correctness of it is dependent on the bindings themselves being correct and safe and in many cases they cannot because they use concepts that cannot be represented in C# or understood by the JIT/VM.

Its one thing having to write unsafe code, and a completely different thing having to write the whole COM infrastructure like apartment marshaling logic manually because .NET is free threaded while COM is not. People will either forget finalizers (leaking COM objects) or forget apartments (violating COM rules, destabilzing the environment).

C# is used by people not exposed to low level COM, C++ teams already recognized the code quality problems of manual reference counting and are using smart pointer classes, reintroducing manual reference counting as the default experience into .NET is simply a design mistake and not appropriate.

(Maybe this library isn't intended to be the default go-to for interop, but looking at the goals of the Windows metadata project and how prominently this projection is featured there, it definitely will get a lot of visibility to people who "just" want to do interop. This happens a lot for desktop applications, including COM interop..)

[...] And the two worlds are incompatible due to the cascading effect [...]
Thoughts?

I don't exactly see why these are supposed to be incompatible, COM references are just pointers with lots of rules attached and I'd expect to be able to cast them between representations (via helper methods, not necessarily a C# language cast). For classic COM interop this is already possible via the Marshal helpers.

Personally I'd have expected this interop layer to coordinate with the work from CsWinRT which, the last time I looked, intended to build the new generation of the native COM interop layer outside the dotnet runtime. So I'd definitely prefer if CsWin32 picks up CsWinRT infrastructure rather than legacy .NET runtime COM marshaling.

If theres any reason to still have struct based vtables for performance sensitive scenarios I believe they should be opt-in on a per-interface level and people can do the appropriate "casts" at the appropriate stage (early once vs. late before a call) like they already can do now in classic COM marshaling (IntPtr vs. interface) when the runtime treatment of COM interop is not sufficient.

one of my concerns is that there are many APIs (especially in COM) where com->Method(null) is different from var x = new SomeStruct(); com->Method(&x);.

This definitely happens very often, and its always been awkward in C# projections. I believe the correct projection is to a one-element array of the interface type (?) which isn't very user friendly so most people prefer writing their ComImport incorrectly to make it easier to use when they think they won't be needing to support null-pointer-to-interface scenarios.

However I don't believe using structs for "literal" projections outweights the complete loss of infrastructure and the resulting destabilization and complexity increase.

@tannergooding
Copy link
Member

I agree that ref counting can be painful. In my own libraries I largely just have a small ComPtr<T> like wrapper that I can call dispose on in the right places so that I get the perf/allocation benefits and easier lifetime management.

It definitely still isn't perfect and we don't have an allocation free way to make it such today.
Having some analyzer to help ensure you call Dispose on ComPtr<T> is something I've looked into, but haven't actually implemented yet.

@tannergooding
Copy link
Member

Personally I'd have expected this interop layer to coordinate with the work from CsWinRT

This would be the new lowlevel CCW/RCW APIs I mentioned above: dotnet/runtime#1845
@AaronRobinsonMSFT is the expert here and is probably the best person to comment on intended usecases, etc.

I talked to him on teams a few minutes back and he basically summed it up as being for 2 scenarios:

  1. GC interaction for WinRT
  2. Efficient object identity

and basically that if there is no desire to support WinRT is basically offers a more efficient lifetime mechanism.

@AaronRobinsonMSFT
Copy link
Member

AaronRobinsonMSFT commented Feb 22, 2021

One more thing to consider. The prototype for our AOT friendly DllImport is designed to be extendable and we hope can be used to fill out all the marshalling needed for COM method argument/return dispatch. See https://github.com/dotnet/runtimelab/tree/feature/DllImportGenerator/DllImportGenerator.

@AArnott
Copy link
Member Author

AArnott commented Feb 22, 2021

Personally I'd have expected this interop layer to coordinate with the work from CsWinRT which, the last time I looked, intended to build the new generation of the native COM interop layer outside the dotnet runtime. So I'd definitely prefer if CsWin32 picks up CsWinRT infrastructure rather than legacy .NET runtime COM marshaling.

I haven't looked into how CsWinRT works, but as it interacts with WinRT whereas CsWin32 interacts with Win32, I don't expect it follows that we can do everything the same way.

I also do not intend to emit code that requires .NET 6 when no one is on it yet. Folks who target net35 want to use this generator, and while I'm on the fence on how much to support that, I certainly can't give up supporting net472.

@AaronRobinsonMSFT
Copy link
Member

but as it interacts with WinRT whereas CsWin32 interacts with Win32, I don't expect it follows that we can do everything the same way.

All of the COM could be done in the same way CsWinRT supports WinRT. CsWinRT even has a limited/naive/slow .netstandard2.0 version.

I also do not intend to emit code that requires .NET 6 when no one is on it yet

Agreed. The ComWrappers support is for .NET 5, but your point is well taken. The best way to support COM is in .NET 5+ and if one needs to support pre-.NET 5 I don't see the current approach in here all that bad given that limitation.

@AaronRobinsonMSFT
Copy link
Member

@AArnott There might actually be another way to improve the interoperability for COM in this model - ICustomMarshaler. It will make marshalling slower but may enable the desired user semantics.

@AArnott
Copy link
Member Author

AArnott commented Feb 22, 2021

CsWinRT even has a limited/naive/slow .netstandard2.0 version.

Is it slower than ordinary .NET /COM interop? If so, what would the user gain by switching to a slower WinRT system?
I also wonder if it is WinRT based, whether it works on Win7. What we do now presumably works on Win7.

@AaronRobinsonMSFT
Copy link
Member

Is it slower than ordinary .NET /COM interop? If so, what would the user gain by switching to a slower WinRT system?

AOT friendly is the key here. It is all known managed code. So it is slower than ComWrappers or the built-in system, but can AOT'd where as the built-in system is never going to support that. Also, this is only slower for pre-.NET5 scenarios for COM. For WinRT this is the only option going forward in .NET.

@AArnott AArnott self-assigned this Mar 9, 2021
@AArnott AArnott changed the title Offer setting to generate pointer-free code COM interfaces should be *interfaces*, and structs should be marshable Mar 16, 2021
@AArnott
Copy link
Member Author

AArnott commented Mar 16, 2021

Crossing off one of the 4 goals and renaming this issue to scope to just the COM problem.

As discussed above, the first "goal" in the issue description may not be a good goal. Someone can open another issue to track that if our friendly overloads are not enough. But remember that we're just getting started with friendly overloads--they're going to get friendlier.

@weltkante
Copy link

@AArnott last nightly build seems from 5 days ago (3/10/2021, 0.1.410-beta), am I missing something? would love to test this update to see if its usable for me now

@AArnott
Copy link
Member Author

AArnott commented Mar 16, 2021

The nightly build hit a network snag. It's now available as 0.1.413-beta on the nightly feed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants