Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

COM marshalling E2E #2618

Closed
MichalStrehovsky opened this issue Feb 16, 2022 · 15 comments
Closed

COM marshalling E2E #2618

MichalStrehovsky opened this issue Feb 16, 2022 · 15 comments
Assignees

Comments

@MichalStrehovsky
Copy link
Member

I ran into this on NativeAOT, but there's a gap with illink as well:

Repro

// dotnet new console
// Put <PublishTrimmed>true</PublishTrimmed> to the csproj.
// Put the following in Program.cs:
MessageBoxW(null, default, default, default);

[System.Runtime.InteropServices.DllImport("user32")]
static extern void MessageBoxW(IFoo o, IntPtr b, IntPtr c, int d);

interface IFoo { }

Observed behaviors

dotnet run

This prints

C:\rr\Program.cs(1,1): warning IL2050: P/invoke method 'MessageBoxW(IFoo, System.IntPtr, System.IntPtr, int)' declares a parameter with COM marshalling. Correctness of COM interop cannot be guaranteed after trimming. Interfaces and interface members might be removed. [C:\rr\rr.csproj]

And then pops the dialog box.

dotnet publish

Prints the message from both the analyzer and illink. Then it crashes with:

Unhandled exception. System.TypeLoadException: Could not load type 'System.StubHelpers.InterfaceMarshaler' from assembly 'System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e'.
   at Program.<<Main>$>g__MessageBoxW|0_0(IFoo o, IntPtr b, IntPtr c, Int32 d)
   at Program.<Main>$(String[] args) in C:\rr\Program.cs:line 1

Issues

  1. The warning from the analyzer seems unnecessary if System.Runtime.InteropServices.BuiltInComInterop.IsSupported=false because we don't actually expect a difference in behavior.
  2. The warning from illink seems unnecessary for the same reason
  3. Should something have prevented the dotnet run scenario from finishing within the runtime? It would avoid the hard failure in dotnet publish.

Cc @LakshanF @vitek-karas

@vitek-karas
Copy link
Member

This is an interesting case - thanks for bringing it up @MichalStrehovsky. I think we should see why it doesn't hit the runtime block for number 3 above.

I'm not 100%, but I think I agree with 1 and 2 as well. It's a special case, because we have special warning/code for this in the linker/analyzer. Most other feature switches work such that if the feature is off, there won't be any warning, and if it's on, that will bring in code which has RUC on it, and thus produce a warning. The special case for COM should ideally behave the same - especially since there are other COM entry points which do follow the typical feature switch/RUC pattern, so the current experience is inconsistent.

That said - the difference in this case is that the runtime doesn't consistently fail. So if we can fix that and make the runtime consistently fail -> we should disable the warnings. But if we can't make the runtime consistently fail, I think it's better to keep the warnings.

Just curious: You chose MessageBoxW because the repro needs an existing export, right? It doesn't matter that the declaration doesn't match the native ABI, for this repro - I think.

@MichalStrehovsky
Copy link
Member Author

So if we can fix that and make the runtime consistently fail -> we should disable the warnings. But if we can't make the runtime consistently fail, I think it's better to keep the warnings.

Agreed - my shallow expectation is that this should have failed without trimming too, but maybe I'm missing something.

Just curious: You chose MessageBoxW because the repro needs an existing export, right? It doesn't matter that the declaration doesn't match the native ABI, for this repro - I think.

Yes, that was just me being lazy to come up with a proper p/invoke. A null object reference marshals into a null pointer and MessageBoxW doesn't mind how we came up with the null, it just expects a pointer - and null is a valid thing to pass there.

Marshalling null interfaces is a real world scenario though because e.g. empty WinForms app goes through a similar codepath (dotnet/corert#8128 was all COM support we had to implement to make empty WinForms app work with NativeAOT back in the day).

@vitek-karas
Copy link
Member

@LakshanF could you please take a look why this doesn't trigger the runtime failure?

@LakshanF
Copy link
Member

/cc @AaronRobinsonMSFT

I see 2 separate issues here;

  1. COM behavior with the example above.
  2. What the warning behavior should be when a .NET feature is switched off and the developer uses APIs from that feature

For 1), the trimming behavior is the same as "dotnet run" if the developer turns on the feature (by setting BuiltInComInteropSupport to true). There is an added constraint with trimming in that, there can be unpredictable behavior not seeing otherwise. Hence, our best effort to warn developers around problems we can predict when trimming. But in general, we strongly discourage trimming and use of build-in-COM by turning off that feature by default.

For 2), I can see us being consistent with other feature switches as @vitek-karas advocates above. Perhaps a mechanism for us to convey to the developer that they are using a feature that is not supported in runtime. In the specific case above, the failure likely happen since the COM types not getting loaded when the feature is switched off causes downstream failures before better error handling like ones here comes into effect.

@vitek-karas
Copy link
Member

Personally I don't mind that the trimmed app is crashing - we provided enough warnings in that case.
The problem is the non-trimmed app working - this one should throw at least in theory. How come it works?

@LakshanF
Copy link
Member

LakshanF commented Feb 18, 2022

I don't think we block all COM paths in managed and native code if IsBuiltInComSupported is set to false. Dotnet run doesn't trim the app when PublishTrimmed is set (and coremanglib would have COM types). My assumption is that this particular interop call is not in a dangerous path but have to admit that I'm not a COM expert and haven't checked this scenario in detail.

@vitek-karas
Copy link
Member

I know that we don't block all COM paths, but I thought it was a "best effort" kind of thing. In which case it would be interesting if we could block this case...
Basically we need to figure out, if the warnings we generate match the actual COM needs - so if there's a way using a simple interface as a parameter type in PInvoke to trigger some problematic COM behavior, then we should keep the warning. And then as a secondary thing, we can try to figure out if we can block this in the runtime.

@LakshanF
Copy link
Member

Given that the non-trimmed behavior (no PublishTrimmed in the project file) is as above, it seems better to modify the warning behavior

@AaronRobinsonMSFT
Copy link
Member

AaronRobinsonMSFT commented Feb 18, 2022

It is definitely possible to address this, but it isn't clear how best to do it without littering the COM enable check all over. The problem is lookup for marshallers of type MARSHAL_TYPE_INTERFACE. An attempt is made in MarshalInfo::MarshalerRequiresCOM() to determine this but that is likely more restrictive than we want.

I personally don't think this is worth fixing. I agree with the sentiment it is a best effort and as we dig deeper into the marshalling logic this is just going to get more complicate with more complex conditional checks. I accept this can be confusing but I think the multitude of warnings should be sufficient to say "Hey! You are doing something that will break here". Either we believe warnings are useful and users should respect them or we make sure the runtime fails in all cases and relax our warnings, requiring both seems a bit like busy work.

@vitek-karas
Copy link
Member

Sounds good to me. If it's too complex, that would also mean it's complex to explain to the users, which is not ideal. In that case I favor the clear warning, which basically says "It's probably wrong". I think COM is sort of special in this case because of its complexity and spread through the runtime. Other features should be much clearer to cut off, and thus should provide a more consistent experience.

Let me turn this argument around: If we try to disable the warnings... when would that be OK? My understanding is that such condition would be really complicated. So again - not worth it.

@AaronRobinsonMSFT
Copy link
Member

Other features should be much clearer to cut off, and thus should provide a more consistent experience.

Agree.

Let me turn this argument around: If we try to disable the warnings... when would that be OK?

I don't think so. This is because of the contract of System.Runtime.InteropServices.BuiltInComInterop.IsSupported=false. This was a "best effort" and may hit issues. I am just reiterating your statement "I think COM is sort of special in this case because of its complexity and spread through the runtime.". For most/all other features I agree with @MichalStrehovsky reasoning about dropping the warnings if they exist.

@MichalStrehovsky
Copy link
Member Author

I've hit this in a test for ComWrappers: https://github.com/dotnet/runtime/blob/06228c10873ecec58c38cda58d7eb06f6b55e41b/src/tests/nativeaot/SmokeTests/ComWrappers/ComWrappers.csproj#L7-L8

The warning seems incorrect because the test is actually already following our guidelines for trimming safety - they register a COM wrapper, but they still get a warning.

If we keep generating the warning, people will have to suppress them in addition to enabling ComWrappers. That's an annoyance.

If we don't generate the warning, we should ideally have the same behavior before/after trimming. If things work before trimming but fail after, it's again one of the things that go against the spirit of safe trimming.

Can we add a chokepoint at the spot where we try to look for System.StubHelpers.InterfaceMarshaler?

@AaronRobinsonMSFT
Copy link
Member

The warning seems incorrect because the test is actually already following our guidelines for trimming safety - they register a COM wrapper, but they still get a warning.

The reason this is still failing is ComWrappers doesn't support marshalling via DllImport. This means it will fall back to built-in COM interop and not use the registered ComWrappers instance. The registered ComWrappers is only supported in some Marshal APIs (for example, Marshal.GetIUnknownForObject()). This decision was made because the LibraryImport source gen is meant as the longer term NativeAOT solution as opposed to DllImportAttribute, which is not AOT friendly.

@LakshanF
Copy link
Member

I think the current behavior in compiler warning is a good enough response to a challenging situation.

As mentioned above, built-in-COM support is spread all over the code base. It is challenging to turn off the feature completely since there is also an inconsistent approach between managed (dynamic) and native (static) guards against the feature. We have the following 3 situations now:

  1. Built-in-COM is ON (Default on Windows): COM works as expected
  2. Built-in-COM is OFF via a feature switch & App not trimmed: Compiler warnings kick in at build-time and we do a best effort to warn of dangerous COM path usage at runtime
  3. Built-in-COM is OFF via a feature switch & App (including runtime libraries) is trimmed: Compiler warnings kick in at build-time, we do a best effort to warn of dangerous COM path usage at runtime, and we trim code including critical COM types from runtime libraries

There are inconsistencies between 2) and 3) above and the biggest concern is a dangerous COM slipping though in trimmed mode due to Discrepancy in Windows built-in COM support between native and managed sides. There have been some surgical fixes in native code that seems to mostly address this (e.g. see the fix for TypeBuilder.CreateType doesn't work on Windows in a trimmed app due to BuiltInComInterop being disabled)

In this particular issue, there is a non-dangerous COM API not failing in 2) during runtime. Although not very consistent, the compiler time warning and runtime failure in trimmed mode seem both a reinforcement of the feature not present ("No means No") during runtime and some hints during compiler time that the developer is using a turned off feature.

@MichalStrehovsky
Copy link
Member Author

The reason this is still failing is ComWrappers doesn't support marshalling via DllImport

Ah, I see - I thought COM being disabled would route things through ComWrappers, but the warning makes sense if that's not the case. If there's a warning, we can have a behavioral difference.

It feels like it would nice if disabled COM rerouted things to ComWrappers because it gives a chance to turn trimming unfriendly code into trimming friendly code, but it's one of those things where one is trying to make code they don't own trimming friendly. If we offer such last-ditch workaround mechanism, library developers will take it as an excuse to not make their libraries trimming friendly. So maybe we shouldn't have it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants