Crossgen2 support for static virtual method resolution (take 2) #87438

trylek · 2023-06-12T23:28:02Z

I have revived my change from two years back, rebased it against current main and updated it based mostly on Michal's PR feedback. It seems to work in my local testing so I'm publishing the PR and starting lab testing. I originally hoped to revive my old PR

#54063

but GitHub has apparently already locked it so I have to open a new one.

Thanks

Tomas

/cc @dotnet/crossgen-contrib, @dotnet/jit-contrib

trylek · 2023-06-13T00:01:27Z

/azp run runtime-coreclr outerloop

azure-pipelines · 2023-06-13T00:01:40Z

Azure Pipelines successfully started running 1 pipeline(s).

EgorBo · 2023-06-13T00:07:33Z

Does it make sense to also run crossgen outerloop jobs?

trylek · 2023-06-13T00:20:03Z

Thanks Egor for the suggestion, I'll run them once the primary runs finish fine to reduce lab cost in case some problem pops up. In general I have recently found out that Crossgen2 outerloop runs are currently busted due to several problematic IL tests so I don't expect a green run but a canary run will certainly be useful.

MichalStrehovsky

I'm not convinced any changes to MetadataVirtualMethodAlgorithm.cs are needed.

src/coreclr/tools/Common/TypeSystem/Common/MetadataVirtualMethodAlgorithm.cs

trylek · 2023-06-15T21:46:38Z

/azp run runtime-coreclr outerloop

azure-pipelines · 2023-06-15T21:46:49Z

Azure Pipelines successfully started running 1 pipeline(s).

trylek · 2023-06-15T21:47:44Z

/azp run runtime-coreclr crossgen2 outerloop

azure-pipelines · 2023-06-15T21:47:55Z

Azure Pipelines successfully started running 1 pipeline(s).

trylek · 2023-06-15T21:55:59Z

@MichalStrehovsky - based on your suggestion I have reverted all changes to MetadataVirtualMethodAlgorithm.cs and I found out that I just need to fix one more detail around module token resolution. @davidwrighton - please make sure to take a detailed look at the very first diff around line 280 in CorInfoImpl.ReadyToRun.cs where I had to modify the method ComputeActualOwningType you originally implemented.

Without this extra change Crossgen2 crashes in several SVM tests you originally authored including Loader\classloader\StaticVirtualMethods\GenericContext\GenericContextTestDefaultImp\GenericContextTestDefaultImp.ilproj where we're resolving NormalMethod and ComputeActualOwningType receives NonGenericClass as the methodTargetOwner parameter and IFaceGenericDefaultImp<string> (implemented by the class) as instantiatedOwningType.

davidwrighton · 2023-06-21T19:05:36Z

src/coreclr/tools/aot/ILCompiler.ReadyToRun/JitInterface/CorInfoImpl.ReadyToRun.cs

+                else if (isStaticVirtual)
+                {
+                    pResult->thisTransform = CORINFO_THIS_TRANSFORM.CORINFO_NO_THIS_TRANSFORM;
+                }


Is it actually possible to reach this code? My reading indicates that either the directMethod will be resolved to a non-null value above, or we will throw a RequiresRuntimeJitException.

I have reformulated this bit and I believe it's now reachable.

davidwrighton · 2023-06-21T19:10:48Z

src/coreclr/tools/aot/ILCompiler.ReadyToRun/JitInterface/CorInfoImpl.ReadyToRun.cs

+                    directMethod = constrainedType.ResolveVariantInterfaceMethodToStaticVirtualMethodOnType(originalMethod);
+                    if (directMethod == null)
+                    {
+                        throw new RequiresRuntimeJitException(originalMethod.ToString());


Could you add a test to the version resiliency tests in src/tests/readytorun/tests so that if the static interface, and implementing type are not in the current R2R module, that if the implementation changes from a default implementation to an exact implementation that the generated R2R code referring to such a thing remains correct?

Added in the latest commit, please take a look to double-check if it matches your intent.

src/coreclr/tools/aot/ILCompiler.ReadyToRun/JitInterface/CorInfoImpl.ReadyToRun.cs

MichalStrehovsky · 2023-06-26T23:49:28Z

src/coreclr/tools/Common/TypeSystem/Common/MetadataVirtualMethodAlgorithm.cs

@@ -995,6 +995,13 @@ private static MethodDesc TryResolveVirtualStaticMethodOnThisType(MetadataType c
                if (methodImpl.Decl == interfaceMethodDefinition)
                {
                    MethodDesc resolvedMethodImpl = methodImpl.Body;
+                    if (resolvedMethodImpl.OwningType != constrainedType.GetTypeDefinition())


Note that the managed type system does a lot less validation than the CoreCLR type system in general - there are many other things that could be wrong with the type and prevent a load on CoreCLR that we'll not detect here. The way this is approached elsewhere is that we emit enough fixups to force this load on the CoreCLR side. I guess adding a check fixes some test, but it's not systematic.

An example of the problem would be:

Call<Foo>(); static void Call<T>() where T : IFoo => T.Something(); interface IFoo { static virtual void Something() => Console.WriteLine("Hello"); } struct Foo : IFoo { extern void BreakThings(); }

Here CoreCLR will not allow Foo to load because it has an extern method with no body (just a random example of things CoreCLR looks at). If we generate code in crossgen2 that bypasses the need to load Foo (which I believe we'd do here, but I didn't actually check - I'm certain it can be tweaked to achieve that), we'd "fix" this even though it's not supposed to work.

Hmm, can you please explain to me why the problem you described is specific to static virtual methods? I mean, let's say Crossgen2 allows JIT to inline a call to a static method - we're also not making sure on the Crossgen2 side that the callee sits in a loadable class that doesn't have any extern methods defined and whatnot. You may be right that it's problematic in certain corner cases and we should improve the type validation but I just don't see why this should be a SVM-specific problem; if a change in this logic turns out to be necessary, I tend to assume it should apply to all methods, not just static virtuals.

I don't think it's specific to static virtual methods. It is a general problem. I think the throw/check added here is just enough to fix a failing test, but it doesn't even fix this problem for bad MethodImpls, let alone for the many other cases when CoreCLR would refuse to load the type. E.g. if there's a broken MethodImpl for some other method but it's after we already found a match, we'll not notice and will not throw. CoreCLR will.

The preferred solution for this has always been to generate a fixup to force CoreCLR to load this type before we're allowed to run the code.

If a fixup is not possible, we're basically saying we'll need a CoreCLR-like type load emulator that will validate this whenever we touch a type. If that's the case, we should disable the test on ActiveIssue and budget for a CoreCLR-load emulator in .NET 9.

Adding partial checks in places that our tests happened to run into doesn't seem sustainable to me. E.g. we iterate MethodImpls in all of these places. Should each of these validate the MethodImpl is valid? And what about all the other scenarios where CoreCLR will refuse a load that we just happen not to have test coverage with static virtuals?

runtime/src/coreclr/tools/Common/TypeSystem/Common/MetadataVirtualMethodAlgorithm.cs

Lines 287 to 301 in cdd7566

foreach (MethodImplRecord record in foundMethodImpls)

{

MethodDesc recordDecl = record.Decl;

if (interfaceDecl != recordDecl.OwningType.IsInterface)

continue;

if (!interfaceDecl)

recordDecl = FindSlotDefiningMethodForVirtualMethod(recordDecl);

if (recordDecl == decl)

{

return FindSlotDefiningMethodForVirtualMethod(record.Body);

}

}

runtime/src/coreclr/tools/Common/TypeSystem/Common/MetadataVirtualMethodAlgorithm.cs

Lines 492 to 515 in cdd7566

foreach (MethodImplRecord methodImplRecord in currentType.VirtualMethodImplsForType)

{

MethodDesc declSlot = FindSlotDefiningMethodForVirtualMethod(methodImplRecord.Decl);

MethodDesc implSlot = FindSlotDefiningMethodForVirtualMethod(methodImplRecord.Body);

if (unificationGroup.IsInGroup(declSlot) && !unificationGroup.IsInGroupOrIsDefiningSlot(implSlot))

{

unificationGroup.RemoveFromGroup(declSlot);

separatedMethods ??= new MethodDescHashtable();

separatedMethods.AddOrGetExisting(declSlot);

if (unificationGroup.RequiresSlotUnification(declSlot) || implSlot.RequiresSlotUnification())

{

if (implSlot.Signature.EqualsWithCovariantReturnType(unificationGroup.DefiningMethod.Signature))

{

unificationGroup.AddMethodRequiringSlotUnification(declSlot);

unificationGroup.AddMethodRequiringSlotUnification(implSlot);

unificationGroup.SetDefiningMethod(implSlot);

}

}

continue;

}

runtime/src/coreclr/tools/Common/TypeSystem/Common/MetadataVirtualMethodAlgorithm.cs

Lines 794 to 816 in cdd7566

MethodImplRecord[] possibleImpls = runtimeInterface.FindMethodsImplWithMatchingDeclName(interfaceMethod.Name);

if (possibleImpls != null)

{

foreach (MethodImplRecord implRecord in possibleImpls)

{

if (implRecord.Decl == interfaceMethodDefinition)

{

// This interface provides a default implementation.

// Is it also most specific?

if (mostSpecificInterface == null || Array.IndexOf(runtimeInterface.RuntimeInterfaces, mostSpecificInterface) != -1)

{

mostSpecificInterface = runtimeInterface;

impl = implRecord.Body;

diamondCase = false;

}

else if (Array.IndexOf(mostSpecificInterface.RuntimeInterfaces, runtimeInterface) == -1)

{

diamondCase = true;

}

break;

}

}

Agreed. I've written the bare bones of a CoreCLR based type load emulator as part of supporting the skip type validation feature of R2R, but its incomplete, and does not cover all of the failure paths (It handles most of the checks that skip type validation skips). I see that we ALSO have a different validator that is used in NativeAOT/Crossgen2 in other cases in the jit interface. We really ought to unify the two and finish them up in the future.

I see that we ALSO have a different validator that is used in NativeAOT/Crossgen2 in other cases in the jit interface

That validator was originally written for NativeAOT purposes - we have a more relaxed approach to invalid inputs there and the validator basically just does enough validation to prevent a compiler crash later (type system exceptions are catchable/handlable only when hit while generating method code and fatal in most other places).

Crossgen2 needs to be more strict because "fixing" a problem initially, but resurfacing it again when the method is rejitted into a higher tier is worse than the NativeAOT behavior (that just doesn't throw the exception it was supposed to throw and leaves things at that).

Both R2R and tiered JIT assume that the IL is valid. Invalid IL can have behavior differences between R2R and JIT or between Tier0 JIT and Tier1 JIT. The AOT compiler validator does not need to be perfect. Its main purpose is to catch missing dependencies that are very common in the code out there. We have tacked some invalid metadata handling to it too, but that is mostly just to make some of our own tests for invalid patterns happy.

If we view it as "just a thing to make a test pass", I'm fine with adding a check here.

However, adding a new ExceptionStringId requires more work than just a new enum member. NativeAOT compiler will potentially catch this exception at compile time and rethrow it at runtime, generating an exception message using CoreLib's localization.

Also the exception type and exception message should match CoreCLR's exception (that's the pattern we've been following).

While I generally agree it's a good idea to emit a fixup to make sure the target type of a SVM gets loaded, I'm worried that that this particular case is somewhat more tricky - if Crossgen2 silently ignores "explicit override forwarders" in the sense of allowing the SVM implementation to be redirected to a completely different type, its load check may happily pass as it would be completely oblivious of the original "intermediate type" that spawned the forward; otherwise we would probably need to encode fixups for all three involved types - the original interface type where the SVM is defined, the targeting constrained type and the actual resolution type - and that seems somewhat wasteful to me and potentially cancelling out the SVM prejit perf savings due to more work that needs to happen at runtime.

1. Delegate cctor helper JIT interface method was missing support for type constraint. 2. Methods wrapped in instantiation / unboxing stubs shouldn't claim they have a generic dictionary slot in their GC refmap. Thanks Tomas

* Added stricter check to delegate constructor * Disabled MethodBodyOnUnrelatedType test in Crossgen2 mode * Reverted change to virtual method resolver Thanks Tomas

trylek · 2023-08-08T15:33:35Z

@mangod9 - As discussed offline I traditionally hit a snag with Crank that now seems to have trouble using locally built apps. I'll follow up with Sebastién, for now I have dusted off our good old friend Jellyfin and I think I am able to see some runtime JIT reduction with this change - for default R2R publishing 453 methods get jitted at runtime before the app crashes; in composite publishing it gets reduced to 341 runtime-jitted methods and further to 252 methods when using composite built with this Crossgen2 change. While that's just an anecdotal observation, I think it shows the potential for some startup perf improvement; I have rebased this change against the latest main and I plan to retest it and merge it in today around noon unless anyone objects.

mangod9 · 2023-08-08T15:36:45Z

for default R2R publishing 453 methods get jitted at runtime before the app crashes;

just confirming that the app crashes are what is being fixed here or is that unrelated? Nice that you are observing less JITing which should be good to merge ( assume you are observing more methods emitted in the R2R image as well?). Thx!

trylek · 2023-08-08T17:29:13Z

@mangod9 - of course not, it crashes in all build modes, I just didn't bother to fix all of it, we never had a fully functional version supporting .NET 8. In the meantime I have also tried "dotnet new webapi" in composite mode without and with the MIBC data. Without MIBC data, the "normal" version jits 200 methods at runtime (including shutdown), with this change it jits 179 methods. With the StandardOptimizationData.mibc, the normal version jits 75 methods at runtime and the version with this PR jits 53 methods. As you can see, the diffs are similar (about 20 methods) so that's presumably thanks to Crossgen2 being newly able to compile methods calling SVMs.

trylek · 2023-08-08T17:47:34Z

For the methods emitted into the R2R file, I don't have an exact diff as the R2RDump diff mode has been broken for a while but I see that for "dotnet new webapi" the full composite PE executable with this change is slightly larger (100.4 MB vs 98.9 MB or 1.4 MB delta) and contains more generics (the instance method entrypoint section size is 4.8 MB vs. 4.74 MB or by about 60 KB larger); interestingly enough the number of non-generic methods is the same which is probably expected as I believe that currently the biggest user of SVMs in the framework is generic maths.

xtqqczze · 2023-08-09T12:10:59Z

For the methods emitted into the R2R file, I don't have an exact diff as the R2RDump diff mode has been broken for a while

Seeing a increase in code size for some generic maths, see #84421 (comment).

EgorBo · 2023-08-14T06:39:26Z

It improved startup time in TE benchmarks, e.g.:

trylek added the area-crossgen2-coreclr label Jun 12, 2023

trylek added this to the 8.0.0 milestone Jun 12, 2023

trylek requested review from EgorBo and davidwrighton June 12, 2023 23:28

trylek requested a review from MichalStrehovsky as a code owner June 12, 2023 23:28

ghost assigned trylek Jun 12, 2023

build-analysis bot mentioned this pull request Jun 13, 2023

Tracking issue for CI build timeouts #76454

Closed

MichalStrehovsky reviewed Jun 13, 2023

View reviewed changes

runfoapp bot mentioned this pull request Jun 13, 2023

Infra improvements for Helix #68176

Closed

trylek force-pushed the Crossgen2SVMResolution branch 2 times, most recently from f870aa8 to 9ff9691 Compare June 15, 2023 21:44

This was referenced Jun 16, 2023

Outerloop crossgen 2 failures: The file is not a ReadyToRun image #87299

Closed

R2R known test failure #87716

Closed

trylek closed this Jun 19, 2023

trylek reopened this Jun 19, 2023

davidwrighton reviewed Jun 21, 2023

View reviewed changes

src/coreclr/tools/aot/ILCompiler.ReadyToRun/JitInterface/CorInfoImpl.ReadyToRun.cs Outdated Show resolved Hide resolved

davidwrighton reviewed Jun 21, 2023

View reviewed changes

src/coreclr/tools/aot/ILCompiler.ReadyToRun/JitInterface/CorInfoImpl.ReadyToRun.cs Outdated Show resolved Hide resolved

trylek force-pushed the Crossgen2SVMResolution branch from 9ff9691 to a84ada5 Compare June 23, 2023 11:01

MichalStrehovsky reviewed Jun 26, 2023

View reviewed changes

trylek added 15 commits August 8, 2023 15:41

Fix two Crossgen2 codegen bugs found via SVM testing

af42831

1. Delegate cctor helper JIT interface method was missing support for type constraint. 2. Methods wrapped in instantiation / unboxing stubs shouldn't claim they have a generic dictionary slot in their GC refmap. Thanks Tomas

Fix bugs, address PR feedback

a6785a4

Remove instrumentation; fix instantiation stub flag in getCallInfo

bc60009

Revert change to Argiterator that turned out to be incorrect

c75a1b0

Subtle fixes for SVM handling in ceeInfoGetCallInfo

c1d0603

Fix delegate ctor and module token construction for SVMs

0916bfc

Remove instrumentation and more fixes for special SVM cases

168908f

Fix assertion failure seen in readytorun/tests/generic tests

ad44935

Rebase against main; remove unused resource

5bad02e

Address Michal's PR feedback towards MethodImpl body lookup

5ddc43c

Fix typo and merge error

f5f530c

Add versioning resiliency test as suggested by DavidWr

56754ea

Remove superfluous blank lines

8d642c6

Fix composite issues by introducing a new token kind

26dd5cf

Address David Wrighton's PR feedback

1374c32

* Added stricter check to delegate constructor * Disabled MethodBodyOnUnrelatedType test in Crossgen2 mode * Reverted change to virtual method resolver Thanks Tomas

trylek force-pushed the Crossgen2SVMResolution branch from f1fb1fe to 1374c32 Compare August 8, 2023 15:28

trylek merged commit cb08364 into dotnet:main Aug 8, 2023
117 checks passed

trylek deleted the Crossgen2SVMResolution branch August 8, 2023 18:43

trylek mentioned this pull request Aug 8, 2023

Casting via generic math doesn't always inline in R2R #84421

Closed

cincuranet mentioned this pull request Aug 15, 2023

[Perf] Windows/x64: 43 Improvements on 8/9/2023 9:46:11 AM dotnet/perf-autofiling-issues#20581

Closed

ghost locked as resolved and limited conversation to collaborators Sep 13, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Crossgen2 support for static virtual method resolution (take 2) #87438

Crossgen2 support for static virtual method resolution (take 2) #87438

trylek commented Jun 12, 2023

trylek commented Jun 13, 2023

azure-pipelines bot commented Jun 13, 2023

EgorBo commented Jun 13, 2023

trylek commented Jun 13, 2023

MichalStrehovsky left a comment

trylek commented Jun 15, 2023

azure-pipelines bot commented Jun 15, 2023

trylek commented Jun 15, 2023

azure-pipelines bot commented Jun 15, 2023

trylek commented Jun 15, 2023 •

edited

davidwrighton Jun 21, 2023

trylek Jul 20, 2023

davidwrighton Jun 21, 2023

trylek Jul 20, 2023

MichalStrehovsky Jun 26, 2023

MichalStrehovsky Jun 26, 2023

trylek Jun 29, 2023

MichalStrehovsky Jun 30, 2023

davidwrighton Jun 30, 2023

MichalStrehovsky Jul 3, 2023

jkotas Jul 3, 2023

MichalStrehovsky Jul 3, 2023

trylek Jul 17, 2023

trylek commented Aug 8, 2023

mangod9 commented Aug 8, 2023

trylek commented Aug 8, 2023

trylek commented Aug 8, 2023

xtqqczze commented Aug 9, 2023

EgorBo commented Aug 14, 2023

	foreach (MethodImplRecord record in foundMethodImpls)
	{
	MethodDesc recordDecl = record.Decl;

	if (interfaceDecl != recordDecl.OwningType.IsInterface)
	continue;

	if (!interfaceDecl)
	recordDecl = FindSlotDefiningMethodForVirtualMethod(recordDecl);

	if (recordDecl == decl)
	{
	return FindSlotDefiningMethodForVirtualMethod(record.Body);
	}
	}

	foreach (MethodImplRecord methodImplRecord in currentType.VirtualMethodImplsForType)
	{
	MethodDesc declSlot = FindSlotDefiningMethodForVirtualMethod(methodImplRecord.Decl);
	MethodDesc implSlot = FindSlotDefiningMethodForVirtualMethod(methodImplRecord.Body);

	if (unificationGroup.IsInGroup(declSlot) && !unificationGroup.IsInGroupOrIsDefiningSlot(implSlot))
	{
	unificationGroup.RemoveFromGroup(declSlot);

	separatedMethods ??= new MethodDescHashtable();
	separatedMethods.AddOrGetExisting(declSlot);

	if (unificationGroup.RequiresSlotUnification(declSlot) \|\| implSlot.RequiresSlotUnification())
	{
	if (implSlot.Signature.EqualsWithCovariantReturnType(unificationGroup.DefiningMethod.Signature))
	{
	unificationGroup.AddMethodRequiringSlotUnification(declSlot);
	unificationGroup.AddMethodRequiringSlotUnification(implSlot);
	unificationGroup.SetDefiningMethod(implSlot);
	}
	}

	continue;
	}

	MethodImplRecord[] possibleImpls = runtimeInterface.FindMethodsImplWithMatchingDeclName(interfaceMethod.Name);
	if (possibleImpls != null)
	{
	foreach (MethodImplRecord implRecord in possibleImpls)
	{
	if (implRecord.Decl == interfaceMethodDefinition)
	{
	// This interface provides a default implementation.
	// Is it also most specific?
	if (mostSpecificInterface == null \|\| Array.IndexOf(runtimeInterface.RuntimeInterfaces, mostSpecificInterface) != -1)
	{
	mostSpecificInterface = runtimeInterface;
	impl = implRecord.Body;
	diamondCase = false;
	}
	else if (Array.IndexOf(mostSpecificInterface.RuntimeInterfaces, runtimeInterface) == -1)
	{
	diamondCase = true;
	}

	break;
	}
	}

Crossgen2 support for static virtual method resolution (take 2) #87438

Crossgen2 support for static virtual method resolution (take 2) #87438

Conversation

trylek commented Jun 12, 2023

trylek commented Jun 13, 2023

azure-pipelines bot commented Jun 13, 2023

EgorBo commented Jun 13, 2023

trylek commented Jun 13, 2023

MichalStrehovsky left a comment

Choose a reason for hiding this comment

trylek commented Jun 15, 2023

azure-pipelines bot commented Jun 15, 2023

trylek commented Jun 15, 2023

azure-pipelines bot commented Jun 15, 2023

trylek commented Jun 15, 2023 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

trylek commented Aug 8, 2023

mangod9 commented Aug 8, 2023

trylek commented Aug 8, 2023

trylek commented Aug 8, 2023

xtqqczze commented Aug 9, 2023

EgorBo commented Aug 14, 2023

trylek commented Jun 15, 2023 •

edited