Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve the performance of the type loader through various tweaks #85743

Merged
merged 29 commits into from Jun 13, 2023

Conversation

davidwrighton
Copy link
Member

@davidwrighton davidwrighton commented May 4, 2023

  1. Disable type validation by default in ReadyToRun images
    • This is potentially a behavior changing modification, and so maybe we shouldn't do this; however it produces some nice wins
    • I've built a type validation layer into the compiler to validate types against (most of) the checks that are removed. As part of building this, I've found a few small bugs in the typesystem. This validation layer isn't quite complete, and will be optional, although it will be enabled by default in crossgen2.
  2. Add new metadata to R2R images for finding generic parameter and nested class information
    • These lookups are extremely common during type loading.
  3. Store the index of the MethodDesc within the MethodDescChunk as well as the chunk index.
    • Used to improve the performance of finding the temporary entry point, which is a surprisingly common operation during type load. Its converted from an O(N^2) algorithm to O(1)
  4. Remove an old ngen'ism (CanShareVTableChunk) Since we don't have ngen images, we can ALWAYS share vtable chunks

Overall, this set of changes produces the following performance impact

Powershell startup

Initial time to launch powershell 100 times1 - 22869ms
With all fixes enabled - 19966ms
Without enabling skipping type validation - 21240ms

Additionally, while this change does increase the size of the r2r compiled files, the total size of the dlls associated with powershell goes from 242,357,432 bytes to 242,564,792 bytes, which is a fairly minimal increase.

Footnotes

  1. These powershell startup numbers are done by taking an official build of powershell targetting an older version of .NET, replacing the version of .NET with a locally built one, and then running crossgen2 on all binaries of powershell. This isn't likely how Powershell actually produces its builds, but should be representative of the sort of performance wins that can be had with these changes.

- Technically, this is a breaking change, so I've provided a means for disabling the type validation skip
- The model is that the C# compile won't get these details wrong, so disable the checks when run through crossgen2. The idea is that we'll get these things checked during normal, non-R2R usage of the app, and publish won't check these details.
- it was only needed to support NGen
Comment on lines 95 to 97
MethodIsGenericMap = 121, // Added in V9.0
EnclosingTypeMap = 122, // Added in V9.0
TypeGenericInfoMap = 123, // Added in V9.0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
MethodIsGenericMap = 121, // Added in V9.0
EnclosingTypeMap = 122, // Added in V9.0
TypeGenericInfoMap = 123, // Added in V9.0
MethodIsGenericMap = 121, // Added in V9.0
EnclosingTypeMap = 122, // Added in V9.0
TypeGenericInfoMap = 123, // Added in V9.0

src/coreclr/inc/readytorun.h Show resolved Hide resolved
src/coreclr/vm/readytoruninfo.h Outdated Show resolved Hide resolved
src/coreclr/vm/readytoruninfo.h Show resolved Hide resolved
src/coreclr/vm/readytoruninfo.cpp Outdated Show resolved Hide resolved
src/coreclr/vm/readytoruninfo.cpp Show resolved Hide resolved
@davidwrighton
Copy link
Member Author

/azp run runtime-coreclr crossgen2

Fix command line parameter to be more reasonable, and allow logging on command
Fix the rest of issues noted in crossgen2 testing
@dotnet dotnet deleted a comment from azure-pipelines bot Jun 1, 2023
@dotnet dotnet deleted a comment from azure-pipelines bot Jun 1, 2023
@dotnet dotnet deleted a comment from azure-pipelines bot Jun 1, 2023
@davidwrighton davidwrighton marked this pull request as ready for review June 2, 2023 00:11
Comment on lines 69 to 71
flags |= TypeFlags.HasGenericVarianceComputed;
flags |= TypeFlags.HasFinalizerComputed;
flags |= TypeFlags.AttributeCacheComputed;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this a bug if someone asks for these? Signature variables don't have "a finalizer" but also if someone is asking for that, maybe it's a bug. I've ran into the assert not setting this flag produces a 100% of times this was a bug in my code that I was happy to find so easily.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, this work operates almost entirely over uninstantiated types and methods, and has a few spots where it actually needs to use these things. However, it might be the case that I can only set a minimal number of flags. I enabled them all as it seemed like the right thing to do, but I believe the only 1 I really needed to work was the attribute cache.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the stack that leads to this being needed? This really feels like something where the only right answer is to unask the question.

Comment on lines 32 to 35
internal void SetAssociatedTypeOrMethod(TypeSystemEntity associatedTypeOrMethod)
{
_associatedTypeOrMethod = associatedTypeOrMethod;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we do this in a constructor? This is already a RuntimeXXXDesc so I wouldn't mind if the constructor just understands a ReadOnlySpan of Runtime.GenericVariance.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Refactor requested and implemented.

Comment on lines 86 to 94
TypeSystemEntity tse = Volatile.Read(ref _associatedTypeOrMethod);
if (tse == null)
{
GenericParameter parameter = _module.MetadataReader.GetGenericParameter(_handle);

tse = (TypeSystemEntity)_module.GetObject(parameter.Parent);
Volatile.Write(ref _associatedTypeOrMethod, tse);
}
return tse;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need the Volatile here.

If this is perf critical, we can use the same pattern as BaseType:

public override DefType BaseType
{
get
{
if (_baseType == this)
return InitializeBaseType();
return _baseType;
}
}

If it's not perf critical maybe we don't even need to cache it.

@@ -76,6 +76,8 @@ private InstantiatedGenericParameter(GenericParameterDesc genericParam)

public override GenericConstraints Constraints => _genericParam.Constraints;

public override TypeSystemEntity AssociatedTypeOrMethod => _genericParam.AssociatedTypeOrMethod;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know much about the ILVerify codebase, but doesn't this need InstantiateSignature?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe? I don't know that codebase either, but it doesn't seem unreasonable.

Copy link
Member

@AaronRobinsonMSFT AaronRobinsonMSFT left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The VM side looks good to me.

@davidwrighton
Copy link
Member Author

@trylek Could you take a look at the Crossgen2 side of these changes?

@davidwrighton davidwrighton requested a review from trylek June 9, 2023 23:08
builder.AddSymbol(this);

// This map is only valid for assemblies with <= 0xFFFE types defined within
Debug.Assert(_metadata.TypeDefinitions.Count <= 0xFFFE);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit - this constant appears multiple times in the code, could we make it a symbolic constant? A sufficiently expressive name like MaxTypeCountSupportingEnclosingTypeMap could also make some of the code comments obsolete / self-explanatory.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea.

{
if (factory.OptimizationFlags.TypeValidation == TypeValidationRule.AutomaticWithLogging)
{
foreach (string reason in _shouldAddSkipTypeValidationFlag.Result.reasons)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might be missing part of the intended logic here - according to the naming I would expect _shouldAddSkipTypeValidationFlag to represent some form of vetting whether we can drop the type validation and demonstrate the rationale if we do; why do we dump the reasons in the else block instead?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a comment. These are logging reasons why we aren't able to skip validation. Logging reasons we could skip would be really boring. (It would just be a list of types that are considered ok, which isn't all that useful for anybody)

return count;
}

HRESULT ReadyToRun_TypeGenericInfoMap::GetGenericArgumentCountNoThrow(mdTypeDef input, uint32_t *pCount, IMDInternalImport* pImport) const
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit - there seems to be tons of duplication between the pairs of methods with / without the NoThrow prefix. Could we somehow consolidate them by making one call the other and just slightly tweak the outputs?

return 0;
}

uint8_t chunk = ((uint8_t*)&MethodCount)[((rid - 1) / 8) + 4];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit - if you want to be consistent with your implementation of the other tables (and possibly also make the code easier to understand), you might want to consider replacing 4 with sizeof(uint32_t) to make it obvious it represents the initial count entry in the table.

Copy link
Member

@trylek trylek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great to me, thanks David for making all these optimizations!

@davidwrighton davidwrighton merged commit 8042fac into dotnet:main Jun 13, 2023
100 of 110 checks passed
@MichalStrehovsky MichalStrehovsky mentioned this pull request Jun 14, 2023
@EgorBo EgorBo mentioned this pull request Jun 19, 2023
@dotnet dotnet locked as resolved and limited conversation to collaborators Jul 14, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants