-
Notifications
You must be signed in to change notification settings - Fork 4.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Libraries PGO tests failing with System.BadImageFormatException #56655
Comments
cc @EgorBo in case you get a chance to look before I do... |
OOF this week without a laptop, sorry |
Via
Not sure yet what in that method causes trouble. |
Note this method is on the stack when the test fails, so debugging the failure seems prudent:
|
Failure is not tied to PGO data directly -- just the inlining that PGO engenders. That is if I use log replay to do the exact inlines that PGO does, but erase the PGO data, the failure persists. Trimming down the set of inlines, the key inline seems to be the last one done here:
This method is large (> 500 bytes of IL) but gets large benefit boosts:
I may try to trim down the set of inlines further; suspect maybe just this inline is enough. |
Ok, think I've found the bug.
The code above initializes a runtime/src/coreclr/System.Private.CoreLib/src/System/Reflection/MdImport.cs Lines 183 to 195 in 57bfe47
Note this looks at one part of the struct or another. So far so good. But when we inline the caller that initializes the struct at a call site within a loop, the struct is not zeroed before each call, and the qcall does not clear out the part of the struct state it's not interested in. So we can end up looking at stale data from a previous call, which happens to be a token value for a property that does not make sense, and this causes the property props code to throw a bad format exception. This is some fallout from skipping zero-init in SPC; any qcall that initializes a struct must now initialize the entire thing (or enough of it to avoid a problem like this). |
This change fixes the issue: diff --git a/src/coreclr/vm/managedmdimport.cpp b/src/coreclr/vm/managedmdimport.cpp
index 6bea537f1c8..ec2aef0cc03 100644
--- a/src/coreclr/vm/managedmdimport.cpp
+++ b/src/coreclr/vm/managedmdimport.cpp
@@ -163,6 +163,7 @@ static int * EnsureResultSize(MetadataEnumResult * pResult, ULONG length)
else
{
ZeroMemory(pResult->smallResult, sizeof(pResult->smallResult));
+ pResult->largeResult = NULL;
p = pResult->smallResult;
} |
That is interesting and something the interop team should be aware of. Thanks for the analysis. @jkoritzinsky @elinor-fung Might be worth auditing some of our QCalls since we tend to do a lot of initialization/reading for the older inertop paths. |
…Enum `MetadataEnumResult` has a fixed inline buffer for returning small results and a pointer to allow it to return larger ones. The indexer for this checks the pointer and if non-null assumes that's the current set of values. But if a `MetadataEnumResult` is re-used within a loop, values written to it by `MetaDataImport::Enum` may bleed from one loop iteration to the next if the iterations first get a large result and then a small one. One case where this could happen was in libraries PGO tests, where PGO data encouraged the jit to inline `MemberInfoCache<T>.PopulateProperties(Filter,...)` into `MemberInfoCache<T>.PopulateProperties(Filter)`. Note this also is a conseqeunce of skipping zero init locals; without that the struct would have been zeroed each loop iteration. Fixes dotnet#56655.
…Enum (#56756) `MetadataEnumResult` has a fixed inline buffer for returning small results and a pointer to allow it to return larger ones. The indexer for this checks the pointer and if non-null assumes that's the current set of values. But if a `MetadataEnumResult` is re-used within a loop, values written to it by `MetaDataImport::Enum` may bleed from one loop iteration to the next if the iterations first get a large result and then a small one. One case where this could happen was in libraries PGO tests, where PGO data encouraged the jit to inline `MemberInfoCache<T>.PopulateProperties(Filter,...)` into `MemberInfoCache<T>.PopulateProperties(Filter)`. Note this also is a conseqeunce of skipping zero init locals; without that the struct would have been zeroed each loop iteration. Fixes #56655.
Seems to mainly pop up in the DCS tests -- example.
Haven't tried to repro yet.
The text was updated successfully, but these errors were encountered: