Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[NativeAot + Objective-C Marshal] memory corruption after thin locks were introduced #80031

Closed
AustinWise opened this issue Dec 29, 2022 · 7 comments

Comments

@AustinWise
Copy link
Contributor

Specifically while running the test in src/tests/Interop/ObjectiveC/ObjectiveCMarshalAPI, compiled using NativeAOT.

At this point I'm still investigating the cause of the corruption. Using git bisect I determined the test started failing when thin locks, #79519, was merged. Another possibility is I made a mistake while implementing #78280.

Crash details

The crashing stack trace:

* thread #1, queue = 'com.apple.main-thread', stop reason = instruction step into
  * frame #0: 0x0000000100004114 ObjectiveCMarshalAPI`__VirtualCall_S_P_CoreLib_System_Random__Next
    frame #1: 0x000000010024ba24 ObjectiveCMarshalAPI`S_P_CoreLib_System_Runtime_CompilerServices_RuntimeHelpers__GetNewHashCode + 52
    frame #2: 0x00000001001e4afc ObjectiveCMarshalAPI`S_P_CoreLib_System_Threading_ObjectHeader__AssignHashCode + 44
    frame #3: 0x00000001001e4aa4 ObjectiveCMarshalAPI`S_P_CoreLib_System_Threading_ObjectHeader__GetHashCode + 324
    frame #4: 0x000000010024ba68 ObjectiveCMarshalAPI`S_P_CoreLib_System_Runtime_CompilerServices_RuntimeHelpers__GetHashCode + 24
    frame #5: 0x00000001003d6038 ObjectiveCMarshalAPI`S_P_CoreLib_System_Runtime_CompilerServices_ConditionalWeakTable_2_Container<System___Canon__System___Canon>__FindEntry + 56
    frame #6: 0x00000001003d5fa8 ObjectiveCMarshalAPI`S_P_CoreLib_System_Runtime_CompilerServices_ConditionalWeakTable_2_Container<System___Canon__System___Canon>__TryGetValueWorker + 72
    frame #7: 0x00000001003cf350 ObjectiveCMarshalAPI`S_P_CoreLib_System_Runtime_CompilerServices_ConditionalWeakTable_2<System___Canon__System___Canon>__TryGetValue + 96
    frame #8: 0x0000000100250574 ObjectiveCMarshalAPI`ObjectiveCMarshalTryGetTaggedMemory + 84
    frame #9: 0x00000001000b81f0 ObjectiveCMarshalAPI`(anonymous namespace)::TryGetTaggedMemory(obj=0x00000001100271b8, tagged=0x000000016fdfe568) at interoplibinterface_objc.cpp:86:23
    frame #10: 0x00000001000b82b0 ObjectiveCMarshalAPI`ObjCMarshalNative::OnEnteredFinalizerQueue(object=0x00000001100271b8) at interoplibinterface_objc.cpp:167:10
    frame #11: 0x000000010001933c ObjectiveCMarshalAPI`GCToEEInterface::EagerFinalized(obj=0x00000001100271b8) at gcrhenv.cpp:1177:9
    frame #12: 0x0000000100078fc8 ObjectiveCMarshalAPI`WKS::CFinalize::ScanForFinalization(this=0x0000600002908000, pfn=(ObjectiveCMarshalAPI`WKS::GCHeap::Promote(Object**, ScanContext*, unsigned int) at gc.cpp:45945), gen=2, mark_only_p=NO, hp=0x0000000000000000)(Object**, ScanContext*, unsigned int), int, int, WKS::gc_heap*) at gc.cpp:48499:25
    frame #13: 0x000000010005ee40 ObjectiveCMarshalAPI`WKS::gc_heap::mark_phase(condemned_gen_number=2, mark_only_p=NO) at gc.cpp:26933:21
    frame #14: 0x0000000100059be0 ObjectiveCMarshalAPI`WKS::gc_heap::gc1() at gc.cpp:21221:13
    frame #15: 0x000000010006b9cc ObjectiveCMarshalAPI`WKS::gc_heap::garbage_collect(n=2) at gc.cpp:23011:9
    frame #16: 0x000000010004dcf0 ObjectiveCMarshalAPI`WKS::GCHeap::GarbageCollectGeneration(this=0x0000600000004230, gen=2, reason=reason_induced) at gc.cpp:47464:9
    frame #17: 0x000000010009f934 ObjectiveCMarshalAPI`WKS::GCHeap::GarbageCollectTry(this=0x0000600000004230, generation=2, low_memory_p=NO, mode=2) at gc.cpp:46709:12
    frame #18: 0x000000010009f774 ObjectiveCMarshalAPI`WKS::GCHeap::GarbageCollect(this=0x0000600000004230, generation=2, low_memory_p=false, mode=2) at gc.cpp:46639:30
    frame #19: 0x0000000100014778 ObjectiveCMarshalAPI`::RhpCollect(uGeneration=4294967295, uMode=2) at GCHelpers.cpp:39:35
    frame #20: 0x0000000100242348 ObjectiveCMarshalAPI`S_P_CoreLib_System_Runtime_InternalCalls__RhpCollect + 72
    frame #21: 0x00000001002422f0 ObjectiveCMarshalAPI`RhCollect + 32
    frame #22: 0x0000000100112938 ObjectiveCMarshalAPI`S_P_CoreLib_System_GC__Collect_0 + 24
    frame #23: 0x00000001002dec70 ObjectiveCMarshalAPI`ObjectiveCMarshalAPI_ObjectiveCMarshalAPI_Program__Validate_ReferenceTracking_Scenario + 672
    frame #24: 0x00000001002df3d0 ObjectiveCMarshalAPI`ObjectiveCMarshalAPI_ObjectiveCMarshalAPI_Program__Main + 48
    frame #25: 0x00000001003e140c ObjectiveCMarshalAPI`ObjectiveCMarshalAPI__Module___MainMethodWrapper + 12
    frame #26: 0x00000001003e1480 ObjectiveCMarshalAPI`ObjectiveCMarshalAPI__Module___StartupCodeMain + 96
    frame #27: 0x000000010000c738 ObjectiveCMarshalAPI`main(argc=1, argv=0x000000016fdff6f8) at main.cpp:216:18
    frame #28: 0x00000001aef43e50 dyld`start + 2544

Within the virtual call stub at the top of the stack, a register that should be pointing at a vtable is pointing one byte past the vtable. Specifically it is pointing at:

ObjectiveCMarshalAPI`vtable for S_P_CoreLib_System_Random_ThreadSafeRandom + 1

Other thoughts

This hash code related code path is being called while the program is paused in the garbage collector. It can potentially allocate memory or take locks, which is not allowed according to the rules of restricted callouts. I'll file a separate issue for that.

@dotnet-issue-labeler
Copy link

I couldn't figure out the best area label to add to this issue. If you have write-permissions please help me learn by adding exactly one area label.

@ghost ghost added the untriaged New issue has not been triaged by the area owner label Dec 29, 2022
@AustinWise
Copy link
Contributor Author

See also #80032. Both proposed solutions cause the test to stop crashing, but I don't know if they fix the root cause.

@ghost
Copy link

ghost commented Dec 29, 2022

Tagging subscribers to this area: @agocke, @MichalStrehovsky, @jkotas
See info in area-owners.md if you want to be subscribed.

Issue Details

Specifically while running the test in src/tests/Interop/ObjectiveC/ObjectiveCMarshalAPI, compiled using NativeAOT.

At this point I'm still investigating the cause of the corruption. Using git bisect I determined the test started failing when thin locks, #79519, was merged. Another possibility is I made a mistake while implementing #78280.

Crash details

The crashing stack trace:

* thread #1, queue = 'com.apple.main-thread', stop reason = instruction step into
  * frame #0: 0x0000000100004114 ObjectiveCMarshalAPI`__VirtualCall_S_P_CoreLib_System_Random__Next
    frame #1: 0x000000010024ba24 ObjectiveCMarshalAPI`S_P_CoreLib_System_Runtime_CompilerServices_RuntimeHelpers__GetNewHashCode + 52
    frame #2: 0x00000001001e4afc ObjectiveCMarshalAPI`S_P_CoreLib_System_Threading_ObjectHeader__AssignHashCode + 44
    frame #3: 0x00000001001e4aa4 ObjectiveCMarshalAPI`S_P_CoreLib_System_Threading_ObjectHeader__GetHashCode + 324
    frame #4: 0x000000010024ba68 ObjectiveCMarshalAPI`S_P_CoreLib_System_Runtime_CompilerServices_RuntimeHelpers__GetHashCode + 24
    frame #5: 0x00000001003d6038 ObjectiveCMarshalAPI`S_P_CoreLib_System_Runtime_CompilerServices_ConditionalWeakTable_2_Container<System___Canon__System___Canon>__FindEntry + 56
    frame #6: 0x00000001003d5fa8 ObjectiveCMarshalAPI`S_P_CoreLib_System_Runtime_CompilerServices_ConditionalWeakTable_2_Container<System___Canon__System___Canon>__TryGetValueWorker + 72
    frame #7: 0x00000001003cf350 ObjectiveCMarshalAPI`S_P_CoreLib_System_Runtime_CompilerServices_ConditionalWeakTable_2<System___Canon__System___Canon>__TryGetValue + 96
    frame #8: 0x0000000100250574 ObjectiveCMarshalAPI`ObjectiveCMarshalTryGetTaggedMemory + 84
    frame #9: 0x00000001000b81f0 ObjectiveCMarshalAPI`(anonymous namespace)::TryGetTaggedMemory(obj=0x00000001100271b8, tagged=0x000000016fdfe568) at interoplibinterface_objc.cpp:86:23
    frame #10: 0x00000001000b82b0 ObjectiveCMarshalAPI`ObjCMarshalNative::OnEnteredFinalizerQueue(object=0x00000001100271b8) at interoplibinterface_objc.cpp:167:10
    frame #11: 0x000000010001933c ObjectiveCMarshalAPI`GCToEEInterface::EagerFinalized(obj=0x00000001100271b8) at gcrhenv.cpp:1177:9
    frame #12: 0x0000000100078fc8 ObjectiveCMarshalAPI`WKS::CFinalize::ScanForFinalization(this=0x0000600002908000, pfn=(ObjectiveCMarshalAPI`WKS::GCHeap::Promote(Object**, ScanContext*, unsigned int) at gc.cpp:45945), gen=2, mark_only_p=NO, hp=0x0000000000000000)(Object**, ScanContext*, unsigned int), int, int, WKS::gc_heap*) at gc.cpp:48499:25
    frame #13: 0x000000010005ee40 ObjectiveCMarshalAPI`WKS::gc_heap::mark_phase(condemned_gen_number=2, mark_only_p=NO) at gc.cpp:26933:21
    frame #14: 0x0000000100059be0 ObjectiveCMarshalAPI`WKS::gc_heap::gc1() at gc.cpp:21221:13
    frame #15: 0x000000010006b9cc ObjectiveCMarshalAPI`WKS::gc_heap::garbage_collect(n=2) at gc.cpp:23011:9
    frame #16: 0x000000010004dcf0 ObjectiveCMarshalAPI`WKS::GCHeap::GarbageCollectGeneration(this=0x0000600000004230, gen=2, reason=reason_induced) at gc.cpp:47464:9
    frame #17: 0x000000010009f934 ObjectiveCMarshalAPI`WKS::GCHeap::GarbageCollectTry(this=0x0000600000004230, generation=2, low_memory_p=NO, mode=2) at gc.cpp:46709:12
    frame #18: 0x000000010009f774 ObjectiveCMarshalAPI`WKS::GCHeap::GarbageCollect(this=0x0000600000004230, generation=2, low_memory_p=false, mode=2) at gc.cpp:46639:30
    frame #19: 0x0000000100014778 ObjectiveCMarshalAPI`::RhpCollect(uGeneration=4294967295, uMode=2) at GCHelpers.cpp:39:35
    frame #20: 0x0000000100242348 ObjectiveCMarshalAPI`S_P_CoreLib_System_Runtime_InternalCalls__RhpCollect + 72
    frame #21: 0x00000001002422f0 ObjectiveCMarshalAPI`RhCollect + 32
    frame #22: 0x0000000100112938 ObjectiveCMarshalAPI`S_P_CoreLib_System_GC__Collect_0 + 24
    frame #23: 0x00000001002dec70 ObjectiveCMarshalAPI`ObjectiveCMarshalAPI_ObjectiveCMarshalAPI_Program__Validate_ReferenceTracking_Scenario + 672
    frame #24: 0x00000001002df3d0 ObjectiveCMarshalAPI`ObjectiveCMarshalAPI_ObjectiveCMarshalAPI_Program__Main + 48
    frame #25: 0x00000001003e140c ObjectiveCMarshalAPI`ObjectiveCMarshalAPI__Module___MainMethodWrapper + 12
    frame #26: 0x00000001003e1480 ObjectiveCMarshalAPI`ObjectiveCMarshalAPI__Module___StartupCodeMain + 96
    frame #27: 0x000000010000c738 ObjectiveCMarshalAPI`main(argc=1, argv=0x000000016fdff6f8) at main.cpp:216:18
    frame #28: 0x00000001aef43e50 dyld`start + 2544

Within the virtual call stub at the top of the stack, a register that should be pointing at a vtable is pointing one byte past the vtable. Specifically it is pointing at:

ObjectiveCMarshalAPI`vtable for S_P_CoreLib_System_Random_ThreadSafeRandom + 1

Other thoughts

This hash code related code path is being called while the program is paused in the garbage collector. It can potentially allocate memory or take locks, which is not allowed according to the rules of restricted callouts. I'll file a separate issue for that.

Author: AustinWise
Assignees: -
Labels:

untriaged, area-NativeAOT-coreclr

Milestone: -

@jkotas
Copy link
Member

jkotas commented Dec 29, 2022

cc @VSadov

@jkotas
Copy link
Member

jkotas commented Dec 29, 2022

Within the virtual call stub at the top of the stack, a register that should be pointing at a vtable is pointing one byte past the vtable.

The code is violating this rule:

// * For the AfterMarkPhase callout special attention must be paid to avoid any action that reads the MethodTable*
// from an object header (e.g. casting). At this point the GC may have mark bits set in the pointer.

@AustinWise
Copy link
Contributor Author

Ah, thanks for pointing out that rule. The reason the Thin Locks PR caused the test to start failing is it introduced a virtual call in RuntimeHelpers.GetNewHashCode.

I'll close this issue and leave #80032 open to track fixing the problem.

@ghost ghost removed the untriaged New issue has not been triaged by the area owner label Dec 29, 2022
@VSadov
Copy link
Member

VSadov commented Dec 29, 2022

Also there is a possible allocation, which could try to trigger GC, while doing GC.
And there is possible taking a lock, which could be held by a non-GC thread, which would be suspended at this point.

@dotnet dotnet locked as resolved and limited conversation to collaborators Jan 28, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

3 participants