Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Non-deterministic codegen in S_P_CoreLib_System_Reflection_Runtime_TypeInfos_RuntimeTypeInfo__GetPropertyImpl #93843

Closed
MichalStrehovsky opened this issue Oct 23, 2023 · 4 comments · Fixed by #94082

Comments

@MichalStrehovsky
Copy link
Member

In #93600 I'm adding a test to ensure the output of the native aot compiler is deterministic, but it found an issue in codegen of GetPropertyImpl:

diff --git a/baseline.txt b/compare.txt
index faad015..81a66b5 100644
--- a/baseline.txt
+++ b/compare.txt
@@ -79,7 +79,7 @@ S_P_CoreLib_System_Reflection_Runtime_TypeInfos_RuntimeTypeInfo__GetPropertyImpl
                     00
   00000000000ADE75: 4C 8B 4A 08        mov         r9,qword ptr [rdx+8]
   00000000000ADE79: 48 85 DB           test        rbx,rbx
-  00000000000ADE7C: 0F 84 85 02 00 00  je          00000000000AE107
+  00000000000ADE7C: 0F 84 78 02 00 00  je          00000000000AE0FA
   00000000000ADE82: 48 89 5C 24 20     mov         qword ptr [rsp+20h],rbx
   00000000000ADE87: 89 6C 24 28        mov         dword ptr [rsp+28h],ebp
   00000000000ADE8B: 33 D2              xor         edx,edx
@@ -215,7 +215,7 @@ S_P_CoreLib_System_Reflection_Runtime_TypeInfos_RuntimeTypeInfo__GetPropertyImpl
   00000000000AE047: 48 8B C3           mov         rax,rbx
   00000000000AE04A: E9 91 00 00 00     jmp         00000000000AE0E0
   00000000000AE04F: 4D 85 FF           test        r15,r15
-  00000000000AE052: 0F 84 BC 00 00 00  je          00000000000AE114
+  00000000000AE052: 0F 84 AF 00 00 00  je          00000000000AE107
   00000000000AE058: F7 C5 00 00 01 00  test        ebp,10000h
   00000000000AE05E: 74 3E              je          00000000000AE09E
   00000000000AE060: 48 8D 4C 24 60     lea         rcx,[rsp+60h]
@@ -271,23 +271,16 @@ S_P_CoreLib_System_Reflection_Runtime_TypeInfos_RuntimeTypeInfo__GetPropertyImpl
                     00
   00000000000AE101: E8 6A EC F7 FF     call        S_P_CoreLib_System_ArgumentNullException__Throw
   00000000000AE106: CC                 int         3
-  00000000000AE107: 48 8D 0D 00 00 00  lea         rcx,[__Str_name]
-                    00
-  00000000000AE10E: E8 5D EC F7 FF     call        S_P_CoreLib_System_ArgumentNullException__Throw
-  00000000000AE113: CC                 int         3
-  00000000000AE114: 48 8B CB           mov         rcx,rbx
-  00000000000AE117: E8 04 15 F9 FF     call        S_P_CoreLib_System_ThrowHelper__GetAmbiguousMatchException
-  00000000000AE11C: 48 8B C8           mov         rcx,rax
-  00000000000AE11F: E8 00 00 00 00     call        RhpThrowEx
-  00000000000AE124: CC                 int         3
-  00000000000AE125: 90                 nop
-  00000000000AE126: 90                 nop
-  00000000000AE127: 90                 nop
-  00000000000AE128: 90                 nop
-  00000000000AE129: 90                 nop
-  00000000000AE12A: 90                 nop
-  00000000000AE12B: 90                 nop
-  00000000000AE12C: 90                 nop
-  00000000000AE12D: 90                 nop
-  00000000000AE12E: 90                 nop
-  00000000000AE12F: 90                 nop
+  00000000000AE107: 48 8B CB           mov         rcx,rbx
+  00000000000AE10A: E8 11 15 F9 FF     call        S_P_CoreLib_System_ThrowHelper__GetAmbiguousMatchException
+  00000000000AE10F: 48 8B C8           mov         rcx,rax
+  00000000000AE112: E8 00 00 00 00     call        RhpThrowEx
+  00000000000AE117: CC                 int         3
+  00000000000AE118: 90                 nop
+  00000000000AE119: 90                 nop
+  00000000000AE11A: 90                 nop
+  00000000000AE11B: 90                 nop
+  00000000000AE11C: 90                 nop
+  00000000000AE11D: 90                 nop
+  00000000000AE11E: 90                 nop
+  00000000000AE11F: 90                 nop

In one run, two ArgumentNullException.Throw("name") calls got coalesced into a single throwing code. In the other run, there are two identical blocks to throw called from two separate places.

@MichalStrehovsky MichalStrehovsky added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Oct 23, 2023
@ghost
Copy link

ghost commented Oct 23, 2023

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

Issue Details

In #93600 I'm adding a test to ensure the output of the native aot compiler is deterministic, but it found an issue in codegen of GetPropertyImpl:

diff --git a/baseline.txt b/compare.txt
index faad015..81a66b5 100644
--- a/baseline.txt
+++ b/compare.txt
@@ -79,7 +79,7 @@ S_P_CoreLib_System_Reflection_Runtime_TypeInfos_RuntimeTypeInfo__GetPropertyImpl
                     00
   00000000000ADE75: 4C 8B 4A 08        mov         r9,qword ptr [rdx+8]
   00000000000ADE79: 48 85 DB           test        rbx,rbx
-  00000000000ADE7C: 0F 84 85 02 00 00  je          00000000000AE107
+  00000000000ADE7C: 0F 84 78 02 00 00  je          00000000000AE0FA
   00000000000ADE82: 48 89 5C 24 20     mov         qword ptr [rsp+20h],rbx
   00000000000ADE87: 89 6C 24 28        mov         dword ptr [rsp+28h],ebp
   00000000000ADE8B: 33 D2              xor         edx,edx
@@ -215,7 +215,7 @@ S_P_CoreLib_System_Reflection_Runtime_TypeInfos_RuntimeTypeInfo__GetPropertyImpl
   00000000000AE047: 48 8B C3           mov         rax,rbx
   00000000000AE04A: E9 91 00 00 00     jmp         00000000000AE0E0
   00000000000AE04F: 4D 85 FF           test        r15,r15
-  00000000000AE052: 0F 84 BC 00 00 00  je          00000000000AE114
+  00000000000AE052: 0F 84 AF 00 00 00  je          00000000000AE107
   00000000000AE058: F7 C5 00 00 01 00  test        ebp,10000h
   00000000000AE05E: 74 3E              je          00000000000AE09E
   00000000000AE060: 48 8D 4C 24 60     lea         rcx,[rsp+60h]
@@ -271,23 +271,16 @@ S_P_CoreLib_System_Reflection_Runtime_TypeInfos_RuntimeTypeInfo__GetPropertyImpl
                     00
   00000000000AE101: E8 6A EC F7 FF     call        S_P_CoreLib_System_ArgumentNullException__Throw
   00000000000AE106: CC                 int         3
-  00000000000AE107: 48 8D 0D 00 00 00  lea         rcx,[__Str_name]
-                    00
-  00000000000AE10E: E8 5D EC F7 FF     call        S_P_CoreLib_System_ArgumentNullException__Throw
-  00000000000AE113: CC                 int         3
-  00000000000AE114: 48 8B CB           mov         rcx,rbx
-  00000000000AE117: E8 04 15 F9 FF     call        S_P_CoreLib_System_ThrowHelper__GetAmbiguousMatchException
-  00000000000AE11C: 48 8B C8           mov         rcx,rax
-  00000000000AE11F: E8 00 00 00 00     call        RhpThrowEx
-  00000000000AE124: CC                 int         3
-  00000000000AE125: 90                 nop
-  00000000000AE126: 90                 nop
-  00000000000AE127: 90                 nop
-  00000000000AE128: 90                 nop
-  00000000000AE129: 90                 nop
-  00000000000AE12A: 90                 nop
-  00000000000AE12B: 90                 nop
-  00000000000AE12C: 90                 nop
-  00000000000AE12D: 90                 nop
-  00000000000AE12E: 90                 nop
-  00000000000AE12F: 90                 nop
+  00000000000AE107: 48 8B CB           mov         rcx,rbx
+  00000000000AE10A: E8 11 15 F9 FF     call        S_P_CoreLib_System_ThrowHelper__GetAmbiguousMatchException
+  00000000000AE10F: 48 8B C8           mov         rcx,rax
+  00000000000AE112: E8 00 00 00 00     call        RhpThrowEx
+  00000000000AE117: CC                 int         3
+  00000000000AE118: 90                 nop
+  00000000000AE119: 90                 nop
+  00000000000AE11A: 90                 nop
+  00000000000AE11B: 90                 nop
+  00000000000AE11C: 90                 nop
+  00000000000AE11D: 90                 nop
+  00000000000AE11E: 90                 nop
+  00000000000AE11F: 90                 nop

In one run, two ArgumentNullException.Throw("name") calls got coalesced into a single throwing code. In the other run, there are two identical blocks to throw called from two separate places.

Author: MichalStrehovsky
Assignees: -
Labels:

area-CodeGen-coreclr

Milestone: -

@ghost ghost added the untriaged New issue has not been triaged by the area owner label Oct 23, 2023
@SingleAccretion SingleAccretion self-assigned this Oct 23, 2023
@SingleAccretion SingleAccretion added this to the 9.0.0 milestone Oct 23, 2023
@SingleAccretion SingleAccretion removed the untriaged New issue has not been triaged by the area owner label Oct 23, 2023
@SingleAccretion
Copy link
Contributor

SingleAccretion commented Oct 23, 2023

After some time spent on reproducing this, I think I have a good handle on the root cause.

The difference is that we sometimes don't deduplicate these helpers:

Scanning the 2 candidates

*** Does not return call
               [000256] --C-G------                         *  CALL      void   System.ArgumentNullException:Throw(System.String)
               [000255] ----------- arg0                    \--*  CNS_STR   ref   <string constant>
    in BB29 is unique, marking it as canonical

*** Does not return call
               [000236] --C-G------                         *  CALL      void   System.ArgumentNullException:Throw(System.String)
               [000235] ----------- arg0                    \--*  CNS_STR   ref   <string constant>
    in BB24 is unique, marking it as canonical

This is because we may not consider [000255] and [000235] equal. GenTreeStrCon equality compares gtScpHnd, a CORINFO_MODULE_HANDLE, which is actually a MethodIL handle, which is not deterministic in CG2/NAOT compilations.

Side note: the IL cache logic could benefit from a stress mode where flushing would be performed more frequently.

Not sure yet what the good fix is; there is a lot of dependency on the "token + scope handle" pair in the Jit-EE interface for string literals, it would be nice not to have to rewrite it.

@MichalStrehovsky
Copy link
Member Author

@SingleAccretion thank you for looking at this and root causing it! Sorry, I was sick for the past few days so I didn't look at github.

I think we can make this so that RyuJIT doesn't see two CORINFO_MODULE_HANDLE for a single method. I'll make a PR.

@MichalStrehovsky MichalStrehovsky added area-NativeAOT-coreclr and removed area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI labels Oct 27, 2023
@ghost
Copy link

ghost commented Oct 27, 2023

Tagging subscribers to this area: @agocke, @MichalStrehovsky, @jkotas
See info in area-owners.md if you want to be subscribed.

Issue Details

In #93600 I'm adding a test to ensure the output of the native aot compiler is deterministic, but it found an issue in codegen of GetPropertyImpl:

diff --git a/baseline.txt b/compare.txt
index faad015..81a66b5 100644
--- a/baseline.txt
+++ b/compare.txt
@@ -79,7 +79,7 @@ S_P_CoreLib_System_Reflection_Runtime_TypeInfos_RuntimeTypeInfo__GetPropertyImpl
                     00
   00000000000ADE75: 4C 8B 4A 08        mov         r9,qword ptr [rdx+8]
   00000000000ADE79: 48 85 DB           test        rbx,rbx
-  00000000000ADE7C: 0F 84 85 02 00 00  je          00000000000AE107
+  00000000000ADE7C: 0F 84 78 02 00 00  je          00000000000AE0FA
   00000000000ADE82: 48 89 5C 24 20     mov         qword ptr [rsp+20h],rbx
   00000000000ADE87: 89 6C 24 28        mov         dword ptr [rsp+28h],ebp
   00000000000ADE8B: 33 D2              xor         edx,edx
@@ -215,7 +215,7 @@ S_P_CoreLib_System_Reflection_Runtime_TypeInfos_RuntimeTypeInfo__GetPropertyImpl
   00000000000AE047: 48 8B C3           mov         rax,rbx
   00000000000AE04A: E9 91 00 00 00     jmp         00000000000AE0E0
   00000000000AE04F: 4D 85 FF           test        r15,r15
-  00000000000AE052: 0F 84 BC 00 00 00  je          00000000000AE114
+  00000000000AE052: 0F 84 AF 00 00 00  je          00000000000AE107
   00000000000AE058: F7 C5 00 00 01 00  test        ebp,10000h
   00000000000AE05E: 74 3E              je          00000000000AE09E
   00000000000AE060: 48 8D 4C 24 60     lea         rcx,[rsp+60h]
@@ -271,23 +271,16 @@ S_P_CoreLib_System_Reflection_Runtime_TypeInfos_RuntimeTypeInfo__GetPropertyImpl
                     00
   00000000000AE101: E8 6A EC F7 FF     call        S_P_CoreLib_System_ArgumentNullException__Throw
   00000000000AE106: CC                 int         3
-  00000000000AE107: 48 8D 0D 00 00 00  lea         rcx,[__Str_name]
-                    00
-  00000000000AE10E: E8 5D EC F7 FF     call        S_P_CoreLib_System_ArgumentNullException__Throw
-  00000000000AE113: CC                 int         3
-  00000000000AE114: 48 8B CB           mov         rcx,rbx
-  00000000000AE117: E8 04 15 F9 FF     call        S_P_CoreLib_System_ThrowHelper__GetAmbiguousMatchException
-  00000000000AE11C: 48 8B C8           mov         rcx,rax
-  00000000000AE11F: E8 00 00 00 00     call        RhpThrowEx
-  00000000000AE124: CC                 int         3
-  00000000000AE125: 90                 nop
-  00000000000AE126: 90                 nop
-  00000000000AE127: 90                 nop
-  00000000000AE128: 90                 nop
-  00000000000AE129: 90                 nop
-  00000000000AE12A: 90                 nop
-  00000000000AE12B: 90                 nop
-  00000000000AE12C: 90                 nop
-  00000000000AE12D: 90                 nop
-  00000000000AE12E: 90                 nop
-  00000000000AE12F: 90                 nop
+  00000000000AE107: 48 8B CB           mov         rcx,rbx
+  00000000000AE10A: E8 11 15 F9 FF     call        S_P_CoreLib_System_ThrowHelper__GetAmbiguousMatchException
+  00000000000AE10F: 48 8B C8           mov         rcx,rax
+  00000000000AE112: E8 00 00 00 00     call        RhpThrowEx
+  00000000000AE117: CC                 int         3
+  00000000000AE118: 90                 nop
+  00000000000AE119: 90                 nop
+  00000000000AE11A: 90                 nop
+  00000000000AE11B: 90                 nop
+  00000000000AE11C: 90                 nop
+  00000000000AE11D: 90                 nop
+  00000000000AE11E: 90                 nop
+  00000000000AE11F: 90                 nop

In one run, two ArgumentNullException.Throw("name") calls got coalesced into a single throwing code. In the other run, there are two identical blocks to throw called from two separate places.

Author: MichalStrehovsky
Assignees: SingleAccretion
Labels:

area-NativeAOT-coreclr

Milestone: 9.0.0

MichalStrehovsky added a commit to MichalStrehovsky/runtime that referenced this issue Oct 27, 2023
RyuJIT depends on never seeing two different `CORINFO_MODULE_STRUCT` for the same thing.

Fixes dotnet#93843.
@ghost ghost added the in-pr There is an active PR which will close this issue when it is merged label Oct 27, 2023
MichalStrehovsky added a commit that referenced this issue Oct 30, 2023
RyuJIT depends on never seeing two different `CORINFO_MODULE_STRUCT` for the same thing.

Fixes #93843.
@ghost ghost removed the in-pr There is an active PR which will close this issue when it is merged label Oct 30, 2023
@ghost ghost locked as resolved and limited conversation to collaborators Nov 30, 2023
github-actions bot pushed a commit that referenced this issue Dec 1, 2023
RyuJIT depends on never seeing two different `CORINFO_MODULE_STRUCT` for the same thing.

Fixes #93843.
jeffschwMSFT pushed a commit that referenced this issue Jan 2, 2024
RyuJIT depends on never seeing two different `CORINFO_MODULE_STRUCT` for the same thing.

Fixes #93843.

Co-authored-by: Michal Strehovský <MichalStrehovsky@users.noreply.github.com>
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants