-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JIT: enhance escape analysis to understand local struct fields #113141
base: main
Are you sure you want to change the base?
Conversation
Implement a very simplistic "field sensitive" analysis for `Span<T>` where we model the span as simply its byref field. If the span is only consumed locally, and the array does not otherwise escape, then the array does not escape. This is a subset of dotnet#112543 that does not try and reason interprocedurally. Contributes to dotnet#104936 / dotnet#108913
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PR Overview
This PR adds new tests to the JIT for enhanced escape analysis concerning Span capture. It includes the introduction of a ref struct (SpanKeeper) for capturing spans and several new test cases and helper methods that exercise various span capture patterns.
- Introduces a new ref struct (SpanKeeper) to model captured spans.
- Adds new test methods (e.g., SpanCaptureArray1, SpanCaptureArrayT, SpanEscapeArrayOutParam2) for Span capture scenarios.
- Includes supporting helper methods (e.g., Use(Span), TrashStack) to drive the tests.
Reviewed Changes
File | Description |
---|---|
src/tests/JIT/opt/ObjectStackAllocation/ObjectStackAllocationTests.cs | Added tests and helper methods to verify proper handling of Span captures in varying contexts |
Copilot reviewed 3 out of 3 changed files in this pull request and generated no comments.
Comments suppressed due to low confidence (1)
src/tests/JIT/opt/ObjectStackAllocation/ObjectStackAllocationTests.cs:519
- The output parameter 'b' in SpanEscapeArrayOutParam2Helper is used before it is explicitly assigned. Consider initializing 'b' (e.g., using 'b = default;') before setting its fields to ensure proper assignment.
b.span = x;
@EgorBo @jakobbotsch PTAL Note for full-blown field-sensitive handling of structs we would need to build connection graph edges when fields are loaded; since we know loads from spans can't cause escapes we bypass that for now. |
src/coreclr/jit/objectalloc.cpp
Outdated
newLayout = m_compiler->typGetBlkLayout(layout->GetSize()); | ||
JITDUMP("Changing layout of span V%02u to block\n", lclNum); | ||
lclVarDsc->ChangeLayout(newLayout); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think for this we want a layout with proper padding information, otherwise the promotion will be suboptimal compared to the original layout.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, ok... maybe an "erase GC" method on a layout?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it even necessary for us to change the type here? Seems like the original byref type should be ok to now point to a stack-allocated thing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, ok... maybe an "erase GC" method on a layout?
I think you can do something like:
ClassLayoutBuilder builder(lcl->GetLayout()->GetSize());
builder.CopyInfoFrom(0, lcl->GetLayout(), /* copyPadding */ true);
builder.SetGCPtrType(OFFSETOF__CORINFO_Span__reference / TARGET_POINTER_SIZE, TYP_I_IMPL);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Although I suppose if we generalized this PR to arbitrary structs we would indeed want something that only copies the padding information, so maybe introducing another parameter to CopyInfoFrom
to ignore the GC information would be the way to go.
Maybe I misunderstand this, but what about e.g. Can the handling of this PR be generalized to treat any struct with references in it as one big reference that causes unification to happen between everything that read/wrote the struct? |
Indeed, looks like we need to do some modelling of the loads. I think we can just conflate the struct with its fields. Let me see how this works out. |
We conflate a struct with its fields, since we know the struct itself can't escape. We rely on accesses not being able to walk from one local to another. We try to be properly conservative around byrefs.
// Allow a source GC_REF to match a custom GC_BYREF | ||
// | ||
if ((layout2->GetGCPtrType(i) == TYP_REF) && (layout1->GetGCPtrType(i) == TYP_BYREF) && | ||
layout1->IsCustomLayout()) | ||
{ | ||
continue; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changing the definition of layout compatibility seems quite questionable to me. Why is it necessary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
GCStruct a;
a.obj = ..; // doesn't escape
GCStruct b;
b.obj = ..; // escapes
// Assignment is legal, but layouts are not compatible w/o the above
// a.obj must be BYREF as it may refer to stack or heap
a = b
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was tempted to retain the class handle in new layout, but didn't.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note allowing b = a
would be incompatible with the results of escape analysis, which is why this is one-sided (source can be byref when dest is ref, but not vice-versa).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So are you saying that for the
object aObj = ...; // doesn't escape
object bObj = ...; // escapes
aObj = bObj;
case, today we will produce a STORE_LCL_VAR
that stores a TYP_REF
into a TYP_BYREF
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we will retype aObj
appearances to be byrefs, as it will be "possibly stack pointing". And aObj
itself in the local table.
This has been there since forever.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, it makes it somewhat more palatable. But it seems like we need to separate the notions of "user facing" layout compatibility, as used by e.g. Unsafe.BitCast
, and "implementation-specific assignment compatibility" which is non-symmetric and where we rely on internal coreclr implementation details. As you probably know there could be a potential future where byrefs pointing at the method table would be illegal... I guess in that case we would need a third kind of GC pointer. Or we would have to switch object stack allocation to only kick in when it can reason about and change all accesses of the object.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm open to suggestions on how to express this better. Perhaps the presence of a custom layout for the dest might be enough?
We could get here with custom layouts on both sides:
GCStruct a;
a.obj = ..; // doesn't escape, may point at heap [byref]
GCStruct b;
b.obj = ..; // doesn't escape, never points at heap [nongc]
a = b
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would just suggest to rename this to ClassLayout::AssignmentCompatible(ClassLayout* dest, ClassLayout* src)
and allow the byref <- ref assignments generally. And then switch Unsafe.BitCast
to a version that keeps the stricter behavior.
Should probably be a separate PR...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Maybe BitCast
doesn't even need any changes, I was under the assumption that it threw an exception on incompatible layouts, but it does not seem like it is the case)
{ | ||
gcType = TYPE_GC_BYREF; | ||
} | ||
SetGCPtr(startSlot + slot, gcType); | ||
} | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code feels very object stack allocation specific. Put it in objectalloc.cpp?
// Get a layout like an existing layout, with all gc refs removed | ||
ClassLayout* typGetNonGCLayout(ClassLayout* existingLayout); | ||
// Get a layout like an existing layout, with all gc refs changed to byrefs | ||
ClassLayout* typGetByrefLayout(ClassLayout* existingLayout); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This feels like something that only stack allocation is ever going to use, so it would make more sense to me to keep it in objectalloc.h/cpp.
if (prefix != nullptr) | ||
{ | ||
char* newName = nullptr; | ||
char* newShortName = nullptr; | ||
|
||
if (layoutName != nullptr) | ||
{ | ||
size_t len = strlen(prefix) + strlen(layoutName) + 1; | ||
newName = m_compiler->getAllocator(CMK_DebugOnly).allocate<char>(len); | ||
sprintf_s(newName, len, "%s%s", prefix, layoutShortName); | ||
} | ||
|
||
if (layoutShortName != nullptr) | ||
{ | ||
size_t len = strlen(prefix) + strlen(layoutName) + 1; | ||
newShortName = m_compiler->getAllocator(CMK_DebugOnly).allocate<char>(len); | ||
sprintf_s(newShortName, len, "%s%s", prefix, layoutShortName); | ||
} | ||
|
||
SetName(newName, newShortName); | ||
} | ||
else | ||
{ | ||
SetName(layoutName, layoutShortName); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can use printfAlloc
.
Wonder if this is what I was seeing locally, somehow a managed type makes it to a UDIV. |
Span<T>
capture
We are trying to retype after stack allocation
and don't expect to see a cast. |
I see a regression where previously stack allocated array now becomes heap allocated (coreclr_tests), is it expected? @@ -26,10 +26,10 @@
; V15 tmp13 [V15,T14] ( 2, 2 ) byref -> rbx single-def "constrained call requires dereference for 'this' right before call"
;* V16 tmp14 [V16 ] ( 0, 0 ) byref -> zero-ref "constrained call requires dereference for 'this' right before call"
; V17 tmp15 [V17,T15] ( 2, 2 ) byref -> rbx single-def "constrained call requires dereference for 'this' right before call"
-;* V18 tmp16 [V18 ] ( 0, 0 ) long -> zero-ref class-hnd exact "NewArr temp" <MyInt[]>
-; V19 tmp17 [V19,T13] ( 4, 2 ) long -> rsi class-hnd exact single-def "Inline stloc first use temp" <MyInt[]>
+;* V18 tmp16 [V18 ] ( 0, 0 ) ref -> zero-ref class-hnd exact "NewArr temp" <MyInt[]>
+; V19 tmp17 [V19,T13] ( 4, 2 ) ref -> rsi class-hnd exact single-def "Inline stloc first use temp" <MyInt[]>
; V20 tmp18 [V20,T16] ( 2, 2 ) ref -> rax class-hnd single-def "Strict ordering of exceptions for Array store" <MyInt>
-;* V21 tmp19 [V21 ] ( 0, 0 ) long -> zero-ref "constrained call requires dereference for 'this' right before call"
+;* V21 tmp19 [V21 ] ( 0, 0 ) byref -> zero-ref "constrained call requires dereference for 'this' right before call"
;* V22 tmp20 [V22 ] ( 0, 0 ) byref -> zero-ref "constrained call requires dereference for 'this' right before call"
; V23 tmp21 [V23,T17] ( 2, 2 ) byref -> rbx single-def "constrained call requires dereference for 'this' right before call"
;* V24 tmp22 [V24 ] ( 0, 0 ) byref -> zero-ref "constrained call requires dereference for 'this' right before call"
@@ -38,27 +38,22 @@
; V27 tmp25 [V27,T19] ( 2, 2 ) byref -> rbx single-def "constrained call requires dereference for 'this' right before call"
;* V28 tmp26 [V28 ] ( 0, 0 ) byref -> zero-ref "constrained call requires dereference for 'this' right before call"
; V29 tmp27 [V29,T20] ( 2, 2 ) byref -> rbx single-def "constrained call requires dereference for 'this' right before call"
-; V30 tmp28 [V30 ] ( 3, 1.50) struct (24) [rsp+0x20] do-not-enreg[XS] must-init addr-exposed "stack allocated array temp" <MyInt[]>
-; V31 tmp29 [V31,T01] ( 5, 5 ) int -> rax "Single return block return value"
-; V32 tmp30 [V32,T21] ( 2, 2 ) ref -> rcx single-def "argument with side effect"
-; V33 tmp31 [V33,T09] ( 3, 3 ) ref -> rcx single-def "arr expr"
-; V34 tmp32 [V34,T22] ( 2, 2 ) ref -> rcx single-def "argument with side effect"
-; V35 tmp33 [V35,T10] ( 3, 3 ) ref -> rcx single-def "arr expr"
-; V36 tmp34 [V36,T23] ( 2, 2 ) ref -> rcx single-def "argument with side effect"
-; V37 tmp35 [V37,T11] ( 3, 3 ) ref -> rbx single-def "arr expr"
-; V38 tmp36 [V38,T12] ( 3, 3 ) ref -> rbx single-def "arr expr"
+; V30 tmp28 [V30,T01] ( 5, 5 ) int -> rax "Single return block return value"
+; V31 tmp29 [V31,T21] ( 2, 2 ) ref -> rcx single-def "argument with side effect"
+; V32 tmp30 [V32,T09] ( 3, 3 ) ref -> rcx single-def "arr expr"
+; V33 tmp31 [V33,T22] ( 2, 2 ) ref -> rcx single-def "argument with side effect"
+; V34 tmp32 [V34,T10] ( 3, 3 ) ref -> rcx single-def "arr expr"
+; V35 tmp33 [V35,T23] ( 2, 2 ) ref -> rcx single-def "argument with side effect"
+; V36 tmp34 [V36,T11] ( 3, 3 ) ref -> rbx single-def "arr expr"
+; V37 tmp35 [V37,T12] ( 3, 3 ) ref -> rbx single-def "arr expr"
;
-; Lcl frame size = 56
+; Lcl frame size = 40
G_M56432_IG01: ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref, nogc <-- Prolog IG
push rsi
push rbx
- sub rsp, 56
- vxorps xmm4, xmm4, xmm4
- vmovdqa xmmword ptr [rsp+0x20], xmm4
- xor eax, eax
- mov qword ptr [rsp+0x30], rax
- ;; size=23 bbWeight=1 PerfScore 5.83
+ sub rsp, 40
+ ;; size=6 bbWeight=1 PerfScore 2.25
G_M56432_IG02: ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref
mov rcx, 0xD1FFAB1E ; <unknown class>
call CORINFO_HELP_NEWSFAST
@@ -119,21 +114,28 @@ G_M56432_IG03: ; bbWeight=0.50, gcrefRegs=0008 {rbx}, byrefRegs=0000 {},
test eax, eax
jne G_M56432_IG11
mov rcx, 0xD1FFAB1E ; MyInt[]
- mov qword ptr [rsp+0x20], rcx
- lea rcx, [rsp+0x20]
- mov dword ptr [rcx+0x08], 1
- lea rsi, [rsp+0x20]
- mov rcx, 0xD1FFAB1E ; <unknown method>
- call [System.Activator:CreateInstance[System.__Canon]():System.__Canon]
+ mov edx, 1
+ call CORINFO_HELP_NEWARR_1_OBJ
; gcrRegs +[rax]
; gcr arg pop 0
- mov gword ptr [rsi+0x10], rax
+ mov rsi, rax
+ ; gcrRegs +[rsi]
+ mov rcx, 0xD1FFAB1E ; <unknown method>
+ call [System.Activator:CreateInstance[System.__Canon]():System.__Canon]
+ ; gcr arg pop 0
+ lea rcx, bword ptr [rsi+0x10]
+ ; byrRegs +[rcx]
+ mov rdx, rax
+ ; gcrRegs +[rdx]
+ call CORINFO_HELP_ASSIGN_REF
+ ; gcrRegs -[rax rdx]
+ ; byrRegs -[rcx]
mov rcx, gword ptr [rsi+0x10]
; gcrRegs +[rcx]
mov r11, 0xD1FFAB1E ; code for <unknown method>
mov edx, 100
call [r11]<unknown method>
- ; gcrRegs -[rax rcx]
+ ; gcrRegs -[rcx]
; gcr arg pop 0
mov rcx, gword ptr [rbx+0x10]
; gcrRegs +[rcx]
@@ -141,7 +143,7 @@ G_M56432_IG03: ; bbWeight=0.50, gcrefRegs=0008 {rbx}, byrefRegs=0000 {},
; gcrRegs +[r8]
xor edx, edx
call CORINFO_HELP_ARRADDR_ST
- ; gcrRegs -[rcx r8]
+ ; gcrRegs -[rcx rsi r8]
; gcr arg pop 0
mov rcx, gword ptr [rbx+0x10]
; gcrRegs +[rcx]
@@ -187,7 +189,7 @@ G_M56432_IG03: ; bbWeight=0.50, gcrefRegs=0008 {rbx}, byrefRegs=0000 {},
; gcr arg pop 0
mov rcx, rbx
; gcrRegs +[rcx]
- ;; size=283 bbWeight=0.50 PerfScore 30.25
+ ;; size=282 bbWeight=0.50 PerfScore 29.88
G_M56432_IG04: ; bbWeight=0.50, isz, extend
mov rdx, rsi
; gcrRegs +[rdx]
@@ -337,7 +339,7 @@ G_M56432_IG07: ; bbWeight=0.50, gcrefRegs=0008 {rbx}, byrefRegs=0000 {},
mov eax, 4
;; size=95 bbWeight=0.50 PerfScore 10.50
G_M56432_IG08: ; bbWeight=0.50, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref, epilog, nogc
- add rsp, 56
+ add rsp, 40
pop rbx
pop rsi
ret
@@ -379,7 +381,7 @@ G_M56432_IG09: ; bbWeight=0.50, gcVars=0000000000000000 {}, gcrefRegs=000
mov eax, 3
;; size=95 bbWeight=0.50 PerfScore 10.50
G_M56432_IG10: ; bbWeight=0.50, epilog, nogc, extend
- add rsp, 56
+ add rsp, 40
pop rbx
pop rsi
ret
@@ -418,7 +420,7 @@ G_M56432_IG11: ; bbWeight=0.50, gcVars=0000000000000000 {}, gcrefRegs=000
mov eax, 2
;; size=81 bbWeight=0.50 PerfScore 7.50
G_M56432_IG12: ; bbWeight=0.50, epilog, nogc, extend
- add rsp, 56
+ add rsp, 40
pop rbx
pop rsi
ret
@@ -457,7 +459,7 @@ G_M56432_IG13: ; bbWeight=0.50, gcVars=0000000000000000 {}, gcrefRegs=000
mov eax, 1
;; size=81 bbWeight=0.50 PerfScore 7.50
G_M56432_IG14: ; bbWeight=0.50, epilog, nogc, extend
- add rsp, 56
+ add rsp, 40
pop rbx
pop rsi
ret
@@ -468,7 +470,7 @@ G_M56432_IG15: ; bbWeight=0, gcVars=0000000000000000 {}, gcrefRegs=0000 {
int3
;; size=6 bbWeight=0 PerfScore 0.00
-; Total bytes of code 1098, prolog size 23, PerfScore 126.83, instruction count 214, allocated bytes for code 1098 (MethodHash=7754238f) for method test:TestEntryPoint():int (FullOpts)
+; Total bytes of code 1080, prolog size 6, PerfScore 122.88, instruction count 211, allocated bytes for code 1080 (MethodHash=7754238f) for method test:TestEntryPoint():int (FullOpts) |
No. Have you looked into it? If not, I'll get to it in a day or two. |
Implement a very simplistic "field sensitive" analysis for gc structs where we model each struct as simply its gc field(s).
That is, given a gc struct
G
with GC fieldf
, we modelas an assignment to
G
andas a read from
G
. Since we knowG
itself cannot escape, "escaping" ofG
meansG.f
can escape.Note this conflates the fields of
G
: either they all escape or none of them do. We could make this more granular but it's not clear that doing so is worthwhile, and it requires more up-front work to determine how to track each field's status.If the struct or struct fields are only consumed locally, we may be able to prove the gc referents don't escape.
This is a subset/elaboration of #112543 that does not try and reason interprocedurally.
Contributes to #104936 / #108913
Fixes #113236 (once the right inlines happen)