Import `KEEPALIVE(BOX)` as `KEEPALIVE(ADDR(TEMP))` #54412

SingleAccretion · 2021-06-18T12:59:39Z

This is a specialized optimization for a targeted use case, where the intent is to keep the fields of the struct alive across a native call in generated code. Since the generated code does not have access to private fields, it cannot do it on its own without boxing and thus incurring an allocation (and a copy).

The transform itself is quite simple: we import multiple KEEPALIVEs, one for each field, via indirections, and insert a copy if necessary. It is identical to the user writing GC.KeepAlive(struct.ObjectField) themselves (well - not quite - I do not generate GT_FIELDs, but they do eventually fold to the same LCL_FLDs in morph and result in the same codegen).

I included a small test to validate the basic functionality, but would appreciate feedback on more interesting scenarios (and ways to make the test itself more robust).

jkoritzinsky · 2021-06-18T16:32:58Z

For more context, this is for a mechanism to implement KeepAlive semantics for user-defined types with custom marshalling (like structs with delegate fields) in the DllImportGenerator project without allocating.

SingleAccretion · 2021-06-19T09:37:18Z

Running the diffs, there were some hits (to my surprise, I have to say). The test hit has been fixed in one of the commits.

Another case is BindSasl, where there is a struct being kept alive, in all apparency so that freeing it in the finally block is ok...

src/tests/JIT/Methodical/Boxing/boxunbox/KeepAliveBoxOpt.cs

jkoritzinsky · 2021-06-22T18:23:07Z

cc: @dotnet/jit-contrib

JulieLeeMSFT · 2021-07-12T13:22:50Z

@sandreenko @dotnet/jit-contrib PTAL community PR.

sandreenko

For more context, this is for a mechanism to implement KeepAlive semantics for user-defined types with custom marshalling (like structs with delegate fields) in the DllImportGenerator project without allocating.

was there a discord discussion or something about it?

sandreenko · 2021-07-23T22:33:53Z

src/tests/JIT/Methodical/Boxing/boxunbox/KeepAliveBoxOpt.cs

+    public byte AnotherByteField;
+
+    [FieldOffset(24)]
+    public SimpleStructWithExplicitLayout StructWithRef;


Nit: you can make an intersection and declare two object fields with the same offset if you want.

sandreenko · 2021-07-23T23:08:35Z

src/coreclr/jit/importer.cpp

+                {
+                    if (layout->IsGCPtr(slot))
+                    {
+                        // This is a very verbose way of saying gtNewLclFldNode.


why importation was chosen for this transformation? Won't it work better after implicit byrefs resolution somewhere in morph? Are these IND(ADD(ADDR(LCL_VAR), OFFSET) always transformed as GT_LCL_FLD?

why importation was chosen for this transformation?

Primary reason is because that's where the original code was. I can think of a few reasons that make it beneficial to do in import:

Adding statements is "natural", and KEEPALIVEs require new statements (except if one uses COMMAs I suppose, but I'd think it is better not to use them if we can).

Boxes are "fresh", i. e. there is little chance that some subsequent transform changed their shape somehow.

Implicit byrefs and normal locals are "the same" (no need to write code that handles both - though perhaps that code is trivial enough for this not to matter).

However, I am in now way as experienced as you in this area. What would be the advantage of doing it late(er)?

Are these IND(ADD(ADDR(LCL_VAR), OFFSET) always transformed as GT_LCL_FLD?

For locals, yes. I expect most of these structs will be implicit byrefs though, for them it turns into IND(ADD(LCL_VAR byref, offset)).

Implicit byrefs and normal locals are "the same" (no need to write code that handles both - though perhaps that code is trivial enough for this not to matter).

for implicit byrefs do we need to do anything? are not they reported as alive from the parent method, for example:

struct S // more than 16 bytes, so implicit byref { long a, b, c; } void Foo() { S s = new S(); Do(s); // <- implicit byref, s is on the stack and reported live for this line } void Do(S s) // <- s is implicit byref so we don't care about its liveness because the caller is reporting it. { CallUnmanagedApiWithSuppressGCTransition(s); // no need to report s as live in Do, it will be reported in Foo. }

for implicit byrefs do we need to do anything?

Sorry for not being explicit. I agree there are liveness concerns here, it is just convenient that the implicit byrefs haven't been expanded yet, so the copy elision code (the if (OperIsLocal()) { reuse the existing local } branch) "just works".

On that note this copy elision is actually bugged, it must be if (OperIs(GT_LCL_VAR)).

sandreenko · 2021-07-23T23:11:11Z

src/coreclr/jit/importer.cpp

+                        GenTree* field       = gtNewIndir(TYP_REF, fieldAddr);
+                        GenTree* keepalive   = gtNewKeepAliveNode(field);
+
+                        Statement* stmt = gtNewStmt(keepalive, impCurStmtOffs);


I expect such structs to always be on the stack (because they have GC fields and used as a whole, so no independent promotion) so we don't have precise fields' liveness for them, we do zero-init and always report them as alive, don't we?

So my theory for such test:

S s = struct with GC fields; nativeCallThatTakesStructAndDoesSomethingWithGCField(s); GC.KeepAlive(s);

it will currently work even if you delete GC.KeepAlive(s) just because the struct S will be allocated on the stack and will be reported as live for the whole method. The only exception, in theory, could be a struct with 1 byref field

struct S { Object o; }

in this case Jit can replace the struct with its field and do precise liveness for it.
Am I correct and it will work even without KeepAlive ?

However, the previous example is just a current limitation.

What I don't understand is why we can't do this:

if (layout->HasGCPtr()) { GenTree* boxTemp = gtNewLclvNode(boxTempNum, boxSrc->TypeGet()); GenTree* boxSrcAddr = gtNewOperNode(GT_ADDR, TYP_BYREF, boxTemp); GenTree* keepalive = gtNewKeepAliveNode(boxSrcAddr); addStmt. }

and keep one GT_KEEPALIVE for the stuct, not for each GC field?

Am I correct and it will work even without KeepAlive ?

Some ad-hoc testing for me says yes, larger structs survive, while the "one object" one dies.

What I don't understand is why we can't do this:

Put simply I didn't know it could be done that way (I was also trying to emulate how the manual KeepAlive on each field would be imported, since that's known to work, presumably, and the only testing we'll ever have for this optimization is the little test I wrote here).

I suppose one possible concern here is from the following scenario.

Say the source for the box is an implicit byref. Let's also presume that the method we are compiling is a reverse PInvoke one, so the location of the actual shadow copy will be in an unmanaged frame. Let's also presume that the user code is keeping the object fields of the struct in question alive via some other means (and they're pinned, obviously).

Now, in the method, the user code does this:

// Edit: UnmanagedCallersOnly only allows blittable types, but let's assume `MyStruct` // consists of `void*`s and we `Unsafe.As` the reference to it as one to a struct with `object`s void CallerIsUnmanaged(MyStruct myStructWithRefFields) { TurnOffGlobalKeepAliveForMyStructFields(); // Clears static fields or whatever. Keeps the fields pinned. CallUnmanagedApiWithSuppressGCTransition(&myStructWithRefFields); // No transition so no spilling. GC.KeepAlive(myStructWithRefFields); }

Do we risk not keeping the fields alive or not?

This is a completely made-up scenario from me, and I do not actually know if it would work or not (even with the current version of the code). I will test it though.

So after thinking about this and some testing, I've come to the conclusion that the concern that fields in structs in foreign frames won't be "kept alive" is "real", but also per-existing and not something that much can (should) be done about.

Structs in the frame of the method being compiled will be kept alive by virtue of being address-exposed if we take the address of the whole struct, while implicit byrefs will be kept alive by their owner method.

In terms of CQ, the current approach of field-by-field importation produces sequences of dummy loads into a register like mov rax, [fld1], mov rax, [fld2], etc. Address-exposing the struct will avoid that, but will obviously impede other possible optimizations, though presumably that will matter little. It does seem a little weird to generate a dummy use of a byref just so that the struct is marked address-exposed, thus indirectly making it live for the whole method, but perhaps that's by design.

I feel like I am a little behind, how do we get address exposed set for the struct? Do we set address exposed when using your current approach with fields?

I feel like I am a little behind, how do we get address exposed set for the struct? Do we set address exposed when using your current approach with fields?

So if we use the address of the whole struct, it is as if we passed it to a call as an argument - that is why we'll have it address-exposed.

Do we set address exposed when using your current approach with fields?

It appears so...
LocalAddressVisitor sets it, and only later does morph fold it back to a LCL_FLD. I initially got confused between the two :(. Given this fact, I see no advantage to keeping the field approach and will use the whole struct as you suggested.

LocalAddressVisitor sets it, and only later does morph fold it back to a LCL_FLD. I initially got confused between the two :(. Given this fact, I see no advantage to keeping the field approach and will use the whole struct as you suggested.

It is not important when we fold them, because if the whole struct is marked as address exposed we can't do anything with the fields even if they are all in LCL_FLD form.

So if we use the address of the whole struct, it is as if we passed it to a call as an argument - that is why we'll have it address-exposed.

what if the struct is passed by value?

It is not important when we fold them, because if the whole struct is marked as address exposed we can't do anything with the fields even if they are all in LCL_FLD form.

Yes I understand. I just thought that it was the address visitor that folded them (and didn't set the address exposed flag in doing so), for some reason.

what if the struct is passed by value?

You mean to KEEPALIVE? That would require codegen support, wouldn't it?

no, if the struct is 8-byte size or 16-byte on x64 unix/arm and it is passed to unmanaged call by value, is it marked as address exposed?

No, as you say, it is not. I think I misunderstood your original question...

Instead, import KEEPALIVE(BOX) as a sequence of KEEPALIVE for object fields (flattened).

Move the importation to its own method. Add debug output. Rename a few things.

Basic verification of the new optimization.

It relies on a precise GC.

Running diffs for win-x64, there was exactly one result, in this test. GC.KeepAlive here is used as a generic "make sure that there is a use for the variable beyond an assigment" function. The new optimization eliminates this use as the struct in question has no GC fields. Fix the test by using an actual generic "use me" function. Will be broken by advanced interprocediral analysis that is apparently not as far away as one would expect, but will do for now.

It is possible for a GC to occur just after the KeepAlive calls and for CheckSuccess to wrongly conclude the objects weren't kept alive.

SingleAccretion · 2021-07-26T19:07:40Z

was there a discord discussion or something about it?

Heh, yes, a small one :). @jkoritzinsky asked on the C# discord (#lowlevel channel) if the Jit performed box elision for KeepAlive, and I decided to see how hard would it be to make it do that.

jkoritzinsky · 2021-07-26T19:12:29Z

Specifically, I was looking at mechanisms for ensuring that the DllImportGenerator could provide support for keeping delegate fields alive in structs. Due to limited type information in ref assemblies, the generator cannot expect marshaled types to be accurately described. To ensure that all cases where a KeepAlive is needed are covered, the best option is to emit a KeepAlive in all cases. However, the cost of the box makes that not an option. @SingleAccretion was kind enough to take a stab at eliding the box, which would make KeepAlive usable in my scenario.

If the source is a GT_LCL_FLD, then we would only want too keep alive the fields it encloses (or, since TYP_STRUCT local fields are currently not supported, nothing).

The field-by-field approach results in the struct getting address-exposed anyway, so there is no good reason to bloat the IR. Just have the address escape directly.

SingleAccretion · 2021-07-30T19:24:08Z

Since this is a bit of an old branch, I've gone ahead and rebased to as to be able to build with my current runtime fork.

jkoritzinsky

This implementation looks good to me based on my knowledge of the JIT.

AndyAyersMS

Thanks, this looks good.

Nice to see another customer for gtTryRemoveBoxUpstreamEffects.

ghost · 2021-09-02T22:42:18Z

Hello @jkoritzinsky!

Because this pull request has the auto-merge label, I will be glad to assist with helping to merge this pull request once all check-in policies pass.

p.s. you can customize the way I help with merging this pull request, such as holding this pull request until a specific person approves. Simply @mention me (`@msftbot`) and give me an instruction to get started! Learn more here.

dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Jun 18, 2021

SingleAccretion force-pushed the Keepalive-Box-Opt branch from 1d83882 to e05bf6c Compare June 18, 2021 13:32

SingleAccretion marked this pull request as ready for review June 19, 2021 07:04

jkoritzinsky reviewed Jun 19, 2021

View reviewed changes

src/tests/JIT/Methodical/Boxing/boxunbox/KeepAliveBoxOpt.cs Show resolved Hide resolved

JulieLeeMSFT requested a review from sandreenko July 12, 2021 13:21

JulieLeeMSFT assigned SingleAccretion Jul 12, 2021

JulieLeeMSFT added this to the 6.0.0 milestone Jul 12, 2021

terrajobst added the community-contribution Indicates that the PR has been added by a community member label Jul 19, 2021

sandreenko reviewed Jul 23, 2021

View reviewed changes

sandreenko modified the milestones: 6.0.0, 7.0.0 Jul 23, 2021

SingleAccretion added 10 commits July 26, 2021 20:51

Prototype removing boxes from GC.KeepAlive

b416a3f

Instead, import KEEPALIVE(BOX) as a sequence of KEEPALIVE for object fields (flattened).

Clean things up

0ed6cc6

Move the importation to its own method. Add debug output. Rename a few things.

Add a test

a8bfba4

Basic verification of the new optimization.

Fix formatting

7fd3959

Disable the test on Mono

432a194

It relies on a precise GC.

Add a class field scenario to the test

116dca1

Use advanced DEBUG features for even nicer dumps

d67cc7c

Add two simple EH scenarios to the test

26cee1a

Fix the test

c51b07e

It is possible for a GC to occur just after the KeepAlive calls and for CheckSuccess to wrongly conclude the objects weren't kept alive.

SingleAccretion added 2 commits July 30, 2021 12:59

Only elide copies for GT_LCL_VARs

f843ec9

If the source is a GT_LCL_FLD, then we would only want too keep alive the fields it encloses (or, since TYP_STRUCT local fields are currently not supported, nothing).

Use the address of the whole struct for KEEPALIVE

9cfb36d

The field-by-field approach results in the struct getting address-exposed anyway, so there is no good reason to bloat the IR. Just have the address escape directly.

SingleAccretion force-pushed the Keepalive-Box-Opt branch from 5a1ec76 to 9cfb36d Compare July 30, 2021 19:23

SingleAccretion changed the title ~~Import KEEPALIVE(BOX) as a sequence of KEEPALIVE(FIELD)~~ Import KEEPALIVE(BOX) as a sequence of KEEPALIVE(ADDR(TEMP)) Aug 4, 2021

SingleAccretion changed the title ~~Import KEEPALIVE(BOX) as a sequence of KEEPALIVE(ADDR(TEMP))~~ Import KEEPALIVE(BOX) as KEEPALIVE(ADDR(TEMP)) Aug 4, 2021

karelz mentioned this pull request Aug 18, 2021

[WinHttpHandler] Test failed: ReadAsStreamAsync_StreamCanReadIsFalseAfterDispose #57626

Closed

JulieLeeMSFT requested a review from jkoritzinsky August 31, 2021 15:14

jkoritzinsky approved these changes Aug 31, 2021

View reviewed changes

AndyAyersMS approved these changes Aug 31, 2021

View reviewed changes

Merge branch 'main' into Keepalive-Box-Opt

2ca811c

jkoritzinsky added the auto-merge label Sep 2, 2021

ghost merged commit 4173838 into dotnet:main Sep 3, 2021

SingleAccretion deleted the Keepalive-Box-Opt branch September 7, 2021 20:50

dotnet locked as resolved and limited conversation to collaborators Oct 7, 2021

This pull request was closed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Import `KEEPALIVE(BOX)` as `KEEPALIVE(ADDR(TEMP))` #54412

Import `KEEPALIVE(BOX)` as `KEEPALIVE(ADDR(TEMP))` #54412

SingleAccretion commented Jun 18, 2021 •

edited

jkoritzinsky commented Jun 18, 2021

SingleAccretion commented Jun 19, 2021 •

edited

jkoritzinsky commented Jun 22, 2021

JulieLeeMSFT commented Jul 12, 2021

sandreenko left a comment

sandreenko Jul 23, 2021

sandreenko Jul 23, 2021

SingleAccretion Jul 26, 2021

sandreenko Jul 28, 2021

SingleAccretion Jul 29, 2021 •

edited

sandreenko Jul 23, 2021

SingleAccretion Jul 26, 2021 •

edited

SingleAccretion Jul 28, 2021

sandreenko Jul 28, 2021

SingleAccretion Jul 29, 2021

sandreenko Jul 29, 2021

SingleAccretion Jul 29, 2021

sandreenko Jul 29, 2021

SingleAccretion Jul 30, 2021

SingleAccretion commented Jul 26, 2021

jkoritzinsky commented Jul 26, 2021

SingleAccretion commented Jul 30, 2021 •

edited

jkoritzinsky left a comment

AndyAyersMS left a comment

ghost commented Sep 2, 2021

Import KEEPALIVE(BOX) as KEEPALIVE(ADDR(TEMP)) #54412

Import KEEPALIVE(BOX) as KEEPALIVE(ADDR(TEMP)) #54412

Conversation

SingleAccretion commented Jun 18, 2021 • edited

jkoritzinsky commented Jun 18, 2021

SingleAccretion commented Jun 19, 2021 • edited

jkoritzinsky commented Jun 22, 2021

JulieLeeMSFT commented Jul 12, 2021

sandreenko left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

SingleAccretion Jul 29, 2021 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

SingleAccretion Jul 26, 2021 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

SingleAccretion commented Jul 26, 2021

jkoritzinsky commented Jul 26, 2021

SingleAccretion commented Jul 30, 2021 • edited

jkoritzinsky left a comment

Choose a reason for hiding this comment

AndyAyersMS left a comment

Choose a reason for hiding this comment

ghost commented Sep 2, 2021

p.s. you can customize the way I help with merging this pull request, such as holding this pull request until a specific person approves. Simply @mention me (@msftbot) and give me an instruction to get started! Learn more here.

Import `KEEPALIVE(BOX)` as `KEEPALIVE(ADDR(TEMP))` #54412

Import `KEEPALIVE(BOX)` as `KEEPALIVE(ADDR(TEMP))` #54412

SingleAccretion commented Jun 18, 2021 •

edited

SingleAccretion commented Jun 19, 2021 •

edited

SingleAccretion Jul 29, 2021 •

edited

SingleAccretion Jul 26, 2021 •

edited

SingleAccretion commented Jul 30, 2021 •

edited

p.s. you can customize the way I help with merging this pull request, such as holding this pull request until a specific person approves. Simply @mention me (`@msftbot`) and give me an instruction to get started! Learn more here.