Skip to content
This repository has been archived by the owner on Jan 23, 2023. It is now read-only.

[No Merge] CopyBlk copyprop #24915

Closed
wants to merge 1 commit into from

Conversation

benaadams
Copy link
Member

@benaadams benaadams commented Jun 2, 2019

Propagate through struct copies backwards, picking up:

x1 (def) = x0 (last use)
x2 (def) = x1 (last use)

Converting to

x2 (def) = x0 (last use)

Next, propagate forwards through struct copies for usage, picking up:

x2 (def) = x0 (last use)
use x2 (last use)

Converting to

use x0 (last use)

Example change to <ReadBlockAsyncInternal>d__20:MoveNext():this

image

While it resolves almost all the copies in awaiting ValueTasks (and all the copies in the sample in https://github.com/dotnet/coreclr/issues/18542#issuecomment-497850725) in async/await one remains (as above) 😢

And

Sample change from Pipelines <ReadAsyncInternal>d__23:MoveNext():this

image

Resolves: https://github.com/dotnet/coreclr/issues/18542

Is a bit messy, but seeking feedback (assuming tests pass) /cc @CarolEidt @AndyAyersMS @mikedn @stephentoub

@benaadams benaadams force-pushed the copyprop-structs branch 2 times, most recently from 2a298dc to 1432b15 Compare June 3, 2019 02:39
@benaadams
Copy link
Member Author

Hmm... not sure what PHI_ARGs are; but going to assume this is a bad transform


***** BB158 (before)
N007 (  6,  5) [008761] -A------R---              *  ASG       struct (copy)
N006 (  3,  2) [008759] D------N----              +--*  LCL_VAR   struct V425 tmp405      d:3
N005 (  2,  2) [008760] ------------              \--*  PHI       struct
N001 (  0,  0) [008811] ------------                 +--*  PHI_ARG   struct V425 tmp405      u:2
N002 (  0,  0) [008809] ------------                 \--*  PHI_ARG   struct V425 tmp405      u:1 <l:$880, c:$47f>

N003 (  7,  5) [006686] -A------R---              *  ASG       struct (copy) $VN.Void
N002 (  3,  2) [006684] D------N----              +--*  LCL_VAR   struct V421 tmp401      d:1
N001 (  3,  2) [006685] ------------              \--*  LCL_VAR   struct V425 tmp405      u:3 (last use) $6c8

Copy propagated to:

***** BB158 (after)
N007 (  6,  5) [008761] -A------R---              *  ASG       struct (copy)
N006 (???,???) [008936] D------N----              +--*  LCL_VAR   struct V421 tmp401      
N005 (  2,  2) [008760] ------------              \--*  PHI       struct
N001 (  0,  0) [008811] ------------                 +--*  PHI_ARG   struct V425 tmp405      u:2
N002 (  0,  0) [008809] ------------                 \--*  PHI_ARG   struct V425 tmp405      u:1 <l:$880, c:$47f>

N001 (  7,  5) [006686] ------------              *  NOP       void  

@mikedn
Copy link

mikedn commented Jun 3, 2019

Is there any evidence that this is somehow critical for application performance? I don't see something like this merged in 3.0...

@benaadams
Copy link
Member Author

benaadams commented Jun 3, 2019

Is there any evidence that this is somehow critical for application performance?

Async aside (which is one of my reasons for looking into it); its much more common now with the renaissance of struct types.

For example

private static bool DoJson()
{
    bool result = true;
    using (JsonDocument doc = JsonDocument.Parse("[true,false]"))
    {
        JsonElement parsedObject = doc.RootElement;
        bool first = parsedObject[0].GetBoolean();
        bool second = parsedObject[1].GetBoolean();
        result &= true == first;
        result &= false == second;
    }

    return result;
}

Calls into JsonDocument:Parse(struct,struct,byref,byref) which reserves 2888 bytes of stack and zeros out 708 bytes and V520 vals; with these shuffle copies being a fairly significant portion of its code gen:

; Assembly listing for method JsonDocument:Parse(struct,struct,byref,byref)
...
       4881EC480B0000       sub      rsp, 0xB48 ; 2888
       C5F877               vzeroupper 
       488BF1               mov      rsi, rcx
       488D7C2420           lea      rdi, [rsp+20H]
       B9C4020000           mov      ecx, 708
       33C0                 xor      rax, rax
       F3AB                 rep stosd 
...
G_M55736_IG13:
       C5FA6F8424000B0000   vmovdqu  xmm0, qword ptr [rsp+B00H]
       C5FA7F8424100B0000   vmovdqu  qword ptr [rsp+B10H], xmm0

G_M55736_IG14:
       C5FA6F8424100B0000   vmovdqu  xmm0, qword ptr [rsp+B10H]
       C5FA7F8424200B0000   vmovdqu  qword ptr [rsp+B20H], xmm0

... x254 more vmovdqus

; Total bytes of code 7892, prolog size 42 for method JsonDocument:Parse(struct,struct,byref,byref)

@AndyAyersMS
Copy link
Member

In SSA every use has one definition (the "single" in "Sparse Single Assignment"). If in classic dataflow multiple definitions can reach a use, SSA requires inserting a "pseudo-definition" (PHI) to merge all the uses into a single value. A PHI_ARG is an input to a PHI.

The jit is going to expect the local var for the destination and all sources of a PHI to match. So you should avoid modifying them.

cc @dotnet/jit-contrib

@benaadams benaadams force-pushed the copyprop-structs branch 3 times, most recently from 8366b93 to f100c7d Compare June 5, 2019 01:03
@benaadams
Copy link
Member Author

 ; Assembly listing for method JsonDocument:Parse(struct,struct,byref,byref)

-; Lcl frame size = 2888
+; Lcl frame size = 2056

 G_M55736_IG01:
       ...
-      sub      rsp, 0xB48
+      sub      rsp, 0x808
       vzeroupper 
       mov      rsi, rcx
       lea      rdi, [rsp+20H]
-      mov      ecx, 708
+      mov      ecx, 500
       xor      rax, rax
       rep stosd 

-; Total bytes of code 7892, prolog size 42 for method JsonDocument:Parse(struct,struct,byref,byref)
+; Total bytes of code 6947, prolog size 42 for method JsonDocument:Parse(struct,struct,byref,byref)

@benaadams
Copy link
Member Author

benaadams commented Jun 5, 2019

No regressions

Total bytes of diff: -901 (-0.02% of base)
    diff is an improvement.

Top file improvements by size (bytes):
        -901 : System.Private.CoreLib.dasm (-0.02% of base)

1 total files with size differences (1 improved, 0 regressed), 0 unchanged.

Top method improvements by size (bytes):
        -144 (-6.60% of base) : System.Private.CoreLib.dasm - <ReadLineAsyncInternal>d__61:MoveNext():this
        -144 (-4.07% of base) : System.Private.CoreLib.dasm - <ReadAsyncInternal>d__66:MoveNext():this
         -90 (-5.01% of base) : System.Private.CoreLib.dasm - <ReadBufferAsync>d__69:MoveNext():this
         -80 (-4.86% of base) : System.Private.CoreLib.dasm - <CopyToAsyncInternal>d__30:MoveNext():this
         -64 (-1.63% of base) : System.Private.CoreLib.dasm - <AsyncModeCopyToAsync>d__136:MoveNext():this
         -58 (-3.62% of base) : System.Private.CoreLib.dasm - <FlushAsyncInternal>d__82:MoveNext():this
         -48 (-5.59% of base) : System.Private.CoreLib.dasm - ValueTask`1:ConfigureAwait(bool):struct:this (7 methods)
         -45 (-4.04% of base) : System.Private.CoreLib.dasm - <ReadToEndAsync>d__14:MoveNext():this
         -45 (-4.86% of base) : System.Private.CoreLib.dasm - <ReadBlockAsyncInternal>d__20:MoveNext():this
         -39 (-4.57% of base) : System.Private.CoreLib.dasm - <ReadToEndAsyncInternal>d__63:MoveNext():this
         -36 (-4.05% of base) : System.Private.CoreLib.dasm - Vector:ConditionalSelect(struct,struct,struct):struct (8 methods)
         -24 (-4.98% of base) : System.Private.CoreLib.dasm - NativeRuntimeEventSource:.ctor():this (2 methods)
         -24 (-4.98% of base) : System.Private.CoreLib.dasm - FrameworkEventSource:.ctor():this (2 methods)
         -12 (-20.00% of base) : System.Private.CoreLib.dasm - Decimal:System.IConvertible.ToSingle(ref):float:this
         -12 (-21.05% of base) : System.Private.CoreLib.dasm - Decimal:System.IConvertible.ToDouble(ref):double:this
         -12 (-4.98% of base) : System.Private.CoreLib.dasm - ArrayPoolEventSource:.ctor():this
         -12 (-4.98% of base) : System.Private.CoreLib.dasm - TplEventSource:.ctor():this
         -12 (-10.08% of base) : System.Private.CoreLib.dasm - DecimalTypeInfo:WriteData(ref,struct):this

@benaadams
Copy link
Member Author

benaadams commented Jun 5, 2019

; Assembly listing for method <CopyToAsyncInternal>d__30:MoveNext():this

- ; Lcl frame size = 272
+ ; Lcl frame size = 192

...

-      sub      rsp, 272
+      sub      rsp, 192
-      lea      rbp, [rsp+130H]
+      lea      rbp, [rsp+E0H]
       mov      rsi, rcx
-      lea      rdi, [rbp-100H]
+      lea      rdi, [rbp-B0H]
-      mov      ecx, 54
+      mov      ecx, 34
       xor      rax, rax
       rep stosd 
       mov      rcx, rsi
-      mov      qword ptr [rbp-110H], rsp
+      mov      qword ptr [rbp-C0H], rsp

...

-G_M4501_IG08:
-       movdqu   xmm0, qword ptr [rbp-98H]
-       movdqu   qword ptr [rbp-A8H], xmm0
-
-G_M4501_IG09:
-       movdqu   xmm0, qword ptr [rbp-A8H]
-       movdqu   qword ptr [rbp-58H], xmm0
-
-G_M4501_IG10:
-       movdqu   xmm0, qword ptr [rbp-58H]
-       movdqu   qword ptr [rbp-B8H], xmm0
-
-G_M4501_IG11:
-       movdqu   xmm0, qword ptr [rbp-B8H]
-       movdqu   qword ptr [rbp-38H], xmm0

+G_M4501_IG08:
+       movdqu   xmm0, qword ptr [rbp-78H]
+       movdqu   qword ptr [rbp-38H], xmm0

...

-G_M4501_IG28:
-       movdqu   xmm0, qword ptr [rbp-C8H]
-       movdqu   qword ptr [rbp-88H], xmm0
-
-G_M4501_IG29:
-       movdqu   xmm0, qword ptr [rbp-88H]
-       movdqu   qword ptr [rbp-D8H], xmm0
-
-G_M4501_IG30:
-       movdqu   xmm0, qword ptr [rbp-D8H]
-       movdqu   qword ptr [rbp-68H], xmm0

+G_M4501_IG25:
+       movdqu   xmm0, qword ptr [rbp-88H]
+       movdqu   qword ptr [rbp-58H], xmm0


-; Total bytes of code 1519, prolog size 54 for method <CopyToAsyncInternal>d__30:MoveNext():this
+; Total bytes of code 1439, prolog size 54 for method <CopyToAsyncInternal>d__30:MoveNext():this

@benaadams benaadams marked this pull request as ready for review June 5, 2019 04:51
@benaadams
Copy link
Member Author

For System.Text.Json, no regressions

Total bytes of diff: -1493 (-0.44% of base)
    diff is an improvement.

Top file improvements by size (bytes):
       -1493 : System.Text.Json.dasm (-0.44% of base)

1 total files with size differences (1 improved, 0 regressed), 0 unchanged.

Top method improvements by size (bytes):
        -945 (-11.97% of base) : System.Text.Json.dasm - JsonDocument:Parse(struct,struct,byref,byref)
        -306 (-1.95% of base) : System.Text.Json.dasm - <ReadAsync>d__15`1:MoveNext():this (7 methods)
        -140 (-10.62% of base) : System.Text.Json.dasm - <ParseAsyncCore>d__53:MoveNext():this
         -64 (-2.57% of base) : System.Text.Json.dasm - <WriteAsyncCore>d__49:MoveNext():this
         -38 (-3.01% of base) : System.Text.Json.dasm - <FlushAsync>d__37:MoveNext():this

/cc @ahsonkhan @JeremyKuhne

@benaadams
Copy link
Member Author

For System.Threading.Channels, no regressions

Total bytes of diff: -882 (-0.48% of base)
    diff is an improvement.

Top file improvements by size (bytes):
        -882 : System.Threading.Channels.dasm (-0.48% of base)

1 total files with size differences (1 improved, 0 regressed), 0 unchanged.

Top method improvements by size (bytes):
        -318 (-4.85% of base) : System.Threading.Channels.dasm - <ReadAllAsync>d__5:MoveNext():this (7 methods)
        -294 (-4.60% of base) : System.Threading.Channels.dasm - <<ReadAsync>g__ReadAsyncCore|4_0>d:MoveNext():this (7 methods)
        -270 (-5.02% of base) : System.Threading.Channels.dasm - <WriteAsyncCore>d__4:MoveNext():this (7 methods)

/cc @stephentoub

@benaadams
Copy link
Member Author

For System.IO.Pipelines and Microsoft.AspNetCore.Server.Kestrel, no regressions

Total bytes of diff: -3674 (-0.34% of base)
    diff is an improvement.

Top file improvements by size (bytes):
       -2562 : Microsoft.AspNetCore.Server.Kestrel.Core.dasm (-0.42% of base)
        -873 : System.IO.Pipelines.dasm (-1.88% of base)
        -170 : Microsoft.AspNetCore.Http.dasm (-0.15% of base)
         -48 : Microsoft.AspNetCore.WebUtilities.dasm (-0.07% of base)
         -21 : Microsoft.AspNetCore.Http.Abstractions.dasm (-0.03% of base)

5 total files with size differences (5 improved, 0 regressed), 6 unchanged.

Top method improvements by size (bytes):
       -1176 (-2.29% of base) : Microsoft.AspNetCore.Server.Kestrel.Core.dasm - <ProcessRequestsAsync>d__58`1:MoveNext():this (7 methods)
        -336 (-1.34% of base) : Microsoft.AspNetCore.Server.Kestrel.Core.dasm - <ProcessRequests>d__202`1:MoveNext():this (7 methods)
        -224 (-8.02% of base) : System.IO.Pipelines.dasm - <CopyToAsyncCore>d__11:MoveNext():this
        -224 (-8.90% of base) : System.IO.Pipelines.dasm - <ReadAsyncInternal>d__23:MoveNext():this
        -168 (-4.45% of base) : Microsoft.AspNetCore.Server.Kestrel.Core.dasm - <ProcessDataWrites>d__32:MoveNext():this
        -141 (-10.06% of base) : System.IO.Pipelines.dasm - <CopyFromAsync>d__11:MoveNext():this
        -123 (-4.90% of base) : Microsoft.AspNetCore.Server.Kestrel.Core.dasm - <WriteAsyncAwaited>d__254:MoveNext():this
         -96 (-4.73% of base) : Microsoft.AspNetCore.Server.Kestrel.Core.dasm - <ReadAsync>d__10:MoveNext():this
         -82 (-4.00% of base) : System.IO.Pipelines.dasm - <ReadAsync>d__27:MoveNext():this
         -78 (-2.90% of base) : Microsoft.AspNetCore.Server.Kestrel.Core.dasm - <PumpAsync>d__16:MoveNext():this
         -75 (-10.95% of base) : System.IO.Pipelines.dasm - <<GetFlushResultAsTask>g__AwaitTask|24_0>d:MoveNext():this
         -50 (-2.51% of base) : Microsoft.AspNetCore.Http.dasm - <ReadAsync>d__27:MoveNext():this
         -48 (-2.11% of base) : Microsoft.AspNetCore.Http.dasm - <ReadAsyncInternal>d__32:MoveNext():this
         -48 (-1.80% of base) : Microsoft.AspNetCore.Http.dasm - <CopyToAsyncInternal>d__34:MoveNext():this
         -48 (-3.29% of base) : Microsoft.AspNetCore.Server.Kestrel.Core.dasm - <ReadAsync>d__12:MoveNext():this
         -48 (-2.81% of base) : Microsoft.AspNetCore.Server.Kestrel.Core.dasm - <OnConsumeAsyncAwaited>d__4:MoveNext():this
         -48 (-3.10% of base) : Microsoft.AspNetCore.Server.Kestrel.Core.dasm - <TryReadPrefaceAsync>d__60:MoveNext():this
         -48 (-3.27% of base) : Microsoft.AspNetCore.Server.Kestrel.Core.dasm - <ReadAsync>d__11:MoveNext():this
         -48 (-1.14% of base) : Microsoft.AspNetCore.Server.Kestrel.Core.dasm - <WriteOutputAsync>d__17:MoveNext():this
         -48 (-2.21% of base) : Microsoft.AspNetCore.Server.Kestrel.Core.dasm - <ReadAsyncInternal>d__24:MoveNext():this
         -48 (-3.16% of base) : Microsoft.AspNetCore.WebUtilities.dasm - <ReadFormAsync>d__26:MoveNext():this
         -46 (-2.88% of base) : Microsoft.AspNetCore.Server.Kestrel.Core.dasm - <ReadInputAsync>d__18:MoveNext():this
         -46 (-9.77% of base) : System.IO.Pipelines.dasm - Pipe:GetReadResult(byref):this
         -44 (-2.09% of base) : System.IO.Pipelines.dasm - <FlushAsyncInternal>d__32:MoveNext():this
         -37 (-8.20% of base) : Microsoft.AspNetCore.Server.Kestrel.Core.dasm - HttpProtocol:ProduceContinue():this

37 total methods with size differences (37 improved, 0 regressed), 5212 unchanged.

/cc @davidfowl @pakrym @halter73

@benaadams
Copy link
Member Author

Sample change from Pipelines <ReadAsyncInternal>d__23:MoveNext():this

image

@davidfowl
Copy link
Member

/azp run coreclr-ci

@azure-pipelines
Copy link

Commenter does not have sufficient privileges for PR 24915 in repo dotnet/coreclr

@davidfowl
Copy link
Member

Feeling like a second class citizen on coreclr 😄.

cc @jkotas @AndyAyersMS @stephentoub can you kick the build

@stephentoub
Copy link
Member

/azp run coreclr-ci

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@benaadams
Copy link
Member Author

Hmm... didn't restart the failed one coreclr-ci (Test Pri0 Linux x64_checked)

@stephentoub
Copy link
Member

didn't restart the failed one

I just kicked it manually in AzDo.

@@ -468,6 +849,10 @@ void Compiler::optVnCopyProp()

optBlockCopyProp(block, &curSsaName);

optCopyPropFoldCopyBlks(block);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you do not use curSsaName why do you need to put these in VN copy prop?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move it out to a simple block loop?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know. I'll have to take a close look at the problem you're trying to solve and figure out what might be a good solution, if any. My initial impression was that there are a bunch of LCL_FLDs involved and those are difficult to deal with in the current state of the JIT.

The only thing that is trivial is the example I already mentioned, the one with x2 = x0.f0;. But fixing that shouldn't require all this complication, it might be something that local assertion propagation could handle.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it might be something that local assertion propagation could handle

Right -- seems like the cases we can easily handle (say an entire LCL_FLD source or dest for an unexposed local) would fit in pretty well in assertion prop.

Copy link
Member Author

@benaadams benaadams Jun 5, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Happy for someone to try a different approach (or do this better); mostly am interested in the superfluous copies and the additional stack space they require going away,

// Propagate backwards, picking up:
//
// x1 (def) = x0 (last use)
// x2 (def) = x1 (last use)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see why x0 and x1 need to be last-use.

x1 = x0
x2 = x1

is

x1 = x0
x2 = x0

when x1 is a local. Or at least a non volatile memory location.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd agree; but I'm not very familiar with the Jit code so thought I'd start with a conservative change where I just removed variables that didn't serve much function (purely pass-through); then follow up where the copy is aliasing as I assumed it would be more involved?

op = op->AsIndir()->Addr()->gtGetOp1();
}

if ((op->OperGet() == GT_LCL_VAR || op->OperGet() == GT_LCL_FLD))
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see anything in this code that would make GT_LCL_FLD a valid optimization target, except in one particular case:

x0 = x1;
x2 = x0.f0;

that can usually be transformed into

x0 = x1;
x2 = x1.f0;

with some precautions in case x0 and x1 do not have the same type. I don't see what precautions have you taken to ensure that this and only this case is handled.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this for the optCopyPropThroughCopyBlk version rather than optCopyPropFoldCopyBlks?

Which precautions would I need?

It currently checks the are the exact same size; and that both are last use so converts

(def) x0 = x1;    (last use)
      x2 = x0.f0; (last use)

to

      x2 = x1.f0; (last use)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which precautions would I need?

For example, the following transform would be obviously incorrect (unless f0 happens to be the only field):

x0.f0 = x1.f0;
x2 = x0;

to

x0.f0 = x1.f0;
x2 = x1;

Copy link
Member Author

@benaadams benaadams Jun 5, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this is one of the LCL_FLDs it changes; removing V446 in its entirely

CopyBlk based forward copy assertion for forward copy assertion for [006950] V446 @000003B1 by [006911] V444 @00000003.

***** (before)
N003 (  7,  5) [006957] -A------R---  *  ASG       struct (copy) $VN.Void
N002 (  3,  2) [006955] D------N----  +--*  LCL_VAR   struct V446 tmp426      d:1
N001 (  3,  2) [006911] -------N----  \--*  LCL_VAR   struct V444 tmp424      u:1 (last use) $6c8

N003 (  3,  4) [006966] -A--G---R---  *  ASG       byref  $3b1
N002 (  1,  1) [006965] D------N----  +--*  LCL_VAR   byref  V447 tmp427      d:1 $3b1
N001 (  3,  4) [006950] ------------  \--*  LCL_FLD   byref  V446 tmp426      u:1[+0] Fseq[_pointer, _value] (last use) $3b1

Copy propagated to:

***** (after)
N001 (  7,  5) [006957] ------------  *  NOP       void  

     (  3,  4) [006967] ------------  *  STMT      void  (IL   ???...  ???)
N003 (  3,  3) [006966] -A--G---R---  \--*  ASG       byref  $3b1
N002 (  1,  1) [006965] D------N----     +--*  LCL_VAR   byref  V447 tmp427      d:1 $3b1
N001 (  3,  2) [008934] -------N----     \--*  LCL_VAR   byref  V444 tmp424       (last use)

Which is why I'm using creation and death and checking all the sizes are the same; if its created in one statement for a copy and then dies in the next, it removes it and uses the original

Copy link
Member Author

@benaadams benaadams Jun 5, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For <ReadAsyncInternal>d__66:MoveNext to does the following conversions

CopyBlk based forward copy assertion for reverse copy assertion for [002494] V112 @00000003 by [002488] V112 @00000003.

***** (before)
N003 (  7,  5) [002491] -A--G---R---   *  ASG       struct (copy) $VN.Void
N002 (  3,  2) [002490] D---G--N----   +--*  LCL_VAR   struct(AX) V16 loc15        
N001 (  3,  2) [002488] ------------   \--*  LCL_VAR   struct V112 tmp93       u:2 (last use) $456

N007 ( 13, 15) [002503] -A--G---R---   *  ASG       struct (copy) $VN.Void
N006 (  6,  7) [002502] n-----------   +--*  BLK(16)   struct
N005 (  3,  5) [002501] ------------   |  \--*  ADDR      byref  $8bd
N004 (  3,  4) [002494] D------N----   |     \--*  LCL_FLD   struct V112 tmp93       d:2[+0] Fseq[_value]
N003 (  6,  7) [002499] n---G-------   \--*  IND       struct $5c5
N002 (  3,  5) [002495] ----G-------      \--*  ADDR      byref  $18f
N001 (  3,  4) [002498] -------N----         \--*  LCL_FLD   struct V09 loc8         u:3[+0] Fseq[_value] (last use) $5c5

Copy propagated to:

***** (after)
N001 (  7,  5) [002491] ------------   *  NOP       void  

     ( 13, 15) [002504] ------------   *  STMT      void  (IL 0x2CE...  ???)
N005 ( 10, 10) [002503] -A--G---R---   \--*  ASG       struct (copy) $VN.Void
N004 (  3,  2) [005873] D---G--N----      +--*  LCL_VAR   struct(AX) V16 loc15        
N003 (  6,  7) [002499] n---G-------      \--*  IND       struct $5c5
N002 (  3,  5) [002495] ----G-------         \--*  ADDR      byref  $18f
N001 (  3,  4) [002498] -------N----            \--*  LCL_FLD   struct V09 loc8         u:3[+0] Fseq[_value] (last use) $5c5

CopyBlk based forward copy assertion for reverse copy assertion for [002426] V09 @00000003 by [002498] V09 @00000003.

***** (before)
N005 ( 10, 10) [002503] -A--G---R---   *  ASG       struct (copy) $VN.Void
N004 (  3,  2) [005873] D---G--N----   +--*  LCL_VAR   struct(AX) V16 loc15        
N003 (  6,  7) [002499] n---G-------   \--*  IND       struct $5c5
N002 (  3,  5) [002495] ----G-------      \--*  ADDR      byref  $18f
N001 (  3,  4) [002498] -------N----         \--*  LCL_FLD   struct V09 loc8         u:3[+0] Fseq[_value] (last use) $5c5

N003 (  7,  5) [002427] -A------R---   *  ASG       struct (copy) $VN.Void
N002 (  3,  2) [002426] D------N----   +--*  LCL_VAR   struct V09 loc8         d:3
N001 (  3,  2) [002424] ------------   \--*  LCL_VAR   struct V108 tmp89       u:2 (last use) $588

Copy propagated to:

***** (after)
N001 ( 10, 10) [002503] ------------   *  NOP       void  

     (  7,  5) [000268] ------------   *  STMT      void  (IL   ???...  ???)
N003 (  7,  5) [002427] -A--G---R---   \--*  ASG       struct (copy) $VN.Void
N002 (  3,  2) [005874] D---G--N----      +--*  LCL_VAR   struct(AX) V16 loc15        
N001 (  3,  2) [002424] ------------      \--*  LCL_VAR   struct V108 tmp89       u:2 (last use) $588

CopyBlk based forward copy assertion for reverse copy assertion for [002465] V108 @00000003 by [002424] V108 @00000003.

***** (before)
N003 (  7,  5) [002427] -A--G---R---   *  ASG       struct (copy) $VN.Void
N002 (  3,  2) [005874] D---G--N----   +--*  LCL_VAR   struct(AX) V16 loc15        
N001 (  3,  2) [002424] ------------   \--*  LCL_VAR   struct V108 tmp89       u:2 (last use) $588

N005 ( 10, 10) [002472] -A--G---R---   *  ASG       struct (copy) $VN.Void
N004 (  6,  7) [002471] n-----------   +--*  BLK(16)   struct
N003 (  3,  5) [002470] ------------   |  \--*  ADDR      byref  $8bc
N002 (  3,  4) [002465] D------N----   |     \--*  LCL_FLD   struct V108 tmp89       d:2[+0] Fseq[_value]
N001 (  3,  2) [002467] ----G--N----   \--*  LCL_VAR   struct(AX) V107 tmp88        $455

Copy propagated to:

***** (after)
N001 (  7,  5) [002427] ------------   *  NOP       void  

     ( 10, 10) [002473] ------------   *  STMT      void  (IL 0x2C4...  ???)
N003 (  7,  5) [002472] -A--G---R---   \--*  ASG       struct (copy) $VN.Void
N002 (  3,  2) [005875] D---G--N----      +--*  LCL_VAR   struct(AX) V16 loc15        
N001 (  3,  2) [002467] ----G--N----      \--*  LCL_VAR   struct(AX) V107 tmp88        $455

CopyBlk based forward copy assertion for reverse copy assertion for [001471] V64 @00000003 by [001465] V64 @00000003.

***** (before)
N003 (  7,  5) [001468] -A--G---R---   *  ASG       struct (copy) $VN.Void
N002 (  3,  2) [001467] D---G--N----   +--*  LCL_VAR   struct(AX) V13 loc12        
N001 (  3,  2) [001465] ------------   \--*  LCL_VAR   struct V64 tmp45        u:2 (last use) $45c

N007 ( 13, 15) [001480] -A--G---R---   *  ASG       struct (copy) $VN.Void
N006 (  6,  7) [001479] n-----------   +--*  BLK(16)   struct
N005 (  3,  5) [001478] ------------   |  \--*  ADDR      byref  $c4d
N004 (  3,  4) [001471] D------N----   |     \--*  LCL_FLD   struct V64 tmp45        d:2[+0] Fseq[_value]
N003 (  6,  7) [001476] n---G-------   \--*  IND       struct $5c6
N002 (  3,  5) [001472] ----G-------      \--*  ADDR      byref  $18f
N001 (  3,  4) [001475] -------N----         \--*  LCL_FLD   struct V09 loc8         u:4[+0] Fseq[_value] (last use) $5c6

Copy propagated to:

***** (after)
N001 (  7,  5) [001468] ------------   *  NOP       void  

     ( 13, 15) [001481] ------------   *  STMT      void  (IL 0x18C...  ???)
N005 ( 10, 10) [001480] -A--G---R---   \--*  ASG       struct (copy) $VN.Void
N004 (  3,  2) [005876] D---G--N----      +--*  LCL_VAR   struct(AX) V13 loc12        
N003 (  6,  7) [001476] n---G-------      \--*  IND       struct $5c6
N002 (  3,  5) [001472] ----G-------         \--*  ADDR      byref  $18f
N001 (  3,  4) [001475] -------N----            \--*  LCL_FLD   struct V09 loc8         u:4[+0] Fseq[_value] (last use) $5c6

CopyBlk based forward copy assertion for reverse copy assertion for [001403] V09 @00000003 by [001475] V09 @00000003.

***** (before)
N005 ( 10, 10) [001480] -A--G---R---   *  ASG       struct (copy) $VN.Void
N004 (  3,  2) [005876] D---G--N----   +--*  LCL_VAR   struct(AX) V13 loc12        
N003 (  6,  7) [001476] n---G-------   \--*  IND       struct $5c6
N002 (  3,  5) [001472] ----G-------      \--*  ADDR      byref  $18f
N001 (  3,  4) [001475] -------N----         \--*  LCL_FLD   struct V09 loc8         u:4[+0] Fseq[_value] (last use) $5c6

N003 (  7,  5) [001404] -A------R---   *  ASG       struct (copy) $VN.Void
N002 (  3,  2) [001403] D------N----   +--*  LCL_VAR   struct V09 loc8         d:4
N001 (  3,  2) [001401] ------------   \--*  LCL_VAR   struct V60 tmp41        u:2 (last use) $58c

Copy propagated to:

***** (after)
N001 ( 10, 10) [001480] ------------   *  NOP       void  

     (  7,  5) [000562] ------------   *  STMT      void  (IL   ???...  ???)
N003 (  7,  5) [001404] -A--G---R---   \--*  ASG       struct (copy) $VN.Void
N002 (  3,  2) [005877] D---G--N----      +--*  LCL_VAR   struct(AX) V13 loc12        
N001 (  3,  2) [001401] ------------      \--*  LCL_VAR   struct V60 tmp41        u:2 (last use) $58c

CopyBlk based forward copy assertion for reverse copy assertion for [001442] V60 @00000003 by [001401] V60 @00000003.

***** (before)
N003 (  7,  5) [001404] -A--G---R---   *  ASG       struct (copy) $VN.Void
N002 (  3,  2) [005877] D---G--N----   +--*  LCL_VAR   struct(AX) V13 loc12        
N001 (  3,  2) [001401] ------------   \--*  LCL_VAR   struct V60 tmp41        u:2 (last use) $58c

N005 ( 10, 10) [001449] -A--G---R---   *  ASG       struct (copy) $VN.Void
N004 (  6,  7) [001448] n-----------   +--*  BLK(16)   struct
N003 (  3,  5) [001447] ------------   |  \--*  ADDR      byref  $c4c
N002 (  3,  4) [001442] D------N----   |     \--*  LCL_FLD   struct V60 tmp41        d:2[+0] Fseq[_value]
N001 (  3,  2) [001444] ----G--N----   \--*  LCL_VAR   struct(AX) V59 tmp40         $45b

Copy propagated to:

***** (after)
N001 (  7,  5) [001404] ------------   *  NOP       void  

     ( 10, 10) [001450] ------------   *  STMT      void  (IL 0x182...  ???)
N003 (  7,  5) [001449] -A--G---R---   \--*  ASG       struct (copy) $VN.Void
N002 (  3,  2) [005878] D---G--N----      +--*  LCL_VAR   struct(AX) V13 loc12        
N001 (  3,  2) [001444] ----G--N----      \--*  LCL_VAR   struct(AX) V59 tmp40         $45b

CopyBlk based forward copy assertion for reverse copy assertion for [001056] V45 @00000003 by [001050] V45 @00000003.

***** (before)
N003 (  7,  5) [001053] -A--G---R---   *  ASG       struct (copy) $VN.Void
N002 (  3,  2) [001052] D---G--N----   +--*  LCL_VAR   struct(AX) V07 loc6         
N001 (  3,  2) [001050] ------------   \--*  LCL_VAR   struct V45 tmp26        u:2 (last use) $443

N007 ( 13, 15) [001065] -A--G---R---   *  ASG       struct (copy) $VN.Void
N006 (  6,  7) [001064] n-----------   +--*  BLK(16)   struct
N005 (  3,  5) [001063] ------------   |  \--*  ADDR      byref  $190
N004 (  3,  4) [001056] D------N----   |     \--*  LCL_FLD   struct V45 tmp26        d:2[+0] Fseq[_value]
N003 (  6,  7) [001061] n---G-------   \--*  IND       struct $5c0
N002 (  3,  5) [001057] ----G-------      \--*  ADDR      byref  $18f
N001 (  3,  4) [001060] -------N----         \--*  LCL_FLD   struct V09 loc8         u:2[+0] Fseq[_value] (last use) $5c0

Copy propagated to:

***** (after)
N001 (  7,  5) [001053] ------------   *  NOP       void  

     ( 13, 15) [001066] ------------   *  STMT      void  (IL 0x048...  ???)
N005 ( 10, 10) [001065] -A--G---R---   \--*  ASG       struct (copy) $VN.Void
N004 (  3,  2) [005879] D---G--N----      +--*  LCL_VAR   struct(AX) V07 loc6         
N003 (  6,  7) [001061] n---G-------      \--*  IND       struct $5c0
N002 (  3,  5) [001057] ----G-------         \--*  ADDR      byref  $18f
N001 (  3,  4) [001060] -------N----            \--*  LCL_FLD   struct V09 loc8         u:2[+0] Fseq[_value] (last use) $5c0

CopyBlk based forward copy assertion for reverse copy assertion for [000988] V09 @00000003 by [001060] V09 @00000003.

***** (before)
N005 ( 10, 10) [001065] -A--G---R---   *  ASG       struct (copy) $VN.Void
N004 (  3,  2) [005879] D---G--N----   +--*  LCL_VAR   struct(AX) V07 loc6         
N003 (  6,  7) [001061] n---G-------   \--*  IND       struct $5c0
N002 (  3,  5) [001057] ----G-------      \--*  ADDR      byref  $18f
N001 (  3,  4) [001060] -------N----         \--*  LCL_FLD   struct V09 loc8         u:2[+0] Fseq[_value] (last use) $5c0

N003 (  7,  5) [000989] -A------R---   *  ASG       struct (copy) $VN.Void
N002 (  3,  2) [000988] D------N----   +--*  LCL_VAR   struct V09 loc8         d:2
N001 (  3,  2) [000986] ------------   \--*  LCL_VAR   struct V41 tmp22        u:2 (last use) $581

Copy propagated to:

***** (after)
N001 ( 10, 10) [001065] ------------   *  NOP       void  

     (  7,  5) [000790] ------------   *  STMT      void  (IL   ???...  ???)
N003 (  7,  5) [000989] -A--G---R---   \--*  ASG       struct (copy) $VN.Void
N002 (  3,  2) [005880] D---G--N----      +--*  LCL_VAR   struct(AX) V07 loc6         
N001 (  3,  2) [000986] ------------      \--*  LCL_VAR   struct V41 tmp22        u:2 (last use) $581

CopyBlk based forward copy assertion for reverse copy assertion for [001027] V41 @00000003 by [000986] V41 @00000003.

***** (before)
N003 (  7,  5) [000989] -A--G---R---   *  ASG       struct (copy) $VN.Void
N002 (  3,  2) [005880] D---G--N----   +--*  LCL_VAR   struct(AX) V07 loc6         
N001 (  3,  2) [000986] ------------   \--*  LCL_VAR   struct V41 tmp22        u:2 (last use) $581

N005 ( 10, 10) [001034] -A--G---R---   *  ASG       struct (copy) $VN.Void
N004 (  6,  7) [001033] n-----------   +--*  BLK(16)   struct
N003 (  3,  5) [001032] ------------   |  \--*  ADDR      byref  $18e
N002 (  3,  4) [001027] D------N----   |     \--*  LCL_FLD   struct V41 tmp22        d:2[+0] Fseq[_value]
N001 (  3,  2) [001029] ----G--N----   \--*  LCL_VAR   struct(AX) V40 tmp21         $442

Copy propagated to:

***** (after)
N001 (  7,  5) [000989] ------------   *  NOP       void  

     ( 10, 10) [001035] ------------   *  STMT      void  (IL 0x03E...  ???)
N003 (  7,  5) [001034] -A--G---R---   \--*  ASG       struct (copy) $VN.Void
N002 (  3,  2) [005881] D---G--N----      +--*  LCL_VAR   struct(AX) V07 loc6         
N001 (  3,  2) [001029] ----G--N----      \--*  LCL_VAR   struct(AX) V40 tmp21         $442

@mikedn
Copy link

mikedn commented Jun 6, 2019

Get Unhandled exception. System.IndexOutOfRangeException: Index was outside the bounds of the array. in Regex when using --pmi on the changes

Stupid me, I forgot to add the "base" LCL_FLD offset to the new LCL_FLDs.

@AndyAyersMS
Copy link
Member

Similar complications over in assertion prop -- the assertion needs to record the field seq and offset, and make sure all this gets updated.

@mikedn
Copy link

mikedn commented Jun 6, 2019

Similar complications over in assertion prop -- the assertion needs to record the field seq and offset, and make sure all this gets updated.

I suppose you mean that optCreateAssertion needs to parse field seq/offset to figure out the case x.f == y.f implies x == y when f is the only field. Or are you trying to build something more advanced that works even when f is not the only field?

@benaadams
Copy link
Member Author

I forgot to add the "base" LCL_FLD offset to the new LCL_FLDs.

Your update fixes it nicely master...mikedn:copy-block-lcl-fld

pmi diffs

Total bytes of diff: -36430 (-0.09% of base)
    diff is an improvement.

Top file regressions by size (bytes):
         970 : Microsoft.CodeAnalysis.VisualBasic.dasm (0.02% of base)
         103 : System.Diagnostics.FileVersionInfo.dasm (2.48% of base)
          97 : System.Security.AccessControl.dasm (0.11% of base)
          81 : System.Reflection.DispatchProxy.dasm (0.32% of base)
          24 : System.Net.WebClient.dasm (0.05% of base)
           7 : Microsoft.Extensions.DependencyModel.dasm (0.02% of base)

Top file improvements by size (bytes):
       -7460 : System.Private.Xml.dasm (-0.21% of base)
       -6026 : System.Private.CoreLib.dasm (-0.13% of base)
       -1965 : Microsoft.CodeAnalysis.dasm (-0.11% of base)
       -1963 : Microsoft.DotNet.Cli.Utils.dasm (-2.47% of base)
       -1957 : Newtonsoft.Json.dasm (-0.22% of base)
       -1728 : System.Security.Cryptography.Algorithms.dasm (-0.60% of base)
       -1714 : System.Net.Http.dasm (-0.29% of base)
       -1404 : Microsoft.CodeAnalysis.CSharp.dasm (-0.03% of base)
       -1344 : System.Reflection.Metadata.dasm (-0.30% of base)
       -1071 : System.Data.Common.dasm (-0.07% of base)
        -853 : System.Security.Cryptography.Cng.dasm (-0.56% of base)
        -679 : System.IO.FileSystem.dasm (-0.60% of base)
        -632 : NuGet.Protocol.Core.v3.dasm (-0.24% of base)
        -612 : System.Linq.Parallel.dasm (-0.04% of base)
        -584 : System.Threading.Tasks.Dataflow.dasm (-0.07% of base)
        -570 : System.Collections.Immutable.dasm (-0.05% of base)
        -540 : System.Security.Cryptography.X509Certificates.dasm (-0.32% of base)
        -471 : CommandLine.dasm (-0.10% of base)
        -459 : System.Private.DataContractSerialization.dasm (-0.06% of base)
        -403 : System.Collections.dasm (-0.07% of base)
        -331 : Microsoft.Diagnostics.Tracing.TraceEvent.dasm (-0.01% of base)
        -330 : System.Private.Xml.Linq.dasm (-0.21% of base)
        -329 : System.Text.RegularExpressions.dasm (-0.11% of base)
        -253 : System.Net.Security.dasm (-0.15% of base)
        -243 : NuGet.Packaging.dasm (-0.14% of base)

103 total files with size differences (97 improved, 6 regressed), 26 unchanged.

@AndyAyersMS
Copy link
Member

Was thinking of recording field seq and offset in the assertion... this lays the groundwork for general copy prop ability through unaliased DNER struct fields but I was thinking we'd only leverage it for struct copy prop early on (I think local prop already has this restriction, so it doesn't arbitrarily extend lifetimes).

In my #18542 repro case the middle copy is not yet generating a copy assertion as it ends up as a more complex tree:

[000107] -A--G+------              *  ASG       struct (copy)
[000106] n----+------              +--*  BLK(16)   struct
[000105] -----+------              |  \--*  ADDR      byref 
[000098] D----+-N----              |     \--*  LCL_FLD   struct V05 tmp2       [+0] Fseq[_value]
[000103] n---G+------              \--*  IND       struct
[000099] ----G+------                 \--*  ADDR      byref 
[000102] -----+-N----                    \--*  LCL_FLD   struct V01 loc1       [+0] Fseq[_value]

This breaks the assertion copy chain logic.

We could work harder on canonicalizing struct copy trees or generalize assertion gen and prop to handle these cases. Seems like the former is the way to go.

@AndyAyersMS
Copy link
Member

Seems like fgMorphBlockOperand could also allow BLK(ADDR(LCL_FLD)) --> LCL_FLD? Let's see.

@benaadams
Copy link
Member Author

An non-async pattern that also exhibits these shuffles is the following (from the Span tests)

private static int DoSpan()
{
    bool result = true;
    for (int length = 2; length < 32; length++)
    {
        char[] a = new char[length];
        for (int i = 0; i < length; i++)
        {
            a[i] = 'a';
        }
        a[length - 1] = ' ';
        string s1 = new string(a);

        ReadOnlySpan<char> ros = s1.AsSpan();
        result &= ros.Slice(0, length - 1).SequenceEqual(ros.Trim());
        result &= ros.SequenceEqual(ros.TrimStart());
        result &= ros.Slice(0, length - 1).SequenceEqual(ros.TrimEnd());

        Span<char> span = new Span<char>(a);
        result &= span.Slice(0, length - 1).SequenceEqual(span.Trim());
        result &= span.SequenceEqual(span.TrimStart());
        result &= span.Slice(0, length - 1).SequenceEqual(span.TrimEnd());

        Memory<char> mem = new Memory<char>(a);
        result &= mem.Slice(0, length - 1).Span.SequenceEqual(mem.Trim().Span);
        result &= mem.Span.SequenceEqual(mem.TrimStart().Span);
        result &= mem.Slice(0, length - 1).Span.SequenceEqual(mem.TrimEnd().Span);

        ReadOnlyMemory<char> rom = new ReadOnlyMemory<char>(a);
        result &= rom.Slice(0, length - 1).Span.SequenceEqual(rom.Trim().Span);
        result &= rom.Span.SequenceEqual(rom.TrimStart().Span);
        result &= rom.Slice(0, length - 1).Span.SequenceEqual(rom.TrimEnd().Span);
    }

    return result ? 1 : 0;
}

Though this PR didn't pick some of them up as they had intervening statements

G_M40146_IG60:
       vmovdqu  xmm0, qword ptr [rsp+840H]
       vmovdqu  qword ptr [rsp+AE8H], xmm0 ; <---

G_M40146_IG61:
       vmovdqu  xmm0, qword ptr [rsp+B08H] ; blocker
       vmovdqu  qword ptr [rsp+828H], xmm0 ;

G_M40146_IG62:
       vmovdqu  xmm0, qword ptr [rsp+AE8H] ; <---
       vmovdqu  qword ptr [rsp+818H], xmm0

@AndyAyersMS
Copy link
Member

intervening statements

Local copy prop can get past this (at least within a block)

As for my changes... rationalize is not quite ready to handle LCL_FLD top level structs (expects them to live under BLK/OBJ). Probably not too hard to fix. Work in progress here: AndyAyersMS@21aa811

But I think we can see getting assertion based copy prop to handle these copy cases will require a number of enabling changes. And raises questions about how far we could or should take this.

Perhaps it is time to step back and reconcile this effort with the work planned for first class structs. We have some good examples now of places where we organically see struct copies and copy chains.

cc @CarolEidt.

@benaadams
Copy link
Member Author

Other one is the new System.Text.Json apis which for

private static void DoJson()
{
    using (JsonDocument doc = JsonDocument.Parse("[true,false]"))
    {  }
}

Which for JsonDocument:Parse(struct,struct,byref,byref) with tiering and R2R off produces x120 of these copy pairs:

vmovdqu  xmm0, qword ptr [rsp+...H]
vmovdqu  qword ptr [rsp+...H], xmm0

Not sure how many are static copies vs mutable data however

@mikedn
Copy link

mikedn commented Jun 7, 2019

Seems like fgMorphBlockOperand could also allow BLK(ADDR(LCL_FLD)) --> LCL_FLD? Let's see.

That should be possible on the RHS of a GT_ASG but it's not going to work currently on the LHS because LCL_FLD does not store struct type info (size, class handle, GC layout). #21705 is intended to make adding such struct type info easier. Once we have struct typed LCL_FLDs we should be able to get rid of the IsLocalAddrExpr hassle (and then get rid of GT_ASG).

@CarolEidt
Copy link

Sorry to be late to this party - what a great conversation!

Interestingly, I've been looking at copyprop in the context of #24912 (see my comment here: https://github.com/dotnet/coreclr/issues/24912#issuecomment-499901356), and it clearly needs some improvement. It seems to me that it should be able to handle these cases without requiring two passes, though I haven't done enough analysis/experimentation to be totally confident of that. Extending it to handle fields seems to be valuable and logical.

I'm eager for #21705 as I think that will really improve our ability to better normalize struct handling with that of primitive value types, and I think that's where we should start after 3.0.

I doubt that it's reasonable to consider any of this before 3.0 is branched (@AndyAyersMS would you agree?) but one thing that would be useful would be to capture the test cases so that we don't lose track of them - perhaps we could add a tests/src/JIT/Directed/copyprop directory.

@AndyAyersMS
Copy link
Member

I doubt that it's reasonable to consider any of this before 3.0 is branched (@AndyAyersMS would you agree?)

Yes, I agree. Hopefully that day is not too far off. In the meantime we can still experiment....

@AndyAyersMS
Copy link
Member

Forgot I had already worked up code to have assertion prop handle struct indirs... let me dust that off and add LCL_FLD support to it.

@mikedn
Copy link

mikedn commented Jun 7, 2019

Forgot I had already worked up code to have assertion prop handle struct indirs... let me dust that off and add LCL_FLD support to it.

It looks like I'm not the only one sitting on mountain of branches and stashes containing various experiments 😄

@AndyAyersMS
Copy link
Member

It looks like I'm not the only one sitting on mountain of branches and stashes containing various experiments

True enough -- the main CoreCLR repo I work in has 342 branches, most of them mine. I try to push the more interesting ones up to my public fork. It only has 156 branches. Probably a good number are branches that got merged that I never got around to deleting.

I really need some better way to keep track of them all.

@ReubenBond
Copy link
Member

I really need some better way to keep track of them all.

Some git UIs (eg GitExtensions, which is written in C# 😉) support a tree view if you structure your branch names with /s, eg: feature/membership-versioning

Example with GitExtensions

image

@AndyAyersMS
Copy link
Member

Integrated that and get a bit further. Now we're more flexible about consuming struct equality assertions during local prop, but still not flexible enough about producing them.

Part of the challenge here is the variety of tree shapes we see during morph. We really should try to produce canonical forms earlier.

@mikedn
Copy link

mikedn commented Jun 8, 2019

Part of the challenge here is the variety of tree shapes we see during morph. We really should try to produce canonical forms earlier.

Well, I have a stash for that too, it produces LCL_FLDs in LocalAddressVisitor, I'm gonna clean it up and push it to a branch. Still, it doesn't handle struct typed LCL_FLDs so perhaps it's probably not that useful until we get support for such LCL_FLDs.

@mikedn
Copy link

mikedn commented Jun 11, 2019

FWIW this commit generates LCL_FLDs early, in LocalAddressVisitor: mikedn@40c2ba1

Of course, for structs it keeps the OBJ node (but still transforms the address tree into ADDR(LCL_FLD)). Interestingly, this generates a few diffs. Looks like the current morph code that generates LCL_FLDs leaves GTF_GLOB_REF set and removing it results in some reordering.

I'm going to look into support struct typed LCL_FLDs on the RHS of ASG. That may be a good first step for more general struct typed LCL_FLDs.

@benaadams benaadams changed the title CopyBlk copyprop [No Merge] CopyBlk copyprop Jun 13, 2019
@mikedn
Copy link

mikedn commented Jun 26, 2019

While working on enabling struct typed LCL_FLDs I noticed that copies involving struct promotion aren't the only cases where we end up using INDs to access local variables, this also happens when copying register sized structs. The generated IR is:

     ( 13, 15) [000049] ------------              *  STMT      void  (IL 0x021...  ???)
N007 ( 13, 15) [000048] -A--G---R---              \--*  ASG       int   
N006 (  6,  7) [000047] *---G--N----                 +--*  IND       int   
N005 (  3,  5) [000046] ------------                 |  \--*  ADDR      byref 
N004 (  3,  4) [000040] U------N----                 |     \--*  LCL_FLD   struct V02 loc1         [+8] Fseq[_value2]
N003 (  6,  7) [000344] n-----------                 \--*  IND       int   
N002 (  3,  5) [000343] ------------                    \--*  ADDR      byref 
N001 (  3,  4) [000042] ------------                       \--*  LCL_FLD   struct V03 loc2         [+8] Fseq[_value2]

and the generated asm is:

       488D442470           lea      rax, bword ptr [rsp+70H]
       8B00                 mov      eax, dword ptr [rax]
       488D942480000000     lea      rdx, bword ptr [rsp+80H]
       8902                 mov      dword ptr [rdx], eax

The never ending LEA saga. And this happens in a case where you'd think that things should work just fine (register sized structs). Oh well, not sure how this should be handled. Probably NotAField LCL_FLD would cause other issues. Though it should probably work fine post-rationalization.

@mikedn
Copy link

mikedn commented Jul 16, 2019

@CarolEidt How do you feel about a serious refactoring of fgMorphCopyBlock and related functions? I've been trying to make some progress on adding struct typed LCL_FLD support but all this code is extremely convoluted and confusing - early declaration/unsuitable reuse of variable, gotos, redundant/impossible logical conditions, dubious comments, nothing's missing. Right now I'm trying to clean it up just for my own understanding but perhaps that work shouldn't go to waste?

@mikedn
Copy link

mikedn commented Aug 5, 2019

FWIW I gathered various changes I'm working on and included them in one my WIP PRs so they go through testing:

  • fgMorphCopyBlock improvements that allows it to generate LCL_FLDs rather than indirs that leave variables address exposed
  • fgMorphCopyBlock improvements that allow folding of a struct field offset into promoted field accesses
  • early generation of LCL_FLDs in LocalAddressVisitor
  • struct typed LCL_FLDs on the RHS of ASG

So far tests passed. Next thing will be the fun stuff - struct typed LCL_FLDs on the LHS of ASG.

@CarolEidt
Copy link

@CarolEidt How do you feel about a serious refactoring of fgMorphCopyBlock and related functions? I've been trying to make some progress on adding struct typed LCL_FLD support but all this code is extremely convoluted and confusing - early declaration/unsuitable reuse of variable, gotos, redundant/impossible logical conditions, dubious comments, nothing's missing. Right now I'm trying to clean it up just for my own understanding but perhaps that work shouldn't go to waste?

I would be very happy to see a good refactoring of the struct-related fgMorph methods. In evolving the handling of structs, I've often opted for minimal refactoring to get the next increment of either IR change or code improvement, with the result that there is often redundant and/or overly-pessimistic code remaining.

@mikedn
Copy link

mikedn commented Aug 5, 2019

I would be very happy to see a good refactoring of the struct-related fgMorph methods. In evolving the handling of structs, I've often opted for minimal refactoring to get the next increment of either IR change or code improvement, with the result that there is often redundant and/or overly-pessimistic code remaining.

OK, the monster PR (#21711) includes that too. In fact it includes almost all the struct related work I have so far. Trying to get something working end to end before starting to split it into more manageable PRs.

@maryamariyan
Copy link
Member

Thank you for your contribution. As announced in dotnet/coreclr#27549 this repository will be moving to dotnet/runtime on November 13. If you would like to continue working on this PR after this date, the easiest way to move the change to dotnet/runtime is:

  1. In your coreclr repository clone, create patch by running git format-patch origin
  2. In your runtime repository clone, apply the patch by running git apply --directory src/coreclr <path to the patch created in step 1>

@BruceForstall BruceForstall added the post-consolidation PRs which will be hand ported to dotnet/runtime label Nov 7, 2019
@maryamariyan
Copy link
Member

Thank you for your contribution. As announced in #27549 the dotnet/runtime repository will be used going forward for changes to this code base. Closing this PR as no more changes will be accepted into master for this repository. If you’d like to continue working on this change please move it to dotnet/runtime.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen post-consolidation PRs which will be hand ported to dotnet/runtime
Projects
None yet
10 participants