Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Decouple call store operands from local ret buf optimization #68469

Merged
merged 3 commits into from
Apr 26, 2022

Conversation

jakobbotsch
Copy link
Member

@jakobbotsch jakobbotsch commented Apr 24, 2022

  • Move GenTreeCall::GetLclRetBufArgNode ->
    Compiler::gtCallGetDefinedRetBufLclAddr and change it to not try to
    look into stores
  • Assert that the node we are returning actually defines a local that
    has lvHiddenBufferStructArg set
  • Assert that call morphing does not break our recognition of the defined
    local when optimizing
  • Update Compiler::DefinesLocalAddr to check for GT_LCL_FLD_ADDR

* Move GenTreeCall::GetLclRetBufArgNode ->
  Compiler::gtCallGetDefinedRetBufLclAddr and change it to not try to
  look into stores
* Assert that the node we are returning actually defines a local that
  has lvHiddenBufferStructArg set
* Assert that call morphing does break our recognition of the defined
  local when optimizing
@dotnet-issue-labeler dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Apr 24, 2022
@ghost ghost assigned jakobbotsch Apr 24, 2022
@ghost
Copy link

ghost commented Apr 24, 2022

Tagging subscribers to this area: @JulieLeeMSFT
See info in area-owners.md if you want to be subscribed.

Issue Details
  • Move GenTreeCall::GetLclRetBufArgNode ->
    Compiler::gtCallGetDefinedRetBufLclAddr and change it to not try to
    look into stores
  • Assert that the node we are returning actually defines a local that
    has lvHiddenBufferStructArg set
  • Assert that call morphing does break our recognition of the defined
    local when optimizing
  • Update Compiler::DefinesLocalAddr to check for GT_LCL_FLD_ADDR
Author: jakobbotsch
Assignees: jakobbotsch
Labels:

area-CodeGen-coreclr

Milestone: -

Copy link
Contributor

@SingleAccretion SingleAccretion left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for addressing this!

I would say it would probably be prudent to run some stress over this to avoid surprises with struct field addresses being GLOB_REF-ed.

Comment on lines 17906 to 17910
// This may be called very late to check validity of LIR.
if (node->IsCopyOrReload())
{
node = node->AsCopyOrReload()->gtGetOp1();
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: this is gtSkipReloadOrCopy.

@@ -7453,7 +7457,7 @@ GenTree* Compiler::fgMorphTailCallViaHelpers(GenTreeCall* call, CORINFO_TAILCALL
JITDUMP("Removing retbuf");

call->gtArgs.Remove(call->gtArgs.GetRetBufferArg());
call->gtCallMoreFlags &= ~GTF_CALL_M_RETBUFFARG;
call->gtCallMoreFlags &= ~(GTF_CALL_M_RETBUFFARG | GTF_CALL_M_RETBUFFARG_LCLOPT);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general the rule is that if you remove GTF_CALL_M_RETBUFFARG_LCLOPT, you need to address-expose the underlying local (since it makes it invisible to liveness), I suppose here we can elide that because the logical definition occurs outside the method (since it is a tailcall)?

Perhaps worth a comment (or not - maybe it is self-evident).

Copy link
Member Author

@jakobbotsch jakobbotsch Apr 24, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, things have already gone very wrong if we think we have a tailcall with a retbuf that can be optimized with this optimization.
I will just revert this change, we will hit the assert inside of gtCallGetDefinedRetBufLclAddr if the flag is set and we don't have a retbuf.

@@ -16180,7 +16180,7 @@ bool GenTree::DefinesLocal(Compiler* comp, GenTreeLclVarCommon** pLclVarTree, bo
// Returns true if this GenTree defines a result which is based on the address of a local.
bool GenTree::DefinesLocalAddr(Compiler* comp, unsigned width, GenTreeLclVarCommon** pLclVarTree, bool* pIsEntire)
{
if (OperGet() == GT_ADDR || OperGet() == GT_LCL_VAR_ADDR)
if (OperIs(GT_ADDR, GT_LCL_VAR_ADDR, GT_LCL_FLD_ADDR))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: this local morph comment is now outdated.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you know why DefinesLocalAddr was not recognizing the pattern in the first place? Since it supports communicating "entirety" it seems pretty uncontroversial.

Copy link
Contributor

@SingleAccretion SingleAccretion Apr 24, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you know why DefinesLocalAddr was not recognizing the pattern in the first place?

Not really I suppose. If I were to guess, it is the result of LCL_FLD being a somewhat late addition, and the original version of the function only recognizing ADDR(LCL_VAR).

I do believe there are no correctness dependencies on it not recognizing LCL_FLD_ADDR.

(Side note: the contract of DefinesLocalAddr is that it must recognize a superset of addresses LocalAddressVisitor recognizes, absence of FLD_ADDR seemingly violates this, but not actually, because it does recognize ADDR(LCL_FLD), and in LIR we rely on use-def flags set by the front-end)

@jakobbotsch
Copy link
Member Author

jakobbotsch commented Apr 25, 2022

This is currently waiting for #68484.

@jakobbotsch
Copy link
Member Author

/azp run runtime-coreclr jitstress, runtime-coreclr outerloop

@azure-pipelines
Copy link

Azure Pipelines successfully started running 2 pipeline(s).

Copy link
Member

@kunalspathak kunalspathak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, just a question in sideeffects.cpp.

@@ -17864,6 +17864,57 @@ bool Compiler::gtIsStaticGCBaseHelperCall(GenTree* tree)
return false;
}

//------------------------------------------------------------------------
// gtCallGetDefinedRetBufLclAddr:
// Get the tree corresponding to the address of the retbuf taht this call defines.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Get the tree corresponding to the address of the retbuf taht this call defines.
// Get the tree corresponding to the address of the retbuf that this call defines.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me include it as part of #68460 unless there are other changes.

if (retBufArgNode != nullptr)
{
// If a copy/reload is inserted by LSRA, retrieve the returnBuffer
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this no longer needed? I don't recall which test, but I added it much later when trying to debug a test failure.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have moved it into gtCallGetDefinedRetBufLclAddr instead, since it also needs to skip it for the assertion there.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, didn't notice that. Thanks!

@ghost ghost added needs-author-action An issue or pull request that requires more info or actions from the author. and removed needs-author-action An issue or pull request that requires more info or actions from the author. labels Apr 26, 2022
@jakobbotsch
Copy link
Member Author

Some of the helix jobs on outerloop/jitstress windows-arm64 machines timed out. Not sure why as the job lists all seem to have "State": "Finished", but in previous runs they sometimes take a very long time too and there are some timeouts previously too, so will assume it's unrelated to this change.

OSX x64 failure is #67816.

Copy link
Member

@kunalspathak kunalspathak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jakobbotsch jakobbotsch merged commit ae9f1ca into dotnet:main Apr 26, 2022
@jakobbotsch jakobbotsch deleted the retbuf-def-asserts branch April 26, 2022 16:04
@ghost ghost locked as resolved and limited conversation to collaborators May 26, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants