-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Decouple call store operands from local ret buf optimization #68469
Conversation
* Move GenTreeCall::GetLclRetBufArgNode -> Compiler::gtCallGetDefinedRetBufLclAddr and change it to not try to look into stores * Assert that the node we are returning actually defines a local that has lvHiddenBufferStructArg set * Assert that call morphing does break our recognition of the defined local when optimizing
Tagging subscribers to this area: @JulieLeeMSFT Issue Details
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for addressing this!
I would say it would probably be prudent to run some stress over this to avoid surprises with struct field addresses being GLOB_REF
-ed.
src/coreclr/jit/gentree.cpp
Outdated
// This may be called very late to check validity of LIR. | ||
if (node->IsCopyOrReload()) | ||
{ | ||
node = node->AsCopyOrReload()->gtGetOp1(); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: this is gtSkipReloadOrCopy
.
src/coreclr/jit/morph.cpp
Outdated
@@ -7453,7 +7457,7 @@ GenTree* Compiler::fgMorphTailCallViaHelpers(GenTreeCall* call, CORINFO_TAILCALL | |||
JITDUMP("Removing retbuf"); | |||
|
|||
call->gtArgs.Remove(call->gtArgs.GetRetBufferArg()); | |||
call->gtCallMoreFlags &= ~GTF_CALL_M_RETBUFFARG; | |||
call->gtCallMoreFlags &= ~(GTF_CALL_M_RETBUFFARG | GTF_CALL_M_RETBUFFARG_LCLOPT); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general the rule is that if you remove GTF_CALL_M_RETBUFFARG_LCLOPT
, you need to address-expose the underlying local (since it makes it invisible to liveness), I suppose here we can elide that because the logical definition occurs outside the method (since it is a tailcall)?
Perhaps worth a comment (or not - maybe it is self-evident).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, things have already gone very wrong if we think we have a tailcall with a retbuf that can be optimized with this optimization.
I will just revert this change, we will hit the assert inside of gtCallGetDefinedRetBufLclAddr
if the flag is set and we don't have a retbuf.
@@ -16180,7 +16180,7 @@ bool GenTree::DefinesLocal(Compiler* comp, GenTreeLclVarCommon** pLclVarTree, bo | |||
// Returns true if this GenTree defines a result which is based on the address of a local. | |||
bool GenTree::DefinesLocalAddr(Compiler* comp, unsigned width, GenTreeLclVarCommon** pLclVarTree, bool* pIsEntire) | |||
{ | |||
if (OperGet() == GT_ADDR || OperGet() == GT_LCL_VAR_ADDR) | |||
if (OperIs(GT_ADDR, GT_LCL_VAR_ADDR, GT_LCL_FLD_ADDR)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: this local morph comment is now outdated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you know why DefinesLocalAddr
was not recognizing the pattern in the first place? Since it supports communicating "entirety" it seems pretty uncontroversial.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you know why DefinesLocalAddr was not recognizing the pattern in the first place?
Not really I suppose. If I were to guess, it is the result of LCL_FLD
being a somewhat late addition, and the original version of the function only recognizing ADDR(LCL_VAR)
.
I do believe there are no correctness dependencies on it not recognizing LCL_FLD_ADDR
.
(Side note: the contract of DefinesLocalAddr
is that it must recognize a superset of addresses LocalAddressVisitor
recognizes, absence of FLD_ADDR
seemingly violates this, but not actually, because it does recognize ADDR(LCL_FLD)
, and in LIR we rely on use-def flags set by the front-end)
|
/azp run runtime-coreclr jitstress, runtime-coreclr outerloop |
Azure Pipelines successfully started running 2 pipeline(s). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, just a question in sideeffects.cpp.
@@ -17864,6 +17864,57 @@ bool Compiler::gtIsStaticGCBaseHelperCall(GenTree* tree) | |||
return false; | |||
} | |||
|
|||
//------------------------------------------------------------------------ | |||
// gtCallGetDefinedRetBufLclAddr: | |||
// Get the tree corresponding to the address of the retbuf taht this call defines. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// Get the tree corresponding to the address of the retbuf taht this call defines. | |
// Get the tree corresponding to the address of the retbuf that this call defines. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me include it as part of #68460 unless there are other changes.
if (retBufArgNode != nullptr) | ||
{ | ||
// If a copy/reload is inserted by LSRA, retrieve the returnBuffer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this no longer needed? I don't recall which test, but I added it much later when trying to debug a test failure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have moved it into gtCallGetDefinedRetBufLclAddr
instead, since it also needs to skip it for the assertion there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, didn't notice that. Thanks!
Some of the helix jobs on outerloop/jitstress windows-arm64 machines timed out. Not sure why as the job lists all seem to have OSX x64 failure is #67816. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Compiler::gtCallGetDefinedRetBufLclAddr and change it to not try to
look into stores
has lvHiddenBufferStructArg set
local when optimizing
Compiler::DefinesLocalAddr
to check forGT_LCL_FLD_ADDR