New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SESE Loops: transform loops to have single exit block in non-cloning case. #19385
Conversation
@swift-ci please test tensorflow linux |
@swift-ci please test tensorflow macos |
@@ -298,8 +302,9 @@ static SILValue createTFIntegerConst(GraphFunctionDeviceInfo &deviceInfo, | |||
class SingleExitLoopTransformer { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The PR descriptions gives a nice overview of how the code works. Would you like to also capture that as a comment block?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in the pr description, should br new_latch(p0, p1
be br new_latch(a0, a1
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, it is br new_latch(p0, p1..)
. However there was a typo in the description. It should have been as follows: For an exit edge br exit_i(x, y) -> exit_i(a2, a3)
, change the source to br new_latch(p0, p1, x, y, p4, ..., pn)
. Essentially, we are passing x
and y
at the corresponding positions and use the state captured at the loop header for the other exit arguments. Intuitively, p_i
captures the value for exit argument a_i
at the new header.
(I also incorporated some of the text from the description into the comments.)
@@ -375,7 +389,7 @@ class SingleExitLoopTransformer { | |||
/// Equivalence classes induced by argument passing. | |||
llvm::EquivalenceClasses<SILValue> equivalentValues; | |||
/// exit blocks before the loop is transformed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The PR description says:
Let exitBlocks be the set of exit blocks for the loop after the above transformation.
Is it before or after?
i believe it should before. (if it's "after", there would only be a single exit block.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is after the transformation in this PR. We can't have single exit block yet as we need to implement cloning.
@@ -323,6 +328,14 @@ class SingleExitLoopTransformer { | |||
llvm::DenseMap<SILValue, SILValue> | |||
getPreheaderSubstMap(const SmallPtrSetImpl<SILValue> &values) const; | |||
|
|||
/// Transform the loop by moving and cloning nodes (as needed) so | |||
/// that the nearest common post dominator of the current exiting blocks |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the current exiting blocks -> the current exit blocks?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
@@ -323,6 +328,14 @@ class SingleExitLoopTransformer { | |||
llvm::DenseMap<SILValue, SILValue> | |||
getPreheaderSubstMap(const SmallPtrSetImpl<SILValue> &values) const; | |||
|
|||
/// Transform the loop by moving and cloning nodes (as needed) so | |||
/// that the nearest common post dominator of the current exiting blocks | |||
/// is a single exit block for the loop. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is -> becomes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
// top-level loop is already correct. | ||
} | ||
loop->addBasicBlockToLoop(outsideBlock, LI->getBase()); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we do a sanity check that loop->getExitBlocks() should return a single block here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not yet, as we will need to implement cloning of blocks as mentioned in the PR description.
canonicalizeAllLoops(&DI, &LI); | ||
if (canonicalizeAllLoops(&DI, &LI)) { | ||
// Recalculate PDI if canonicalization made any changes. | ||
PDI.recalculate(*F); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we already have PDI.recalculate(*F);
when if (loopChanged)
. why do we need this additional call?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is to take care of changes introduced by canonicalizeAllLoops, which happens before our transformations. There is no support to incrementall update PDI unlike DI and LI.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems we are not using PDI
before this call. should we remove PDI(F)
from the init list in c'tor, so that the call here becomes its initialization? That would make the code more readable for me, but you can decide.
also consider moving the canonicalizeAllLoops()
call to the c'tor (so that PDI
initialization code is done as part of initialization).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i also noticed processAcyclicRegionExcludingEnd() may not need to be public.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will leave the PDI as is for now. I plan to move the SingleExitTransformer class to a separate file as it is getting larger. I will do some of the refactoring then.
I made the non-interface methods private.
From a quick read of the code (not a deep algorithmic understanding) the patch LGTM! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the review.
canonicalizeAllLoops(&DI, &LI); | ||
if (canonicalizeAllLoops(&DI, &LI)) { | ||
// Recalculate PDI if canonicalization made any changes. | ||
PDI.recalculate(*F); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is to take care of changes introduced by canonicalizeAllLoops, which happens before our transformations. There is no support to incrementall update PDI unlike DI and LI.
@@ -298,8 +302,9 @@ static SILValue createTFIntegerConst(GraphFunctionDeviceInfo &deviceInfo, | |||
class SingleExitLoopTransformer { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, it is br new_latch(p0, p1..)
. However there was a typo in the description. It should have been as follows: For an exit edge br exit_i(x, y) -> exit_i(a2, a3)
, change the source to br new_latch(p0, p1, x, y, p4, ..., pn)
. Essentially, we are passing x
and y
at the corresponding positions and use the state captured at the loop header for the other exit arguments. Intuitively, p_i
captures the value for exit argument a_i
at the new header.
(I also incorporated some of the text from the description into the comments.)
// top-level loop is already correct. | ||
} | ||
loop->addBasicBlockToLoop(outsideBlock, LI->getBase()); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not yet, as we will need to implement cloning of blocks as mentioned in the PR description.
@@ -323,6 +328,14 @@ class SingleExitLoopTransformer { | |||
llvm::DenseMap<SILValue, SILValue> | |||
getPreheaderSubstMap(const SmallPtrSetImpl<SILValue> &values) const; | |||
|
|||
/// Transform the loop by moving and cloning nodes (as needed) so | |||
/// that the nearest common post dominator of the current exiting blocks |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
@@ -323,6 +328,14 @@ class SingleExitLoopTransformer { | |||
llvm::DenseMap<SILValue, SILValue> | |||
getPreheaderSubstMap(const SmallPtrSetImpl<SILValue> &values) const; | |||
|
|||
/// Transform the loop by moving and cloning nodes (as needed) so | |||
/// that the nearest common post dominator of the current exiting blocks | |||
/// is a single exit block for the loop. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
@@ -375,7 +389,7 @@ class SingleExitLoopTransformer { | |||
/// Equivalence classes induced by argument passing. | |||
llvm::EquivalenceClasses<SILValue> equivalentValues; | |||
/// exit blocks before the loop is transformed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is after the transformation in this PR. We can't have single exit block yet as we need to implement cloning.
I will go ahead and merge the PR. |
I could not break the PR further down. Hopefully the algorithm outline presented here helps.
Algorithm Outline
(Note that exiting blocks are inside the loop and exit blocks are outside the loop. Therefore, an exit edge is from an exiting block to an exit block.)
This PR takes care of transforming the loop such that the common post dominator of all exit blocks is the new exit block for the loop. This is one of the several transformations needed to eliminate undef from SESE loops. Consider the following snippet:
Recall that the blocks within
if
do not belong to the while loop in the SIL IR. The transformation implemented in this PR has the effect of moving the blocks back into the loop.Step 1. Transform Loop. (See
ensureSingleExitBlock
)nearestCommonPD
be the nearest common post-dominator block of all the exit blocks.nearestCommonPD
inside the loop. (It is not always possible to move all reachable blocks without cloning some of the them. See notes below.)exitBlocks
be the set of exit blocks for the loop after the above transformation.Step 2. Connect exiting blocks to a new latch and unify loop arguments. (See
patchEdges
)(Only changes related to exit arguments are shown here.)
exitArgs
be the union of the arguments of allexitBlocks
.exitArg
inexitArgs
, add an appropriate argument to the new header block. Letp0, p1, ..pn
be the new arguments corresponding to exit argumentsa0, a1, a2, ..., an
.br exit_i(x, y) -> exit_i(a2, a3)
, change the source tobr new_latch(p0, p1, x, y, p4, ..., pn)
Notes.
If a block is not dominated by the header of the loop, we cannot move it into the loop in Step 1. For such cases, we will need to clone the block before moving it into the loop. I will send out a separate PR for that purpose.