Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SESE Loops: transform loops to have single exit block in non-cloning case. #19385

Merged
merged 6 commits into from Sep 20, 2018

Conversation

bgogul
Copy link
Collaborator

@bgogul bgogul commented Sep 19, 2018

I could not break the PR further down. Hopefully the algorithm outline presented here helps.

Algorithm Outline

(Note that exiting blocks are inside the loop and exit blocks are outside the loop. Therefore, an exit edge is from an exiting block to an exit block.)

This PR takes care of transforming the loop such that the common post dominator of all exit blocks is the new exit block for the loop. This is one of the several transformations needed to eliminate undef from SESE loops. Consider the following snippet:

while(...) {
   if (...) {
     ... 
     break;
   }
}

Recall that the blocks within if do not belong to the while loop in the SIL IR. The transformation implemented in this PR has the effect of moving the blocks back into the loop.

Step 1. Transform Loop. (See ensureSingleExitBlock)

  • Let nearestCommonPD be the nearest common post-dominator block of all the exit blocks.
  • Move all the blocks reachable from exiting blocks to the nearestCommonPD inside the loop. (It is not always possible to move all reachable blocks without cloning some of the them. See notes below.)
  • Let exitBlocks be the set of exit blocks for the loop after the above transformation.

Step 2. Connect exiting blocks to a new latch and unify loop arguments. (See patchEdges)

(Only changes related to exit arguments are shown here.)

  • Let exitArgs be the union of the arguments of all exitBlocks.
  • for each exitArg in exitArgs, add an appropriate argument to the new header block. Let p0, p1, ..pn be the new arguments corresponding to exit arguments a0, a1, a2, ..., an.
  • For a exit edge br exit_i(x, y) -> exit_i(a2, a3), change the source to br new_latch(p0, p1, x, y, p4, ..., pn)

Notes.
If a block is not dominated by the header of the loop, we cannot move it into the loop in Step 1. For such cases, we will need to clone the block before moving it into the loop. I will send out a separate PR for that purpose.

@bgogul
Copy link
Collaborator Author

bgogul commented Sep 19, 2018

@swift-ci please test tensorflow linux

@bgogul
Copy link
Collaborator Author

bgogul commented Sep 19, 2018

@swift-ci please test tensorflow macos

@@ -298,8 +302,9 @@ static SILValue createTFIntegerConst(GraphFunctionDeviceInfo &deviceInfo,
class SingleExitLoopTransformer {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR descriptions gives a nice overview of how the code works. Would you like to also capture that as a comment block?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in the pr description, should br new_latch(p0, p1 be br new_latch(a0, a1?

Copy link
Collaborator Author

@bgogul bgogul Sep 19, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it is br new_latch(p0, p1..). However there was a typo in the description. It should have been as follows: For an exit edge br exit_i(x, y) -> exit_i(a2, a3), change the source to br new_latch(p0, p1, x, y, p4, ..., pn). Essentially, we are passing x and y at the corresponding positions and use the state captured at the loop header for the other exit arguments. Intuitively, p_i captures the value for exit argument a_i at the new header.

(I also incorporated some of the text from the description into the comments.)

@@ -375,7 +389,7 @@ class SingleExitLoopTransformer {
/// Equivalence classes induced by argument passing.
llvm::EquivalenceClasses<SILValue> equivalentValues;
/// exit blocks before the loop is transformed.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR description says:

Let exitBlocks be the set of exit blocks for the loop after the above transformation.

Is it before or after?

i believe it should before. (if it's "after", there would only be a single exit block.)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is after the transformation in this PR. We can't have single exit block yet as we need to implement cloning.

@@ -323,6 +328,14 @@ class SingleExitLoopTransformer {
llvm::DenseMap<SILValue, SILValue>
getPreheaderSubstMap(const SmallPtrSetImpl<SILValue> &values) const;

/// Transform the loop by moving and cloning nodes (as needed) so
/// that the nearest common post dominator of the current exiting blocks
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the current exiting blocks -> the current exit blocks?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@@ -323,6 +328,14 @@ class SingleExitLoopTransformer {
llvm::DenseMap<SILValue, SILValue>
getPreheaderSubstMap(const SmallPtrSetImpl<SILValue> &values) const;

/// Transform the loop by moving and cloning nodes (as needed) so
/// that the nearest common post dominator of the current exiting blocks
/// is a single exit block for the loop.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is -> becomes?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

// top-level loop is already correct.
}
loop->addBasicBlockToLoop(outsideBlock, LI->getBase());
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we do a sanity check that loop->getExitBlocks() should return a single block here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not yet, as we will need to implement cloning of blocks as mentioned in the PR description.

canonicalizeAllLoops(&DI, &LI);
if (canonicalizeAllLoops(&DI, &LI)) {
// Recalculate PDI if canonicalization made any changes.
PDI.recalculate(*F);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we already have PDI.recalculate(*F); when if (loopChanged). why do we need this additional call?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is to take care of changes introduced by canonicalizeAllLoops, which happens before our transformations. There is no support to incrementall update PDI unlike DI and LI.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems we are not using PDI before this call. should we remove PDI(F) from the init list in c'tor, so that the call here becomes its initialization? That would make the code more readable for me, but you can decide.

also consider moving the canonicalizeAllLoops() call to the c'tor (so that PDI initialization code is done as part of initialization).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i also noticed processAcyclicRegionExcludingEnd() may not need to be public.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will leave the PDI as is for now. I plan to move the SingleExitTransformer class to a separate file as it is getting larger. I will do some of the refactoring then.

I made the non-interface methods private.

@lattner
Copy link
Collaborator

lattner commented Sep 19, 2018

From a quick read of the code (not a deep algorithmic understanding) the patch LGTM!

Copy link
Collaborator Author

@bgogul bgogul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review.

canonicalizeAllLoops(&DI, &LI);
if (canonicalizeAllLoops(&DI, &LI)) {
// Recalculate PDI if canonicalization made any changes.
PDI.recalculate(*F);
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is to take care of changes introduced by canonicalizeAllLoops, which happens before our transformations. There is no support to incrementall update PDI unlike DI and LI.

@@ -298,8 +302,9 @@ static SILValue createTFIntegerConst(GraphFunctionDeviceInfo &deviceInfo,
class SingleExitLoopTransformer {
Copy link
Collaborator Author

@bgogul bgogul Sep 19, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it is br new_latch(p0, p1..). However there was a typo in the description. It should have been as follows: For an exit edge br exit_i(x, y) -> exit_i(a2, a3), change the source to br new_latch(p0, p1, x, y, p4, ..., pn). Essentially, we are passing x and y at the corresponding positions and use the state captured at the loop header for the other exit arguments. Intuitively, p_i captures the value for exit argument a_i at the new header.

(I also incorporated some of the text from the description into the comments.)

// top-level loop is already correct.
}
loop->addBasicBlockToLoop(outsideBlock, LI->getBase());
}
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not yet, as we will need to implement cloning of blocks as mentioned in the PR description.

@@ -323,6 +328,14 @@ class SingleExitLoopTransformer {
llvm::DenseMap<SILValue, SILValue>
getPreheaderSubstMap(const SmallPtrSetImpl<SILValue> &values) const;

/// Transform the loop by moving and cloning nodes (as needed) so
/// that the nearest common post dominator of the current exiting blocks
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@@ -323,6 +328,14 @@ class SingleExitLoopTransformer {
llvm::DenseMap<SILValue, SILValue>
getPreheaderSubstMap(const SmallPtrSetImpl<SILValue> &values) const;

/// Transform the loop by moving and cloning nodes (as needed) so
/// that the nearest common post dominator of the current exiting blocks
/// is a single exit block for the loop.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@@ -375,7 +389,7 @@ class SingleExitLoopTransformer {
/// Equivalence classes induced by argument passing.
llvm::EquivalenceClasses<SILValue> equivalentValues;
/// exit blocks before the loop is transformed.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is after the transformation in this PR. We can't have single exit block yet as we need to implement cloning.

@bgogul
Copy link
Collaborator Author

bgogul commented Sep 20, 2018

I will go ahead and merge the PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants