Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Eliminate undefs wherever possible by finding def-free paths through loop. #19193

Merged
merged 12 commits into from Sep 13, 2018

Conversation

bgogul
Copy link
Collaborator

@bgogul bgogul commented Sep 7, 2018

Eliminate undefs during canonicalization by finding an def-free path from the escaping value to a value defined in the node dominated that dominates the preheader.

This is a first in the series of PRs to completely eliminate undefs during SESE loop canoncialization.

Partially resolves SR-7765.

@bgogul
Copy link
Collaborator Author

bgogul commented Sep 7, 2018

@swift-ci please test tensorflow linux

@bgogul
Copy link
Collaborator Author

bgogul commented Sep 7, 2018

@swift-ci please test tensorflow macos

@bgogul bgogul requested review from mhong and lattner September 7, 2018 17:43
@mhong
Copy link
Contributor

mhong commented Sep 8, 2018

Eliminate undefs during canonicalization by finding an def-free path from the escaping value to a value defined in the node dominated that dominates the preheader.

I can't quite parse this sentence. Can you please clarify, and perhaps give an example (based on the unit test) on what that path could be? The code checks on "domination", but here we use the concept "def-free" -- can you elaborate on the relationship?

Copy link
Contributor

@mhong mhong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this PR is one of the steps, I'm approving it to unblock incremental progress. We can continue the discussion on the side.


using namespace swift;
using namespace tf;

static llvm::cl::opt<bool> TFEnsureSingleLoopExit(
"tf-ensure-single-loop-exit", llvm::cl::init(true),
llvm::cl::desc("Transform loops to have a single exit from header."));
static llvm::cl::opt<bool> TFNoUndefsInSESE(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's the intended use of this flag?

are there possible bugs. if yes it'd be good to try addressing them before flipping to true.

Note if the bug causes wrong results, user may not be aware of the root cause and thus "turning off the flag" in that case would not be a useful workaround.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is mostly for debugging. I just needed a way to flip to the old behavior without recompiling the binary. (Will update the comment accordingly.) I don't expect the flag to be in the code base for long.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SG

@@ -361,8 +369,10 @@ class SingleExitLoopTransformer {
// the new latchBlock as appropriate.
SmallVector<std::pair<const SILBasicBlock *, const SILBasicBlock *>, 8>
edgesToFix;
/// Identify the set of values that escape the loop.
llvm::SmallPtrSet<SILValue, 8> escapingValues;
/// Identify the set of values that escape the loop. The keys represent the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: keys -> key?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@@ -386,13 +396,17 @@ void SingleExitLoopTransformer::initialize() {
}
}

computeEscapingValues();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would it be cleaner to:

  1. make computeEscapingValues() return a dense map, and assign that to escapingValueSubstMap. this makes it clear what the (side)effects of computeEscapingValues() are. even better, mark computeEscapingValues() static if that's possible.

  2. can we even make escapingValueSubstMap a const member, and set it in the c'tor initializer list?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 is a good idea. Instead of static, I made this a const function.
2 is not possible. In subsequent PRs, I will do CFG transformations before computing the escaping values.

if (!TFNoUndefsInSESE) return;

// Find a def-free path from the escaping value to the preheader for each
// escaping value. If no such path is found, set the escaping value at
Copy link
Contributor

@mhong mhong Sep 8, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in what cases can we not find such a path? an example (or maybe unit test) could help.

it's useful to be able to characterize the cases that we can remove (or cannot remove) undefs, because then we can assess if the gained benefits outweigh the added complexity. also that'll give us a mental model of looking at user code, and predicting if undefs will be there.

I will add a unit test.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ultimately, we should never have undefs when I finish all the PRs.

do..while is a simple case, where we cannot find a def-free path:
{
var count:Int32 = 0
var sum:Int32 = 0
repeat {
sum += count
count += 1
} while (count < 100)
return sum
}
In this case, the only way is to clone the body of the loop as follows:
{
var count:Int32 = 0
var sum:Int32 = 0
sum += count
count += 1
stayInLoop = (count < 100) ? true : false
while (stayInLoop) {
sum += count
count += 1
stayInLoop = (count < 100) ? true : false
}
return sum
}

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a test case that is similar to the one outlined here. In this case, we will still produce undefs, but will eliminate it eventually.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice!

kv.second = current;
break;
}
} else if (auto *arg = dyn_cast<SILArgument>(current)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are these the only two cases relevant here (are we guaranteed to never see an undef at this point)? if yes it'd be good to assert on the second case (or assert there is no other case).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, these are the cases that will lead to a def-free path. I think this part will become clearer if I rewrite this in terms of computing equivalence classes induced by argument passing. It will be more efficient and cleaner. Let me rewrite this code.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I rewrote teh code to use equivalance classes which conveys the intention better and is more efficient.

@@ -470,7 +524,8 @@ SingleExitLoopTransformer::createNewHeader() {
}
header->dropAllArguments();
// Add phi arguments in the new header corresponding to the escaping values.
for (const auto &escapingValue : escapingValues) {
for (const auto &kv : escapingValueSubstMap) {
const SILValue& escapingValue = kv.first;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i believe SILValue is a lightweight wrapper, and we need not take a reference. (making a copy is fine)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

for (const SILValue &escapingValue : escapingValues) {
newArgs.push_back(getUndef(escapingValue->getType()));
for (const auto &kv : escapingValueSubstMap) {
newArgs.push_back(kv.second);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a note for future consideration: if we really think that some undefs cannot be eliminated (may be useful to discuss that first), we may want to log the undefs and/or add stats counter on them

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, there should not be any undefs in the final version.

SmallVector<SILValue, 8> incomingValues;
arg->getIncomingValues(incomingValues);
for (const SILValue &incomingValue : incomingValues) {
if (visited.count(incomingValue) > 0) continue;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just write if (visited.count(incomingValue))

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I usually write (visited.count(...) > 0) as it reduces some cognitive load when reading code. I don't have a strong preference though.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SG. No strong preference either.

break;
}
// This is not usable. Add incoming values to worklist.
SmallVector<SILValue, 8> incomingValues;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the code here suggests if any incoming value is in a block that dominates preheader, we can use it.

but actually don't we need all, not any?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me rewrite this code using equivalence classes and the idea will be come clear.

@bgogul
Copy link
Collaborator Author

bgogul commented Sep 11, 2018

I have changed the implementation to use equivalence classes. PTAL.

Copy link
Contributor

@mhong mhong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Equiv class based code does make it more readable. Great!

}

llvm::DenseMap<SILValue, SILValue>
SingleExitLoopTransformer::computeEscapingValues() const {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: consider passing llvm::DenseMap<SILValue, SILValue> as an output (pointer) param into the func.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am conflicted here. :) I generally prefer the functional style as it makes it easy to read code, but llvm and swift seems to mostly have inout param style. Is that the preferred style? If so, I will change it.

This should be as efficient as the compiler will optimize away copies with Return Value Optimization.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have a strong preference. I don't quite like the current closure syntax "&result" though, but it's not a big issue.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will leave it as is then. (Even if I pass an argument, I will need to add the inout parameter to the capture?)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you pass the input by pointer/address into computeEscapingValues(), you can then pass that pointer to the closure by value. but it's not a big deal.

const SILValue &escapingValue = kv.first;
// Find an equivalent value that dominates the header.
for (auto equivalentValue :
make_range(equivalentValues.findLeader(escapingValue),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it possible that we have an equiv class [v1, v2], and v1 is chosen as the leader, but v2 dominates preheader, not v1?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that can happen. The leader for a class depends upon a bunch of factors:

  • order in which unification constraints are processed
  • the implementation of the union-find

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then it would seem to lead to suboptimal transformation -- if we choose v2 in the above example, the undef can be removed. Do we want to iterate over the members (as in your old code), rather than only checking on the "leader" picked by the underlying llvm infra? (pls feel free to scope it out of this PR.)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are actually iterating over all the elements of the equivalence class. For this example, we will chose v2 as we iterate over {v1, v2}. See the make_range(...) call.

(It seems to me that the code is not readable. Any suggestions to improve it?)

Copy link
Contributor

@mhong mhong Sep 12, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for clarifying. So we are iterating between equivalentValues.findLeader(escapingValue) and equivalentValues.member_end(). say the container has a list of elems [v1, v2, v3], and findLeader() returns v2, does that then skip processing v1?

I wonder if we should call member_begin() instead of findLeader().

In any case, it might help to add a comment right before make_range() to clarify the intention.

Copy link
Collaborator Author

@bgogul bgogul Sep 12, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see the source of confusion now. findLeader() returns an member iterator for the equivalence class.

Unfortunately, calling member_begin() on non-leaders returns null.

So, we are iterating over all the elements of the equivalence class. We won't be skipping v1 in your example.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got on the same page with Gogul through offline discussion.

@@ -591,7 +641,8 @@ SingleExitLoopTransformer::patchEdges(SILBasicBlock *newHeader,
getUserSourceLocation(src->getTerminator()->getDebugLocation()));
// Find an appropriate value to use for each escaping value.
unsigned argIndex = oldHeaderNumArgs;
for (const SILValue &escapingValue : escapingValues) {
for (const auto &kv : escapingValueSubstMap) {
const SILValue &escapingValue = kv.first;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

const SILValue &escapingValue -> SILValue escapingValue?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@bgogul
Copy link
Collaborator Author

bgogul commented Sep 12, 2018

@swift-ci please test TensorFlow linux

@bgogul
Copy link
Collaborator Author

bgogul commented Sep 13, 2018

Thanks for the review, Ming. I am merging this now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants