Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add a wrapper around Mir with an exit node, dominators returns error when nodes are unreachable #34556

Closed
wants to merge 10 commits into from

Conversation

Projects
None yet
5 participants
@scottcarr
Copy link
Contributor

scottcarr commented Jun 29, 2016

We want the Mir CFG to have an exit node to calculate post dominators. The MirWithExit type allows us to add the exit node on demand.

When nodes are unreachable from the start node, dominators are undefined, so dominators now returns an error in that case.

@rust-highfive

This comment has been minimized.

Copy link
Collaborator

rust-highfive commented Jun 29, 2016

Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @Aatch (or someone else) soon.

If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. Due to the way GitHub handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes.

Please see the contribution instructions for more information.

@scottcarr

This comment has been minimized.

Copy link
Contributor Author

scottcarr commented Jun 29, 2016

@rust-highfive rust-highfive assigned nikomatsakis and unassigned Aatch Jun 29, 2016

@arielb1

This comment has been minimized.

Copy link
Contributor

arielb1 commented Jun 29, 2016

What is the motivation for postdoms here? LLVM seems to use them only for its region analysis.

BTW, this does not handle loops without an exit.

@scottcarr

This comment has been minimized.

Copy link
Contributor Author

scottcarr commented Jun 29, 2016

What is the motivation for postdoms here? LLVM seems to use them only for its region analysis.

For the "move up propagation" optimization I'm working on (which eliminates a temporary), it only fires if this particular use post dominates the temporary's definition.

BTW, this does not handle loops without an exit.

The way I was thinking to handle it is:

If the Mir CFG has loops without an exit, then there is no "the exit node" and the post-dominators are undefined. When we calculate the dominators of the transposed MirWithExit CFG (which are the post-dominators of the original Mir CFG), that graph will have unreachable nodes and dominators returns an error in that case.

scottcarr added some commits Jun 29, 2016

@arielb1

This comment has been minimized.

Copy link
Contributor

arielb1 commented Jun 30, 2016

Use post-dominates definition? That means it won't work if there are panics in the middle, right?

@scottcarr

This comment has been minimized.

Copy link
Contributor Author

scottcarr commented Jun 30, 2016

Use post-dominates definition?

I'm not sure 100% sure what you're asking. For some variable v, I want to know if some particular use (ex: x = ... v ..) post dominates some definition of v (ex: v = ...).

That means it won't work if there are panics in the middle, right?

Dominators::dominators doesn't panic when it encounters unreachable nodes, it returns an Result. Callers should check the result if the graph might have unreachable nodes. Mir's CFG shouldn't have unreachable nodes, AFAIK, but I can change Mir::dominators to return the Result if needed.

@arielb1

This comment has been minimized.

Copy link
Contributor

arielb1 commented Jun 30, 2016

@scottcarr

The problem is that if you consider panic edges, your analysis will be reduced to be basically local, as every call has a panic edge which means that nothing post-dominates anything.

@scottcarr

This comment has been minimized.

Copy link
Contributor Author

scottcarr commented Jun 30, 2016

@arielb1

Let me make sure I understand what you mean. If we have:

bb0: {
  tmp0 = 5;
  tmp1 = 42;
  tmp2 = foo(tmp1) -> [return: bb1, unwind: bb2] 
}

bb1: {
  tmp3 = tmp0;
  ...
}

bb2: {
  resume;
}

You are suggesting we should optimize to:

bb0: {
  tmp3 = 5; // tmp0 optimized out
  tmp1 = 42;
  tmp2 = foo(tmp1) -> [return: bb1, unwind: bb2] 
}

bb1: {
  // tmp3 = tmp0 optimized out
  ...
}

bb2: {
  resume;
}

.. because "tmp3 = tmp0" is on all paths from "tmp0 = 5" to some "exit" that do not end in a resume;?

@scottcarr

This comment has been minimized.

Copy link
Contributor Author

scottcarr commented Jun 30, 2016

FWIW, move up optimization does fire a non-zero number of times when building the compiler. But it may be that all the statement pairs it optimizes are pretty local to each other.

#34585

@nikomatsakis

This comment has been minimized.

Copy link
Contributor

nikomatsakis commented Jul 1, 2016

So I chatted a bit with @scottcarr on IRC. I don't think that panic edges are actually particularly relevant. I think that what it comes down to is that if you are going to move the write B so that it occurs at the point A:

B0: {
   TMP = ... // Point A
   ...
}

Bn: {
    X = TMP // Point B
    ...
}

then basically anything reachable from A without passing through B must not be able to observe the fact that X has changed before it was supposed to have changed. So, if you trace paths from A and you encounter a RETURN or UNWIND terminator (whatever we call those now), then you could conclude that because the local variable X is being popped, you can safely perform the optimization. But if you encounter some node that may observe the value of X (which might include calls, depending on whether the address of X has been taken and what kind of conservative rules we are using) then you can't safely move it backward.

In other words, I think @arielb1 is right that it might be better not to consider post-doms, but I think the focus on panics etc isn't that important.

@nikomatsakis

This comment has been minimized.

Copy link
Contributor

nikomatsakis commented Jul 1, 2016

(To be clear, I didn't read all the comments on this PR in depth.)

@arielb1

This comment has been minimized.

Copy link
Contributor

arielb1 commented Jul 1, 2016

I think we got this discussion totally wrong anyway. Here's my model of the optimization:

The optimization is to transform

S1:
    tmp = SRC
    ...
S2:
    DEST = tmp

to

S1:
    DEST = SRC

I think this is best split into 4 steps. I don't think we ever want to do the steps separately, but this clarifies the analysis rules.

Step 0 (original)

S1:
    tmp = SRC
    ...
S2:
    DEST = tmp

Step 1 - add additional dead write

S1:
    tmp = SRC
    DEST = tmp
    ...
S2:
    DEST = tmp

This requires that DEST is dead at S1, and that it can be evaluated there (e.g. it is not a dereference of a pointer that is uninitialized there). Nothing else matters.

Step 2: common subexpression introduction

S1:
    tmp = SRC
    DEST = tmp
    ...
S2:
    tmp2 = DEST
    DEST = tmp2

This requires that the newly-added read links with the write of tmp - basically a "memdep" analysis.

This also requires that the address of DEST does not change, which is non-trivial because DEST can be the dereference of a pointer.

Step 3: remove write-of-read

S1:
    tmp = SRC
    DEST = tmp
    ...
S2:
    NOP

After step 2, S2 is obviously a NOP and can be removed.

Step 4: remove tmp

S1:
    DEST = SRC
    ...
S2:
    NOP

This is purely a local operation, but may require some sophistication if e.g. SRC is a function call. I think we are always justified doing it, but we should make sure it is OK.

@scottcarr

This comment has been minimized.

Copy link
Contributor Author

scottcarr commented Jul 6, 2016

Since we're not planning to use post dominators for move-up-propagation, should be close this PR and move discussion to #34693?

@nikomatsakis

This comment has been minimized.

Copy link
Contributor

nikomatsakis commented Jul 7, 2016

@scottcarr I think we should.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.