Skip to content
This repository has been archived by the owner on Jun 26, 2020. It is now read-only.

Added last instruction information to loop analysis data #133

Closed

Conversation

denismerigoux
Copy link
Contributor

Loop analysis now splits Ebbs they belong to two or more loops and that the second loop does not extend to the end of the Ebb.

ControlFlowGraph and DominatorTree are recomputed incrementally on the fly. A new numbering system is implemented to allow that in DominatorTree.

Adequate testing and increased verifying for dominator tree integrity.

// Dominator tree node. We keep one of these per EBB.
#[derive(Clone, Default)]
struct DomNode {
// Number of this node in a reverse post-order traversal of the CFG, starting from 1.
// This number is monotonic in the reverse postorder but not contiguous, since we leave
// holes for later loczalized modifications of the dominator tree.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

*localized

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

@@ -96,12 +134,15 @@ impl LoopAnalysis {

impl LoopAnalysis {
/// Detects the loops in a function. Needs the control flow graph and the dominator tree.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be good to mention in a comment here why function, cfg, and domtree need mut.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

cur.goto_inst(split_inst);
cur.next_inst()
.expect("you cannot split at the last instruction")
};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know this is the existing idiom used elsewhere for getting the next instruction, but it feels verbose. Would it make sense to have something like layout.insts[inst].next.expand() in a helper function of Layout instead of requiring a Cursor for this task?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There already is a Layout::next_ebb() method. We should have a next_inst too.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do I add the Layout::next_inst() in this PR ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, that's fine.

// We renumber the current Ebb
self.nodes[current_ebb].rpo_number = current_ebb_rpo_number + 1;
if current_ebb_postorder_index == 0 {
//We have finished here
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: space after //.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

// Here we perform a modification that's not related to
// the loop discovery algorithm but rather to the way we
// store the loop information. Indeed, for each Ebb we record
// the loop its part of and the last inst in this loop
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: "it's" rather than "its". And "." at the end of the sentence.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

// is fine;
// - either only part of the ebb is part of the parent loop and
// in that case we can't store the information of where does
// the parent loop stops.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The word "either" should only appear once. One way to fix this would be to replace the second "either" with "or".

Also, remove the word "does" here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

loop_id,
last_inst: _,
} => {
// So here the ebb we're visiting is already tagged as being part of a loop
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More comment nits: period after this sentence. And if you reflow this whole paragraph it would be tidier.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

@@ -7,10 +7,16 @@ use packed_option::PackedOption;

use std::cmp::Ordering;

// RPO number are not first assigned in a contiguous way but as multiples of STRIDE, to leave
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: "numbers" rather than "number"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

ebb_rpo_number + 1
} else {
// We have to renumber
let return_value = ebb_rpo_number + 1;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both arms of this 'if' appear to be returning ebb_rpo_number + 1 which could be simplified.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

Copy link
Contributor

@stoklund stoklund left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the discover_loop_blocks function is getting too long and deeply nested. It is time to simplify it. Two techniques for that:

  • Factor out subroutines into their own functions.
  • Use early break/continue/return when possible to reduce nesting.

There is something hinky about the computation of the last loop instruction. When performing the backwards DFS, you may encounter multiple edges from the same EBB, and the last loop instruction should be the last of the predecessor branches. I don't see any "max" like computation anywhere.

Loop { loop_id: Loop, last_inst: Inst },
}

impl PrimaryEntityData for EbbLoopData {}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should not be necessary. You're making a secondary map.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh that is what this trait is for. Understood.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it's not a great solution. I'm thinking about splitting EntityMap into two separate types instead.

///
/// If `ebb` belongs to `lp`, it returns `None` if the whole `ebb` belongs to `lp` or
/// `Some(inst)` where `inst` is the last instruction of `ebb` to belong to `lp`.
pub fn last_loop_instruction(&self, ebb: Ebb, lp: Loop) -> Result<Option<Inst>, ()> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If all of an EBB's instructions belong to the loop, it seems that this function will return None or the EBB terminator somewhat randomly.

Wouldn't it be more consistent if this never returned the terminator? Only return some instruction when the EBB is actually split between two loops?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, I'll do this.

#[derive(Clone,Debug)]
enum EbbLoopData {
NoLoop(),
Loop { loop_id: Loop, last_inst: Inst },
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is better represented as two packed options. Both because the enum makes the data structure 50% larger, and because you naturally have three possibilities:

  • None, none. No loop
  • Loop, None. Whole EBB belongs to a single loop.
  • Loop, last. EBB belongs to two loops.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK for two PackedOptions. Also you're right about the lack of max computation, there's currently something wrong.

.iter()
.zip(domtree.cfg_postorder().iter())
.enumerate() {
if true_ebb != test_ebb {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, this won't detect if one PO is a prefix if the other. The zip iterator stops when either input ends.

Note that slices implement Eq, so you can just compare them with ==.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can use == but the drawback is that is doesn't return any interesting location when the two slices are different.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's true. It's up to you, you're probably the one who gets to see these verifier errors.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can have the best of both worlds by doing the zip and testing for equal length beforehand.

// We verify rpo_cmp on pairs of adjacent ebbs in the postorder
for (&prev_ebb, &next_ebb) in self.domtree.cfg_postorder().iter().adjacent_pairs() {
match domtree.rpo_cmp(prev_ebb, next_ebb, &self.func.layout) {
Ordering::Greater => (),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think a match is necessary here: if rpo_cmp(...) != Ordering::Greater.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

///
/// `old_ebb` is the `Ebb` before splitting, and `new_ebb` is the `Ebb` which now contains
/// the second half of `old_ebb`.
pub fn recompute_split_ebb(&mut self, old_ebb: Ebb, new_ebb: Ebb, split_jump_inst: Inst) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this API is a bit tricky, I think it would be a good idea to add a couple debug_assert!s here: split_jump_inst is the terminator of old_ebb, and it is a jump to new_ebb.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But actually in order to do the debug_assert() I would need to add the dfg and layout as arguments to the function. Is it worth it since they're not going to be used in release mode?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, not worth it. The verifier should catch any errors too.

// We use the RPO comparison on the postorder list so we invert the operands of the
// comparison
.binary_search_by(|probe| self.rpo_cmp_ebb(old_ebb,*probe,))
.expect("the old ebb is not declared to the dominator tree");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if it is unreachable?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If ebb is unreachable then it stays unreachable after splitting, and new_ebb unreachable too. I'll deal with this case.

idom: Some(split_jump_inst).into(),
};
// TODO: insert in constant time?
self.postorder.insert(0, new_ebb);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically we should, but I don't think we need to worry about this for now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok

fn insert_after_rpo(&mut self, ebb: Ebb, ebb_postorder_index: usize) -> u32 {
let ebb_rpo_number = self.nodes[ebb].rpo_number;
if ebb_postorder_index == 0 {
ebb_rpo_number + STRIDE
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use an early return here to reduce nesting of the remaining function.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

let prev_postorder_ebb_rpo_number = self.nodes[prev_postorder_ebb].rpo_number;
if prev_postorder_ebb_rpo_number > ebb_rpo_number + 1 {
// There is a gap, we can use it
ebb_rpo_number + 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, just return instead of nesting.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

@stoklund
Copy link
Contributor

stoklund commented Aug 3, 2017

Would it make sense to break the dominator tree change out into it's own PR? It seems pretty independent from the loop analysis change, and it is closer to be able to land.

@denismerigoux
Copy link
Contributor Author

I can do that.

/// be part of at most two loops: an inner loop which can end in the middle of the `Ebb` and
/// an outer loop that has to contain the full `Ebb`. All `Ebb`s who don't match this criteria
/// are split until they do so. That is why `compute` mutates the function, control flow
/// graph and moniator tree.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo: moniator

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

@denismerigoux
Copy link
Contributor Author

Dominator tree-related changes are now in #135.

@denismerigoux
Copy link
Contributor Author

By the way how do you compare the position of two instructions inside an Ebb? I haven't seen any method in Layout for that.

@stoklund
Copy link
Contributor

stoklund commented Aug 3, 2017

Look for ProgramOrder which is implemented by Layout.

@denismerigoux denismerigoux force-pushed the efficient_licm branch 4 times, most recently from 6330b21 to 7d565cf Compare August 4, 2017 04:12
Loop analysis now performs ebb splitting in case of multi-loop ebbs
More checks for dominator tree integrity
More testing for loop analysis ebb splitting
@sunfishcode
Copy link
Member

Closing; we'll now track this at #393.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants