Added last instruction information to loop analysis data #133

denismerigoux · 2017-08-03T00:02:34Z

Loop analysis now splits Ebbs they belong to two or more loops and that the second loop does not extend to the end of the Ebb.

ControlFlowGraph and DominatorTree are recomputed incrementally on the fly. A new numbering system is implemented to allow that in DominatorTree.

Adequate testing and increased verifying for dominator tree integrity.

bjorn3 · 2017-08-03T08:28:09Z

lib/cretonne/src/dominator_tree.rs

 // Dominator tree node. We keep one of these per EBB.
 #[derive(Clone, Default)]
 struct DomNode {
    // Number of this node in a reverse post-order traversal of the CFG, starting from 1.
+    // This number is monotonic in the reverse postorder but not contiguous, since we leave
+    // holes for later loczalized modifications of the dominator tree.


sunfishcode · 2017-08-03T10:29:17Z

lib/cretonne/src/loop_analysis.rs

@@ -96,12 +134,15 @@ impl LoopAnalysis {

 impl LoopAnalysis {
    /// Detects the loops in a function. Needs the control flow graph and the dominator tree.


It would be good to mention in a comment here why function, cfg, and domtree need mut.

sunfishcode · 2017-08-03T10:57:04Z

lib/cretonne/src/loop_analysis.rs

+            cur.goto_inst(split_inst);
+            cur.next_inst()
+                .expect("you cannot split at the last instruction")
+        };


I know this is the existing idiom used elsewhere for getting the next instruction, but it feels verbose. Would it make sense to have something like layout.insts[inst].next.expand() in a helper function of Layout instead of requiring a Cursor for this task?

There already is a Layout::next_ebb() method. We should have a next_inst too.

Do I add the Layout::next_inst() in this PR ?

Sure, that's fine.

sunfishcode · 2017-08-03T11:01:56Z

lib/cretonne/src/dominator_tree.rs

+                    // We renumber the current Ebb
+                    self.nodes[current_ebb].rpo_number = current_ebb_rpo_number + 1;
+                    if current_ebb_postorder_index == 0 {
+                        //We have finished here


nit: space after //.

sunfishcode · 2017-08-03T11:05:01Z

lib/cretonne/src/loop_analysis.rs

+                                    // Here we perform a modification that's not related to
+                                    // the loop discovery algorithm but rather to the way we
+                                    // store the loop information. Indeed, for each Ebb we record
+                                    // the loop its part of and the last inst in this loop


nit: "it's" rather than "its". And "." at the end of the sentence.

sunfishcode · 2017-08-03T11:16:20Z

lib/cretonne/src/loop_analysis.rs

+                                    // is fine;
+                                    // - either only part of the ebb is part of the parent loop and
+                                    // in that case we can't store the information of where does
+                                    // the parent loop stops.


The word "either" should only appear once. One way to fix this would be to replace the second "either" with "or".

Also, remove the word "does" here.

sunfishcode · 2017-08-03T11:22:58Z

lib/cretonne/src/loop_analysis.rs

+                        loop_id,
+                        last_inst: _,
+                    } => {
+                        // So here the ebb we're visiting is already tagged as being part of a loop


More comment nits: period after this sentence. And if you reflow this whole paragraph it would be tidier.

sunfishcode · 2017-08-03T11:32:27Z

lib/cretonne/src/dominator_tree.rs

@@ -7,10 +7,16 @@ use packed_option::PackedOption;

 use std::cmp::Ordering;

+// RPO number are not first assigned in a contiguous way but as multiples of STRIDE, to leave


nit: "numbers" rather than "number"

sunfishcode · 2017-08-03T11:44:48Z

lib/cretonne/src/dominator_tree.rs

+                ebb_rpo_number + 1
+            } else {
+                // We have to renumber
+                let return_value = ebb_rpo_number + 1;


Both arms of this 'if' appear to be returning ebb_rpo_number + 1 which could be simplified.

stoklund

I think the discover_loop_blocks function is getting too long and deeply nested. It is time to simplify it. Two techniques for that:

Factor out subroutines into their own functions.
Use early break/continue/return when possible to reduce nesting.

There is something hinky about the computation of the last loop instruction. When performing the backwards DFS, you may encounter multiple edges from the same EBB, and the last loop instruction should be the last of the predecessor branches. I don't see any "max" like computation anywhere.

stoklund · 2017-08-03T14:20:24Z

lib/cretonne/src/loop_analysis.rs

+    Loop { loop_id: Loop, last_inst: Inst },
+}
+
+impl PrimaryEntityData for EbbLoopData {}


This should not be necessary. You're making a secondary map.

Oh that is what this trait is for. Understood.

Yeah, it's not a great solution. I'm thinking about splitting EntityMap into two separate types instead.

stoklund · 2017-08-03T14:25:39Z

lib/cretonne/src/loop_analysis.rs

+    ///
+    /// If `ebb` belongs to `lp`, it returns `None` if the whole `ebb` belongs to `lp` or
+    /// `Some(inst)` where `inst` is the last instruction of `ebb` to belong to `lp`.
+    pub fn last_loop_instruction(&self, ebb: Ebb, lp: Loop) -> Result<Option<Inst>, ()> {


If all of an EBB's instructions belong to the loop, it seems that this function will return None or the EBB terminator somewhat randomly.

Wouldn't it be more consistent if this never returned the terminator? Only return some instruction when the EBB is actually split between two loops?

You're right, I'll do this.

stoklund · 2017-08-03T14:35:38Z

lib/cretonne/src/loop_analysis.rs

+#[derive(Clone,Debug)]
+enum EbbLoopData {
+    NoLoop(),
+    Loop { loop_id: Loop, last_inst: Inst },


I think this is better represented as two packed options. Both because the enum makes the data structure 50% larger, and because you naturally have three possibilities:

None, none. No loop

Loop, None. Whole EBB belongs to a single loop.

Loop, last. EBB belongs to two loops.

OK for two PackedOptions. Also you're right about the lack of max computation, there's currently something wrong.

stoklund · 2017-08-03T16:34:39Z

lib/cretonne/src/verifier/mod.rs

+                .iter()
+                .zip(domtree.cfg_postorder().iter())
+                .enumerate() {
+            if true_ebb != test_ebb {


Unfortunately, this won't detect if one PO is a prefix if the other. The zip iterator stops when either input ends.

Note that slices implement Eq, so you can just compare them with ==.

I can use == but the drawback is that is doesn't return any interesting location when the two slices are different.

That's true. It's up to you, you're probably the one who gets to see these verifier errors.

I can have the best of both worlds by doing the zip and testing for equal length beforehand.

stoklund · 2017-08-03T16:36:10Z

lib/cretonne/src/verifier/mod.rs

+        // We verify rpo_cmp on pairs of adjacent ebbs in the postorder
+        for (&prev_ebb, &next_ebb) in self.domtree.cfg_postorder().iter().adjacent_pairs() {
+            match domtree.rpo_cmp(prev_ebb, next_ebb, &self.func.layout) {
+                Ordering::Greater => (),


I don't think a match is necessary here: if rpo_cmp(...) != Ordering::Greater.

stoklund · 2017-08-03T16:38:32Z

lib/cretonne/src/dominator_tree.rs

+    ///
+    /// `old_ebb` is the `Ebb` before splitting, and `new_ebb` is the `Ebb` which now contains
+    /// the second half of `old_ebb`.
+    pub fn recompute_split_ebb(&mut self, old_ebb: Ebb, new_ebb: Ebb, split_jump_inst: Inst) {


Since this API is a bit tricky, I think it would be a good idea to add a couple debug_assert!s here: split_jump_inst is the terminator of old_ebb, and it is a jump to new_ebb.

But actually in order to do the debug_assert() I would need to add the dfg and layout as arguments to the function. Is it worth it since they're not going to be used in release mode?

No, not worth it. The verifier should catch any errors too.

stoklund · 2017-08-03T16:39:19Z

lib/cretonne/src/dominator_tree.rs

+            // We use the RPO comparison on the postorder list so we invert the operands of the
+            // comparison
+            .binary_search_by(|probe| self.rpo_cmp_ebb(old_ebb,*probe,))
+            .expect("the old ebb is not declared to the dominator tree");


What if it is unreachable?

If ebb is unreachable then it stays unreachable after splitting, and new_ebb unreachable too. I'll deal with this case.

stoklund · 2017-08-03T16:41:12Z

lib/cretonne/src/dominator_tree.rs

+            idom: Some(split_jump_inst).into(),
+        };
+        // TODO: insert in constant time?
+        self.postorder.insert(0, new_ebb);


Technically we should, but I don't think we need to worry about this for now.

stoklund · 2017-08-03T16:44:09Z

lib/cretonne/src/dominator_tree.rs

+    fn insert_after_rpo(&mut self, ebb: Ebb, ebb_postorder_index: usize) -> u32 {
+        let ebb_rpo_number = self.nodes[ebb].rpo_number;
+        if ebb_postorder_index == 0 {
+            ebb_rpo_number + STRIDE


Use an early return here to reduce nesting of the remaining function.

stoklund · 2017-08-03T16:45:53Z

lib/cretonne/src/dominator_tree.rs

+            let prev_postorder_ebb_rpo_number = self.nodes[prev_postorder_ebb].rpo_number;
+            if prev_postorder_ebb_rpo_number > ebb_rpo_number + 1 {
+                // There is a gap, we can use it
+                ebb_rpo_number + 1


Again, just return instead of nesting.

stoklund · 2017-08-03T16:54:01Z

Would it make sense to break the dominator tree change out into it's own PR? It seems pretty independent from the loop analysis change, and it is closer to be able to land.

denismerigoux · 2017-08-03T17:09:39Z

I can do that.

sunfishcode · 2017-08-03T17:39:48Z

lib/cretonne/src/loop_analysis.rs

+    /// be part of at most two loops: an inner loop which can end in the middle of the `Ebb` and
+    /// an outer loop that has to contain the full `Ebb`. All `Ebb`s who don't match this criteria
+    /// are split until they do so. That is why `compute` mutates the function, control flow
+    /// graph and moniator tree.


typo: moniator

denismerigoux · 2017-08-03T18:15:38Z

Dominator tree-related changes are now in #135.

denismerigoux · 2017-08-03T18:23:19Z

By the way how do you compare the position of two instructions inside an Ebb? I haven't seen any method in Layout for that.

stoklund · 2017-08-03T18:24:45Z

Look for ProgramOrder which is implemented by Layout.

Loop analysis now performs ebb splitting in case of multi-loop ebbs

More checks for dominator tree integrity More testing for loop analysis ebb splitting

Not done yet

Loop analysis now performs ebb splitting in case of multi-loop ebbs

More checks for dominator tree integrity More testing for loop analysis ebb splitting

Not done yet

…to efficient_licm

sunfishcode · 2018-07-07T21:54:17Z

Closing; we'll now track this at #393.

denismerigoux force-pushed the efficient_licm branch from 26f243c to c5170d2 Compare August 3, 2017 00:05

bjorn3 suggested changes Aug 3, 2017

View reviewed changes

sunfishcode reviewed Aug 3, 2017

View reviewed changes

stoklund reviewed Aug 3, 2017

View reviewed changes

sunfishcode reviewed Aug 3, 2017

View reviewed changes

denismerigoux force-pushed the efficient_licm branch 4 times, most recently from 6330b21 to 7d565cf Compare August 4, 2017 04:12

denismerigoux added 5 commits August 7, 2017 09:31

Added last instruction information to loop analysis

d0b7d82

Loop analysis now performs ebb splitting in case of multi-loop ebbs

Added ebb splitting update for dominator tree

7100fa0

More checks for dominator tree integrity More testing for loop analysis ebb splitting

First round of fixes after review

ea19598

Not done yet

Fixed bug in loop analysis ebb splitting algorithm

fc18e95

Adapted PR to changes in master

1721584

denismerigoux force-pushed the efficient_licm branch from 7d565cf to 1721584 Compare August 7, 2017 16:35

denismerigoux mentioned this pull request Aug 7, 2017

Optimized LICM pass #138

Closed

denismerigoux added 7 commits August 10, 2017 11:40

Bugfixes after test with real-world code

3ccc9dd

Added last instruction information to loop analysis

a05653b

Loop analysis now performs ebb splitting in case of multi-loop ebbs

Added ebb splitting update for dominator tree

00784e0

More checks for dominator tree integrity More testing for loop analysis ebb splitting

First round of fixes after review

05f4d66

Not done yet

Fixed bug in loop analysis ebb splitting algorithm

5e51997

Adapted to EntityMap changes

9edafee

Merge branch 'efficient_licm' of github.com:denismerigoux/cretonne in…

da50c1c

…to efficient_licm

denismerigoux force-pushed the efficient_licm branch from d8169bf to da50c1c Compare August 26, 2017 14:51

denismerigoux mentioned this pull request Sep 30, 2017

Loop basic blocks splitting #106

Closed

sunfishcode closed this Jul 7, 2018

sunfishcode mentioned this pull request Feb 28, 2020

Better LICM bytecodealliance/wasmtime#1038

Open

		@@ -96,12 +134,15 @@ impl LoopAnalysis {

		impl LoopAnalysis {
		/// Detects the loops in a function. Needs the control flow graph and the dominator tree.

		@@ -7,10 +7,16 @@ use packed_option::PackedOption;

		use std::cmp::Ordering;

		// RPO number are not first assigned in a contiguous way but as multiples of STRIDE, to leave

Added last instruction information to loop analysis data #133

Added last instruction information to loop analysis data #133

Conversation

denismerigoux commented Aug 3, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

stoklund left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

stoklund commented Aug 3, 2017

denismerigoux commented Aug 3, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

denismerigoux commented Aug 3, 2017

denismerigoux commented Aug 3, 2017

stoklund commented Aug 3, 2017

sunfishcode commented Jul 7, 2018