# Recursive bisect strategy #1997

Merged
merged 6 commits into from Jun 29, 2015

## Conversation

Projects
None yet
3 participants
Contributor

### urbanautomaton commented Jun 15, 2015

Hi,

First, thank you very much for the new `--bisect` feature - it's already saved me and a colleague hours of painstaking manual debugging, and it doesn't get much better than that. :)

While looking at the implementation recently, I came across a way I think it might be possible to improve the bisection strategy in `ExampleMinimizer`. Presently a round-based approach is used, which attempts to find ~50% of the current residual candidates to ignore in each round. It first searches for contiguous blocks of candidates to ignore, then for more complex combinations of the candidate set. If it can't find ~50% of the candidate set to ignore at any stage, it concludes that the run is irreducible.

In the case where there is a single, singly-dependent failure (i.e. when there is a single expected failure, which depends upon a single preceding example), I believe the current strategy is optimal.

However, when there are multiple preceding examples that combine to cause the expected failure, then there are cases that the current strategy fails to reduce, and cases that it successfully reduces, but could reduce in fewer steps.

#### Examples

Since the problem cases involve expected failures that depend on multiple preceding examples, I first expanded the `FakeBisectRunner` class to support simulating multiply-dependent failures.

I then added failing test cases:

The rounds-based strategy will not reduce the case `C C C C C C C I F` (where C = culprit, I = innocent, and F = expected failure), as the first round fails to remove 50% of the candidates.

Also, the rounds-based strategy takes 70 runs to infer that the following is irreducible: `C C C C C C C C F`. With a recursive strategy, I believe this should be possible in `1 + 2 + 4 + 8 = 15` runs:

• 1 run to determine that at least one of the candidates is responsible
• 2 runs to determine that both halves of the candidates are responsible
• 4 runs to determine that all quarters of the candidates are responsible
• 8 runs to determine that all of the candidates are responsible

(My test case originally had 14 runs, because I hadn't accounted for the first step.)

#### Recursive strategy

Instead of searching amongst combinations of the candidate set, the recursive algorithm always splits its candidate set into two slices. Informally:

1. Is the candidate set a single example?
• yes: return it as the culprit
• no:
1. Divide the candidate set in two
2. Run each slice with the expected failure
3. Can either set be eliminated?
• yes: recurse on other set
• no: recurse on both sets, retaining other set as "fixed" for each run

(Where "fixed" means while reducing one set, the other set is always included as part of each run.)

In case my lousy prose description makes no sense, here is a visual representation of a simple case where there are two culprit examples, one in each half of the candidate set:

I've implemented the above strategy in this PR, and the additional test cases now pass.

#### Formatter

There remain some failures elsewhere in the suite, however, because the formatter output no longer matches the previous algorithm's, and because the recursive strategy doesn't proceed in rounds, it's difficult to restore these tests without changing the bisection output.

So, before I tackle that: is this something you'd be interested in merging? I'm happy to modify anything that's unclear. :)

Cheers,
Simon

Member

### myronmarston commented Jun 16, 2015

 Thanks for digging into this! I'm quite keen to see the bisector improved, and you've clearly thought a lot about this....so this looks great! I do have a few questions/concerns: How does this affect the common case where there is on culprit causing one or more other failures when it runs before them? You mention that the current algorithm is optimal for this common case, so presumably your recursive algorithm performs less optimally...any ideas what the affect is? (Particularly in terms of runtime and number of runs) The diagram you provided is reaaallly helpful, but there are a couple things that surprise me about it: For the "Recurse on RHS with LHS Fixed" section, why is the last run needed? It looks like you've identified the culprit on the RHS after the 3rd run so the 4th seems unnecessary (and wasteful). For the "Recurse on LHS with RHS Fixed" section, why is the RHS fixed to all 4 examples when you've already identified which of the 4 examples is the culprit? The individual runs will take less time if we don't run the extra examples we've already identified as safe to ignore. Given that this is a departure from a rounds-based strategy, and the formatter output is built around that...do you have thoughts on what the formatter output might look like for this strategy? The diagram is really cool and it makes me wonder if we could do something like that... I'd be interested on seeing some A/B test results of the two strategies on real-world example repos. It'd help me understand how this affects users. Thanks again!
Contributor

### urbanautomaton commented Jun 16, 2015

 Hi Myron, How does this affect the common case where there is one culprit In this instance the algorithms should be identical. The recursive strategy uses the same two initial slices as the rounds-based one, so if there is only a single failure, one of those slices can always be eliminated until the terminating condition is reached. It's only when both of the initial slices are implicated that the algorithms start to differ. There's a minor difference in that the recursive strategy tests the expected failure(s) on their own at the start to eliminate totally independent failures, then proceeds with the knowledge that at least one culprit exists. For singly-dependent failures this makes no difference to the number of runs since the rounds-based strategy also does a run with just the expected failure; it just does it at the end, rather than the beginning. For totally independent failures, the recursive strategy will return in a single run, whereas the rounds-based strategy will take ~log2(n) runs. For multiply-dependent failures the number of runs differs anyway. For the "Recurse on RHS with LHS Fixed" section, why is the last run needed? This is my mistake - the diagram is incorrect at that point, and the last run would be skipped because of the `#find` usage here, combined with the terminating condition that assumes any candidate set of length 1 must be the culprit. why is the RHS fixed to all 4 examples when you've already identified which of the 4 examples is the culprit? At the moment because the code is clearer when the recursive steps are independent. It's certainly an optimisation that could be applied, but my (brief) attempt to do so resulted in a bit of a mess, so I thought I'd submit the PR with the more understandable code. I'll have another look at this. do you have thoughts on what the formatter output might look like for this strategy? I'd love to try something like the diagram. :) The challenging bit would be handling cases where there are more examples than will fit in the terminal width (or 80 chars, or whatever). I'll certainly give it a go - is there a standard width you try to keep your formatter output to? I guess the output also needs to be comprehensible in a non-color terminal, so we couldn't rely solely on colors. I'd be interested on seeing some A/B test results of the two strategies on real-world example repos. Yep, some real-world testing would definitely be nice. I have a good test project (private source unfortunately) that had a bunch of order-dependent failures I can resurrect and try bisecting, but broader testing would be nice, too. I don't suppose the rspec project has any example test suites lying around to hand that are used for this sort of thing? If not I could look at using the github API to gather some, or just make a twitter appeal. :)
Member

### myronmarston commented Jun 16, 2015

 I'd love to try something like the diagram. :) The challenging bit would be handling cases where there are more examples than will fit in the terminal width (or 80 chars, or whatever). I'll certainly give it a go - is there a standard width you try to keep your formatter output to? I guess the output also needs to be comprehensible in a non-color terminal, so we couldn't rely solely on colors. We don't have any width we limit output to. In fact, our formatters commonly go over normal terminal widths and the terminal just wraps it. The dots for the progress formatter are just one line line that the terminal wraps, for example. For the doc formatter, the length is whatever the doc strings provided by the user are, and we do nothing to limit it. Here it could be a problem for it to wrap if it causes it to run together and makes it hard to distinguish different runs. Note that we also support multiple bisect formatters -- currently we have a normal one and a verbose one that is enabled via `--bisect=verbose`. So maybe the diagram could use `--bisect=diagram` or replace the verbose output or something.
Member

### myronmarston commented Jun 16, 2015

 I don't suppose the rspec project has any example test suites lying around to hand that are used for this sort of thing? Sadly, no.

### myronmarston reviewed Jun 16, 2015

 end def bisect_over_recursive(candidate_ids, fixed) return candidate_ids if candidate_ids.length == 1

#### myronmarston Jun 16, 2015

Member

`one?` works nicely for this case:

`return candidate_ids if candidate_ids.one?`

### myronmarston reviewed Jun 16, 2015

 else slices.permutation(2).map do |slice, other_slice| bisect_over_recursive(slice, fixed + other_slice) end.flatten(1)

#### myronmarston Jun 16, 2015

Member

I had to think for a while about this to confirm that there's always only two permutations here. IMO, this would be easier to understand if you explicitly spelled out the two permutations rather than using `permutation(2).map { }.flatten(1)`. This also works out well to let us name the two sides `lhs` and `rhs` which helped me understand the algorithm and your diagram:

```lhs, rhs = candidate_ids.each_slice(slice_size).to_a

# ...

ids_to_ignore = [lhs, rhs].find do |ids|
get_expected_failures_for?(candidate_ids - ids + fixed)
end

if ids_to_ignore
# ...
else
bisect_over_recursive(lhs, fixed + rhs) +
bisect_over_recursive(rhs, fixed + lhs)
end```

This simpler form may also help with this:

At the moment because the code is clearer when the recursive steps are independent. It's certainly an optimisation that could be applied, but my (brief) attempt to do so resulted in a bit of a mess, so I thought I'd submit the PR with the more understandable code. I'll have another look at this.

...since they are no longer generically run through a `map` block the 2nd recursive call can more easily be updated to account for what was learned from the 1st recursive call.

### myronmarston reviewed Jun 16, 2015

 end currently_needed_ids if ids_to_ignore self.remaining_ids = candidate_ids - ids_to_ignore

#### myronmarston Jun 16, 2015

Member

Mutating this on `self` always made me a bit uneasy but it was the best way I could find to make it work before. If you find it works best for the new algorithm, feel free to keep it...but if you can find a clean way to not mutate the state like this, that would be great :).

#### urbanautomaton Jun 16, 2015

Contributor

Yeah - I tried to eliminate it but ended up having problems with `#repro_command_for_currently_needed_ids`. I think to support the graceful abort on SIGINT feature the minimizer will have to store its current progress (since the execution state will be lost on interrupt), so I don't think it's going to be easy to get away from this bit of state.

Maybe it'd be good to write the algorithm purely with local variables, then add explicit `#memoize_progress` calls to write the state, just to make it a bit clearer why the object-level state is needed. I'll give that a go and see if it's any nicer.

Closed

Contributor

### urbanautomaton commented Jun 18, 2015

 Thanks for the code feedback, Myron. I have: Altered the algorithm to track eliminated ids rather than remaining ids, allowing the algorithm to use the knowledge gained from LHS recursion when recursing on the RHS (updated algorithm diagram). Simplified the recursive logic to eliminate the `#permutation` call (don't know what I was thinking there, really :) ) Had another go at eliminating the object-level progress state, but it got messy again. In the end I think if it has to exist for the graceful-exit-on-SIGINT feature, it's less confusing to just accept and use it, rather than having multiple sources of truth for algorithm progress. Updated the progress and verbose formatters as minimally as possible to make sense with the new algorithm - here's a gist with sample output. Patched up the specs and features involving formatter output to match the new display (for the debug formatter, this had the nice side-effect of demonstrating that the algorithms are essentially identical for the singly-dependent case). I'll definitely continue to work on getting a diagrammatic formatter going, but from my POV it'd be nice to do that as a separate PR after doing the real-world testing on the new algorithm and getting this merged (if it's accepted, of course :)). Would that be a good idea? If you'd like it all in one PR though, that's fine with me. The full test suite now passes for me locally; I've fixed the travis failures I understand, but some of the suites seem to be failing despite having only pending failures.
Member

### myronmarston commented Jun 19, 2015

 The build is failing due to missing code coverage. To see the missing coverage, run `script/rspec_with_simplecov` and the open `coverage/index.html`. It looks like the bisect formatter has some uncovered lines. Let me know if you want help with that.

### myronmarston reviewed Jun 19, 2015

 Bisecting over non-failing examples 1-9 .. ignoring examples 6-9 Bisecting over non-failing examples 1-5 .. ignoring examples 4-5 Bisecting over non-failing examples 1-3 .. ignoring example 3 Bisecting over non-failing examples 1-2 . ignoring example 1

#### myronmarston Jun 19, 2015

Member

The duration of each round was really nice before. Can we include the duration on these lines as well?

Also, how does this format it when the set being bisected over isn't contiguous? e.g. when you've fixed the LHS and are running part of the RHS.

#### urbanautomaton Jun 19, 2015

Contributor

Yep, I'll look at putting that back. It dropped out because there wasn't a natural equivalent of `:bisect_round_finished` in the recursive scheme, but I'll work it back in.

There's an illustration of the more complex recursive case here - in the case where both sides need recursing on it just doesn't display any ignorable ids at the moment. I'm thinking of adding text like "Both sides implicated - splitting..." to indicate this situation. Will follow up with a proper test to illustrate.

### myronmarston reviewed Jun 19, 2015

 bisect_over_recursive(candidate_ids) end def bisect_over_recursive(candidate_ids)

#### myronmarston Jun 19, 2015

Member

Instead of `bisect_over(candidate_ids)` and `bisect_over_recursive(candidate_ids)`, WDYT about calling the methods `bisect()` and `bisect_over(candidate_ids)`? The former method is called only once and already assumes the `candidate_ids` list is `non_failing_example_ids` so it's not really needed as an argument. Plus it performs the entire bisect procedure over the entire id set, not just over a subset, so `bisect` seems like a better method name. I also prefer not to put `recursive` in a method name -- that's obvious from seeing it call itself.

#### urbanautomaton Jun 19, 2015

Contributor

Agreed, will rename as suggested.

### myronmarston reviewed Jun 19, 2015

 @@ -8,7 +6,7 @@ module Bisect # repeatedly running different subsets of the suite. class ExampleMinimizer attr_reader :runner, :reporter, :all_example_ids, :failed_example_ids attr_accessor :remaining_ids attr_accessor :eliminated_ids

#### myronmarston Jun 19, 2015

Member

Why the switch from `remaining_ids` to `eliminated_ids`? They are just inverses (bearing the relationship `remaining_ids = non_failing_example_ids - eliminated_ids`) so it seems like whats possible or easy with one is just as possible or easy with the other, right? I think tracking `remaining_ids` has a few advantages over tracking `eliminated_ids`:

• It's what was already in place, so sticking with it means less churn here...which would help keep the diff on the PR focused on behavioral changes rather than including an equivalent structural change.
• `remaining_ids` seems more fundamental to me. After all, that's what the final repo command is based on. In addition, we keep passing `remaining_ids` as part of the notification events so it's clearly important externally. Contrast that to `eliminated_ids` -- as far as I can see it's only used within this class.
• Having `remaining_ids` be a logical attribute that is computed on every call (e.g. for every notification that includes it) is going to be slower than if we tracked it directly. (Admittedly, this perf difference is not going to be noticable.)

Thoughts? If you still wanted to write code in terms of eliminating ids, you could define an `eliminate_ids` helper method that would remove the ids from `remaining_ids`.

#### urbanautomaton Jun 19, 2015

Contributor

Hmm. I'd thought it was because it made calculating the working set for each run easier, but on reflection it doesn't at all. Will revert. :)

Member

### myronmarston commented Jun 19, 2015

 This is looking really good, @urbanautomaton. I haven't had a chance to finish my review and have to get ready for work but I'll try to finish my review later today or tomorrow.
Contributor

### urbanautomaton commented Jun 19, 2015

 Thanks @myronmarston - I've only been able to grab 30 minutes here and there to work on this since submitting it, so a steady drip of feedback is perfect. :)

Contributor

### urbanautomaton commented Jun 21, 2015

 Okay, feedback addressed so far: Renamed bisect_over/bisect_over_recursive as suggested Reverted to tracking remaining ids instead of eliminated ones Restored durations to formatter output for both simple and complex cases Fixed the coverage issues for the bisect formatter For the latter issue, I figured that rather than try to cover formatter edge cases in either the `Bisect::Coordinator` spec or the feature tests, it made most sense to have a dedicated spec for `BisectProgressFormatter`. I added an example for the case I needed, but it now looks weirdly incomplete. Since I'm planning to try a diagrammatic version soon, though, I don't really want to exhaustively test the current behaviour only to immediately rewrite it. Are you happy for me to fill this spec out in a follow-up PR with the diagram formatter?

### myronmarston reviewed Jun 22, 2015

 @@ -106,8 +107,10 @@ Feature: Bisect - ./spec/calculator_7_spec.rb[1:1] - ./spec/calculator_8_spec.rb[1:1] - ./spec/calculator_9_spec.rb[1:1] Checking that failures are order-dependent.. - Running: rspec ./spec/calculator_1_spec.rb[1:1] --seed 1234 (n.nnnn seconds) failure is order-dependent

#### myronmarston Jun 22, 2015

Member

The "failure is order dependent" bit isn't separated at all from the command so it kinda runs together. Thoughts on adding an elipses or something similar to separate?

#### urbanautomaton Jun 23, 2015

Contributor

Sure, I'll sort something out there (might put it on a separate line):

``````Checking that failures are order-dependent...
- Running: <snip>
Failure is order-dependent
``````

(with the above-mentioned changes to the "is order-dependent" copy, of course)

### myronmarston reviewed Jun 22, 2015

 Bisecting over non-failing examples 1-9 .. ignoring examples 6-9 (0.30166 seconds) Bisecting over non-failing examples 1-5 .. ignoring examples 4-5 (0.30306 seconds) Bisecting over non-failing examples 1-3 .. ignoring example 3 (0.33292 seconds) Bisecting over non-failing examples 1-2 . ignoring example 1 (0.16476 seconds)

#### myronmarston Jun 22, 2015

Member

Numbering the rounds was kinda nice before. What do you think about adding back the `Round x:` label?

#### urbanautomaton Jun 23, 2015

Contributor

I left it out because in this algorithm there aren't really sequential rounds (they're executed sequentially, but really the doubly-recursive case is a branch), so I couldn't think of a neat way to map the doubly-recursive case onto the old rounds-counting system.

I tried adding an object-level counter and incrementing it every time `bisect_over` is entered, but I didn't like the extra state (and it didn't really represent what the algorithm was doing very well). I'd be happy to put it back though if you think it's okay.

#### myronmarston Jun 23, 2015

Member

Maybe the round counter can be kept in the formatter? It's really a formatting concern and it seems simpler to handle it there.

### myronmarston reviewed Jun 22, 2015

 @@ -54,12 +54,12 @@ Feature: Bisect Bisect started using options: "--seed 1234" Running suite to find failures... (0.16755 seconds) Starting bisect with 1 failing example and 9 non-failing examples. Checking that failures are order-dependent... failure is order-dependent

#### myronmarston Jun 22, 2015

Member

Technically, we don't actually know that the failure is order dependent -- it merely appears to be. But it could flap for other reasons (e.g. based on randomness or some other indeterminacy) and it could just happen to fail with all examples and pass when run on its own. I don't feel strongly about this, but phrasing this as `failure appears to be order-dependent` is more accurate. Thoughts?

#### urbanautomaton Jun 23, 2015

Contributor

Yep, will make that change.

### myronmarston reviewed Jun 22, 2015

 @@ -5,8 +5,8 @@ module RSpec::Core::Formatters BisectProgressFormatter = Class.new(remove_const :BisectProgressFormatter) do RSpec::Core::Formatters.register self def bisect_round_finished(notification) return super unless notification.round == 3 def bisect_round_started(notification)

#### myronmarston Jun 22, 2015

Member

It looks like you've removed `bisect_round_finished` but kept `bisect_round_started`. Seems a little odd that they aren't paired. Not sure if there's anything to do about that though...

#### urbanautomaton Jun 23, 2015

Contributor

Yeah, I don't really see a way around this one - there are two exit conditions for a round now, and because it's recursive we can only really say a round "ends" when all of its sub-rounds end, which doesn't really help much. So I ended up with `bisect_round_started` and two exit conditions, `bisect_ignoring_ids` and `bisect_multiple_culprits_detected`.

I'll try to think of better names for these to make it clear they're a related set of notifications.

#### urbanautomaton Jun 26, 2015

Contributor

I've renamed these to give them all the same prefix:

• `bisect_round_started`
• `bisect_round_ignoring_ids`
• `bisect_round_detected_multiple_culprits`

There's still the split for the different exit conditions, but at least now they're all more obviously a set.

### myronmarston reviewed Jun 22, 2015

 ids_to_ignore = subsets.find do |ids| notify( :bisect_round_started, :candidates_range => candidates_range(candidate_ids),

#### myronmarston Jun 22, 2015

Member

IMO, this would read more naturally as `candidate_range` rather than `candidates_range` -- kinda like you hear people say "age range" but not "ages range".

#### urbanautomaton Jun 23, 2015

Contributor

Agreed, will change.

### myronmarston reviewed Jun 22, 2015

#### myronmarston Jun 22, 2015

Member

Seeing this conditional makes me think that maybe it would be better to have separate notifications for the two cases, e.g. `bisect_dependency_check_succeeded` and `bisect_dependency_check_failed` or similar. Thoughts?

#### urbanautomaton Jun 23, 2015

Contributor

Heh, I actually had them separate initially exactly as you suggest, but joined them into one notification. Will put them back.

### myronmarston reviewed Jun 22, 2015

#### myronmarston Jun 22, 2015

Member

I'm seeing duplication between the `range_desc = [ ... ].join('-')` bit in the method above and this bit here. WDYT about passing a custom type instead of a `Range` object that has a `description` method that takes care of this formatting? After all, we don't actually need any of the range semantics....just an object with a begin and an end.

### myronmarston reviewed Jun 22, 2015

 end end end end

#### myronmarston Jun 22, 2015

Member

This certainly works to cover the previously uncovered lines but I think I'd prefer to see them covered with a more integrated-style test (probably with the `FakeRunner`, though, for simplicity and faster speed). I see a couple specific benefits of a more integrated style test over this approach:

• One of the big advantages I see from code coverage tools is that they can act as an automated dead code finder. Consider if you test this by having an example in `coordinator_spec.rb` that causes these events to be exercised, and asserts on the total output, rather than testing them directly here. In the future if we make changes that removes the need for these events, the code coverage will notify us these events are not covered and it'll serve it's purpose as a dead code finder. OTOH, when you test them directly, they will always be covered, even if we don't actually need or use them anymore.
• Here you've tested that when the formatter receives the event with `dependent_failures == false`, it outputs "is not order-dependent"...but as far as I can see, you have no test that demonstrates that the event is ever sent with that value. By having a more integrated test you can easily cover both sides of the logic.

Obviously, I don't advocate always using a more integrated style (trade offs and all that) but it has some compelling advantages here, particular with the fact that we have the `FakeRunner`. Thoughts?

#### urbanautomaton Jun 23, 2015

Contributor

Yep, very happy to put it in the coordinator spec instead.

### myronmarston reviewed Jun 22, 2015

 @@ -1,3 +1,5 @@ require 'set'

#### myronmarston Jun 22, 2015

Member

Do you mind using `RSpec::Core::Set` rather than stdlib `Set`?

We try hard to minimize how much of the stdlib we depend on (to limit situations where gem authors could have passing RSpec suites while forgetting to load a bit of stdlib they use) and defined `RSpec::Core::Set` specifically for situations where we want a set-like collection.

Member

👍

#### urbanautomaton Jun 23, 2015

Contributor

Sure, makes total sense. Will reimplement with arrays, I think (unless you're happy for me to start adding methods like `#superset?` to the substitute `Set` implementation...)

### myronmarston reviewed Jun 24, 2015

 ) end it 'reduces a multiply-dependent failure' do

#### myronmarston Jun 24, 2015

Member

Do you mean "multiple-dependent failure" instead of "multiply-dependent failure"?

#### urbanautomaton Jun 24, 2015

Contributor

It's the adverb form (c.f. "singly") - happy to change it though if it's unusual in US English.

#### myronmarston Jun 24, 2015

Member

Please do. I've never heard "multiply" used as anything other than a verb in the arithmetic sense. I assumed it was a typo and I think other US English speakers would think the same.

#### JonRowe Jun 24, 2015

Member

I've never heard of singly either :P

#### urbanautomaton Jun 24, 2015

Contributor

Gadzooks! All right then, away with it. :)

"Multiple-dependent" sounds odd to me in turn, though. How's `it "reduces a failure with multiple dependencies"`?

Member

👍

Contributor

### urbanautomaton commented Jun 26, 2015

 Okay, thanks for all the latest feedback - I think I've covered it all: Round counting is back in, tracked in the formatter An `ExampleRange` object is introduced to help formatting the round info Improved formatting of the dependency check in the debug formatter Assorted naming and wording feedback. Edit: there's a single JRuby failure which appears to be related to my changes (at least, it appears in a bisect-related spec), but is pretty baffling. Don't suppose anyone's seen something similar in the suite before?
Member

### myronmarston commented Jun 26, 2015

 Edit: there's a single JRuby failure which appears to be related to my changes (at least, it appears in a bisect-related spec), but is pretty baffling. Don't suppose anyone's seen something similar in the suite before? We periodically get file descriptor errors like that in our builds for JRuby. I have no idea why, but it's always transient so we just kick the build and run it again. That's what I just did. It would be nice to not have to do that but since it has to do with a JVM resource limit and not a logic bug in RSpec or our specs, I'm not sure what to do about it :(.

### myronmarston reviewed Jun 26, 2015

 it 'detects an unminimisable failure in the minimum number of runs' do counting_minimizer.find_minimal_repro expect(counting_reporter.round_count).to eq(15)

#### myronmarston Jun 26, 2015

Member

It's unclear from this example why 15 is the minimum number of runs. It kinda feels like a magic number right now. Can you add a comment (or update the docstring, if the explanation is short enough) explaining why 15 is the minimum?

### myronmarston reviewed Jun 26, 2015

 | |The minimal reproduction command is: | rspec 2.rb[1:1] EOS

#### myronmarston Jun 26, 2015

Member

Not a merge blocker, but I much prefer if the lines with `|` are indented 2 extra spaces so they nested under the `expect` line. Structurally I think of them as being "child" lines rather than "sibling" lines to the expectation line so indenting them 2 extra spaces makes that clear. I like the `EOS` where it is.

#### urbanautomaton Jun 26, 2015

Contributor

Oh! Yeah, totally agreed, not sure how I managed that. Vim paste-with-autoindent maybe.

Member

### myronmarston commented Jun 26, 2015

 Thanks, @urbanautomaton, this is fantastic! I left two more comments that are cosmetic changes (formatting / comments). I'm completely happy (ecstatic, even) with this otherwise. Do you want to rebase into a smaller number of commits before we merge? It's not strictly needed but given the back-and-forth and the fact that we're up to 26 commits it could help cleanup the history a bit.

### myronmarston reviewed Jun 26, 2015

 if start == finish "example #{start}" else "examples #{[start, finish].join("-")}"

#### myronmarston Jun 26, 2015

Member

This seems simpler as:

`"examples #{start}-#{finish}"`

...which should also be a bit faster -- it's fewer object allocations (no unnecessary array allocation or string allocation for `"-"`) and fewer method calls (no `join` call).

``` Support multiple dependent failures in FakeRunner ```
```To illustrate cases for which a recursive bisection strategy is faster than
subset enumeration, FakeRunner needs to support failures that are dependent on
multiple candidate examples.```
``` 0a16613 ```

### urbanautomaton added some commits Jun 14, 2015

``` Test the number of rounds to 'reduce' pathological case ```
```If an expected failure depends on a large number of the passing examples, the
subset enumeration strategy can take a large number of rounds to minimise the
run.

Add a test to count rounds in the worst case, where a failure depends on all
of the preceding examples. With 8 candidates preceding a failure, a recursive
strategy could return in 15 rounds (1 + 2 + 4 + 8).```
``` 13f89e3 ```
``` Add bisection test case where culprits > 50% of candidates ```
``` e8f8a53 ```
``` Add bisection test case for reducing independent failures ```
``` 8048b6f ```

Contributor

### urbanautomaton commented Jun 28, 2015

 Okay, remaining bits of feedback addressed - I've rebased into a smaller number of commits, too. The rebase isn't perfect - because of all the back-and-forth I wasn't able to do it totally cleanly, so there are a couple of additions in 6a22644 that only make complete sense in the light of cf1482a (relating to formatter notifications). If you're happy with that then we could just leave it - alternatively I could just squash those commits together and have one big commit that lands both the algorithm and formatter modifications at once. Thanks once again for all the feedback - the PR is so much better for it. I've always been appreciative of the hard work that goes into maintaining RSpec, but I don't think I've really understood just how much work that actually is. So, cheers. :)
Member

### myronmarston commented Jun 29, 2015

 Thanks, @urbanautomaton! The way you've squashed this looks great to me. It tells a very clear story, which is quite helpful. I did notice one of the commit messages that I think is slightly inaccurate (in a way that can cause confusion), and if I'm correct, I'd like the message amended before we merge. I'm happy to take care of amending the message (or you can take care of it; I don't particularly care) but can you first confirm for me whether or not I'm correct about the message? I'm talking specifically about the message for 6a22644: ``````[✘] - expected failure (failing) [✔︎] - expected failure (passing) . - uninvolved example % - culprit Failing run: . % . . . . % . [✘] Divide candidates in two and test: . . % . [✔︎] . % . . [✔︎] Both sides implicated Recurse on LHS with RHS fixed . . . . % . [✔︎] . % . . % . [✘] % . . % . [✘] Identified ^ Recurse on RHS with LHS result % % . [✘] % . . [✔︎] % . [✔︎] % % [✘] Identified ^ Both sides returned, final results: % % [✘] `````` Specifically, these lines: ``````Recurse on RHS with LHS result % % . [✘] % . . [✔︎] `````` It shows that it first discovers on that the failure happens with using the RH quarter of the RHS...and then for some reason it proceeds to also try the LH quarter of the RHS, when it shouldn't need to. Given that you use `Enumerable#find` to find test the two quarters, it should abort as soon as it finds a quarter that causes the expected failures so it wouldn't actually try the 2nd quarter, right? Actually, the `find` is run against `[lhs, rhs]` so really I think that should be: ``````Recurse on RHS with LHS result % . . [✔︎] % % . [✘] `````` ...where the LH quarter of the RHS is tested before the RH quarter of the RHS. Not trying to be anal here but the usefulness of the graphical diagram (which is really, really great, BTW!) depends on it being accurate or else it causes confusion so I think it's worth getting right. Can you confirm if my understanding is correct?

### urbanautomaton added some commits Jun 14, 2015

``` Use recursive strategy for bisection ```
```To speed up bisection of complex dependent failures, a recursive bisection
strategy is introduced in place of permutation searching. The strategy works
as follows:

1. Split the candidate set into two slices
2. Test each slice with the expected failure(s)
3. If either slice can be ignored:
- recurse on the other slice
4. If neither slice can be ignored, for each slice:
- recurse on slice, retaining other slice for test runs

This is easier to illustrate graphically - shown below is a case where a
single expected failure depends on two preceding non-failing examples:

[✘] - expected failure (failing)
[✔︎] - expected failure (passing)
.   - uninvolved example
%   - culprit

Failing run:
. % . . . . % . [✘]

Divide candidates in two and test:
. . % . [✔︎]
. % . .         [✔︎]
Both sides implicated

Recurse on LHS with RHS fixed
. . . . % . [✔︎]
. %     . . % . [✘]
%     . . % . [✘]
Identified ^

Recurse on RHS with LHS result
%         % . [✘]
%           . [✔︎]
%         %   [✘]
Identified ^

Both sides returned, final results:
%         %   [✘]```
``` d177934 ```
``` Basic recursive bisect formatter ```
```This modifies BisectProgressFormatter to more closely model the recursive
bisection algorithm, fixing the associated BisectCoordinator specs.```
``` 5a2f476 ```

Contributor

### urbanautomaton commented Jun 29, 2015

 Hmm - I think you're right about the former, but not the latter. The `find` runs on the lhs first, but the block tests whether the failure exists without the tested side: ```ids_to_ignore, duration = track_duration do [lhs, rhs].find do |ids| get_expected_failures_for?(remaining_ids - ids) end end``` So I think your first version is correct: Edit: sorry, I keep trying to reply in too little time. So I think the following is correct: ``````Recurse on RHS with LHS result % % . [✘] % . [✔︎] % % [✘] `````` and have amended the commit accordingly.

### myronmarston added a commit that referenced this pull request Jun 29, 2015

``` Merge pull request #1997 from urbanautomaton/recursive-bisect-strategy ```
`Recursive bisect strategy`
``` 86c4fd4 ```

### myronmarston merged commit `86c4fd4` into rspec:master Jun 29, 2015 2 checks passed

#### 2 checks passed

continuous-integration/appveyor AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
Member

### myronmarston commented Jun 29, 2015

 👍 Merged. Thanks again!

### myronmarston added a commit that referenced this pull request Jun 29, 2015

``` Changelog for #1997. ```
`[ci skip]`
``` 9f99b6a ```