Add new --only-failures CLI option #1888

myronmarston · 2015-02-27T08:11:19Z

(Note: this PR has changed significantly compared to when it was first opened. See this comment for current status).

This is the start of the implementation of a new --rerun-failures CLI flag that runs just the failures from the last run. This leverages the new example IDs to ensure it selects the exact examples that failed the last time. My hope with this feature is that it "just works" the first time a user decides they want to run just the failures from the last run, w/o having to configure anything in advance to enable it.

As part of the implementation, this creates a new .rspec_last_run directory in the users's home directory that persists the list of failures from the last run so we can use it on the next run. I can imagine us finding other uses for this directory in the future.

There are a lot of open questions:

Do we always want to write out the failures to .rspec_last_run? Specific cases I can think of where we may not want to:
- All specs pass -- do we blow away what was there before and replace it with nothing?
- The user uses --fail-fast -- in that case it's going to abort after one failure and the user may want to get all failures, and then use --rerun-failures --fail-fast to repeatedly run just one of the remaining failing specs...but if we replaced the persisted failures with a single --fail-fast spec it wouldn't support this workflow.
- The user runs only one example (via a location or id filter).
What do we do if we can't write out the failures list to disk? Emitting an error or warning on every run would be very annoying/noisy for users who don't want to use this option.
What do we do if --rerun-failures is passed and there is no persisted failures from a prior run? Do we abort with an error or treat it like "all examples filtered out" and respect the run_all_when_everything_filtered config option?
What do we do if the user tries to pass --rerun-failures and also specifies specific spec files to load, or locations/ids to run? Should we disallow that or try to combine them somehow? For example, maybe we could take the intersection of the two? So rspec spec/models --rerun-failures would run just the failures in the model specs. This situation may also relate to whether or not we update the persisted failures every time.
The recorded example ids may no longer be valid or may refer to different examples than they did when they were recorded (such as when changing branches or pulling new commits). Should we try to detect this and do anything about it?

/cc @rspec/rspec

JonRowe · 2015-03-01T09:55:57Z

Do we always want to write out the failures to .rspec_last_run? Specific cases I can think of where we may not want to:

All specs pass -- do we blow away what was there before and replace it with nothing?

The user uses --fail-fast -- in that case it's going to abort after one failure and the user may want to get all failures, and then use --rerun-failures --fail-fast to repeatedly run just one of the remaining failing specs...but if we replaced the persisted failures with a single --fail-fast spec it wouldn't support this workflow.

The user runs only one example (via a location or id filter).

I don't think we should write to this directory unless the users opt in, it'd pretty rude to just start writing to disk and I can think of a few scenarios where this would be undesirable (for example CI systems)

What do we do if we can't write out the failures list to disk? Emitting an error or warning on every run would be very annoying/noisy for users who don't want to use this option.

I think if we only write when told to it's ok to omit an error when we can't.

What do we do if --rerun-failures is passed and there is no persisted failures from a prior run? Do we abort with an error or treat it like "all examples filtered out" and respect the run_all_when_everything_filtered config option?

I like the alter idea.

What do we do if the user tries to pass --rerun-failures and also specifies specific spec files to load, or locations/ids to run? Should we disallow that or try to combine them somehow? For example, maybe we could take the intersection of the two? So rspec spec/models --rerun-failures would run just the failures in the model specs. This situation may also relate to whether or not we update the persisted failures every time.

Combine them, but we'd need to not blow away any existing cache at this point, I'd use this feature this way.

The recorded example ids may no longer be valid or may refer to different examples than they did when they were recorded (such as when changing branches or pulling new commits). Should we try to detect this and do anything about it?

I think we should try, maybe hash the file contents?

myronmarston · 2015-03-01T22:36:24Z

I don't think we should write to this directory unless the users opt in, it'd pretty rude to just start writing to disk and I can think of a few scenarios where this would be undesirable (for example CI systems)

My hope was to make this feature usable w/o requiring the user to have the forethought to configure it. They could run rspec --help, notice the --rerun-failures flag and use it immediately.

That said, it makes it hard to report write errors w/o appearing to nag the user about something they haven't opted-in to, so I think I'm coming around to the idea of:

Adding a config.persistence_directory option that specifies where we should write to. When not set we would avoid writing out the failures and emit an error when the user tries to use --rerun-failures instructing them to set the config option first.
Adding this option to the recommended settings section of spec_helper.rb generated by rspec --init.

Combine them, but we'd need to not blow away any existing cache at this point, I'd use this feature this way.

By "combine them" do you mean "take the intersection"?

As for not blowing away the existing cache...I think it's confusing for us to only update the persisted failures sometimes but it's definitely desirable not to lose the state about other examples that weren't part of the run...which gives me an idea for a different direction we could take this feature that I think would work much, much better:

Rather than persisting a list of examples that failed on the last run, use persistence to keep track of the status (e.g. pass/fail/pending) of each example the last time it ran (which may or may not have been part of the last run).
When the suite completes, we would update the persisted example list using the following logic:
- For examples that ran as part of that run, we would overwrite their last_run_result (or store the example for the first time if its new).
- For examples that did not run that time, we would either leave the last_run_result as-is if the example may still exist or delete the example's record from the file if we are sure it no longer exists.
  - If the example's rerun_file_path was one of the loaded spec files, then we can see if the id matches the id of any of the loaded examples (including ones that did not run due to filtering or --fail-fast or whatever). If there's a loaded example with the same id, we'd leave the last_run_result record as-is. Otherwise we can delete the record because we know the example no longer exists.
  - If the example's rerun_file_path was not one of the loaded spec files, we would check and see if File.exist?(rerun_file_path) and either leave the last_run_result record alone, if the file does exist, or delete the record if the file does not exist.
The semantics (and --help description) of --rerun-failures would change from "run just the examples that failed on the last run" to "run just the examples that failed the last time they ran". --rerun-failures becomes just another filter. In fact, we may want to call it --only-failures to make it more clear how it works in conjunction with specifying some files or directories. IMO, rspec spec/models --only-failures is clear that it's running the failures from spec/models, where as a user could reasonably think that rspec spec/models --rerun-failures may run all specs in spec/models and all failures (that is, set union rather than set intersection).
I can imagine other uses for persisting the last_run_result of each example. For example, in our formatter output we could draw attention to examples that are newly failing or newly passing as opposed to failing again or passing again. Sometimes I find myself scrolling between the output from the prior run and the last run to figure out what the diff is, but we could have our formatters do that work! Obviously, I consider this out of scope for this PR--it's more just one idea of something else we might do with this data once we have it. I may even add metadata[:last_run_result] to expose this directly on example metadata (and then --only-failures would become a simple filter on last_run_result == :failed).

I think we should try, maybe hash the file contents?

Problem is, the user is going to want to edit the spec file to address the failures, and then rerun them...which would produce a different checksum just like changing branches would. I think the best we could do is to store the example description w/ the id, and use that to as a signal to infer if the id refers to the same example as before or not.

That would probably work better....but I'm concerned that it is simply a heuristic and may make this feature act in confusing, inconsistent ways. For example, the user may change the doc string to correct a typo and they would still consider it to be the same example. Ultimately, I think "is this the same example as before?" is a question that can only be answered by human judgement, and we're better off trying to avoid guessing. The downsides of running the wrong examples after changing branches aren't really that high -- it would just run some passing examples and skip some failed ones. We can always revisit if someone comes up with a way to detect this that we're satisfied with.

Thoughts on the alternate direction I've proposed above?

JonRowe · 2015-03-02T21:29:06Z

By "combine them" do you mean "take the intersection"?

Yes

so I think I'm coming around to the idea of:

Adding a config.persistence_directory option that specifies where we should write to. When not set we would avoid writing out the failures and emit an error when the user tries to use --rerun-failures instructing them to set the config option first.

Adding this option to the recommended settings section of spec_helper.rb generated by rspec --init.

Yep, ok I'm ok with this, and when not said we should disable all of the below for performance.

As for not blowing away the existing cache...I think it's confusing for us to only update the persisted failures sometimes but it's definitely desirable not to lose the state about other examples that weren't part of the run...which gives me an idea for a different direction we could take this feature that I think would work much, much better:

Rather than persisting a list of examples that failed on the last run, use persistence to keep track of the status (e.g. pass/fail/pending) of each example the last time it ran (which may or may not have been part of the last run).

And the last time it ran maybe? We could log a quick stat line pass/fail, #{Time.now}, #{time_taken}

When the suite completes, we would update the persisted example list using the following logic:

For examples that ran as part of that run, we would overwrite their last_run_result (or store the example for the first time if its new).

For examples that did not run that time, we would either leave the last_run_result as-is if the example may still exist or delete the example's record from the file if we are sure it no longer exists.

If the example's rerun_file_path was one of the loaded spec files, then we can see if the id matches the id of any of the loaded examples (including ones that did not run due to filtering or --fail-fast or whatever). If there's a loaded example with the same id, we'd leave the last_run_result record as-is. Otherwise we can delete the record because we know the example no longer exists.

If the example's rerun_file_path was not one of the loaded spec files, we would check and see if File.exist?(rerun_file_path) and either leave the last_run_result record alone, if the file does exist, or delete the record if the file does not exist.

As long as we can turn all this off in case it adds a large performance overhead (which is what I mean by turn it off when persistance_directory isn't set above :) )

The semantics (and --help description) of --rerun-failures would change from "run just the examples that failed on the last run" to "run just the examples that failed the last time they ran". --rerun-failures becomes just another filter. In fact, we may want to call it --only-failures to make it more clear how it works in conjunction with specifying some files or directories. IMO, rspec spec/models --only-failures is clear that it's running the failures from spec/models, where as a user could reasonably think that rspec spec/models --rerun-failures may run all specs in spec/models and all failures (that is, set union rather than set intersection).

This makes more sense

Problem is, the user is going to want to edit the spec file to address the failures, and then rerun them...which would produce a different checksum just like changing branches would. I think the best we could do is to store the example description w/ the id, and use that to as a signal to infer if the id refers to the same example as before or not.

Sure, that works :) We can just warn them when it changes or

myronmarston · 2015-03-02T22:06:15Z

And the last time it ran maybe? We could log a quick stat line pass/fail, #{Time.now}, #{time_taken}.

I'm thinking we may want to add more data like that but I'd rather wait to add that until we have a use for it. For now I just want to get a serialization format in place that will allow us to easily add that in the future. Or maybe I'll change my mind and add it now :).

This does bring up a key question: how should we serialize this? JSON would be a natural choice but we'd like to avoid loading that stdlib, of course. We could vendor okjson, perhaps...although I'm interested in other ideas. A single JSON string in a file isn't terribly readable if you pop it open in an editor :(.

Sure, that works :) We can just warn them when it changes or

Or what?

Anyhow, I advocate putting no effort into "example id references a different example than before" detection. Too many open questions, edge cases, etc. We can revisit in the future.

JonRowe · 2015-03-02T22:30:05Z

Er, or not run them, not sure how that got truncated!

JonRowe · 2015-03-03T08:52:40Z

This does bring up a key question: how should we serialize this? JSON would be a natural choice but we'd like to avoid loading that stdlib, of course. We could vendor okjson, perhaps...although I'm interested in other ideas. A single JSON string in a file isn't terribly readable if you pop it open in an editor :(.

I'd stick with something simpler, I mean one example per line a few values may as well be csv!

myronmarston · 2015-03-03T17:51:01Z

I'd stick with something simpler, I mean one example per line a few values may as well be csv!

That would probably work. I worry a bit about dealing with quoting/escaping (e.g. a file name that has a comma in it...or should we not support that?), though. To do proper CSV parsing with all the edge cases we'd probably have to load the CSV stdlib but maybe we can get away with a simple File.read(...).lines.map { |line| line.split(",") }.

JonRowe · 2015-03-03T22:36:38Z

Well we don't have to use , of course, we could pick a char less likely to be in filenames (although , is probably pretty rare it is valid in a filename).

But I agree we should keep it as simple as possible so we can do a simple split!

In my first pass I passed in the index but later changed to passing in an index provider and forgot to update the argument name.

- Stop memoizing `example`. It made it difficult to construct multiple different examples. - Explicitly create the examples where they are needed.

The result object is really just a value object, so why fake it out?

myronmarston · 2015-03-10T17:52:39Z

OK, I've pushed what I've done so far for this. It's not quite done yet but I wanted to get Travis builds going and give people a chance to review what I have so far.

Still TODO:

Make --only-failures skip the loading of files that have no failures (this can make it significantly faster!).
Consider changing the implementation of that last bullet point: what I just pushed works only when rspec --only-failures is run (with no file or dir arg) but thinking about it more, when rspec spec/models --only-failures is run it would still be nice to only load the files in spec/models that have failures.
Set example.metadata[:last_run_status] to "unknown" when example_status_persistence_file_path is not configured. Currently it is set to nil, but unknown is what we use for examples for which we don't know the status yet (e.g. they are filtered out or have never run for some reason) so to be consistent we should set it to "unknown".
Emit a useful error message when --only-failures is used and example_status_persistence_path has not been configured. (This is tricky, though: when the CLI option is processed, spec_helper probably hasn't been loaded yet and that's where example_status_peristence_path would usually be set).
Emit a useful error message when we try to read from or write to the persistence file and it fails for some reason (user permissions or whatever). Just letting it fail with the full RSpec backtrace won't be very useful friendly as the backtrace will entirely be our code, so we want a friendlier error.
Update the generated spec_helper.rb so that it sets example_status_persistence_path in the recommended settings section.
Fix world.last_run_statuses and world.spec_files_with_failures so that if they are first accessed before config.example_status_persistence_file_path is set, they will work properly after that has been set. I may move them onto the config object to facilitate this.

emilebosch · 2015-03-10T17:53:16Z

Oh my is this finally going to be there then 👯

myronmarston · 2015-03-10T17:53:26Z

/cc @JonRowe

Loading ALL spec files when only a handful have failures is inefficient and could make the run take significantly longer. Note that we only do this if no file or directory arg is passed to `rspec`. If the user passes anything, we load what they tell us to load.

`configuration_spec.rb` is too huge. The API needs to remain large to support the configurability of RSpec, but we don’t need to keep all the specs in one file. I want to start breaking off different chunks into their own files. This is a first step towards that.

It’s easier to make this work by moving the `spec_files_with_failures` and the `last_run_statuses` methods off of `world` and over to `configuration.

It made more sense when these methods were on `world` as they were before but now they are on config so we want to avoid that.

It’s more appropriate there.

Before, we only did so when no file or directory arg was passed to `rspec`, but it's better to take the intersection of files matching the provided args and files with failures.

This ensures that the value is consistent — before for an unknown case it could either be `nil` or `"unknown"`. We've also moved the constant into configuration, so that we can avoid the cost of loading `example_status_persister` (via constant access since it is setup to be autoloaded) if the user hasn't configured the feature.

myronmarston · 2015-03-15T07:01:25Z

OK, this is ready for a review, @rspec/rspec. I've done all the stuff I planned to do.

JonRowe · 2015-03-15T10:51:53Z

I've had two passes over this now, and can't fault it, but you know what they say about large PRs ;)

Add new --only-failures CLI option

myronmarston · 2015-03-16T04:45:44Z

Thanks, @JonRowe. I guess I should have left some extra trailing blank lines on some files so you'd have something to point out? Anyhow, I got it green and merged it.

JonRowe · 2015-03-16T04:50:09Z

I was referring to the saying:

Ask a programmer to review 10 lines of code, he'll find 10 issues. Ask him to do 500 lines and he'll say it looks good.

:P

Add new --only-failures CLI option

myronmarston added 3 commits March 9, 2015 21:43

Rename argument to reflect what it actually is.

c62fcde

In my first pass I passed in the index but later changed to passing in an index provider and forgot to update the argument name.

Cleanup formatter support a bit.

96c7590

- Stop memoizing `example`. It made it difficult to construct multiple different examples. - Explicitly create the examples where they are needed.

Use a real result object instead of a double.

bd774ef

The result object is really just a value object, so why fake it out?

myronmarston force-pushed the rerun-failures branch from 82258a0 to a606314 Compare March 10, 2015 17:46

myronmarston added 9 commits March 10, 2015 11:41

Implement dumper/parser for example statuses.

0529b6e

Implement merging algorithm.

ec07ce8

Implement ExampleStatusPersister.

6c9d438

Fix parser to handle blank values properly.

597b9fd

Add config option for example status persistence file path.

ee490cc

Add :last_run_status metadata to examples.

6215a7a

Add --only-failures and --next-failure options.

951e083

Combine --fail-fast and --no-fail-fast in help output.

67c48b3

Add changelog entries for new features from this PR.

52d4f7c

myronmarston force-pushed the rerun-failures branch 4 times, most recently from f749f79 to 312fd77 Compare March 12, 2015 17:50

myronmarston added 2 commits March 12, 2015 11:39

myronmarston added 3 commits March 12, 2015 11:39

Ensure derived values are updated when the config setting updates.

cc34181

It’s easier to make this work by moving the `spec_files_with_failures` and the `last_run_statuses` methods off of `world` and over to `configuration.

Stop stubbing the object under test.

3ad1aeb

It made more sense when these methods were on `world` as they were before but now they are on config so we want to avoid that.

Use spec/examples.txt instead of examples.txt.

f8a5cd6

It’s more appropriate there.

myronmarston force-pushed the rerun-failures branch 2 times, most recently from 22b018d to 85bcc85 Compare March 15, 2015 05:40

myronmarston added 2 commits March 14, 2015 23:59

Always limit loaded files when --only-failures is used.

500ff37

Before, we only did so when no file or directory arg was passed to `rspec`, but it's better to take the intersection of files matching the provided args and files with failures.

myronmarston force-pushed the rerun-failures branch from eb6368e to 1276b38 Compare March 15, 2015 06:59

myronmarston changed the title ~~[WIP] Rerun failures~~ Add new --only-failures CLI option Mar 15, 2015

myronmarston force-pushed the rerun-failures branch 2 times, most recently from 7f78843 to 360910a Compare March 15, 2015 21:50

myronmarston added 4 commits March 15, 2015 16:27

Fail fast when we can't support --only-failures due to lack of config.

43da674

Provide friendly warnings when we can’t access status file.

3258466

Add example_status_persistence_file_path generated spec_helper.rb

8ced287

Add backtrace to appveyor build (will later port to rspec-dev).

af8628f

myronmarston force-pushed the rerun-failures branch from 360910a to af8628f Compare March 15, 2015 23:31

myronmarston added a commit that referenced this pull request Mar 16, 2015

Merge pull request #1888 from rspec/rerun-failures

4a7e748

Add new --only-failures CLI option

myronmarston merged commit 4a7e748 into master Mar 16, 2015

myronmarston deleted the rerun-failures branch March 16, 2015 04:45

myronmarston mentioned this pull request Mar 19, 2015

--next-failure should not force --order defined in the user passes a seed #1910

Closed

jrafanie mentioned this pull request Feb 9, 2016

Enable RSpec 3's --only-failures ManageIQ/manageiq#6537

Merged

MatheusRich pushed a commit to MatheusRich/rspec-core that referenced this pull request Oct 30, 2020

Merge pull request rspec#1888 from rspec/rerun-failures

78907e2

Add new --only-failures CLI option

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add new --only-failures CLI option #1888

Add new --only-failures CLI option #1888

myronmarston commented Feb 27, 2015

JonRowe commented Mar 1, 2015

myronmarston commented Mar 1, 2015

JonRowe commented Mar 2, 2015

myronmarston commented Mar 2, 2015

JonRowe commented Mar 2, 2015

JonRowe commented Mar 3, 2015

myronmarston commented Mar 3, 2015

JonRowe commented Mar 3, 2015

myronmarston commented Mar 10, 2015

emilebosch commented Mar 10, 2015

myronmarston commented Mar 10, 2015

myronmarston commented Mar 15, 2015

JonRowe commented Mar 15, 2015

myronmarston commented Mar 16, 2015

JonRowe commented Mar 16, 2015

Add new --only-failures CLI option #1888

Add new --only-failures CLI option #1888

Conversation

myronmarston commented Feb 27, 2015

JonRowe commented Mar 1, 2015

myronmarston commented Mar 1, 2015

JonRowe commented Mar 2, 2015

myronmarston commented Mar 2, 2015

JonRowe commented Mar 2, 2015

JonRowe commented Mar 3, 2015

myronmarston commented Mar 3, 2015

JonRowe commented Mar 3, 2015

myronmarston commented Mar 10, 2015

emilebosch commented Mar 10, 2015

myronmarston commented Mar 10, 2015

myronmarston commented Mar 15, 2015

JonRowe commented Mar 15, 2015

myronmarston commented Mar 16, 2015

JonRowe commented Mar 16, 2015