Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add new --only-failures CLI option #1888

Merged
merged 23 commits into from
Mar 16, 2015
Merged

Add new --only-failures CLI option #1888

merged 23 commits into from
Mar 16, 2015

Conversation

myronmarston
Copy link
Member

(Note: this PR has changed significantly compared to when it was first opened. See this comment for current status).

This is the start of the implementation of a new --rerun-failures CLI flag that runs just the failures from the last run. This leverages the new example IDs to ensure it selects the exact examples that failed the last time. My hope with this feature is that it "just works" the first time a user decides they want to run just the failures from the last run, w/o having to configure anything in advance to enable it.

As part of the implementation, this creates a new .rspec_last_run directory in the users's home directory that persists the list of failures from the last run so we can use it on the next run. I can imagine us finding other uses for this directory in the future.

There are a lot of open questions:

  • Do we always want to write out the failures to .rspec_last_run? Specific cases I can think of where we may not want to:
    • All specs pass -- do we blow away what was there before and replace it with nothing?
    • The user uses --fail-fast -- in that case it's going to abort after one failure and the user may want to get all failures, and then use --rerun-failures --fail-fast to repeatedly run just one of the remaining failing specs...but if we replaced the persisted failures with a single --fail-fast spec it wouldn't support this workflow.
    • The user runs only one example (via a location or id filter).
  • What do we do if we can't write out the failures list to disk? Emitting an error or warning on every run would be very annoying/noisy for users who don't want to use this option.
  • What do we do if --rerun-failures is passed and there is no persisted failures from a prior run? Do we abort with an error or treat it like "all examples filtered out" and respect the run_all_when_everything_filtered config option?
  • What do we do if the user tries to pass --rerun-failures and also specifies specific spec files to load, or locations/ids to run? Should we disallow that or try to combine them somehow? For example, maybe we could take the intersection of the two? So rspec spec/models --rerun-failures would run just the failures in the model specs. This situation may also relate to whether or not we update the persisted failures every time.
  • The recorded example ids may no longer be valid or may refer to different examples than they did when they were recorded (such as when changing branches or pulling new commits). Should we try to detect this and do anything about it?

/cc @rspec/rspec

@JonRowe
Copy link
Member

JonRowe commented Mar 1, 2015

  • Do we always want to write out the failures to .rspec_last_run? Specific cases I can think of where we may not want to:
    • All specs pass -- do we blow away what was there before and replace it with nothing?
    • The user uses --fail-fast -- in that case it's going to abort after one failure and the user may want to get all failures, and then use --rerun-failures --fail-fast to repeatedly run just one of the remaining failing specs...but if we replaced the persisted failures with a single --fail-fast spec it wouldn't support this workflow.
    • The user runs only one example (via a location or id filter).

I don't think we should write to this directory unless the users opt in, it'd pretty rude to just start writing to disk and I can think of a few scenarios where this would be undesirable (for example CI systems)

  • What do we do if we can't write out the failures list to disk? Emitting an error or warning on every run would be very annoying/noisy for users who don't want to use this option.

I think if we only write when told to it's ok to omit an error when we can't.

  • What do we do if --rerun-failures is passed and there is no persisted failures from a prior run? Do we abort with an error or treat it like "all examples filtered out" and respect the run_all_when_everything_filtered config option?

I like the alter idea.

  • What do we do if the user tries to pass --rerun-failures and also specifies specific spec files to load, or locations/ids to run? Should we disallow that or try to combine them somehow? For example, maybe we could take the intersection of the two? So rspec spec/models --rerun-failures would run just the failures in the model specs. This situation may also relate to whether or not we update the persisted failures every time.

Combine them, but we'd need to not blow away any existing cache at this point, I'd use this feature this way.

  • The recorded example ids may no longer be valid or may refer to different examples than they did when they were recorded (such as when changing branches or pulling new commits). Should we try to detect this and do anything about it?

I think we should try, maybe hash the file contents?

@myronmarston
Copy link
Member Author

I don't think we should write to this directory unless the users opt in, it'd pretty rude to just start writing to disk and I can think of a few scenarios where this would be undesirable (for example CI systems)

My hope was to make this feature usable w/o requiring the user to have the forethought to configure it. They could run rspec --help, notice the --rerun-failures flag and use it immediately.

That said, it makes it hard to report write errors w/o appearing to nag the user about something they haven't opted-in to, so I think I'm coming around to the idea of:

  • Adding a config.persistence_directory option that specifies where we should write to. When not set we would avoid writing out the failures and emit an error when the user tries to use --rerun-failures instructing them to set the config option first.
  • Adding this option to the recommended settings section of spec_helper.rb generated by rspec --init.

Combine them, but we'd need to not blow away any existing cache at this point, I'd use this feature this way.

By "combine them" do you mean "take the intersection"?

As for not blowing away the existing cache...I think it's confusing for us to only update the persisted failures sometimes but it's definitely desirable not to lose the state about other examples that weren't part of the run...which gives me an idea for a different direction we could take this feature that I think would work much, much better:

  • Rather than persisting a list of examples that failed on the last run, use persistence to keep track of the status (e.g. pass/fail/pending) of each example the last time it ran (which may or may not have been part of the last run).
  • When the suite completes, we would update the persisted example list using the following logic:
    • For examples that ran as part of that run, we would overwrite their last_run_result (or store the example for the first time if its new).
    • For examples that did not run that time, we would either leave the last_run_result as-is if the example may still exist or delete the example's record from the file if we are sure it no longer exists.
      • If the example's rerun_file_path was one of the loaded spec files, then we can see if the id matches the id of any of the loaded examples (including ones that did not run due to filtering or --fail-fast or whatever). If there's a loaded example with the same id, we'd leave the last_run_result record as-is. Otherwise we can delete the record because we know the example no longer exists.
      • If the example's rerun_file_path was not one of the loaded spec files, we would check and see if File.exist?(rerun_file_path) and either leave the last_run_result record alone, if the file does exist, or delete the record if the file does not exist.
  • The semantics (and --help description) of --rerun-failures would change from "run just the examples that failed on the last run" to "run just the examples that failed the last time they ran". --rerun-failures becomes just another filter. In fact, we may want to call it --only-failures to make it more clear how it works in conjunction with specifying some files or directories. IMO, rspec spec/models --only-failures is clear that it's running the failures from spec/models, where as a user could reasonably think that rspec spec/models --rerun-failures may run all specs in spec/models and all failures (that is, set union rather than set intersection).
  • I can imagine other uses for persisting the last_run_result of each example. For example, in our formatter output we could draw attention to examples that are newly failing or newly passing as opposed to failing again or passing again. Sometimes I find myself scrolling between the output from the prior run and the last run to figure out what the diff is, but we could have our formatters do that work! Obviously, I consider this out of scope for this PR--it's more just one idea of something else we might do with this data once we have it. I may even add metadata[:last_run_result] to expose this directly on example metadata (and then --only-failures would become a simple filter on last_run_result == :failed).

I think we should try, maybe hash the file contents?

Problem is, the user is going to want to edit the spec file to address the failures, and then rerun them...which would produce a different checksum just like changing branches would. I think the best we could do is to store the example description w/ the id, and use that to as a signal to infer if the id refers to the same example as before or not.

That would probably work better....but I'm concerned that it is simply a heuristic and may make this feature act in confusing, inconsistent ways. For example, the user may change the doc string to correct a typo and they would still consider it to be the same example. Ultimately, I think "is this the same example as before?" is a question that can only be answered by human judgement, and we're better off trying to avoid guessing. The downsides of running the wrong examples after changing branches aren't really that high -- it would just run some passing examples and skip some failed ones. We can always revisit if someone comes up with a way to detect this that we're satisfied with.

Thoughts on the alternate direction I've proposed above?

@JonRowe
Copy link
Member

JonRowe commented Mar 2, 2015

By "combine them" do you mean "take the intersection"?

Yes

so I think I'm coming around to the idea of:

  • Adding a config.persistence_directory option that specifies where we should write to. When not set we would avoid writing out the failures and emit an error when the user tries to use --rerun-failures instructing them to set the config option first.
  • Adding this option to the recommended settings section of spec_helper.rb generated by rspec --init.

Yep, ok I'm ok with this, and when not said we should disable all of the below for performance.

As for not blowing away the existing cache...I think it's confusing for us to only update the persisted failures sometimes but it's definitely desirable not to lose the state about other examples that weren't part of the run...which gives me an idea for a different direction we could take this feature that I think would work much, much better:

  • Rather than persisting a list of examples that failed on the last run, use persistence to keep track of the status (e.g. pass/fail/pending) of each example the last time it ran (which may or may not have been part of the last run).

And the last time it ran maybe? We could log a quick stat line pass/fail, #{Time.now}, #{time_taken}

  • When the suite completes, we would update the persisted example list using the following logic:
    • For examples that ran as part of that run, we would overwrite their last_run_result (or store the example for the first time if its new).
    • For examples that did not run that time, we would either leave the last_run_result as-is if the example may still exist or delete the example's record from the file if we are sure it no longer exists.
      • If the example's rerun_file_path was one of the loaded spec files, then we can see if the id matches the id of any of the loaded examples (including ones that did not run due to filtering or --fail-fast or whatever). If there's a loaded example with the same id, we'd leave the last_run_result record as-is. Otherwise we can delete the record because we know the example no longer exists.
      • If the example's rerun_file_path was not one of the loaded spec files, we would check and see if File.exist?(rerun_file_path) and either leave the last_run_result record alone, if the file does exist, or delete the record if the file does not exist.

As long as we can turn all this off in case it adds a large performance overhead (which is what I mean by turn it off when persistance_directory isn't set above :) )

  • The semantics (and --help description) of --rerun-failures would change from "run just the examples that failed on the last run" to "run just the examples that failed the last time they ran". --rerun-failures becomes just another filter. In fact, we may want to call it --only-failures to make it more clear how it works in conjunction with specifying some files or directories. IMO, rspec spec/models --only-failures is clear that it's running the failures from spec/models, where as a user could reasonably think that rspec spec/models --rerun-failures may run all specs in spec/models and all failures (that is, set union rather than set intersection).

This makes more sense

Problem is, the user is going to want to edit the spec file to address the failures, and then rerun them...which would produce a different checksum just like changing branches would. I think the best we could do is to store the example description w/ the id, and use that to as a signal to infer if the id refers to the same example as before or not.

Sure, that works :) We can just warn them when it changes or

@myronmarston
Copy link
Member Author

And the last time it ran maybe? We could log a quick stat line pass/fail, #{Time.now}, #{time_taken}.

I'm thinking we may want to add more data like that but I'd rather wait to add that until we have a use for it. For now I just want to get a serialization format in place that will allow us to easily add that in the future. Or maybe I'll change my mind and add it now :).

This does bring up a key question: how should we serialize this? JSON would be a natural choice but we'd like to avoid loading that stdlib, of course. We could vendor okjson, perhaps...although I'm interested in other ideas. A single JSON string in a file isn't terribly readable if you pop it open in an editor :(.

Sure, that works :) We can just warn them when it changes or

Or what?

Anyhow, I advocate putting no effort into "example id references a different example than before" detection. Too many open questions, edge cases, etc. We can revisit in the future.

@JonRowe
Copy link
Member

JonRowe commented Mar 2, 2015

Er, or not run them, not sure how that got truncated!

@JonRowe
Copy link
Member

JonRowe commented Mar 3, 2015

This does bring up a key question: how should we serialize this? JSON would be a natural choice but we'd like to avoid loading that stdlib, of course. We could vendor okjson, perhaps...although I'm interested in other ideas. A single JSON string in a file isn't terribly readable if you pop it open in an editor :(.

I'd stick with something simpler, I mean one example per line a few values may as well be csv!

@myronmarston
Copy link
Member Author

I'd stick with something simpler, I mean one example per line a few values may as well be csv!

That would probably work. I worry a bit about dealing with quoting/escaping (e.g. a file name that has a comma in it...or should we not support that?), though. To do proper CSV parsing with all the edge cases we'd probably have to load the CSV stdlib but maybe we can get away with a simple File.read(...).lines.map { |line| line.split(",") }.

@JonRowe
Copy link
Member

JonRowe commented Mar 3, 2015

Well we don't have to use , of course, we could pick a char less likely to be in filenames (although , is probably pretty rare it is valid in a filename).

But I agree we should keep it as simple as possible so we can do a simple split!

In my first pass I passed in the index but later
changed to passing in an index provider and forgot
to update the argument name.
- Stop memoizing `example`. It made it difficult to
  construct multiple different examples.
- Explicitly create the examples where they are needed.
The result object is really just a value object,
so why fake it out?
@myronmarston
Copy link
Member Author

OK, I've pushed what I've done so far for this. It's not quite done yet but I wanted to get Travis builds going and give people a chance to review what I have so far.

Still TODO:

  • Make --only-failures skip the loading of files that have no failures (this can make it significantly faster!).
  • Consider changing the implementation of that last bullet point: what I just pushed works only when rspec --only-failures is run (with no file or dir arg) but thinking about it more, when rspec spec/models --only-failures is run it would still be nice to only load the files in spec/models that have failures.
  • Set example.metadata[:last_run_status] to "unknown" when example_status_persistence_file_path is not configured. Currently it is set to nil, but unknown is what we use for examples for which we don't know the status yet (e.g. they are filtered out or have never run for some reason) so to be consistent we should set it to "unknown".
  • Emit a useful error message when --only-failures is used and example_status_persistence_path has not been configured. (This is tricky, though: when the CLI option is processed, spec_helper probably hasn't been loaded yet and that's where example_status_peristence_path would usually be set).
  • Emit a useful error message when we try to read from or write to the persistence file and it fails for some reason (user permissions or whatever). Just letting it fail with the full RSpec backtrace won't be very useful friendly as the backtrace will entirely be our code, so we want a friendlier error.
  • Update the generated spec_helper.rb so that it sets example_status_persistence_path in the recommended settings section.
  • Fix world.last_run_statuses and world.spec_files_with_failures so that if they are first accessed before config.example_status_persistence_file_path is set, they will work properly after that has been set. I may move them onto the config object to facilitate this.

@emilebosch
Copy link

Oh my is this finally going to be there then 👯

@myronmarston
Copy link
Member Author

/cc @JonRowe

@myronmarston myronmarston force-pushed the rerun-failures branch 4 times, most recently from f749f79 to 312fd77 Compare March 12, 2015 17:50
Loading ALL spec files when only a handful have failures is inefficient
and could make the run take significantly longer. Note that we only do
this if no file or directory arg is passed to `rspec`. If the user
passes anything, we load what they tell us to load.
`configuration_spec.rb` is too huge. The API needs to
remain large to support the configurability of RSpec,
but we don’t need to keep all the specs in one file.
I want to start breaking off different chunks into their
own files. This is a first step towards that.
It’s easier to make this work by moving the
`spec_files_with_failures` and the
`last_run_statuses` methods off of `world`
and over to `configuration.
It made more sense when these methods were on `world`
as they were before but now they are on config so we
want to avoid that.
It’s more appropriate there.
@myronmarston myronmarston force-pushed the rerun-failures branch 2 times, most recently from 22b018d to 85bcc85 Compare March 15, 2015 05:40
Before, we only did so when no file or directory arg was passed
to `rspec`, but it's better to take the intersection of files
matching the provided args and files with failures.
This ensures that the value is consistent — before for an unknown
case it could either be `nil` or `"unknown"`.

We've also moved the constant into configuration, so that we can
avoid the cost of loading `example_status_persister` (via constant
access since it is setup to be autoloaded) if the user hasn't
configured the feature.
@myronmarston
Copy link
Member Author

OK, this is ready for a review, @rspec/rspec. I've done all the stuff I planned to do.

@myronmarston myronmarston changed the title [WIP] Rerun failures Add new --only-failures CLI option Mar 15, 2015
@JonRowe
Copy link
Member

JonRowe commented Mar 15, 2015

I've had two passes over this now, and can't fault it, but you know what they say about large PRs ;)

@myronmarston myronmarston force-pushed the rerun-failures branch 2 times, most recently from 7f78843 to 360910a Compare March 15, 2015 21:50
myronmarston added a commit that referenced this pull request Mar 16, 2015
Add new --only-failures CLI option
@myronmarston myronmarston merged commit 4a7e748 into master Mar 16, 2015
@myronmarston myronmarston deleted the rerun-failures branch March 16, 2015 04:45
@myronmarston
Copy link
Member Author

Thanks, @JonRowe. I guess I should have left some extra trailing blank lines on some files so you'd have something to point out? Anyhow, I got it green and merged it.

@JonRowe
Copy link
Member

JonRowe commented Mar 16, 2015

I was referring to the saying:

Ask a programmer to review 10 lines of code, he'll find 10 issues. Ask him to do 500 lines and he'll say it looks good.

:P

MatheusRich pushed a commit to MatheusRich/rspec-core that referenced this pull request Oct 30, 2020
Add new --only-failures CLI option
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants