Add support for "--failed" in ExUnit #7373

myronmarston · 2018-02-20T07:19:16Z

This is a follow up to #7082 that adds support for --only-failures using the new ExUnit manifest.

Note that I put some effort into clearly laying out the history in this PR, with explanatory commit messages, so I'd recommend merging instead of doing a squash+merge but ultimately it's up to y'all, of course.

myronmarston · 2018-02-20T07:27:23Z

lib/ex_unit/lib/ex_unit/manifest.ex

  end

  @spec add_test(t, ExUnit.Test.t()) :: t
+  def add_test(manifest, %ExUnit.Test{tags: %{file: file}})
+      when not is_binary(file),
+      do: manifest


It's unfortunate this function clause is needed, but without it, ex_unit_test.exs experiences order-dependent failures, due to this test. Apparently when you have a setup block like setup do: {:ok, file: :foo} it still updates the test's :file tag even though it also triggers an exception telling the user the mistake. The file: :foo tag would then get stored in the manifest as the file (as we always keep the results from the new manifest). Then, on the next test run, when ExUnit.Manifest.merge/2 is inspecting the old manifest entries, it would try to check File.regular?(:foo) and get an exception. This function clause ignores the test entirely when it's :file tag is invalid, working around the problem.

That said, while this fixes the exception, I think a problem still remains. If a user did setup do: {:ok, file: "string"} we would store the test in the manifest with the wrong :file value. I haven't spent any time looking into it yet, but I think a better fix would be to prevent the tag from updating in this case so that the file passed to this function is always the real file the test came from.

josevalim · 2018-02-20T09:08:38Z

lib/ex_unit/lib/ex_unit/manifest.ex

@@ -4,19 +4,41 @@ defmodule ExUnit.Manifest do
  import Record

  defrecord :entry, [:last_run_status, :file]
-  @opaque t :: [{test_id, entry}]
+  defstruct entries: [], path: nil


I know this somewhat of a nitpick but is the path really a property of the manifest?

A manifest, once loaded, can be written anywhere in disk, and not simply on the path it was originally created at. So it feels we are trying to couple two things that don't really belong together. :) And as a consequence the code got much more complex too. Simple functions like merge now need to extract values from structs, discard paths and what not.

Although I understand now that you did it to avoid loading the manifest multiple times. I am thinking that it is ok to read the manifest twice when the --only-failures flag is given, since we have really optimized that.

So maybe a better way to go about this is to have a new function in ExUnit.Filters that returns the failed files from the manifest?

Another option is to have a function in ExUnit.Filters that reads the manifest and returns all failed files and all failed tests per module. This way, we won't tell ExUnit to run only last_status_run:failed, which we said we don't want to expose to developers, but instead we simply tell it which tests to run in any given module. With this approach, we don't need to touch the manifest in the runner at all.

josevalim · 2018-02-20T09:11:24Z

lib/mix/lib/mix/tasks/test.ex

+    manifest =
+      Mix.Project.manifest_path()
+      |> Path.join(".ex_unit_results.elixir")
+      |> ExUnit.Manifest.read()


I would prefer if we didn't access ExUnit.Manifest in the Mix application. It is private API. I truly prefer the previous approach where we simply gave it a path. :)

josevalim · 2018-02-20T09:29:49Z

Thank you @myronmarston! I have added some comments.

We will also be glad to merge the commits instead of squashing them but could you please make sure you don't add a dot at the end of the commit titles?

myronmarston · 2018-02-24T05:25:59Z

@josevalim thanks for the feedback. I wasn't a huge fan of converting the manifest to a struct and storing the path in it but I assumed loading the manifest only once was a requirement. Now that we've relaxed that restriction this is easier :).

So maybe a better way to go about this is to have a new function in ExUnit.Filters that returns the failed files from the manifest?

This is the route I took.

Another option is to have a function in ExUnit.Filters that reads the manifest and returns all failed files and all failed tests per module. This way, we won't tell ExUnit to run only last_status_run:failed, which we said we don't want to expose to developers, but instead we simply tell it which tests to run in any given module. With this approach, we don't need to touch the manifest in the runner at all.

I didn't quite understand your idea here, so I didn't attempt it.

We will also be glad to merge the commits instead of squashing them but could you please make sure you don't add a dot at the end of the commit titles?

Done.

BTW, do you have thoughts on my comment above about how setup do: {:ok, file: "not_the_write_file"} can cause problems? I'm not sure what to do about that.

josevalim · 2018-02-24T09:01:05Z

lib/ex_unit/lib/ex_unit/manifest.ex

+  def get_files_with_failures(entries) do
+    entries
+    |> Stream.filter(fn {_, entry(last_run_status: status)} -> status == :failed end)
+    |> MapSet.new(fn {_, entry(file: file)} -> file end)


I think the fastest implementation of this function would be:

for {_, entry(last_run_status: :failed, file: file)} <- entries, do: file, uniq: true

We can do everything in one pass and we can return a simple list back, which is what we end-up traversing later on anyway.

I think it's important that we return a set. For one, conceptually, a set is the right data structure since we want to return a collection of unique files and we don't care about the order. More importantly, we want a constant-time membership check when we use this later in mix. Otherwise, the filter_test_files_using_manifest/3 function in mix is going to be an O(n * m) operation (where n == number of test files and m == number of test files with failures).

We can do everything in one pass

With the Stream.filter, we're only traversing the list once, anyway, right?

On a side note, I didn't know that for comprehensions support a uniq option. TIL!

josevalim · 2018-02-24T09:05:18Z

@myronmarston this is perfect. I have just added one comment about a piece of code that we can execute faster but I can also address it after merging.

The last topic I wanted to discuss is what happens in some situations:

--only-failures is used buy we pass a directory to mix test, such as mix test test/bar
--only-failures is used but we have no pending failures
--only-failures and --stale are used together

For the first, I think we should only keep the directories given by the user, so we need to always filter on top of that.

For the second, I think we should print a message saying "There are no pending failures. Re-running all the suite". I want to do that to avoid people using --only-failures to get a build to eventually pass.

For the third, I am thinking that if both are passed, --stale would only be triggered after the failures are fixed, so it builds on top of the behaviour outlined for the second.

Thoughts?

BTW, do you have thoughts on my comment above about how setup do: {:ok, file: "not_the_write_file"} can cause problems? I'm not sure what to do about that.

I haven't looked into it yet, I was planning to do after merging. :)

josevalim · 2018-02-24T16:01:39Z

Oh, we already filter based on the manifest, that was one of my concerns, great! And you are right, a set is better! The difference between stream filter and a comprehension is that the filter inlines at runtime, the comprehension at compile time. But given we need to return a set, i don’t this difference matters now. -- *José Valimwww.plataformatec.com.br <http://www.plataformatec.com.br/>Founder and Director of R&D*

josevalim · 2018-02-24T16:07:27Z

And to further clarify, it is still most likely that the eager approach is the most performant one. So I would do a comprehension that retrieves a list of unique files and then pass that to MapSet.new. The stream is only worth it for large data sets. One last question: is the manifest guaranteed to return absolute or relative file names? When we filter files against the manifest, are those files also guaranteed to be absolute or relative? If they are relative, there is a chance they won’t match if you invoke something like “mix test test/../test”. -- *José Valimwww.plataformatec.com.br <http://www.plataformatec.com.br/>Founder and Director of R&D*

myronmarston · 2018-02-24T16:20:41Z

For the first, I think we should only keep the directories given by the user, so we need to always filter on top of that.

Agreed. That's definitely the behavior I'd expect, and it's what we did for RSpec.

For the second, I think we should print a message saying "There are no pending failures. Re-running all the suite". I want to do that to avoid people using --only-failures to get a build to eventually pass.

I lean towards running nothing--IMO, it's the most consistent. I view --only-failures as being a filter applied before running the tests:

all_tests
|> Stream.filter(&test_failed_last_run?/1)
|> Enum.each(&run_test/1)

With a piece of code like this, if Stream.filter decided to include everything when no tests returned true for test_failed_last_run?/1, we'd consider it a bug. Here's it's essentially the same: I told elixir to run only failures; to run the entire suite is to essentially ignore what I asked it to do.

I took a look at what happens now when you do mix test --only unknown_tag:value, and I see that it does not run anything. It prints a message and exits with non-zero status:

$ mix test --only foo:bar; echo $?

Including tags: [foo: "bar"]
Excluding tags: [:test]

Finished in 0.2 seconds
7 doctests, 20 tests, 0 failures, 27 excluded

Randomized with seed 377368
The --only option was given to "mix test" but no test executed
1

Since we convert --only-failures to --only last_run_status:failed under the covers, I think we would get this for free, and I think it makes sense to be consistent with that existing behavior.

For the third, I am thinking that if both are passed, --stale would only be triggered after the failures are fixed, so it builds on top of the behaviour outlined for the second.

TBH, I haven't really used --stale (if you recall, for the main elixir project I worked on, we had 40+ subapps in an umbrella and created a mix test_all task that run them all as a single suite, but that wasn't compatible with --stale). That said, the behavior you describe isn't what I'd expect. In general, are multiple filters ANDed or ORed together? I'd expect them to be ANDed (a test has to match both filter 1 and filter 2 to be included in the run). e.g. I'd expect mix test --only a:1 --only b:2 to do this:

all_tests
|> Stream.filter(fn test -> has_tag?(test, :a, 1) end)
|> Stream.filter(fn test -> has_tag?(test, :b, 2) end)
|> Enum.each(&run_test/1)

I'm not sure if that's what multiple --only options already does (I haven't tried it), but that interpretation fits with what we discussed when you pass a directory and --only-failures.

For --stale I'd expect it to work the same. I'd expect mix to only run tests that both failed the last time they ran and that are stale due to me changing modules/functions they depend upon since the last time they ran. This would be a good way to run just the failures that my recent code changes might have fixed.

Ultimately, I think it makes more sense to decide with how you want multiple filters to work in general, and apply that consistently to every sort of filter, rather than taking it on a case-by-case basis.

myronmarston · 2018-02-24T16:33:08Z

One last question: is the manifest guaranteed to return absolute or
relative file names?

It returns absolute files. For example, here's what gets printed when I inspect the map set before using it in the mix test task:

#MapSet<["/Users/myron/code/elixir/lib/mix/tmp/Mix.Tasks.TestTest/test --only-failures_ loads only files with failures and runs just the failures/test/passing_and_failing_test_only_failures.exs"]>

When we filter files against the manifest, are those files also guaranteed to be absolute or relative?

We expand the paths when we do the filtering:

Enum.filter(files, &MapSet.member?(files_with_failures, Path.expand(&1)))

I believe Path.expand handles .. in path names like the example you asked about. The docs say:

Expands the path relative to the path given as the second argument expanding any . and .. characters.

myronmarston · 2018-02-24T16:40:34Z

OK, I've switched to the for comprehension instead of using Stream.filter, and also updated the docs for ExUnit.Filters.files_with_failures/1 to mention the paths are absolute. Let me know if you want anything else before this is merged!

josevalim · 2018-02-24T17:55:23Z

Oh, I forgot we recently changed —only to exit with non-zero status in case of missing entries. Being consistent there would make me happy but I would still like to discuss alternatives. One of the downsides of the current behavior is that I have to consistently flip between flags. I use —only-failures until all tests pass, then I remove the flag, and repeat. Maybe —only-failures is not the best flag name, but wouldn’t it be better a flag that I can always pass as argument? Something like —failures-first or —focus-on-failures? There is a similar concern about —stale. It doesn’t compose well with —only-failures because they would filter each other and only a subset of —only-failures would run. It probably makes sense but it not wouldn’t be useful. -- *José Valimwww.plataformatec.com.br <http://www.plataformatec.com.br/>Founder and Director of R&D*

myronmarston · 2018-02-24T18:40:59Z

One of the downsides of the current behavior is that I have to consistently
flip between flags. I use —only-failures until all tests pass, then I
remove the flag, and repeat.

It sounds like you're expecting --only-failures to be a "normal" option that you usually pass when you run mix test. My experience may not match that of others, but for RSpec, I've never used it as a normal option (and doubt I would for mix test, either). My norm is to run the test file for the specific module I'm working on (I have a vim keybinding that makes this easy). Once that's done, run the whole suite, commit and push. The situations where I tend to use --only-failures are when I'm changing some interface that causes failures across my test suite. In that case, I repeatedly use --only-failures (or, for RSpec, --next-failure, but that's not a feature we're discussing adding here) until my suite is green. Then I can run the entire suite once more (without any flags) to check it is all green.

For RSpec we exit with a 0 status when all tests are filtered out. Once nice thing about that is that it is easy to compose a command to do what you're talking about:

$ rspec --only-failures && rspec && commit -m "Finished refactoring"

With mix test --only-failures exiting with a 1, you can't compose a command like this, unfortunately.

Also, my experience and approach to running my suite isn't anyone else's, so I don't know how much it should affect the direction we go here. That said, rspec --only-failures has been available in RSpec for 2.5 years and it's always run nothing when there are no failures, and we've never gotten a user request to change that behavior.

All that said: I honestly don't feel very strongly about this, so go whichever way you feel is best!

There is a similar concern about —stale. It doesn’t compose well with
—only-failures because they would filter each other and only a subset of
—only-failures would run. It probably makes sense but it not wouldn’t be
useful.

I have some thoughts on this but I have to run, so I'll try to respond later.

josevalim · 2018-02-24T23:02:38Z

And now I am back at the computer. --only compose with OR. The issue is that --stale filters the files being loaded so it doesn't work quite the same. Further --only-failures use both --only and filters the files, which ends up behaving slightly different.

Thank you for the detailed description about the workflow. Given that mix test with a filter that doesn't trigger any test exits with reason 1, the only way to get the workflow you propose:

$ rspec --only-failures && rspec && commit -m "Finished refactoring"

Would be if --only-failures always exits with reason 1 except when it runs the whole suite because there are no more failures. Although it is also unclear to me if this is reasonable behaviour altogether.

I am looking forward to your feedback on --stale. :)

myronmarston · 2018-02-25T05:47:38Z

--only compose with OR.

That definitely changes things! I didn't realize that. Among other things, it makes me wonder if we should not have --only-failures do the file filtering (or perhaps only due it when there are no other filters in play). I was viewing the file filtering as a pure optimization that has no effect on observable behavior (outside of the test suite finishing faster!). For RSpec, that's the case, because when we have multiple filters, we take the set intersection of them. So discarding files that have no failures is a safe operation since we would never run any tests in those files anyway. If ExUnit's semantic is to take the set union of multiple filters, then discarding files with no failures isn't necessarily safe, and perhaps we shouldn't do it. Or, maybe if it's not too complicated we can have it only filter files if it's safe to do so (e.g. if no other filters are in play), as it still provides a really nice performance benefit.

All that said, the fact that you union multiple filters seems to add complexity, as it gets in the way of a filter doing optimizations like in this case--which it sounds like is a problem for both --stale and --only-failures. I imagine that ship has sailed, but if you decide to change how multiple filters compose, I think it would simplify things and make it easier to know what the "right" behavior in these situations is.

Given the current situation, I'm not sure how --stale --only-failures should compose. It seems like your current filtering semantics would dictate that it should run all tests that either are stale or failed on their last run, but I don't know if that's easily achievable with the file filtering that's being done. An alternate behavior for --stale --only-failures (which is what I initially expected) is that it would run just the failures that your code changes might affect--that is the failures that might now pass. (In theory, the non-stale failures should still fail unless they are non-deterministic).

Would be if --only-failures always exits with reason 1 except when it runs the whole suite because there are no more failures. Although it is also unclear to me if this is reasonable behaviour altogether.

I can understand the reasons for exiting with 1, but IMO, it's surprising. My expectation of a test framework's exit status is that it should be equivalent to either of these expressions:

if Enum.all?(tests, &passed?/1), do: 0, else: 1

if Enum.any?(tests, &failed?/1), do: 1, else: 0

In both these cases, if tests is empty, the expression would return 0 and that's what I'd expect from a test framework, for the same reasons that Enum.all?([], foo) always returns true and Enum.any?([], foo) always returns false.

I see now that mix test on a project with no tests exits with a 1 (and a nice message: "There are no tests to run"). I find this surprising, too, but I'm glad to see it's at least consistent that it exits with a 1 whenever no tests ran, regardless of why :).

Not sure if any of my ramblings are helpful or not, but that's all I've got tonight.

josevalim · 2018-02-25T09:45:40Z

Very helpful, as always!

It seems like your current filtering semantics would dictate that it should run all tests that either are stale or failed on their last run, but I don't know if that's easily achievable with the file filtering that's being done.

Right, unfortunately I don't think it is easily achievable.

I would suggest for us to at least rename --only-failures to something we no longer associate with regular filters. To mirror --stale, I propose --failed, but I am open to suggestions. It is also worth noting that --stale does not exit with status 1 when no tests run, so the behaviour between the two would be consistent here.

To recap the whole discussion: the reason why having no tests exit with reason 1 is to make it easier to catch failures in CIs and other places where you may filter tests for the wrong reasons and never find it out.

However, we can say that --stale and --failed are not really part of CI but rather your development workflow. At the end of the day, we want this workflow to be possible:

mix test --failed && mix test --stale && mix test

So I think calling it --failed and make it mirror --stale is the way to go. However, we would need to document this workflow explicitly and outline that passing --failed --stale at the same time won't work as they will filter each other. We already have a section called "Stale" in the task docs, we can upgrade it "Stale and Failed" and describe the workflow there.

Thoughts?

myronmarston · 2018-02-26T06:36:48Z

I would suggest for us to at least rename --only-failures to something we no longer associate with regular filters. To mirror --stale, I propose --failed, but I am open to suggestions.

I like --failed.

To recap the whole discussion: the reason why having no tests exit with reason 1 is to make it easier to catch failures in CIs and other places where you may filter tests for the wrong reasons and never find it out.

I get the reasoning behind this, but IMO, it doesn't quite fulfill the stated purpose. Consider:

If we're talking about a filter that shouldn't be in place on CI, then it shouldn't be in place regardless of if it filters out all tests or filters out only some tests. But if it filters out only some tests, the current solution doesn't do anything to surface the problem to the developer. And 9 times out of 10, if you're going to accidentally leave a filter in place, it's going to be a filter that leaves at least some tests.
It's hard for me to imagine how you could "accidentally" set a filter on CI, given you'd specifically have to edit .travis.yml or whatever your CI build script is.

However, we would need to document this workflow explicitly and outline that passing --failed --stale at the same time won't work as they will filter each other.

If we don't support --failed --stale, I think we should programmatically detect it and raise an error, and not simply rely on users reading the docs. That said, it seems like --failed --stale is really just a special case of the more general issue that neither --failed nor --stale are able to cleanly compose with other filters, due to the file filtering they do. If that's correct, should we actually document and enforce programmatically that neither option can be combined with another filtering option?

josevalim · 2018-02-26T09:44:30Z

@myronmarston many filters are placed on CI because they are given in two different shapes: via the command line and in your test_helper.exs. For example, in Ecto, each adapter sets a bunch of filters on the test helper file.

There is also the case you have integration tests with external APIs. In such cases, the integration tests are usually disabled by default and you may run your suite on CI like this:

$ mix test
$ mix test --only integration

A typo on "integration" means you would never run those tests and they would always pass.

If we don't support --failed --stale, I think we should programmatically detect it and raise an error, and not simply rely on users reading the docs.

I like this direction as well. 👍

the more general issue that neither --failed nor --stale are able to cleanly compose with other filters

They compose fine with other filters. For example, I still use --stale with Ecto that relies on many filters and there is nothing stopping you from using mix test --failed --only integration when working on your integration tests. --failed and --stale work as an AND with other filters and that can be desired.

To clarify the next steps:

Rename --only-failures to --failed
Make sure --failed exits with status 0 if nothing runs
Raise if --failed and --stale are used together

Thoughts?

myronmarston · 2018-02-26T17:09:56Z

A typo on "integration" means you would never run those tests and they would always pass.

Thanks for the example. I've never setup a CI build that way but I can see the reason to do it and why exiting with 1 is helpful there.

--failed and --stale work as an AND with other filters and that can be desired.

I'm confused now :(. Let's put aside --stale for a second since I don't actually know much about how it works, and focus on --failed. Earlier you had said:

--only compose with OR

Under the covers, --failed is translated to --only last_run_status:failed, and also applies file filtering as an optimization. If --only filters composed with AND, I think --failed --only integration would work fine. It would run only the tests tagged with :integration that failed the last time they ran. However, given that you said that --only composed with OR, I think --failed --only integration would attempt to run the set union of all tests that failed on their last run and all tests tagged with :integration. That would OK if not for the fact that --failed does the file filtering. Because if it, the actual behavior we'll get is to run all tests that failed on their last run plus all tests tagged with :integration from files that have at least one failing test. :integration tests from files with no failing tests will not be run. I don't think that behavior is what any user will expect, and it's pretty confusing. So, if --only filters compose with OR, I don't think we should allow --failed to be combined with any other filter, unless we remove the file filtering (at least for situations when there are other filters in effect).

Regarding --stale, earlier you said:

The issue is that --stale filters the files being loaded so it doesn't work quite the same.

Since --stale also filters files, I thought it suffered from the same issue as --failed, but if it doesn't use --only under the covers and therefore is composable with AND, it's probably not a problem for it to be combined with another filter.

Regarding your next steps: those all sound good. I'll plan to address them in this PR, hopefully in the next few days.

josevalim · 2018-02-26T22:52:00Z

Since --stale also filters files, I thought it suffered from the same issue as --failed, but if it doesn't use --only under the covers and therefore is composable with AND, it's probably not a problem for it to be combined with another filter.

Yes, I forgot that --failed does add a --only behind the scenes, which makes it behave like an OR. In other words, if you pass --failed --only integration it will run all of the integration tests in the loaded files. Your summary is perfect.

I think we can go ahead with the next steps but we may need to do something regarding the composition of --failed/--only before we say this is fully done. I will think about a couple options we may have here. Suggestions are also welcome!

josevalim · 2018-02-26T23:01:11Z

I will think about a couple options we may have here.

One option is, instead of passing only: [last_run_status: :failed], we pass to the runner the exact test_ids of the tests we want to run. Since we already get the files back from the manifest, we could get the list of test ids as well. I see two benefits with this option:

The runner no longer needs to be care about the manifest, since it will no longer filter on last_run_status
We get the composition right

The downside is that we are adding a new feature to the runner which is a bit awkward. Maybe a more general and less awkward way to implement this feature is to introduce the concept of a test filter, which allows us to discard tests without tracking them as excluded or skipped. Basically a filter we would apply here: https://github.com/elixir-lang/elixir/blob/master/lib/ex_unit/lib/ex_unit/runner.ex#L136

myronmarston · 2018-02-28T15:26:49Z

@josevalim that makes sense, but if we do that, we lose the ability to do mix test --only last_run_status:passed or --mix test --only last_run_status:unknown. These are far less useful than filtering to only failed tests (which is why I wouldn't advocate surfacing a new CLI option to users like we are doing for this case), but there are situations where they are useful:

--only last_run_status:passed is useful when you're got some new failing tests for a feature you're working on, and realize a particular refactoring to some interface in your project would make the feature easier to build. After making the interface change, you want to make sure you didn't break anything (but don't care to run the new failing tests for the new feature). --only last_run_status:passed is perfect for these cases.
--only last_run_status:unknown isn't quite as useful, but it lets you run just tests that haven't run locally before. That can be useful when you change branches or added a bunch of new tests or something.

Ultimately, our goal here wasn't to add --failed, not those features, so losing them is probably OK. Figured it was worth calling out, though.

josevalim · 2018-03-03T00:01:45Z

Thanks @myronmarston!

It is always easy to add features later, removing them is right. So in my opinion is totally fine to remove the --only last_run_status:... for now. We can always add it later! ❤️

josevalim · 2018-03-03T00:02:09Z

lib/ex_unit/lib/ex_unit/case.ex

+    * `:async`           - if the test case is in async mode
+    * `:registered`      - used for `ExUnit.Case.register_attribute/3` values
+    * `:describe`        - the describe block the test belongs to
+    * `:last_run_status` - status (`:passed`, `:failed`, or `:unknown`) from the test's last run


So we probably won't need this anymore. :)

josevalim · 2018-03-03T00:02:13Z

lib/ex_unit/lib/ex_unit/case.ex

+    :registered,
+    :describe,
+    :type,
+    :last_run_status


…lock Given a `setup` block like: setup do {:ok, file: :foo} end ...ExUnit raises an exception, but it also overrides the test's `:file` tag, which could cause problems for the manifest since it looks up the existence of the file on the file system, and would get an error if it was not a string.

myronmarston · 2018-03-03T19:15:33Z

It is always easy to add features later, removing them is right. So in my opinion is totally fine to remove the --only last_run_status:... for now. We can always add it later! ❤️

Sounds good! This isn't always my instinct, but I think Elixir is as good as it is because you consistently resist introducing things before they are really needed, ensuring it doesn't accumulate bloat.

Anyhow, I've made the changes you've requested and implemented it how we discussed. However, there's one more change I'd like to make--but I wanted to see what you think first. Now that we are only utilizing the failed tests stored in the manifest, there really isn't a reason to store tests with any other status. And there's no need to store the last_run_status at all. In fact, we can get rid of the entry record entirely, and store the manifest as a map of %{test_id => file}. Such an approach should cause the manifest to be much smaller both on disk and in memory (as failures are usually a small number compared to the total number of tests) and should be even faster to read from disk, write back to disk, merge, etc.

Thoughts?

myronmarston · 2018-03-03T19:17:54Z

lib/ex_unit/lib/ex_unit/manifest.ex

+  def failed_test_ids(manifest) do
+    for({test_id, entry(last_run_status: :failed)} <- manifest, do: test_id, uniq: true)
+    |> MapSet.new()
+  end


It's sub-optimal that both of these new functions filter to the failed statuses. If we switch to storing only failures in the manifest (as I explain in my larger comment on the PR), no filtering is needed, and this isn't a concern. OTOH, if we decide not to implement that optimization, we may want to merge these into one function that returns a tuple so it can filter only once.

josevalim · 2018-03-03T22:15:41Z

It is always easy to add features later, removing them is right

Of course I meant to say "removing them is hard".

And there's no need to store the last_run_status at all. In fact, we can get rid of the entry record entirely, and store the manifest as a map of %{test_id => file}

I like this, let's go in this direction. :)

This uses the ExUnit manifest to filter to only tests that failed the last time they ran. As an optimization, we filter out files that have no failures so we load the minimum set of files necessary.

Since that is the only information we care about, we do not have to store every test with its status. This allows us to vastly simplify the manifest, as we no longer have to use an `:entry` record. Reading and writing the manifest should be faster now, since the manifest tends toward being empty. Since we are only storing failures, I have renamed the manifest (and the `:manifest_file` option and file path) to indicate that it deals only with failures.

myronmarston · 2018-03-05T05:33:23Z

OK, @josevalim I've implemented what we discussed. PTAL!

josevalim · 2018-03-05T08:46:26Z

lib/mix/lib/mix/tasks/test.ex

@@ -410,8 +424,30 @@ defmodule Mix.Tasks.Test do
    end
  end

+  @manifest_file_name ".ex_unit_failures.elixir"


Since mix is the one specifying the file name, maybe we should call it .mix_test_failures?

josevalim · 2018-03-05T08:49:25Z

lib/ex_unit/lib/ex_unit/failures_manifest.ex

+    |> MapSet.new()
+  end
+
+  @spec put_test(t, ExUnit.Test.t()) :: t


Let's add a TODO here so we don't forget to check why we need this clause after merging. :)

josevalim · 2018-03-05T08:53:49Z

lib/ex_unit/lib/ex_unit/runner.ex

@@ -132,8 +133,9 @@ defmodule ExUnit.Runner do
    tests = shuffle(config, tests)
    include = config.include
    exclude = config.exclude
+    test_ids = config.only_test_ids


We need to document this and the failure_manifest_file in lib/ex_unit.ex.

josevalim · 2018-03-05T08:54:06Z

lib/mix/test/mix/tasks/test_test.exs

@@ -117,6 +124,46 @@ defmodule Mix.Tasks.TestTest do
    end
  end

+  test "--failed: loads only files with failures and runs just the failures" do


This test is beautiful. 😍

josevalim · 2018-03-05T08:55:48Z

@myronmarston I have left only three minor comments. Everything else is perfect!

Thank you so much for all of the work and discussions around this feature.

Note to merger: REBASE and not squash.

myronmarston · 2018-03-05T15:33:23Z

@josevalim comments addressed!

- Rename manifest fail to indicate that mix owns the file - Leave a TODO - Document new ExUnit config options - Fix ENV var typo

josevalim · 2018-03-05T17:25:26Z

I have merged manually, thank you!
❤️ 💚 💙 💛 💜

josevalim · 2018-03-05T17:26:26Z

Btw, I could not reproduce the nil file scenario. I have tried to return file: nil in setup but it did not allow me. I have tried to set it to nil via a tag but it did not allow me either. For now I have removed the check hoping the case will arise. If you have a consistent way of reproducing it, please let me know!

myronmarston · 2018-03-05T19:34:54Z

I have merged manually, thank you!

It looks like you might not have pushed what you manually merged (at least not to master...at least I don't see it!). Mind pushing it so I can work off it to try to repro the file issue?

josevalim · 2018-03-05T20:03:49Z

In my defense I pushed it but the server refused it. :P Pushed now :D

myronmarston · 2018-03-06T01:43:57Z

For now I have removed the check hoping the case will arise. If you have a consistent way of reproducing it, please let me know!

I can't repro it now, either. Not sure what fixed it...but I guess that's good news?

myronmarston commented Feb 20, 2018

View reviewed changes

josevalim reviewed Feb 20, 2018

View reviewed changes

lexmag changed the title ~~Add support for --only-failures.~~ Add support for "--only-failures" in ExUnit Feb 20, 2018

myronmarston force-pushed the myron/only-failures branch from a800c71 to a829c1a Compare February 24, 2018 05:22

josevalim reviewed Feb 24, 2018

View reviewed changes

myronmarston force-pushed the myron/only-failures branch from a829c1a to 1c8ef9e Compare February 24, 2018 16:39

josevalim reviewed Mar 3, 2018

View reviewed changes

lib/ex_unit/lib/ex_unit/case.ex Outdated

:registered,

:describe,

:type,

:last_run_status

Copy link

Member

josevalim Mar 3, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nor this!

Refactor: have mix pass the manifest file instead of dir path

77adfb9

myronmarston force-pushed the myron/only-failures branch from 1c8ef9e to 8bdb1cb Compare March 3, 2018 19:10

myronmarston changed the title ~~Add support for "--only-failures" in ExUnit~~ Add support for "--failed" in ExUnit Mar 3, 2018

myronmarston commented Mar 3, 2018

View reviewed changes

myronmarston added 2 commits March 3, 2018 19:55

Add support for mix test --failed

0ba4b2c

This uses the ExUnit manifest to filter to only tests that failed the last time they ran. As an optimization, we filter out files that have no failures so we load the minimum set of files necessary.

myronmarston force-pushed the myron/only-failures branch from 8bdb1cb to a28bb62 Compare March 5, 2018 05:23

josevalim reviewed Mar 5, 2018

View reviewed changes

Address code review feedback

c50937a

- Rename manifest fail to indicate that mix owns the file - Leave a TODO - Document new ExUnit config options - Fix ENV var typo

myronmarston force-pushed the myron/only-failures branch from 571b334 to c50937a Compare March 5, 2018 16:45

josevalim closed this Mar 5, 2018

myronmarston deleted the myron/only-failures branch March 6, 2018 01:39

Add support for "--failed" in ExUnit #7373

Add support for "--failed" in ExUnit #7373

Uh oh!

Conversation

myronmarston commented Feb 20, 2018

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

josevalim commented Feb 20, 2018

Uh oh!

myronmarston commented Feb 24, 2018

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

josevalim commented Feb 24, 2018

Uh oh!

josevalim commented Feb 24, 2018 via email

Uh oh!

josevalim commented Feb 24, 2018 via email

Uh oh!

myronmarston commented Feb 24, 2018

Uh oh!

myronmarston commented Feb 24, 2018

Uh oh!

myronmarston commented Feb 24, 2018

Uh oh!

josevalim commented Feb 24, 2018 via email

Uh oh!

myronmarston commented Feb 24, 2018

Uh oh!

josevalim commented Feb 24, 2018

Uh oh!

myronmarston commented Feb 25, 2018

Uh oh!

josevalim commented Feb 25, 2018

Uh oh!

myronmarston commented Feb 26, 2018

Uh oh!

josevalim commented Feb 26, 2018

Uh oh!

myronmarston commented Feb 26, 2018

Uh oh!

josevalim commented Feb 26, 2018

Uh oh!

josevalim commented Feb 26, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

myronmarston commented Feb 28, 2018

Uh oh!

josevalim commented Mar 3, 2018

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

myronmarston commented Mar 3, 2018

Uh oh!

myronmarston Mar 3, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

josevalim commented Mar 3, 2018

Uh oh!

myronmarston commented Mar 5, 2018

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

josevalim commented Feb 26, 2018 •

edited

Loading

myronmarston Mar 3, 2018 •

edited

Loading