New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for "--failed" in ExUnit #7373

Closed
wants to merge 5 commits into
base: master
from

Conversation

Projects
None yet
2 participants
@myronmarston
Contributor

myronmarston commented Feb 20, 2018

This is a follow up to #7082 that adds support for --only-failures using the new ExUnit manifest.

Note that I put some effort into clearly laying out the history in this PR, with explanatory commit messages, so I'd recommend merging instead of doing a squash+merge but ultimately it's up to y'all, of course.

end
@spec add_test(t, ExUnit.Test.t()) :: t
def add_test(manifest, %ExUnit.Test{tags: %{file: file}})
when not is_binary(file),
do: manifest

This comment has been minimized.

@myronmarston

myronmarston Feb 20, 2018

Contributor

It's unfortunate this function clause is needed, but without it, ex_unit_test.exs experiences order-dependent failures, due to this test. Apparently when you have a setup block like setup do: {:ok, file: :foo} it still updates the test's :file tag even though it also triggers an exception telling the user the mistake. The file: :foo tag would then get stored in the manifest as the file (as we always keep the results from the new manifest). Then, on the next test run, when ExUnit.Manifest.merge/2 is inspecting the old manifest entries, it would try to check File.regular?(:foo) and get an exception. This function clause ignores the test entirely when it's :file tag is invalid, working around the problem.

That said, while this fixes the exception, I think a problem still remains. If a user did setup do: {:ok, file: "string"} we would store the test in the manifest with the wrong :file value. I haven't spent any time looking into it yet, but I think a better fix would be to prevent the tag from updating in this case so that the file passed to this function is always the real file the test came from.

@@ -4,19 +4,41 @@ defmodule ExUnit.Manifest do
import Record
defrecord :entry, [:last_run_status, :file]
@opaque t :: [{test_id, entry}]
defstruct entries: [], path: nil

This comment has been minimized.

@josevalim

josevalim Feb 20, 2018

Member

I know this somewhat of a nitpick but is the path really a property of the manifest?

A manifest, once loaded, can be written anywhere in disk, and not simply on the path it was originally created at. So it feels we are trying to couple two things that don't really belong together. :) And as a consequence the code got much more complex too. Simple functions like merge now need to extract values from structs, discard paths and what not.

This comment has been minimized.

@josevalim

josevalim Feb 20, 2018

Member

Although I understand now that you did it to avoid loading the manifest multiple times. I am thinking that it is ok to read the manifest twice when the --only-failures flag is given, since we have really optimized that.

So maybe a better way to go about this is to have a new function in ExUnit.Filters that returns the failed files from the manifest?

Another option is to have a function in ExUnit.Filters that reads the manifest and returns all failed files and all failed tests per module. This way, we won't tell ExUnit to run only last_status_run:failed, which we said we don't want to expose to developers, but instead we simply tell it which tests to run in any given module. With this approach, we don't need to touch the manifest in the runner at all.

manifest =
Mix.Project.manifest_path()
|> Path.join(".ex_unit_results.elixir")
|> ExUnit.Manifest.read()

This comment has been minimized.

@josevalim

josevalim Feb 20, 2018

Member

I would prefer if we didn't access ExUnit.Manifest in the Mix application. It is private API. I truly prefer the previous approach where we simply gave it a path. :)

@josevalim

This comment has been minimized.

Member

josevalim commented Feb 20, 2018

Thank you @myronmarston! I have added some comments.

We will also be glad to merge the commits instead of squashing them but could you please make sure you don't add a dot at the end of the commit titles?

@lexmag lexmag changed the title from Add support for `--only-failures`. to Add support for "--only-failures" in ExUnit Feb 20, 2018

@myronmarston

This comment has been minimized.

Contributor

myronmarston commented Feb 24, 2018

@josevalim thanks for the feedback. I wasn't a huge fan of converting the manifest to a struct and storing the path in it but I assumed loading the manifest only once was a requirement. Now that we've relaxed that restriction this is easier :).

So maybe a better way to go about this is to have a new function in ExUnit.Filters that returns the failed files from the manifest?

This is the route I took.

Another option is to have a function in ExUnit.Filters that reads the manifest and returns all failed files and all failed tests per module. This way, we won't tell ExUnit to run only last_status_run:failed, which we said we don't want to expose to developers, but instead we simply tell it which tests to run in any given module. With this approach, we don't need to touch the manifest in the runner at all.

I didn't quite understand your idea here, so I didn't attempt it.

We will also be glad to merge the commits instead of squashing them but could you please make sure you don't add a dot at the end of the commit titles?

Done.

BTW, do you have thoughts on my comment above about how setup do: {:ok, file: "not_the_write_file"} can cause problems? I'm not sure what to do about that.

def get_files_with_failures(entries) do
entries
|> Stream.filter(fn {_, entry(last_run_status: status)} -> status == :failed end)
|> MapSet.new(fn {_, entry(file: file)} -> file end)

This comment has been minimized.

@josevalim

josevalim Feb 24, 2018

Member

I think the fastest implementation of this function would be:

for  {_, entry(last_run_status: :failed, file: file)} <- entries, do: file, uniq: true

We can do everything in one pass and we can return a simple list back, which is what we end-up traversing later on anyway.

This comment has been minimized.

@myronmarston

myronmarston Feb 24, 2018

Contributor

I think it's important that we return a set. For one, conceptually, a set is the right data structure since we want to return a collection of unique files and we don't care about the order. More importantly, we want a constant-time membership check when we use this later in mix. Otherwise, the filter_test_files_using_manifest/3 function in mix is going to be an O(n * m) operation (where n == number of test files and m == number of test files with failures).

We can do everything in one pass

With the Stream.filter, we're only traversing the list once, anyway, right?

On a side note, I didn't know that for comprehensions support a uniq option. TIL!

@josevalim

This comment has been minimized.

Member

josevalim commented Feb 24, 2018

@myronmarston this is perfect. I have just added one comment about a piece of code that we can execute faster but I can also address it after merging.

The last topic I wanted to discuss is what happens in some situations:

  • --only-failures is used buy we pass a directory to mix test, such as mix test test/bar
  • --only-failures is used but we have no pending failures
  • --only-failures and --stale are used together

For the first, I think we should only keep the directories given by the user, so we need to always filter on top of that.

For the second, I think we should print a message saying "There are no pending failures. Re-running all the suite". I want to do that to avoid people using --only-failures to get a build to eventually pass.

For the third, I am thinking that if both are passed, --stale would only be triggered after the failures are fixed, so it builds on top of the behaviour outlined for the second.

Thoughts?

BTW, do you have thoughts on my comment above about how setup do: {:ok, file: "not_the_write_file"} can cause problems? I'm not sure what to do about that.

I haven't looked into it yet, I was planning to do after merging. :)

@josevalim

This comment has been minimized.

Member

josevalim commented Feb 24, 2018

@josevalim

This comment has been minimized.

Member

josevalim commented Feb 24, 2018

@myronmarston

This comment has been minimized.

Contributor

myronmarston commented Feb 24, 2018

For the first, I think we should only keep the directories given by the user, so we need to always filter on top of that.

Agreed. That's definitely the behavior I'd expect, and it's what we did for RSpec.

For the second, I think we should print a message saying "There are no pending failures. Re-running all the suite". I want to do that to avoid people using --only-failures to get a build to eventually pass.

I lean towards running nothing--IMO, it's the most consistent. I view --only-failures as being a filter applied before running the tests:

all_tests
|> Stream.filter(&test_failed_last_run?/1)
|> Enum.each(&run_test/1)

With a piece of code like this, if Stream.filter decided to include everything when no tests returned true for test_failed_last_run?/1, we'd consider it a bug. Here's it's essentially the same: I told elixir to run only failures; to run the entire suite is to essentially ignore what I asked it to do.

I took a look at what happens now when you do mix test --only unknown_tag:value, and I see that it does not run anything. It prints a message and exits with non-zero status:

$ mix test --only foo:bar; echo $?

Including tags: [foo: "bar"]
Excluding tags: [:test]

Finished in 0.2 seconds
7 doctests, 20 tests, 0 failures, 27 excluded

Randomized with seed 377368
The --only option was given to "mix test" but no test executed
1

Since we convert --only-failures to --only last_run_status:failed under the covers, I think we would get this for free, and I think it makes sense to be consistent with that existing behavior.

For the third, I am thinking that if both are passed, --stale would only be triggered after the failures are fixed, so it builds on top of the behaviour outlined for the second.

TBH, I haven't really used --stale (if you recall, for the main elixir project I worked on, we had 40+ subapps in an umbrella and created a mix test_all task that run them all as a single suite, but that wasn't compatible with --stale). That said, the behavior you describe isn't what I'd expect. In general, are multiple filters ANDed or ORed together? I'd expect them to be ANDed (a test has to match both filter 1 and filter 2 to be included in the run). e.g. I'd expect mix test --only a:1 --only b:2 to do this:

all_tests
|> Stream.filter(fn test -> has_tag?(test, :a, 1) end)
|> Stream.filter(fn test -> has_tag?(test, :b, 2) end)
|> Enum.each(&run_test/1)

I'm not sure if that's what multiple --only options already does (I haven't tried it), but that interpretation fits with what we discussed when you pass a directory and --only-failures.

For --stale I'd expect it to work the same. I'd expect mix to only run tests that both failed the last time they ran and that are stale due to me changing modules/functions they depend upon since the last time they ran. This would be a good way to run just the failures that my recent code changes might have fixed.

Ultimately, I think it makes more sense to decide with how you want multiple filters to work in general, and apply that consistently to every sort of filter, rather than taking it on a case-by-case basis.

@myronmarston

This comment has been minimized.

Contributor

myronmarston commented Feb 24, 2018

One last question: is the manifest guaranteed to return absolute or
relative file names?

It returns absolute files. For example, here's what gets printed when I inspect the map set before using it in the mix test task:

#MapSet<["/Users/myron/code/elixir/lib/mix/tmp/Mix.Tasks.TestTest/test --only-failures_ loads only files with failures and runs just the failures/test/passing_and_failing_test_only_failures.exs"]>

When we filter files against the manifest, are those files also guaranteed to be absolute or relative?

We expand the paths when we do the filtering:

Enum.filter(files, &MapSet.member?(files_with_failures, Path.expand(&1)))

I believe Path.expand handles .. in path names like the example you asked about. The docs say:

Expands the path relative to the path given as the second argument expanding any . and .. characters.

@myronmarston

This comment has been minimized.

Contributor

myronmarston commented Feb 24, 2018

OK, I've switched to the for comprehension instead of using Stream.filter, and also updated the docs for ExUnit.Filters.files_with_failures/1 to mention the paths are absolute. Let me know if you want anything else before this is merged!

@josevalim

This comment has been minimized.

Member

josevalim commented Feb 24, 2018

@myronmarston

This comment has been minimized.

Contributor

myronmarston commented Feb 24, 2018

One of the downsides of the current behavior is that I have to consistently
flip between flags. I use —only-failures until all tests pass, then I
remove the flag, and repeat.

It sounds like you're expecting --only-failures to be a "normal" option that you usually pass when you run mix test. My experience may not match that of others, but for RSpec, I've never used it as a normal option (and doubt I would for mix test, either). My norm is to run the test file for the specific module I'm working on (I have a vim keybinding that makes this easy). Once that's done, run the whole suite, commit and push. The situations where I tend to use --only-failures are when I'm changing some interface that causes failures across my test suite. In that case, I repeatedly use --only-failures (or, for RSpec, --next-failure, but that's not a feature we're discussing adding here) until my suite is green. Then I can run the entire suite once more (without any flags) to check it is all green.

For RSpec we exit with a 0 status when all tests are filtered out. Once nice thing about that is that it is easy to compose a command to do what you're talking about:

$ rspec --only-failures && rspec && commit -m "Finished refactoring"

With mix test --only-failures exiting with a 1, you can't compose a command like this, unfortunately.

Also, my experience and approach to running my suite isn't anyone else's, so I don't know how much it should affect the direction we go here. That said, rspec --only-failures has been available in RSpec for 2.5 years and it's always run nothing when there are no failures, and we've never gotten a user request to change that behavior.

All that said: I honestly don't feel very strongly about this, so go whichever way you feel is best!

There is a similar concern about —stale. It doesn’t compose well with
—only-failures because they would filter each other and only a subset of
—only-failures would run. It probably makes sense but it not wouldn’t be
useful.

I have some thoughts on this but I have to run, so I'll try to respond later.

@josevalim

This comment has been minimized.

Member

josevalim commented Feb 24, 2018

And now I am back at the computer. --only compose with OR. The issue is that --stale filters the files being loaded so it doesn't work quite the same. Further --only-failures use both --only and filters the files, which ends up behaving slightly different.

Thank you for the detailed description about the workflow. Given that mix test with a filter that doesn't trigger any test exits with reason 1, the only way to get the workflow you propose:

$ rspec --only-failures && rspec && commit -m "Finished refactoring"

Would be if --only-failures always exits with reason 1 except when it runs the whole suite because there are no more failures. Although it is also unclear to me if this is reasonable behaviour altogether.

I am looking forward to your feedback on --stale. :)

@myronmarston

This comment has been minimized.

Contributor

myronmarston commented Feb 25, 2018

--only compose with OR.

That definitely changes things! I didn't realize that. Among other things, it makes me wonder if we should not have --only-failures do the file filtering (or perhaps only due it when there are no other filters in play). I was viewing the file filtering as a pure optimization that has no effect on observable behavior (outside of the test suite finishing faster!). For RSpec, that's the case, because when we have multiple filters, we take the set intersection of them. So discarding files that have no failures is a safe operation since we would never run any tests in those files anyway. If ExUnit's semantic is to take the set union of multiple filters, then discarding files with no failures isn't necessarily safe, and perhaps we shouldn't do it. Or, maybe if it's not too complicated we can have it only filter files if it's safe to do so (e.g. if no other filters are in play), as it still provides a really nice performance benefit.

All that said, the fact that you union multiple filters seems to add complexity, as it gets in the way of a filter doing optimizations like in this case--which it sounds like is a problem for both --stale and --only-failures. I imagine that ship has sailed, but if you decide to change how multiple filters compose, I think it would simplify things and make it easier to know what the "right" behavior in these situations is.

Given the current situation, I'm not sure how --stale --only-failures should compose. It seems like your current filtering semantics would dictate that it should run all tests that either are stale or failed on their last run, but I don't know if that's easily achievable with the file filtering that's being done. An alternate behavior for --stale --only-failures (which is what I initially expected) is that it would run just the failures that your code changes might affect--that is the failures that might now pass. (In theory, the non-stale failures should still fail unless they are non-deterministic).

Would be if --only-failures always exits with reason 1 except when it runs the whole suite because there are no more failures. Although it is also unclear to me if this is reasonable behaviour altogether.

I can understand the reasons for exiting with 1, but IMO, it's surprising. My expectation of a test framework's exit status is that it should be equivalent to either of these expressions:

if Enum.all?(tests, &passed?/1), do: 0, else: 1
if Enum.any?(tests, &failed?/1), do: 1, else: 0

In both these cases, if tests is empty, the expression would return 0 and that's what I'd expect from a test framework, for the same reasons that Enum.all?([], foo) always returns true and Enum.any?([], foo) always returns false.

I see now that mix test on a project with no tests exits with a 1 (and a nice message: "There are no tests to run"). I find this surprising, too, but I'm glad to see it's at least consistent that it exits with a 1 whenever no tests ran, regardless of why :).

Not sure if any of my ramblings are helpful or not, but that's all I've got tonight.

@josevalim

This comment has been minimized.

Member

josevalim commented Feb 25, 2018

Very helpful, as always!

It seems like your current filtering semantics would dictate that it should run all tests that either are stale or failed on their last run, but I don't know if that's easily achievable with the file filtering that's being done.

Right, unfortunately I don't think it is easily achievable.

I would suggest for us to at least rename --only-failures to something we no longer associate with regular filters. To mirror --stale, I propose --failed, but I am open to suggestions. It is also worth noting that --stale does not exit with status 1 when no tests run, so the behaviour between the two would be consistent here.

To recap the whole discussion: the reason why having no tests exit with reason 1 is to make it easier to catch failures in CIs and other places where you may filter tests for the wrong reasons and never find it out.

However, we can say that --stale and --failed are not really part of CI but rather your development workflow. At the end of the day, we want this workflow to be possible:

mix test --failed && mix test --stale && mix test

So I think calling it --failed and make it mirror --stale is the way to go. However, we would need to document this workflow explicitly and outline that passing --failed --stale at the same time won't work as they will filter each other. We already have a section called "Stale" in the task docs, we can upgrade it "Stale and Failed" and describe the workflow there.

Thoughts?

@myronmarston

This comment has been minimized.

Contributor

myronmarston commented Feb 26, 2018

I would suggest for us to at least rename --only-failures to something we no longer associate with regular filters. To mirror --stale, I propose --failed, but I am open to suggestions.

I like --failed.

To recap the whole discussion: the reason why having no tests exit with reason 1 is to make it easier to catch failures in CIs and other places where you may filter tests for the wrong reasons and never find it out.

I get the reasoning behind this, but IMO, it doesn't quite fulfill the stated purpose. Consider:

  • If we're talking about a filter that shouldn't be in place on CI, then it shouldn't be in place regardless of if it filters out all tests or filters out only some tests. But if it filters out only some tests, the current solution doesn't do anything to surface the problem to the developer. And 9 times out of 10, if you're going to accidentally leave a filter in place, it's going to be a filter that leaves at least some tests.
  • It's hard for me to imagine how you could "accidentally" set a filter on CI, given you'd specifically have to edit .travis.yml or whatever your CI build script is.

However, we would need to document this workflow explicitly and outline that passing --failed --stale at the same time won't work as they will filter each other.

If we don't support --failed --stale, I think we should programmatically detect it and raise an error, and not simply rely on users reading the docs. That said, it seems like --failed --stale is really just a special case of the more general issue that neither --failed nor --stale are able to cleanly compose with other filters, due to the file filtering they do. If that's correct, should we actually document and enforce programmatically that neither option can be combined with another filtering option?

@josevalim

This comment has been minimized.

Member

josevalim commented Feb 26, 2018

@myronmarston many filters are placed on CI because they are given in two different shapes: via the command line and in your test_helper.exs. For example, in Ecto, each adapter sets a bunch of filters on the test helper file.

There is also the case you have integration tests with external APIs. In such cases, the integration tests are usually disabled by default and you may run your suite on CI like this:

$ mix test
$ mix test --only integration

A typo on "integration" means you would never run those tests and they would always pass.

If we don't support --failed --stale, I think we should programmatically detect it and raise an error, and not simply rely on users reading the docs.

I like this direction as well. 👍

the more general issue that neither --failed nor --stale are able to cleanly compose with other filters

They compose fine with other filters. For example, I still use --stale with Ecto that relies on many filters and there is nothing stopping you from using mix test --failed --only integration when working on your integration tests. --failed and --stale work as an AND with other filters and that can be desired.

To clarify the next steps:

  • Rename --only-failures to --failed
  • Make sure --failed exits with status 0 if nothing runs
  • Raise if --failed and --stale are used together

Thoughts?

@myronmarston

This comment has been minimized.

Contributor

myronmarston commented Feb 26, 2018

A typo on "integration" means you would never run those tests and they would always pass.

Thanks for the example. I've never setup a CI build that way but I can see the reason to do it and why exiting with 1 is helpful there.

--failed and --stale work as an AND with other filters and that can be desired.

I'm confused now :(. Let's put aside --stale for a second since I don't actually know much about how it works, and focus on --failed. Earlier you had said:

--only compose with OR

Under the covers, --failed is translated to --only last_run_status:failed, and also applies file filtering as an optimization. If --only filters composed with AND, I think --failed --only integration would work fine. It would run only the tests tagged with :integration that failed the last time they ran. However, given that you said that --only composed with OR, I think --failed --only integration would attempt to run the set union of all tests that failed on their last run and all tests tagged with :integration. That would OK if not for the fact that --failed does the file filtering. Because if it, the actual behavior we'll get is to run all tests that failed on their last run plus all tests tagged with :integration from files that have at least one failing test. :integration tests from files with no failing tests will not be run. I don't think that behavior is what any user will expect, and it's pretty confusing. So, if --only filters compose with OR, I don't think we should allow --failed to be combined with any other filter, unless we remove the file filtering (at least for situations when there are other filters in effect).

Regarding --stale, earlier you said:

The issue is that --stale filters the files being loaded so it doesn't work quite the same.

Since --stale also filters files, I thought it suffered from the same issue as --failed, but if it doesn't use --only under the covers and therefore is composable with AND, it's probably not a problem for it to be combined with another filter.

Regarding your next steps: those all sound good. I'll plan to address them in this PR, hopefully in the next few days.

@josevalim

This comment has been minimized.

Member

josevalim commented Feb 26, 2018

Since --stale also filters files, I thought it suffered from the same issue as --failed, but if it doesn't use --only under the covers and therefore is composable with AND, it's probably not a problem for it to be combined with another filter.

Yes, I forgot that --failed does add a --only behind the scenes, which makes it behave like an OR. In other words, if you pass --failed --only integration it will run all of the integration tests in the loaded files. Your summary is perfect.

I think we can go ahead with the next steps but we may need to do something regarding the composition of --failed/--only before we say this is fully done. I will think about a couple options we may have here. Suggestions are also welcome!

@josevalim

This comment has been minimized.

Member

josevalim commented Feb 26, 2018

I will think about a couple options we may have here.

One option is, instead of passing only: [last_run_status: :failed], we pass to the runner the exact test_ids of the tests we want to run. Since we already get the files back from the manifest, we could get the list of test ids as well. I see two benefits with this option:

  1. The runner no longer needs to be care about the manifest, since it will no longer filter on last_run_status
  2. We get the composition right

The downside is that we are adding a new feature to the runner which is a bit awkward. Maybe a more general and less awkward way to implement this feature is to introduce the concept of a test filter, which allows us to discard tests without tracking them as excluded or skipped. Basically a filter we would apply here: https://github.com/elixir-lang/elixir/blob/master/lib/ex_unit/lib/ex_unit/runner.ex#L136

@myronmarston

This comment has been minimized.

Contributor

myronmarston commented Feb 28, 2018

@josevalim that makes sense, but if we do that, we lose the ability to do mix test --only last_run_status:passed or --mix test --only last_run_status:unknown. These are far less useful than filtering to only failed tests (which is why I wouldn't advocate surfacing a new CLI option to users like we are doing for this case), but there are situations where they are useful:

  • --only last_run_status:passed is useful when you're got some new failing tests for a feature you're working on, and realize a particular refactoring to some interface in your project would make the feature easier to build. After making the interface change, you want to make sure you didn't break anything (but don't care to run the new failing tests for the new feature). --only last_run_status:passed is perfect for these cases.
  • --only last_run_status:unknown isn't quite as useful, but it lets you run just tests that haven't run locally before. That can be useful when you change branches or added a bunch of new tests or something.

Ultimately, our goal here wasn't to add --failed, not those features, so losing them is probably OK. Figured it was worth calling out, though.

@josevalim

This comment has been minimized.

Member

josevalim commented Mar 3, 2018

Thanks @myronmarston!

It is always easy to add features later, removing them is right. So in my opinion is totally fine to remove the --only last_run_status:... for now. We can always add it later! ❤️

* `:async` - if the test case is in async mode
* `:registered` - used for `ExUnit.Case.register_attribute/3` values
* `:describe` - the describe block the test belongs to
* `:last_run_status` - status (`:passed`, `:failed`, or `:unknown`) from the test's last run

This comment has been minimized.

@josevalim

josevalim Mar 3, 2018

Member

So we probably won't need this anymore. :)

:registered,
:describe,
:type,
:last_run_status

This comment has been minimized.

@josevalim

josevalim Mar 3, 2018

Member

Nor this!

myronmarston added some commits Feb 20, 2018

Deal with `:file` tags that were incorrectly overriden by a `setup` b…
…lock

Given a `setup` block like:

    setup do
      {:ok, file: :foo}
    end

...ExUnit raises an exception, but it also overrides the test's
`:file` tag, which could cause problems for the manifest since it
looks up the existence of the file on the file system, and would
get an error if it was not a string.

@myronmarston myronmarston changed the title from Add support for "--only-failures" in ExUnit to Add support for "--failed" in ExUnit Mar 3, 2018

@myronmarston

This comment has been minimized.

Contributor

myronmarston commented Mar 3, 2018

It is always easy to add features later, removing them is right. So in my opinion is totally fine to remove the --only last_run_status:... for now. We can always add it later! ❤️

Sounds good! This isn't always my instinct, but I think Elixir is as good as it is because you consistently resist introducing things before they are really needed, ensuring it doesn't accumulate bloat.

Anyhow, I've made the changes you've requested and implemented it how we discussed. However, there's one more change I'd like to make--but I wanted to see what you think first. Now that we are only utilizing the failed tests stored in the manifest, there really isn't a reason to store tests with any other status. And there's no need to store the last_run_status at all. In fact, we can get rid of the entry record entirely, and store the manifest as a map of %{test_id => file}. Such an approach should cause the manifest to be much smaller both on disk and in memory (as failures are usually a small number compared to the total number of tests) and should be even faster to read from disk, write back to disk, merge, etc.

Thoughts?

def failed_test_ids(manifest) do
for({test_id, entry(last_run_status: :failed)} <- manifest, do: test_id, uniq: true)
|> MapSet.new()
end

This comment has been minimized.

@myronmarston

myronmarston Mar 3, 2018

Contributor

It's sub-optimal that both of these new functions filter to the failed statuses. If we switch to storing only failures in the manifest (as I explain in my larger comment on the PR), no filtering is needed, and this isn't a concern. OTOH, if we decide not to implement that optimization, we may want to merge these into one function that returns a tuple so it can filter only once.

@josevalim

This comment has been minimized.

Member

josevalim commented Mar 3, 2018

It is always easy to add features later, removing them is right

Of course I meant to say "removing them is hard".

And there's no need to store the last_run_status at all. In fact, we can get rid of the entry record entirely, and store the manifest as a map of %{test_id => file}

I like this, let's go in this direction. :)

myronmarston added some commits Feb 18, 2018

Add support for `mix test --failed`
This uses the ExUnit manifest to filter to only tests that failed
the last time they ran. As an optimization, we filter out files
that have no failures so we load the minimum set of files necessary.
Optimize ExUnit manifest to store only failures
Since that is the only information we care about, we do not
have to store every test with its status. This allows us to
vastly simplify the manifest, as we no longer have to use an
`:entry` record. Reading and writing the manifest should be
faster now, since the manifest tends toward being empty.

Since we are only storing failures, I have renamed the manifest
(and the `:manifest_file` option and file path) to indicate that
it deals only with failures.
@myronmarston

This comment has been minimized.

Contributor

myronmarston commented Mar 5, 2018

OK, @josevalim I've implemented what we discussed. PTAL!

@@ -410,8 +424,30 @@ defmodule Mix.Tasks.Test do
end
end
@manifest_file_name ".ex_unit_failures.elixir"

This comment has been minimized.

@josevalim

josevalim Mar 5, 2018

Member

Since mix is the one specifying the file name, maybe we should call it .mix_test_failures?

This comment has been minimized.

@myronmarston

myronmarston Mar 5, 2018

Contributor

Done.

|> MapSet.new()
end
@spec put_test(t, ExUnit.Test.t()) :: t

This comment has been minimized.

@josevalim

josevalim Mar 5, 2018

Member

Let's add a TODO here so we don't forget to check why we need this clause after merging. :)

This comment has been minimized.

@myronmarston

myronmarston Mar 5, 2018

Contributor

Done.

@@ -132,8 +133,9 @@ defmodule ExUnit.Runner do
tests = shuffle(config, tests)
include = config.include
exclude = config.exclude
test_ids = config.only_test_ids

This comment has been minimized.

@josevalim

josevalim Mar 5, 2018

Member

We need to document this and the failure_manifest_file in lib/ex_unit.ex.

This comment has been minimized.

@myronmarston

myronmarston Mar 5, 2018

Contributor

Done.

@@ -117,6 +124,46 @@ defmodule Mix.Tasks.TestTest do
end
end
test "--failed: loads only files with failures and runs just the failures" do

This comment has been minimized.

@josevalim

josevalim Mar 5, 2018

Member

This test is beautiful. 😍

@josevalim

This comment has been minimized.

Member

josevalim commented Mar 5, 2018

@myronmarston I have left only three minor comments. Everything else is perfect!

Thank you so much for all of the work and discussions around this feature.

Note to merger: REBASE and not squash.

@myronmarston

This comment has been minimized.

Contributor

myronmarston commented Mar 5, 2018

@josevalim comments addressed!

Address code review feedback
- Rename manifest fail to indicate that mix owns the file
- Leave a TODO
- Document new ExUnit config options
- Fix ENV var typo
@josevalim

This comment has been minimized.

Member

josevalim commented Mar 5, 2018

I have merged manually, thank you!
❤️ 💚 💙 💛 💜

@josevalim josevalim closed this Mar 5, 2018

@josevalim

This comment has been minimized.

Member

josevalim commented Mar 5, 2018

Btw, I could not reproduce the nil file scenario. I have tried to return file: nil in setup but it did not allow me. I have tried to set it to nil via a tag but it did not allow me either. For now I have removed the check hoping the case will arise. If you have a consistent way of reproducing it, please let me know!

@myronmarston

This comment has been minimized.

Contributor

myronmarston commented Mar 5, 2018

I have merged manually, thank you!

It looks like you might not have pushed what you manually merged (at least not to master...at least I don't see it!). Mind pushing it so I can work off it to try to repro the file issue?

@josevalim

This comment has been minimized.

Member

josevalim commented Mar 5, 2018

In my defense I pushed it but the server refused it. :P Pushed now :D

@myronmarston myronmarston deleted the myronmarston:myron/only-failures branch Mar 6, 2018

@myronmarston

This comment has been minimized.

Contributor

myronmarston commented Mar 6, 2018

For now I have removed the check hoping the case will arise. If you have a consistent way of reproducing it, please let me know!

I can't repro it now, either. Not sure what fixed it...but I guess that's good news?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment