-
Notifications
You must be signed in to change notification settings - Fork 3.5k
Add support for "--failed" in ExUnit #7373
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
lib/ex_unit/lib/ex_unit/manifest.ex
Outdated
end | ||
|
||
@spec add_test(t, ExUnit.Test.t()) :: t | ||
def add_test(manifest, %ExUnit.Test{tags: %{file: file}}) | ||
when not is_binary(file), | ||
do: manifest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's unfortunate this function clause is needed, but without it, ex_unit_test.exs
experiences order-dependent failures, due to this test. Apparently when you have a setup
block like setup do: {:ok, file: :foo}
it still updates the test's :file
tag even though it also triggers an exception telling the user the mistake. The file: :foo
tag would then get stored in the manifest as the file (as we always keep the results from the new manifest). Then, on the next test run, when ExUnit.Manifest.merge/2
is inspecting the old manifest entries, it would try to check File.regular?(:foo)
and get an exception. This function clause ignores the test entirely when it's :file
tag is invalid, working around the problem.
That said, while this fixes the exception, I think a problem still remains. If a user did setup do: {:ok, file: "string"}
we would store the test in the manifest with the wrong :file
value. I haven't spent any time looking into it yet, but I think a better fix would be to prevent the tag from updating in this case so that the file
passed to this function is always the real file the test came from.
lib/ex_unit/lib/ex_unit/manifest.ex
Outdated
@@ -4,19 +4,41 @@ defmodule ExUnit.Manifest do | |||
import Record | |||
|
|||
defrecord :entry, [:last_run_status, :file] | |||
@opaque t :: [{test_id, entry}] | |||
defstruct entries: [], path: nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know this somewhat of a nitpick but is the path really a property of the manifest?
A manifest, once loaded, can be written anywhere in disk, and not simply on the path it was originally created at. So it feels we are trying to couple two things that don't really belong together. :) And as a consequence the code got much more complex too. Simple functions like merge now need to extract values from structs, discard paths and what not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Although I understand now that you did it to avoid loading the manifest multiple times. I am thinking that it is ok to read the manifest twice when the --only-failures
flag is given, since we have really optimized that.
So maybe a better way to go about this is to have a new function in ExUnit.Filters
that returns the failed files from the manifest?
Another option is to have a function in ExUnit.Filters
that reads the manifest and returns all failed files and all failed tests per module. This way, we won't tell ExUnit to run only last_status_run:failed, which we said we don't want to expose to developers, but instead we simply tell it which tests to run in any given module. With this approach, we don't need to touch the manifest in the runner at all.
lib/mix/lib/mix/tasks/test.ex
Outdated
manifest = | ||
Mix.Project.manifest_path() | ||
|> Path.join(".ex_unit_results.elixir") | ||
|> ExUnit.Manifest.read() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer if we didn't access ExUnit.Manifest in the Mix application. It is private API. I truly prefer the previous approach where we simply gave it a path. :)
Thank you @myronmarston! I have added some comments. We will also be glad to merge the commits instead of squashing them but could you please make sure you don't add a dot at the end of the commit titles? |
--only-failures
.a800c71
to
a829c1a
Compare
@josevalim thanks for the feedback. I wasn't a huge fan of converting the manifest to a struct and storing the path in it but I assumed loading the manifest only once was a requirement. Now that we've relaxed that restriction this is easier :).
This is the route I took.
I didn't quite understand your idea here, so I didn't attempt it.
Done. BTW, do you have thoughts on my comment above about how |
lib/ex_unit/lib/ex_unit/manifest.ex
Outdated
def get_files_with_failures(entries) do | ||
entries | ||
|> Stream.filter(fn {_, entry(last_run_status: status)} -> status == :failed end) | ||
|> MapSet.new(fn {_, entry(file: file)} -> file end) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the fastest implementation of this function would be:
for {_, entry(last_run_status: :failed, file: file)} <- entries, do: file, uniq: true
We can do everything in one pass and we can return a simple list back, which is what we end-up traversing later on anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's important that we return a set. For one, conceptually, a set is the right data structure since we want to return a collection of unique files and we don't care about the order. More importantly, we want a constant-time membership check when we use this later in mix. Otherwise, the filter_test_files_using_manifest/3
function in mix is going to be an O(n * m)
operation (where n == number of test files and m == number of test files with failures).
We can do everything in one pass
With the Stream.filter
, we're only traversing the list once, anyway, right?
On a side note, I didn't know that for
comprehensions support a uniq
option. TIL!
@myronmarston this is perfect. I have just added one comment about a piece of code that we can execute faster but I can also address it after merging. The last topic I wanted to discuss is what happens in some situations:
For the first, I think we should only keep the directories given by the user, so we need to always filter on top of that. For the second, I think we should print a message saying "There are no pending failures. Re-running all the suite". I want to do that to avoid people using --only-failures to get a build to eventually pass. For the third, I am thinking that if both are passed, Thoughts?
I haven't looked into it yet, I was planning to do after merging. :) |
Oh, we already filter based on the manifest, that was one of my concerns,
great! And you are right, a set is better!
The difference between stream filter and a comprehension is that the filter
inlines at runtime, the comprehension at compile time. But given we need to
return a set, i don’t this difference matters now.
--
*José Valimwww.plataformatec.com.br
<http://www.plataformatec.com.br/>Founder and Director of R&D*
|
And to further clarify, it is still most likely that the eager approach is
the most performant one.
So I would do a comprehension that retrieves a list of unique files and
then pass that to MapSet.new.
The stream is only worth it for large data sets.
One last question: is the manifest guaranteed to return absolute or
relative file names? When we filter files against the manifest, are those
files also guaranteed to be absolute or relative?
If they are relative, there is a chance they won’t match if you invoke
something like “mix test test/../test”.
--
*José Valimwww.plataformatec.com.br
<http://www.plataformatec.com.br/>Founder and Director of R&D*
|
Agreed. That's definitely the behavior I'd expect, and it's what we did for RSpec.
I lean towards running nothing--IMO, it's the most consistent. I view all_tests
|> Stream.filter(&test_failed_last_run?/1)
|> Enum.each(&run_test/1) With a piece of code like this, if I took a look at what happens now when you do
Since we convert
TBH, I haven't really used all_tests
|> Stream.filter(fn test -> has_tag?(test, :a, 1) end)
|> Stream.filter(fn test -> has_tag?(test, :b, 2) end)
|> Enum.each(&run_test/1) I'm not sure if that's what multiple For Ultimately, I think it makes more sense to decide with how you want multiple filters to work in general, and apply that consistently to every sort of filter, rather than taking it on a case-by-case basis. |
It returns absolute files. For example, here's what gets printed when I inspect the map set before using it in the
We expand the paths when we do the filtering: Enum.filter(files, &MapSet.member?(files_with_failures, Path.expand(&1))) I believe
|
a829c1a
to
1c8ef9e
Compare
OK, I've switched to the |
Oh, I forgot we recently changed —only to exit with non-zero status in case
of missing entries. Being consistent there would make me happy but I would
still like to discuss alternatives.
One of the downsides of the current behavior is that I have to consistently
flip between flags. I use —only-failures until all tests pass, then I
remove the flag, and repeat.
Maybe —only-failures is not the best flag name, but wouldn’t it be better a
flag that I can always pass as argument? Something like —failures-first or
—focus-on-failures?
There is a similar concern about —stale. It doesn’t compose well with
—only-failures because they would filter each other and only a subset of
—only-failures would run. It probably makes sense but it not wouldn’t be
useful.
--
*José Valimwww.plataformatec.com.br
<http://www.plataformatec.com.br/>Founder and Director of R&D*
|
It sounds like you're expecting For RSpec we exit with a 0 status when all tests are filtered out. Once nice thing about that is that it is easy to compose a command to do what you're talking about: $ rspec --only-failures && rspec && commit -m "Finished refactoring" With Also, my experience and approach to running my suite isn't anyone else's, so I don't know how much it should affect the direction we go here. That said, All that said: I honestly don't feel very strongly about this, so go whichever way you feel is best!
I have some thoughts on this but I have to run, so I'll try to respond later. |
And now I am back at the computer. Thank you for the detailed description about the workflow. Given that
Would be if I am looking forward to your feedback on |
That definitely changes things! I didn't realize that. Among other things, it makes me wonder if we should not have All that said, the fact that you union multiple filters seems to add complexity, as it gets in the way of a filter doing optimizations like in this case--which it sounds like is a problem for both Given the current situation, I'm not sure how
I can understand the reasons for exiting with 1, but IMO, it's surprising. My expectation of a test framework's exit status is that it should be equivalent to either of these expressions: if Enum.all?(tests, &passed?/1), do: 0, else: 1 if Enum.any?(tests, &failed?/1), do: 1, else: 0 In both these cases, if I see now that Not sure if any of my ramblings are helpful or not, but that's all I've got tonight. |
Very helpful, as always!
Right, unfortunately I don't think it is easily achievable. I would suggest for us to at least rename To recap the whole discussion: the reason why having no tests exit with reason 1 is to make it easier to catch failures in CIs and other places where you may filter tests for the wrong reasons and never find it out. However, we can say that
So I think calling it Thoughts? |
I like
I get the reasoning behind this, but IMO, it doesn't quite fulfill the stated purpose. Consider:
If we don't support |
@myronmarston many filters are placed on CI because they are given in two different shapes: via the command line and in your test_helper.exs. For example, in Ecto, each adapter sets a bunch of filters on the test helper file. There is also the case you have integration tests with external APIs. In such cases, the integration tests are usually disabled by default and you may run your suite on CI like this:
A typo on "integration" means you would never run those tests and they would always pass.
I like this direction as well. 👍
They compose fine with other filters. For example, I still use To clarify the next steps:
Thoughts? |
Thanks for the example. I've never setup a CI build that way but I can see the reason to do it and why exiting with 1 is helpful there.
I'm confused now :(. Let's put aside
Under the covers, Regarding
Since Regarding your next steps: those all sound good. I'll plan to address them in this PR, hopefully in the next few days. |
Yes, I forgot that I think we can go ahead with the next steps but we may need to do something regarding the composition of --failed/--only before we say this is fully done. I will think about a couple options we may have here. Suggestions are also welcome! |
One option is, instead of passing
The downside is that we are adding a new feature to the runner which is a bit awkward. Maybe a more general and less awkward way to implement this feature is to introduce the concept of a test filter, which allows us to discard tests without tracking them as excluded or skipped. Basically a filter we would apply here: https://github.com/elixir-lang/elixir/blob/master/lib/ex_unit/lib/ex_unit/runner.ex#L136 |
@josevalim that makes sense, but if we do that, we lose the ability to do
Ultimately, our goal here wasn't to add |
Thanks @myronmarston! It is always easy to add features later, removing them is right. So in my opinion is totally fine to remove the |
lib/ex_unit/lib/ex_unit/case.ex
Outdated
* `:async` - if the test case is in async mode | ||
* `:registered` - used for `ExUnit.Case.register_attribute/3` values | ||
* `:describe` - the describe block the test belongs to | ||
* `:last_run_status` - status (`:passed`, `:failed`, or `:unknown`) from the test's last run |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So we probably won't need this anymore. :)
lib/ex_unit/lib/ex_unit/case.ex
Outdated
:registered, | ||
:describe, | ||
:type, | ||
:last_run_status |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nor this!
…lock Given a `setup` block like: setup do {:ok, file: :foo} end ...ExUnit raises an exception, but it also overrides the test's `:file` tag, which could cause problems for the manifest since it looks up the existence of the file on the file system, and would get an error if it was not a string.
1c8ef9e
to
8bdb1cb
Compare
Sounds good! This isn't always my instinct, but I think Elixir is as good as it is because you consistently resist introducing things before they are really needed, ensuring it doesn't accumulate bloat. Anyhow, I've made the changes you've requested and implemented it how we discussed. However, there's one more change I'd like to make--but I wanted to see what you think first. Now that we are only utilizing the failed tests stored in the manifest, there really isn't a reason to store tests with any other status. And there's no need to store the Thoughts? |
lib/ex_unit/lib/ex_unit/manifest.ex
Outdated
def failed_test_ids(manifest) do | ||
for({test_id, entry(last_run_status: :failed)} <- manifest, do: test_id, uniq: true) | ||
|> MapSet.new() | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's sub-optimal that both of these new functions filter to the failed statuses. If we switch to storing only failures in the manifest (as I explain in my larger comment on the PR), no filtering is needed, and this isn't a concern. OTOH, if we decide not to implement that optimization, we may want to merge these into one function that returns a tuple so it can filter only once.
Of course I meant to say "removing them is hard".
I like this, let's go in this direction. :) |
This uses the ExUnit manifest to filter to only tests that failed the last time they ran. As an optimization, we filter out files that have no failures so we load the minimum set of files necessary.
Since that is the only information we care about, we do not have to store every test with its status. This allows us to vastly simplify the manifest, as we no longer have to use an `:entry` record. Reading and writing the manifest should be faster now, since the manifest tends toward being empty. Since we are only storing failures, I have renamed the manifest (and the `:manifest_file` option and file path) to indicate that it deals only with failures.
8bdb1cb
to
a28bb62
Compare
OK, @josevalim I've implemented what we discussed. PTAL! |
lib/mix/lib/mix/tasks/test.ex
Outdated
@@ -410,8 +424,30 @@ defmodule Mix.Tasks.Test do | |||
end | |||
end | |||
|
|||
@manifest_file_name ".ex_unit_failures.elixir" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since mix is the one specifying the file name, maybe we should call it .mix_test_failures
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
|> MapSet.new() | ||
end | ||
|
||
@spec put_test(t, ExUnit.Test.t()) :: t |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's add a TODO here so we don't forget to check why we need this clause after merging. :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
@@ -132,8 +133,9 @@ defmodule ExUnit.Runner do | |||
tests = shuffle(config, tests) | |||
include = config.include | |||
exclude = config.exclude | |||
test_ids = config.only_test_ids |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to document this and the failure_manifest_file in lib/ex_unit.ex.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
@@ -117,6 +124,46 @@ defmodule Mix.Tasks.TestTest do | |||
end | |||
end | |||
|
|||
test "--failed: loads only files with failures and runs just the failures" do |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test is beautiful. 😍
@myronmarston I have left only three minor comments. Everything else is perfect! Thank you so much for all of the work and discussions around this feature. Note to merger: REBASE and not squash. |
@josevalim comments addressed! |
- Rename manifest fail to indicate that mix owns the file - Leave a TODO - Document new ExUnit config options - Fix ENV var typo
571b334
to
c50937a
Compare
I have merged manually, thank you! |
Btw, I could not reproduce the nil file scenario. I have tried to return |
It looks like you might not have pushed what you manually merged (at least not to |
In my defense I pushed it but the server refused it. :P Pushed now :D |
I can't repro it now, either. Not sure what fixed it...but I guess that's good news? |
This is a follow up to #7082 that adds support for
--only-failures
using the new ExUnit manifest.Note that I put some effort into clearly laying out the history in this PR, with explanatory commit messages, so I'd recommend merging instead of doing a squash+merge but ultimately it's up to y'all, of course.