-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add --sort-first and --sort-last to mix test #13589
Conversation
What if we have --sort-first and --sort-last accept files, and we use Code.require_file to explicitly those files and return the test modules, which we then accordingly schedule first and last? I think the issue is that async ones will run immediately but we probably have ways to work around that too? |
|
Perhaps we don't need to do anything? The ones loaded first will run first, the ones loaded last will run last. Perhaps the only challenge is that async tests always run before the sync ones, but we can add that as a footnote? We only guarantee execution order within those groups? |
It works for sync tests when we Enum.reverse the modules in I think the problem for async tests is similar: while we might compile the async test last, it is still prepended to the list of async modules in |
Sounds good to me!
Perhaps we should make it use |
Good idea with In the current state, testing in the phoenix repo
correctly runs the async resource_test first, the async token_test as the last async test right before the sync websocket_channels_test and the sync phx.gen.notifier_test right at the end. |
In the case of phoenixframework/phoenix@2c70068 even --sort-first does not help in every run as apparently other async tests are able to require the file just in time. In that case the initial |
Just wondering (apologies if it is a silly question), could something like this be achieved with consecutive calls using different tag selections? mix test --only first
mix test --except first --except last
mix test --only last I suppose it has the same downside regarding parallel compilation though. |
@sabiwara the goal is to find dependency between tests, so separate runs won't help us. :( |
If you don't want to use a :queue, another idea is to have a "prepend/append" mode inside the ExUnit server and swap accordingly, but :queue seems to be the semantically correct operation anyway. |
async_modules = :queue.to_list(state.async_modules) |> Enum.uniq() |> :queue.from_list() | ||
sync_modules = :queue.to_list(state.sync_modules) |> Enum.uniq() |> :queue.from_list() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not really pretty
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
btw @josevalim speaking about reducing complexity, I've seen that you changed the server to append tests instead of prepending them in b665ddd, so the changes in ExUnit.Server are not necessary any more. The :queue
may be a little bit faster, so keep it or not?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did I? I think the code is still prepending, since &1 (the value in state) is on the right side?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
well never mind, you're right of course
Maybe it would be more helpful to have the option to enforce a deterministic compilation order for tests? At least as for CI, where the speed difference might not matter for smaller projects? I feel like this could make reproducing such problems much easier, even without |
@SteffenDE we could make it so And if we use the seed to It depends on what you feel would be easier to expose those Phoenix bugs. Thoughts? |
8cf6513
to
134eace
Compare
# require one file at a time | ||
failed = | ||
for file <- test_files do | ||
do_require(false, [file], parallel_require_callbacks, warnings_as_errors?) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can use Code.require_file instead... but I wonder if that largely changes the semantics of the code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed it to still use the ParallelCompiler because of the parallel_require_callbacks
and the warnings_as_errors?
option (134eace), otherwise --stale
or --warnings-as-errors
won't work if combined with --sort-first/last
or --trace-require
.
@@ -278,13 +280,42 @@ defmodule ExUnit.Runner do | |||
end | |||
end | |||
|
|||
to_run = sort_first_last(config, to_run) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think these changes are necessary? We only sort files, not actual tests?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we only want to support sorting files, then yes. But as I am using the existing ExUnit filters, I saw the opportunity to also support --sort-first=mytag
to sort tests themselves.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. Let's start with either tags or files first. I think files will reduce complexity, so that gets my vote. :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm so it would only remove the changes in ExUnit.Runner, but then I'd probably need to validate that the parameters given to --sort-first/last
are only using file:
filters, or would you recommend not using the ExUnit filter syntax and directly pass files:
Instead of:
mix test --sort-first file:path/to/test.exs --sort-first file:path/to/other/test.exs
just
mix test --sort-first path/to/test.exs --sort-first path/to/other/test.exs
?
Maybe test complexity would be reduced, I'll need to look at those.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Exactly, just:
mix test --sort-first path/to/test.exs --sort-first path/to/other/test.exs
I've been thinking about another flag that would make debugging flaky tests caused by global state easier: Combining this with So maybe it makes sense to split this:
Properly naming the flags can be further discussed. Maybe |
Wouldn't it be the same as
It seems this is definitely a minimum requirement. I am afraid, however, this is not enough for reproducing a workflow. For example, imagine CI fails. Maybe, however, the flag we need to add is |
Not exactly the same, as even with
Exactly. It only helps for runs with the same settings. But it's still a win, as you can then try running locally until you can hopefully reproduce it with So I'd propose the following:
After all this I'm not fully convinced that |
I think we only need to add
Do we even need the queue for FIFO ordering? If they are loaded in order, it is deterministic anyway. I think we only need FIFO ordering if we add --sort-first and --sort-last. So overall, I think it is:
|
I don't think so. Imagine a test (let's call if FooTest) that takes a non-deterministic amount of time to run. Maybe it does a database operation or something. For now let's assume that it sometimes takes 1 second and sometimes up to 5. And as compilation happens concurrently we could have:
The order of test runs now is: FooTest, BazTest, then whatever is last compiled while BazTest runs, ... Now another run, FooTest takes 5 seconds to run. While FooTest runs, more than two other tests are compiled. The order of test runs is: FooTest, LastCompiledTest, SecondLastCompiledTest, ..., BazTest, BarTest So for deterministic order either compilation would need to wait for each test to complete, or we need a FIFO order, right? |
I see. That's why you also proposed It feels that loading tests and running tests as two distinct events, instead of concurrently, would be even better if we want to guarantee determinism? So maybe we should add these two options:
|
I don't know if it's "better". Phoenix:
I like that! While initially implementing |
So let's do |
That sounds good to me! I’ll close this one and work on new PRs. Thank you so much for the guidance :) |
In order to have more deterministic test runs when using `--max-cases 1` and `--max-requires 1` (elixir-lang#13635) (see also elixir-lang#13589), we need to run tests in compilation order (FIFO). In the past, ExUnit.Server appended new tests to the front of a list, which would result in the most recently added test to be run first. Let's quickly demonstrate the problem this causes for deterministic runs with a simple example: Imagine a test (let's call if FooTest) that takes a non-deterministic amount of time to run. For now let's assume that it sometimes takes 1 second and sometimes up to 5. And as async tests execute in parallel with compilation of other test files, we could have the following scenario: FooTest is compiled and because it's async it is immediately started. It takes 1 second to run. In this 1 second two more tests are compiled. First BarTest is prepended to the list, then BazTest. The order of test runs now is: FooTest, BazTest, then whatever is last compiled while BazTest runs, ... Now another run, FooTest takes 5 seconds to run. While FooTest runs, more than two other tests are compiled. The order of test runs is: FooTest, LastCompiledTest, SecondLastCompiledTest, ..., BazTest, BarTest This can be fixed either by appending new test modules to the end of the list, or - and that's what this commit does - by using a `:queue` instead.
In order to have more deterministic test runs when using `--max-cases 1` and `--max-requires 1` (#13635) (see also #13589), we need to run tests in compilation order (FIFO). In the past, ExUnit.Server appended new tests to the front of a list, which would result in the most recently added test to be run first. Let's quickly demonstrate the problem this causes for deterministic runs with a simple example: Imagine a test (let's call if FooTest) that takes a non-deterministic amount of time to run. For now let's assume that it sometimes takes 1 second and sometimes up to 5. And as async tests execute in parallel with compilation of other test files, we could have the following scenario: FooTest is compiled and because it's async it is immediately started. It takes 1 second to run. In this 1 second two more tests are compiled. First BarTest is prepended to the list, then BazTest. The order of test runs now is: FooTest, BazTest, then whatever is last compiled while BazTest runs, ... Now another run, FooTest takes 5 seconds to run. While FooTest runs, more than two other tests are compiled. The order of test runs is: FooTest, LastCompiledTest, SecondLastCompiledTest, ..., BazTest, BarTest This can be fixed either by appending new test modules to the end of the list, or - and that's what this commit does - by using a `:queue` instead.
This is based on an idea @josevalim had when debugging Phoenix tests. It is not working quite as intended though (yet?).
The goal is to provide an option to sort specific modules to be compiled first / last (and I extended this to also specify the order of tests based on the ExUnit filter syntax).
The problem as far as I can tell is that the test files are still compiled in parallel, so changing the order of the files passed to
Kernel.ParallelCompiler.require
does not guarantee that the modules are indeed compiled first or last.Examples on how to use:
This would ideally compile the given file first and run all tests matching the mytag:true filter after all the other tests in the file.