Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ExUnit] Execute only one test with the line filter, per file with multiple test modules #11949

Conversation

studzien
Copy link

Intro

This PR is the continuation of #11863.
This PR changes the behavior of ExUnit's filter.
It makes sure that only the test referenced in the filter is executed in case more than one test module is defined in one test file (see the example below).
Defining multiple test modules in one file is handy when parallelizing tests aggressively, such as defining each test as a separate async: true test module.

Example

Given the following test module:

defmodule MultipleTestModulesTest do
  defmodule TestModule1 do
    use ExUnit.Case

    test "one" do
      assert "one" == 1
    end
  end

  defmodule TestModule2 do
    use ExUnit.Case

    test "two" do
      assert "two" == 2
    end
  end
end

Currently, in main, when we run the test with the line filter mix test test/multiple_test_modules_in_file.exs:14, both tests will be executed. The tests in a file are filtered on a module by module basis -- the first test from every module that starts before the line 14 is executed:

% mix test test/multiple_test_modules_in_file.exs:14
Excluding tags: [:test]
Including tags: [line: "14"]



  1) test two (MultipleTestModulesTest.TestModule2)
     test/multiple_test_modules_in_file.exs:13
     Assertion with == failed
     code:  assert "two" == 2
     left:  "two"
     right: 2
     stacktrace:
       test/multiple_test_modules_in_file.exs:14: (test)



  2) test one (MultipleTestModulesTest.TestModule1)
     test/multiple_test_modules_in_file.exs:5
     Assertion with == failed
     code:  assert "one" == 1
     left:  "one"
     right: 1
     stacktrace:
       test/multiple_test_modules_in_file.exs:6: (test)


Finished in 0.02 seconds (0.00s async, 0.02s sync)
2 tests, 2 failures

The changes proposed in this PR consider both the first and last line of every test and make sure that only the test is executed if the filter is in that range:

% mix test test/multiple_test_modules_in_file.exs:14
Excluding tags: [:test]
Including tags: [line: "14"]



  1) test two (MultipleTestModulesTest.TestModule2)
     test/multiple_test_modules_in_file.exs:13
     Assertion with == failed
     code:  assert "two" == 2
     left:  "two"
     right: 2
     stacktrace:
       test/multiple_test_modules_in_file.exs:14: (test)


Finished in 0.02 seconds (0.00s async, 0.02s sync)
2 tests, 1 failure, 1 excluded

Considerations

  • Mapping the test modules to files they are implemented is executed once for every portion of async tests and once for sync tests. Storing the test module -> file mapping in the ExUnit.Server's state can solve this;
  • Since async tests start to run when not all the test modules are loaded, I'm wondering if there's a possibility of any race conditions here. This should not be a problem if the test modules from a single file are loaded in order, but I'm not sure if that's guaranteed.

michallepicki and others added 30 commits February 28, 2022 09:59
It is now being called with :infinity atom as width
In releases, the final step for stripping chunks of out BEAM files is to
gzip compress the file. This change lets users pass `compress: true` to
opt-in. Somewhat unintuitively, turning compression off here
enables better compression at a later step. This is due to the archive
compressors being able to process all BEAM files rather than restarting
compression for each individual BEAM file.

For some Nerves devices, reducing the total number of bytes sent to
update software is much more important than reducing the on-disk usage.
This is due to their network connections being metered. Given enough
devices, even small size reductions can result in meaningful savings.

Here are more details:

The default Nerves configuration puts all of the BEAM files in one
SquashFS archive. While Nerves also uses gzip for the SquashFS archive
and SquashFS (mostly) individually compresses files as well, letting
SquashFS perform the compression results in 2.5% file size reduction.
This is assumed to be due to using a higher compression level with
building the SquashFS archive. As a side benefit, moving gzip
compression to SquashFS removes the decompression steps from the BEAM
load and it resulted in a ~500ms boot time improvement on GRiSP 2
hardware for a small Nerves app. Presumably the Linux kernel's gzip
decompression is faster than Erlang's or removing the Erlang gunzip
calls completely added up.

When it's possible to compress all BEAM files together (for example,
when making tar files), not compressing BEAM files also helps. This is
because the BEAM files have a lot of similar strings that are now
visible to the archive compressor. It's also possible to use better
compression methods than gzip.

A similar use is the way that delta firmware updates are handled with
Nerves. Delta firmware updates are a way to send down the difference
between the firmware image on a device and the firmware image you want.
These are generated across the full image and should benefit from being
able to work off uncompressed .beam files.
antedeguemon and others added 16 commits June 23, 2022 09:04
Co-authored-by: antedeguemon <antedeguemon@users.noreply.github.com>
Co-authored-by: José Valim <jose.valim@dashbit.co>
…ng#11948)

first_in_iso_days and last_in_iso_days are both integers. This commit
updates the Date.Range typespec to reflect that.
@@ -103,7 +104,9 @@ defmodule ExUnit.Runner do

# Slots are available, start with async modules
modules = ExUnit.Server.take_async_modules(available) ->
running = spawn_modules(config, modules, running)
tests_per_file = tests_per_file(ExUnit.Server.get_all_modules())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately this is problematic because modules are loaded dynamically. So there is a chance I have two modules in the same file, the first one is loaded, the second one is not yet loaded, so get_all_modules won't return the second one and still run more tests than it should. :(

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately I can't come up with any other ideas either. the best option i can think of is to look into the test manifest and tranform line: ... into line: ..., max_line: .... But even then is a bit tricky.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I get this; I just wasn't sure if maybe all test cases from one file are loaded at once. That should solve this, but it would still be pretty ugly because it would be very implicit.

Another thought I had (which might be what you're saying) is to do it independently from the current line tag (so keep the line tag behavior as it is).
And to introduce a set of min_line and max_line, and execute all the tests from the line range if both are passed.
So something similar to the first attempt at solving this, but independent from the current behavior and in mix, this could be passed, like mix test test_file.exs:10-30

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So something similar to the first attempt at solving this, but independent from the current behavior and in mix, this could be passed, like mix test test_file.exs:10-30

Oh, I like this approach! It may give additional features too.

@josevalim
Copy link
Member

Hi @studzien! I just read the blog post (awsome job!) and I am wondering if a better solution to this problem is for us to introduce:

async: :per_module | :per_test | true | false 

(where true is equivalent to :per_module).

This means the original root issue is no longer that relevant to you and that you no longer need one module per test, but one per file, and that should compile much faster too. Would you like to work on such PR?

@studzien
Copy link
Author

studzien commented Jul 7, 2022

Hi @studzien! I just read the blog post (awsome job!) and I am wondering if a better solution to this problem is for us to introduce:

async: :per_module | :per_test | true | false 

(where true is equivalent to :per_module).

This means the original root issue is no longer that relevant to you and that you no longer need one module per test, but one per file, and that should compile much faster too. Would you like to work on such PR?

Yup, sounds good! Sure!

@josevalim
Copy link
Member

The only thing I am not sure is what to do with the max_cases configuration. But I think we can say that it applies to both numbers of modules and number of tests. This means that you may have max_cases * max_cases tests running at the same time but we can add more fine-grained controls later.

@drtheuns
Copy link

@josevalim I've been looking into a possible implementation for this. I see two paths in the current runner.ex code:

  1. For async: :per_test, replace run_tests/3 with a function similar to async_loop/3 (but for spawning tests instead). This will result in the potential max_cases * max_cases scenario.

  2. Make the process that was spawned for the module (in spawn_modules/3) it's own little server that suspends once the setup_all is performed. It should then somehow push all the tests that need to be executed for the module back into the async_loop (maybe a new parameter suspended that, when non-empty, takes precedence over fetching new async modules?). async_loop will then perform work according to the max_cases parameters by sending tests to the suspended module process. With this approach, no more tests than max_cases will be executed. I think this would result in the least surprising behaviour, but the implementation is a bit more hands-on.

Which one would be preferred? (or an alternative approach)

@josevalim
Copy link
Member

I like suggestion 2. Maybe we could push {pid, ref} to ExUnit.Server and then the runner, once it sees such "case", it sends a message to it so it runs next.

@drtheuns
Copy link

I made some progress on this, but feel a little blocked in the implementation currently. Code here: main...drtheuns:elixir:main

I've set it up as follows:

When async: :per_test is set, the module test process will block in run_module with it's own little async_module_loop. It will spawn a process for each test that will wait until it's told to execute by async_loop.

However, I've run into the following problem: the async_loop may be finished with processing the async modules faster than the module process can push the tests into ExUnit.Server. As a result, the async_loop will block until the async modules are finished (the true -> branch), but it never had a chance to fetch the running tests from the server. I feel a bit apprehensive about making spawn_modules blocking, because I don't want to slow down the existing async: true tests, especially if a module has a slow setup_all.

I've thought about introducing an await_modules into the async_loop that holds a list of :per_test refs that haven't notified the loop that they're done pushing tests onto ExUnit.Server. This way, the async_loop has a bit more control over the blocking/awaiting.

Another issue i've though about, is with the max_cases. If the async_loop has max_cases = 10, and spawns 10 module processes before it's able to see any :per_test processes, then it'll basically be stuck; it'll block until a module is done, but a module will never be done because the tests aren't being executed. So perhaps the :per_test module shouldn't count in the max_cases check?

@josevalim
Copy link
Member

What if we have been approaching this issue the wrong way?

Maybe, instead of swapping it per module, maybe we should swap the whole suite to run tests async via a flag? This way we don't need to consider the interplay of those two options and it should simplify the implementation. In both cases, the concurrency is controlled with max_cases.

WDYT?

@drtheuns
Copy link

Do you mean something like?

ExUnit.configure(async_mode: :per_test | :per_module)

which would then determine how the runner will treat async: true code?

In the :per_module case, we can just keep the existing implementation.

For the :per_test case we could then ignore the module processes from the max_cases, because other than the setup_all they're not actually executing tests. Perhaps we could still limit the amount of async modules that have been spawned, but at least they won't block the execution of the actual tests.

Is this what you had in mind?

@josevalim
Copy link
Member

Yes, that's what I had in mind. :)

@josevalim
Copy link
Member

Closing this one for now, due to lack of activity. Thank you.

@josevalim josevalim closed this Jan 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

None yet