New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parallelize tests only when there are enough to justify the parallelization overhead #42761
Conversation
activesupport/lib/active_support/testing/parallelize_when_suitable_executor.rb
Outdated
Show resolved
Hide resolved
This would need a changelog entry. I'm a big fan of the concept. |
Yes thanks @ghiculescu it's there in the "pending" list. I'll add :) |
2e95bb8
to
3306df3
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know how I feel about this feature as it silently changes behavior for existing applications. I think if someone isn't explicitly turning off parallelization then there should be a warning that parallelization isn't turned on.
activesupport/lib/active_support/testing/parallelize_when_suitable_executor.rb
Outdated
Show resolved
Hide resolved
8daa191
to
6a41aa1
Compare
0d7d5ee
to
b927c28
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is looking good, I like that new warning and the method name changes. I think ParallelizeWhenSuitableExecutor
should just be ParallelizeExecutor
since it always gets initialized. Once that's fixed, can you also squash your commits into one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a couple more comments after I hit enter on that last one, sorry for the notification noise.
I think that parallelization should always be on if PARALLEL_WORKERS
is set above 1 because that means it's an explicit setting and I think it would be odd to have to change a config and an env var.
activesupport/lib/active_support/testing/parallelize_when_suitable_executor.rb
Outdated
Show resolved
Hide resolved
activesupport/lib/active_support/testing/parallelize_when_suitable_executor.rb
Outdated
Show resolved
Hide resolved
3cabec5
to
2de5298
Compare
@eileencodes I think I addressed all your comments. I changed things a bit to move some configuration decisions into the new I still have pending to add some tests for this to wrap things up. Also I'll try to determine a more-informed default. I'll do that today or tomorrow. Thanks for the review and feedback 🙏. |
4ba649b
to
93a3af3
Compare
@dhh suggested the possibility of adapting the number of workers to the number of tests. E.g: 30 tests, 1 worker. 60 tests, 2 workers, etc. I think it would be great and it would be totally feasible with this implementation. I'd leave that for a future iteration though. |
743e065
to
ca96366
Compare
I wrote this script for trying to determine a proper default threshold here. It measures and compares times for running tests in parallel versus running them sequentially. When running the script with HEY and Basecamp I got 79 and 85, respectively. For measuring, I established a synthetic test duration of 100ms each test. If no delay, threshold goes ridiculously high (it's always faster to run sequentially). I lean towards setting the default threshold conservatively at 50. |
24de208
to
42f83a5
Compare
Parallelizing tests has a cost in terms of database setup and fixture loading. This change makes Rails disable parallelization when the number of tests is below a configurable threshold. When running tests in parallel each process gets its own database instance. On each execution, each process will update each database schema (if needed) and load all the fixtures. This can be very expensive for non trivial datasets. As an example, for HEY, when running a single file with 18 tests, running tests in parallel in my box adds an overhead of 13 seconds versus not parallelizing them. Of course parallelizing is totally worthy when there are many tests to run, but not when running just a few tests. The threshold is configurable via config.active_support.test_parallelization_minimum_number_of_tests, which is 30 50 by default. This also adds some tracing to know how tests are being executed: When in parallel: ``` Running 2829 tests in parallel in 8 processes ``` When not in parallel: ``` Running 15 tests in a single process (parallelization threshold is 30) ```
42f83a5
to
ecc5afe
Compare
This is good to merge from my side @eileencodes |
…test name. - Related to rails#42761 - Used `parallelized?` method instead of calling a method `should_parallelize?` to figure out if parallezation is enabled. - Fixed semantics of the test name corresponding to the change - Updated test name as per the code review suggestion.
…test rails#42761 made the old system of disable paralleling testing when only one test file was included obsolete.
…test #42761 made the old system of disable paralleling testing when only one test file was included obsolete.
Parallelizing tests has a cost in terms of database setup and fixture loading. This change makes Rails disable parallelization when the number of tests is below a configurable threshold.
When running tests in parallel each process gets its own database instance. On each execution, each process will update each database schema (if needed) and load all the fixtures. This can be very expensive for non trivial datasets.
As an example, for HEY, when running a single file with 18 tests, running tests in parallel in my box adds an overhead of 13 seconds versus not parallelizing them. Of course parallelizing is totally worthy when there are many tests to run, but not when running just a few tests.
The threshold is configurable via
config.active_support.test_parallelization_minimum_number_of_tests
, which is3050 by default.This also adds some tracing to know how tests are being executed:
When in parallel:
When not in parallel:
Pending:
cc @eileencodes