Parallelize tests only when there are enough to justify the parallelization overhead #42761

jorgemanrubia · 2021-07-12T15:52:20Z

Parallelizing tests has a cost in terms of database setup and fixture loading. This change makes Rails disable parallelization when the number of tests is below a configurable threshold.

When running tests in parallel each process gets its own database instance. On each execution, each process will update each database schema (if needed) and load all the fixtures. This can be very expensive for non trivial datasets.

As an example, for HEY, when running a single file with 18 tests, running tests in parallel in my box adds an overhead of 13 seconds versus not parallelizing them. Of course parallelizing is totally worthy when there are many tests to run, but not when running just a few tests.

The threshold is configurable via config.active_support.test_parallelization_minimum_number_of_tests, which is 30 50 by default.

This also adds some tracing to know how tests are being executed:

When in parallel:

Running 2829 tests in parallel using 8 processes

When not in parallel:

Running 15 tests in a single process (parallelization threshold is 30)

Pending:

Document
Changelog
Tests
Refine default (30 is pretty much a wild guess). Update: see this.

cc @eileencodes

activesupport/lib/active_support/testing/parallelize_when_suitable_executor.rb

ghiculescu · 2021-07-12T15:56:40Z

This would need a changelog entry. I'm a big fan of the concept.

jorgemanrubia · 2021-07-12T15:57:44Z

Yes thanks @ghiculescu it's there in the "pending" list. I'll add :)

eileencodes

I don't know how I feel about this feature as it silently changes behavior for existing applications. I think if someone isn't explicitly turning off parallelization then there should be a warning that parallelization isn't turned on.

activesupport/lib/active_support/test_case.rb

activesupport/lib/active_support/testing/parallelize_when_suitable_executor.rb

activesupport/CHANGELOG.md

eileencodes

This is looking good, I like that new warning and the method name changes. I think ParallelizeWhenSuitableExecutor should just be ParallelizeExecutor since it always gets initialized. Once that's fixed, can you also squash your commits into one?

eileencodes

I have a couple more comments after I hit enter on that last one, sorry for the notification noise.

I think that parallelization should always be on if PARALLEL_WORKERS is set above 1 because that means it's an explicit setting and I think it would be odd to have to change a config and an env var.

activesupport/lib/active_support/testing/parallelize_when_suitable_executor.rb

jorgemanrubia · 2021-07-13T11:29:38Z

@eileencodes I think I addressed all your comments. I changed things a bit to move some configuration decisions into the new ParalellizeExecutor.

I still have pending to add some tests for this to wrap things up. Also I'll try to determine a more-informed default. I'll do that today or tomorrow.

Thanks for the review and feedback 🙏.

jorgemanrubia · 2021-07-13T11:49:04Z

@dhh suggested the possibility of adapting the number of workers to the number of tests. E.g: 30 tests, 1 worker. 60 tests, 2 workers, etc. I think it would be great and it would be totally feasible with this implementation. I'd leave that for a future iteration though.

jorgemanrubia · 2021-07-14T10:58:22Z

I wrote this script for trying to determine a proper default threshold here. It measures and compares times for running tests in parallel versus running them sequentially. When running the script with HEY and Basecamp I got 79 and 85, respectively.

For measuring, I established a synthetic test duration of 100ms each test. If no delay, threshold goes ridiculously high (it's always faster to run sequentially).

I lean towards setting the default threshold conservatively at 50.

Parallelizing tests has a cost in terms of database setup and fixture loading. This change makes Rails disable parallelization when the number of tests is below a configurable threshold. When running tests in parallel each process gets its own database instance. On each execution, each process will update each database schema (if needed) and load all the fixtures. This can be very expensive for non trivial datasets. As an example, for HEY, when running a single file with 18 tests, running tests in parallel in my box adds an overhead of 13 seconds versus not parallelizing them. Of course parallelizing is totally worthy when there are many tests to run, but not when running just a few tests. The threshold is configurable via config.active_support.test_parallelization_minimum_number_of_tests, which is 30 50 by default. This also adds some tracing to know how tests are being executed: When in parallel: ``` Running 2829 tests in parallel in 8 processes ``` When not in parallel: ``` Running 15 tests in a single process (parallelization threshold is 30) ```

jorgemanrubia · 2021-07-14T13:52:45Z

This is good to merge from my side @eileencodes

…test name. - Related to rails#42761 - Used `parallelized?` method instead of calling a method `should_parallelize?` to figure out if parallezation is enabled. - Fixed semantics of the test name corresponding to the change - Updated test name as per the code review suggestion.

…test rails#42761 made the old system of disable paralleling testing when only one test file was included obsolete.

…test #42761 made the old system of disable paralleling testing when only one test file was included obsolete.

rails-bot bot added the activesupport label Jul 12, 2021

jorgemanrubia commented Jul 12, 2021

View reviewed changes

activesupport/lib/active_support/testing/parallelize_when_suitable_executor.rb Outdated Show resolved Hide resolved

jorgemanrubia force-pushed the smart-parallel-tests branch from 2e95bb8 to 3306df3 Compare July 12, 2021 16:02

rails-bot bot added the docs label Jul 12, 2021

eileencodes reviewed Jul 12, 2021

View reviewed changes

activesupport/lib/active_support/test_case.rb Outdated Show resolved Hide resolved

activesupport/lib/active_support/testing/parallelize_when_suitable_executor.rb Outdated Show resolved Hide resolved

activesupport/CHANGELOG.md Outdated Show resolved Hide resolved

jorgemanrubia force-pushed the smart-parallel-tests branch 2 times, most recently from 8daa191 to 6a41aa1 Compare July 12, 2021 20:37

rails-bot bot added the railties label Jul 12, 2021

jorgemanrubia force-pushed the smart-parallel-tests branch from 0d7d5ee to b927c28 Compare July 12, 2021 20:38

eileencodes reviewed Jul 12, 2021

View reviewed changes

activesupport/lib/active_support/testing/parallelize_when_suitable_executor.rb Outdated Show resolved Hide resolved

activesupport/lib/active_support/testing/parallelize_when_suitable_executor.rb Outdated Show resolved Hide resolved

jorgemanrubia force-pushed the smart-parallel-tests branch 2 times, most recently from 3cabec5 to 2de5298 Compare July 13, 2021 11:26

jorgemanrubia force-pushed the smart-parallel-tests branch 2 times, most recently from 4ba649b to 93a3af3 Compare July 13, 2021 11:35

jorgemanrubia changed the title ~~Parallelize tests only when there are a number enough to make it worthy~~ Parallelize tests only when there are enough to compensate the parallelization overhead Jul 13, 2021

jorgemanrubia changed the title ~~Parallelize tests only when there are enough to compensate the parallelization overhead~~ Parallelize tests only when there are enough to justify the parallelization overhead Jul 13, 2021

jorgemanrubia force-pushed the smart-parallel-tests branch from 743e065 to ca96366 Compare July 14, 2021 09:23

jorgemanrubia force-pushed the smart-parallel-tests branch 4 times, most recently from 24de208 to 42f83a5 Compare July 14, 2021 11:23

jorgemanrubia force-pushed the smart-parallel-tests branch from 42f83a5 to ecc5afe Compare July 14, 2021 11:36

eileencodes merged commit c183397 into rails:main Jul 14, 2021

akshaymohite mentioned this pull request Jul 14, 2021

Used already defined method parallelized? and fixed semantics of test name. #42785

Merged

jorgemanrubia mentioned this pull request Jul 14, 2021

Add an option to configure parallelization threshold at the test case level #42789

Merged

jorgemanrubia mentioned this pull request Jul 19, 2021

Remove mechanism to disable test parallelization based on number of test files #42811

Merged

jorgemanrubia added a commit to basecamp/rails that referenced this pull request Jul 19, 2021

Remove mechanism to disable test parallelization when runnin gonly 1 …

debbbbc

…test rails#42761 made the old system of disable paralleling testing when only one test file was included obsolete.

jeremy pushed a commit that referenced this pull request Jul 27, 2021

Remove mechanism to disable test parallelization when runnin gonly 1 …

a29884d

…test #42761 made the old system of disable paralleling testing when only one test file was included obsolete.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Parallelize tests only when there are enough to justify the parallelization overhead #42761

Parallelize tests only when there are enough to justify the parallelization overhead #42761

jorgemanrubia commented Jul 12, 2021 •

edited

ghiculescu commented Jul 12, 2021

jorgemanrubia commented Jul 12, 2021 •

edited

eileencodes left a comment

eileencodes left a comment

eileencodes left a comment

jorgemanrubia commented Jul 13, 2021 •

edited

jorgemanrubia commented Jul 13, 2021 •

edited

jorgemanrubia commented Jul 14, 2021 •

edited

jorgemanrubia commented Jul 14, 2021 •

edited

Parallelize tests only when there are enough to justify the parallelization overhead #42761

Parallelize tests only when there are enough to justify the parallelization overhead #42761

Conversation

jorgemanrubia commented Jul 12, 2021 • edited

ghiculescu commented Jul 12, 2021

jorgemanrubia commented Jul 12, 2021 • edited

eileencodes left a comment

Choose a reason for hiding this comment

eileencodes left a comment

Choose a reason for hiding this comment

eileencodes left a comment

Choose a reason for hiding this comment

jorgemanrubia commented Jul 13, 2021 • edited

jorgemanrubia commented Jul 13, 2021 • edited

jorgemanrubia commented Jul 14, 2021 • edited

jorgemanrubia commented Jul 14, 2021 • edited

jorgemanrubia commented Jul 12, 2021 •

edited

jorgemanrubia commented Jul 12, 2021 •

edited

jorgemanrubia commented Jul 13, 2021 •

edited

jorgemanrubia commented Jul 13, 2021 •

edited

jorgemanrubia commented Jul 14, 2021 •

edited

jorgemanrubia commented Jul 14, 2021 •

edited