Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make parallel testing db faster #33479

Merged

Conversation

twitnithegirl
Copy link
Contributor

@twitnithegirl twitnithegirl commented Jul 31, 2018

Summary

Currently TestDatabases.create_and_migrate for parallel tests uses migrate. For a large number of workers this makes the tests run pretty slowly. The PR changes this to load_schema which shows an increase of speed.

Benchmarks below are for a sample app with 6 tables, 10 workers, and generated tests.

Before: screen shot 2018-07-27 at 2 19 14 pm
After: screen shot 2018-07-27 at 2 19 09 pm

/cc @eileencodes

@rails-bot
Copy link

Thanks for the pull request, and welcome! The Rails team is excited to review your changes, and you should hear from @kaspth (or someone else) soon.

If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. Due to the way GitHub handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes.

This repository is being automatically checked for code quality issues using Code Climate. You can see results for this analysis in the PR status below. Newly introduced issues should be fixed before a Pull Request is considered ready to review.

Please see the contribution instructions for more information.

@twitnithegirl
Copy link
Contributor Author

twitnithegirl commented Jul 31, 2018

I'm just dropping the CI failures here so I know what to look at tomorrow:

Ruby 2.4.4, 2.5.1 & Head GEM-Railties:

.F
Failure:
ApplicationTests::TestTest#test_hooks_for_plugins [test/application/test_test.rb:319]:
Expected "before: false\n-- create_table(:users)\n   -> 0.0036s\nafter: true\nRun options: --seed 33770\n\n# Running:\n\n.\n\nFinished in 0.010207s, 97.9722 runs/s, 0.0000 assertions/s.\n1 runs, 0 assertions, 0 failures, 0 errors, 0 skips\n" to include "before: false\nafter: true".
rails test test/application/test_test.rb:281
.F
Failure:
ApplicationTests::TestTest#test_ruby_schema_migrations [test/application/test_test.rb:143]:
`Expected "-- create_table(:users)\n   -> 0.0054s\nRun options: --seed 8880\n\n# Running:\n\n.\n\nFinished in 0.016965s, 58.9435 runs/s, 0.0000 assertions/s.\n1 runs, 0 assertions, 0 failures, 0 errors, 0 skips\n" to not include "create_table(:users)".

ensure
ActiveRecord::Base.establish_connection(ActiveRecord::Base.configurations[Rails.env])
ENV["VERBOSE"] = old
end

def self.drop(i, spec_name:)
def self.drop(spec_name:)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should not the i be used here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It took me a couple of looks to see why this is okay: create_and_migrate modifies the original connection spec hash, so the rest of the (forked) app, including this method, sees that already-suffixed DB name.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that is confusing.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had put the i in when I wrote this and while pairing with Britni on fixing this realized that it isn't used so we should delete it.

@@ -247,6 +247,7 @@ def structure_load(*arguments)

def load_schema(configuration, format = ActiveRecord::Base.schema_format, file = nil, environment = env, spec_name = "primary") # :nodoc:
file ||= dump_filename(spec_name, format)
verbose_was, Migration.verbose = Migration.verbose, verbose?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is Migration.verbose thread-safe?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is any of this? It runs in a fork, so I think it should be fine.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But doesn't the parallel test also have the threaded mode? I guess this don't run on those threads but before it in a fork.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes it does have a threaded mode, but we don't make new databases when we're using the threaded parallelizer because it uses new connections instead.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need these changes for #load_schema? Is this method supposed to work with migrations?
If remove these change, tests in railties/test/application/test_test.rb succeed. See #33479 (comment)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we don't do this running your tests in parallel will display schema output for every single process which is noisy and unnecessary.

@kaspth
Copy link
Contributor

kaspth commented Jul 31, 2018

Cool! What happens if I create a migration that adds some columns, forget to migrate the test database, and then run my tests. Do the tests fail in a sufficiently obvious manner that I can know I need to run the migration?

@eileencodes
Copy link
Member

@kaspth this doesn't change how your tests would run if you didn't run a migration. You'd still get blocked by pending migrations error because the create_and_migrate part runs after the tests boot. This just speeds up setting up the databases for test-database-0 and test-database-1 and doesn't affect test-database.

ActiveSupport::Testing::Parallelization.run_cleanup_hook do |i|
drop(i, spec_name: Rails.env)
ActiveSupport::Testing::Parallelization.run_cleanup_hook do |_|
drop(spec_name: Rails.env)
end

def self.create_and_migrate(i, spec_name:)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's also rename this method to create_and_load since we're no longer migrating.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it! Thanks.

@rafaelfranca
Copy link
Member

Thank you @eileencodes for the clarifications, now I understand better the parallel test implementation.

Nice improvement @twitnithegirl 👏.

@twitnithegirl
Copy link
Contributor Author

twitnithegirl commented Jul 31, 2018

@rafael @eileencodes @kaspth I just pushed a change that I think will fix CI for railties/application/test_test.rb. But I feel like we're making some strange assumptions. It seems that we assume that migrations are verbose by default but that schema loading is silent by default. If you don't set ENV['VERBOSE'] then verbose? evaluates to true because anything but "false" evaluates to true. The tests, at least in railties, assume that not setting verbose will result in shema_loads being silent. Granted, before we weren't even bothering with Migration.verbose at all in the schema_load but I think that verbose? itself is misleading, or at least has some disjointed logic in practice.

@eileencodes
Copy link
Member

@twitnithegirl can you rebase and squash your commits into one?

@twitnithegirl
Copy link
Contributor Author

@eileencodes sure, should I update the changelog too?

@twitnithegirl twitnithegirl force-pushed the make_parallel_testing_db_faster branch from 6ac0e99 to e6173c9 Compare August 1, 2018 01:26
@twitnithegirl
Copy link
Contributor Author

All squashed up

@eileencodes eileencodes merged commit 78992d6 into rails:master Aug 1, 2018
@eileencodes
Copy link
Member

Thanks @twitnithegirl - no need for a changelog since this is a performance improvement and not actually in a version of Rails yet.

bogdanvlviv added a commit to bogdanvlviv/rails that referenced this pull request Aug 8, 2018
rails#33479 changed `#load_schema` to
prevent displaying schema load on running tests in parallel.
We should test this in order to prevent any regression in the future.

Context rails#33479 (comment)
bogdanvlviv added a commit to bogdanvlviv/rails that referenced this pull request Dec 3, 2018
- Fix formatting
- Don't repeat "Active Record automatically handles creating and migrating a new
  database for each worker to use."
- Tell that AR loads the schema to a database for each process(Related to rails#33479)
- Clarify that `parallelize_teardown` is executed for each process
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants