New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parallel testing #31900

Merged
merged 1 commit into from Feb 16, 2018

Conversation

@eileencodes
Member

eileencodes commented Feb 5, 2018

@tenderlove and I worked on this which adds parallel testing to Rails applications by default. New applications will have parallel testing enabled by default, and older applications can add it to their test helper:

class ActiveSupport::TestCase
  parallelize(workers: 2)
end

Parallel testing in this implementation utilizes forking processes over threads. The reason we (tenderlove and eileencodes) chose forking processes over threads is forking will be faster with single databases, which most applications will use locally. Using threads is beneficial when tests are IO bound but the majority of tests are not IO bound. NOTE: after some experimentation we added a threaded parallelization method as well, but forked processes are still the default.

If an application doesn't want to use parallel testing they can either remove the parallelize block from the test application or set PARALLEL_WORKERS to 1 or fewer. For environments where you want to change the default number of workers from what you've set in your test_helper.rb you can export / set an environment variable to change the number of workers used. The following will use 15 workers and split the tests into 15 processes.

PARALLEL_WORKERS=15 bin/rails test

Note: While parallel testing will work with multiple primary databases, it currently doesn't rollback fixtures correctly. I'm actively working on fixing that but decided it was out of scope for this particular feature, since fixing it is not a feature of parallel testing but rather a bug / inconsistency in how Rails is handled. The fix for that should be coming shortly. Parallel testing and multiple primary databases does work with tests if not using fixtures. I'm not sure why I thought this but I just tested it locally again and the fixtures work. I think I had a bug in my setup the last time I tested this.

If you have multiple databases they can be setup like this in your test_helper.rb

class ActiveSupport::TestCase
  parallelize(workers: 2)

  parallelize_setup do |worker|
    # create a db w/ worker. Runs after processes are forked
  end

  parallelize_teardown do |worker|
    # delete the test databases or other cleanup. Runs before processes are closed
  end
end

To do:

  • Documentation
  • Guides
  • CHANGELOG

cc/ @tenderlove @dhh

@eileencodes eileencodes added this to the 6.0.0 milestone Feb 5, 2018

@eileencodes eileencodes self-assigned this Feb 5, 2018

@tenderlove

This comment has been minimized.

Member

tenderlove commented Feb 5, 2018

This looks great! 3 things off the top of my head:

  1. I'm not sure if Windows supports unix sockets. We should test that and possibly opt-out Windows folks.
  2. JRuby definitely doesn't support fork. We might be able to get this to work with Process.spawn out of the box, but maybe we should opt-out JRuby folks for now too?
  3. I like the class-level parallelize API for setting the number of workers, but do we care if people call the method more than once? (I think the answer is "no", but I just want to check)
@nynhex

This comment has been minimized.

nynhex commented Feb 5, 2018

This is great work! Thanks @eileencodes @tenderlove ❤️

@jasl

This comment has been minimized.

Contributor

jasl commented Feb 6, 2018

It seems Windows 10 upcoming RS4 release will support unix sockets partially, see https://blogs.msdn.microsoft.com/commandline/2017/12/19/af_unix-comes-to-windows/

@metacritical

This comment has been minimized.

metacritical commented Feb 6, 2018

Does anyone use rails on Windows either development or production? MRI is still large i dont know anyone using JRUby on production. Opting them out for this release is OK.

@GBH

This comment has been minimized.

Contributor

GBH commented Feb 6, 2018

Is this a replacement for https://github.com/grosser/parallel_tests ?

@eileencodes

This comment has been minimized.

Member

eileencodes commented Feb 6, 2018

@metacritical I don't want to discount users of JRuby or Windows, however I think since we're a ways out from Rails 6.0 there's a ton of potential for this feature to evolve. I'd love to see a way to use threads over processes if someone finds that useful. I don't think it's necessary for the first iteration though. Once we have an API nailed down and this merged we can iterate on it and add a threads feature.

@GBH yes technically it replaces it, but not because we think the gem is doing anything incorrect. We're not personally using it at GitHub but we wrote our own implementation based on how we parallelize our own tests.

activesupport/lib/active_support/testing/parallelization.rb Outdated
@url = "drbunix://#{file}"
@pool = []
DRb.start_service(@url, @queue)

This comment has been minimized.

@bogdanvlviv

bogdanvlviv Feb 6, 2018

Contributor

I think we should generate uri by DRb in order to guarantee uniqueness. See #31591.

activesupport/lib/active_support/testing/parallelization.rb Outdated
require "drb"
require "drb/unix"
require "tempfile"

This comment has been minimized.

@bogdanvlviv

bogdanvlviv Feb 6, 2018

Contributor

looks like we can remove require "tempfile" since we don't use Tempfile in this file.

activerecord/lib/active_record/test_databases.rb Outdated
ActiveRecord::Tasks::DatabaseTasks.create(connection_spec)
ActiveRecord::Base.establish_connection(connection_spec)
if ActiveRecord::Base.connection.migration_context.needs_migration?
ActiveRecord::Tasks::DatabaseTasks.migrate

This comment has been minimized.

@bogdanvlviv

bogdanvlviv Feb 6, 2018

Contributor

I think we should always execute (and rely on) ActiveRecord::Tasks::DatabaseTasks.migrate to ensure full setup for test database since this creates tables like schema_migrations, ar_internal_metadata etc.

This comment has been minimized.

@eileencodes

eileencodes Feb 7, 2018

Member

They'll always need a migration because they'll always be brand new. I think however I'm going to drop this implementation and load up the structure/schema instead because it will be faster.

activerecord/lib/active_record/test_databases.rb Outdated
ActiveRecord::Tasks::DatabaseTasks.migrate
end
ensure
ActiveRecord::Base.establish_connection(ActiveRecord::Base.configurations[Rails.env])

This comment has been minimized.

@bogdanvlviv

bogdanvlviv Feb 6, 2018

Contributor

It is just question: can't realize why do we need to re-establish the connection?

This comment has been minimized.

@eileencodes

eileencodes Feb 7, 2018

Member

We need to re-establish the original connection to AR Base since we want the test connection, not the other db connection. Otherwise the tests will try to all run against AR Base with the other db connection, not the test connection.

activerecord/lib/active_record/test_databases.rb Outdated
require "active_support/testing/parallelization"
module ActiveRecord
module TestDatabases

This comment has been minimized.

@bogdanvlviv

bogdanvlviv Feb 6, 2018

Contributor

I think we should express this module as private api by # :nodoc:

This comment has been minimized.

@eileencodes

eileencodes Feb 7, 2018

Member

I haven't decided yet if I want to make this private or document it. I think it could be useful for setting up test dbs when you have multiple databases.

This comment has been minimized.

@eileencodes

eileencodes Feb 15, 2018

Member

Decided to no doc this for now. Easier to document later than undocument.

@alissonbrunosa

This comment has been minimized.

alissonbrunosa commented Feb 7, 2018

I'm afraid this problem is beyond the scope, but is there any way to define whether a test should run on threads or processes? For instance, if I had a test that is IO bound, it would be awesome to run on threads.
I don’t know, maybe this will solve the JRuby’s problem with fork as well.

@yeongsheng-tan

This comment has been minimized.

yeongsheng-tan commented Feb 7, 2018

Nice and sweet. 👏💖

end
end
def <<(o)

This comment has been minimized.

@simi

simi Feb 7, 2018

Contributor

What not use delegator for << and pop methods?

end
end
def <<(o)

This comment has been minimized.

@dogweather

dogweather Feb 7, 2018

Contributor

I'm wondering what o is — object? Is there some name we could give it that'd help describe what it's intended to be?

@pangolingo

This comment has been minimized.

pangolingo commented Feb 7, 2018

Sometimes I set breakpoints in tests with Pry or Byebug. I assume the might not work with multiple processes. Could we add a note to the docs mentioning that? (Sorry I haven't had a chance to test this for myself.)

@yuki24

This comment has been minimized.

Contributor

yuki24 commented Feb 8, 2018

@pangolingo when debugging with pry or byebug you should probably run a single test rather than all of them. But at least byebug works well in a multi-threaded environment (pauses other threads when it hits byebug and resumes upon continue). I'm not sure how it behaves in a forked process.

activesupport/lib/active_support/test_case.rb Outdated
end
# Cleanup required for parallel testing. This can be used to drop databases
# if you're app uses multiple write/read databases or other clean up before

This comment has been minimized.

@tjschuck

tjschuck Feb 14, 2018

Contributor

you're => your 😅

guides/source/testing.md Outdated
end
```
The number of workers passes is the number of times the process will be forked. You may want to

This comment has been minimized.

@tjschuck

tjschuck Feb 14, 2018

Contributor

passes => passed

guides/source/testing.md Outdated
process. The databases will be suffixed with the number corresponding to the worker. For example, if you
have 2 workers the tests will create `test-database-0` and `test-database-1` respectively.
If the number of workers passes is 1 or fewer the processes will not be forked and the tests will not

This comment has been minimized.

@tjschuck

tjschuck Feb 14, 2018

Contributor

passes => passed

@eileencodes

This comment has been minimized.

Member

eileencodes commented Feb 15, 2018

Decided to add a threaded parallelizer after all. JRuby apps will automatically be generated using the threaded one. If you want to use threads just add with: :threads as a keyword argument.

I've updated docs, guides, and added a changelog.

connection_spec["database"] += "-#{i}"
ActiveRecord::Tasks::DatabaseTasks.create(connection_spec)
ActiveRecord::Base.establish_connection(connection_spec)
ActiveRecord::Tasks::DatabaseTasks.migrate

This comment has been minimized.

@eileencodes

eileencodes Feb 15, 2018

Member

I plan on replacing this with a structure load or a straight copy of the database later on but currently structure load doesn't work with multiple databases so sticking with migrate for now.

@eileencodes eileencodes changed the title from WIP: Parallel testing to Parallel testing Feb 15, 2018

Add test parallelization to Rails
Provides both a forked process and threaded parallelization options. To
use add `parallelize` to your test suite.

Takes a `workers` argument that controls how many times the process
is forked. For each process a new database will be created suffixed
with the worker number; test-database-0 and test-database-1
respectively.

If `ENV["PARALLEL_WORKERS"]` is set the workers argument will be ignored
and the environment variable will be used instead. This is useful for CI
environments, or other environments where you may need more workers than
you do for local testing.

If the number of workers is set to `1` or fewer, the tests will not be
parallelized.

The default parallelization method is to fork processes. If you'd like to
use threads instead you can pass `with: :threads` to the `parallelize`
method. Note the threaded parallelization does not create multiple
database and will not work with system tests at this time.

parallelize(workers: 2, with: :threads)

The threaded parallelization uses Minitest's parallel exector directly.
The processes paralleliztion uses a Ruby Drb server.

For parallelization via threads a setup hook and cleanup hook are
provided.

```
class ActiveSupport::TestCase
  parallelize_setup do |worker|
    # setup databases
  end

  parallelize_teardown do |worker|
    # cleanup database
  end

  parallelize(workers: 2)
end
```

[Eileen M. Uchitelle, Aaron Patterson]
connection_spec = ActiveRecord::Base.configurations[spec_name]
connection_spec["database"] += "-#{i}"

This comment has been minimized.

@simi

simi Feb 16, 2018

Contributor

What about underscore "_#{i}" to keep file based databases using underscores in file names?

This comment has been minimized.

@eileencodes

eileencodes Feb 16, 2018

Member

They're temporary databases, unless it's going to break file based dbs I don't think underscore vs dash is a big deal.

@eileencodes eileencodes merged commit 7286d81 into rails:master Feb 16, 2018

1 of 2 checks passed

continuous-integration/travis-ci/pr The Travis CI build failed
Details
codeclimate All good!
Details

@eileencodes eileencodes deleted the eileencodes:parallel-testing branch Feb 16, 2018

@IgorDobryn

This comment has been minimized.

IgorDobryn commented Feb 18, 2018

Awesome!

@Fudoshiki

This comment has been minimized.

Contributor

Fudoshiki commented Feb 21, 2018

@eileencodes
In rails 5.2

# frozen_string_literal: true

ENV['RAILS_ENV'] ||= 'test'
require_relative '../config/environment'
require 'rails/test_help'
require 'rubocop/rake_task'

RuboCop::RakeTask.new

class ActiveSupport::TestCase
  # Run tests in parallel with specified workers
  parallelize(workers: 2)

  # Add more helper methods to be used by all tests here...
end

rails t show failed rubocop tests

Now

# frozen_string_literal: true

ENV['RAILS_ENV'] ||= 'test'
require_relative '../config/environment'
require 'rails/test_help'
require 'rubocop/rake_task'

RuboCop::RakeTask.new

class ActiveSupport::TestCase
  # Add more helper methods to be used by all tests here...
end

rails t ignoring RuboCop::RakeTask.new

How use that now?

@eileencodes

This comment has been minimized.

Member

eileencodes commented Feb 24, 2018

@Fudoshiki can you open a new issue explaining the problem you're having? From your comment I don't understand what the issue is. Thanks!

when :threads
Minitest::Parallel::Executor.new(workers)
else
raise ArgumentError, "#{with} is not a supported parallelization exectutor."

This comment has been minimized.

@dapicester

dapicester Feb 26, 2018

I believe this exectutor is a typo.

This comment has been minimized.

@bogdanvlviv

bogdanvlviv Feb 26, 2018

Contributor

It was fixed by a4e226f

@johnvross

This comment has been minimized.

johnvross commented Mar 19, 2018

Someone asked if anyone is using windows for rails development. Yes. I work for a state agency and use the windows subsystem for linux on windows for all development since the fall creators update and they fixed filewatchers. Just FYI we are out here.

@chrishough

This comment has been minimized.

chrishough commented Mar 19, 2018

Does this apply to both rspec and minitest?

@eileencodes

This comment has been minimized.

Member

eileencodes commented Mar 19, 2018

Hey @johnvross I know you're out there and I'm sorry someone asked that question as it's not an opinion the core team holds. I think you should be supported through the threaded parallelizer, but I'm not sure the Unix sockets that dRB relies on will work for you.

@chrishough this is using Minitest's parallel executor so I don't think it will work for rspec per say but we wrote the API in such a way that it's easy for us to add support for another parallelizer. This feature is still very new and it will be awhile before Rails 6 is released.

@chrishough

This comment has been minimized.

chrishough commented Mar 19, 2018

Thanks @eileencodes. I was hoping it would replace https://github.com/grosser/parallel_tests, and I am definitely curious to see how this plays out.

@WaKeMaTTa

This comment has been minimized.

WaKeMaTTa commented May 24, 2018

@eileencodes if we use rspec or cucumber how we can use this feature?

@thepeoplesbourgeois

This comment has been minimized.

thepeoplesbourgeois commented Jul 16, 2018

@WaKeMaTTa I think that's the topic of the comments a few notes above yours; I think an adapter for rspec/cucumber needs to be written for this parallelizer

@WaKeMaTTa

This comment has been minimized.

WaKeMaTTa commented Jul 16, 2018

@thepeoplesbourgeois you are right. Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment