Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Accept flag --parallel to run spec files in parallel #1690

Closed
jennifer-shehane opened this issue May 8, 2018 · 7 comments · Fixed by #2154 · 4 remaining pull requests
Closed

Accept flag --parallel to run spec files in parallel #1690

jennifer-shehane opened this issue May 8, 2018 · 7 comments · Fixed by #2154 · 4 remaining pull requests
Assignees
Labels
Cypress Cloud Feature request or issue in Cypress Cloud, not App pkg/server This is due to an issue in the packages/server directory
Milestone

Comments

@jennifer-shehane
Copy link
Member

jennifer-shehane commented May 8, 2018

What users want

To run spec files in parallel in order to speed up test runs (particularly when in CI)

How we will do this

Specs will have the option of running in parallel by specifying a --parallel flag in cypress run.

Spec files will run in parallel when --parallel flag passed and when determined to be within the same homogeneous environment (determined by Cypress by determining the runs ci-group and group).

  • ci-group - is defined by the CI provider (a unique id per build usually, like CI_BUILD_NUM, it can also be specified by the flag ci-group-id if Cypress is unable to determine one). All this means is that a "run" in Cypress is the equivalent of a "run/build/workflow" in your CI provider. TLDR: Basically, you can't parallelize tests across different CI providers or different CI runs.
  • group - within a single ci-group, it is determined by taking into account the uniqueness of the combination of OS name, OS version, browser name, browser version, and defined specs. Groups can be defined yourself and named, but this is optional. Groups will be figured out automatically by Cypress otherwise.

We will automatically balance each group's specs by machine based on forecasting we have calculated based off of their previous run data:

  • Machine A gets Spec 5,9,10
  • Machine B gets Spec 4,1
  • Machine C gets Spec 6,11,8,2
  • Machine D gets Spec 3,7

Each machine will run at the same time and simply ask "for the next spec file". That means the work completes as fast as possible without needing to manually balance or tweak anything.

This will allow you to test files and have them parallelized within one run like:

my CI .yml file

// job 1
cypress run --record --parallel

Cypress will parallelize all spec files within the project across all available machines.

This will allow me to test several apps (like a monorepo) within one run and group them like:

my CI .yml file

// job 1
cypress run --group=app-1 --config integrationFolder=app-1/cypress/integration
// job 2
cypress run --group=app-2 --config integrationFolder=app-2/cypress/integration

This will allow me to test several environments within one run and group them like:

my CI .yml file

// job 1
cypress run --group=development --config=baseUrl=http://develop-app.com --env DEVELOP=true
// job 2
cypress run --group=staging --config=baseUrl=http://staging-app.com --env DEVELOP=true

What you will see

  • The Dashboard Service will display the specs within a run grouped by `group
  • Each failure will show which group it is in
  • You will be able to visualize exactly how the specs ran across the machines and also visualize how any future runs will be affected by adding or removing machines.

Dashboard Service comps
screen shot 2018-07-18 at 11 52 16 am

screen shot 2018-07-18 at 11 52 30 am

@Toxicable
Copy link
Contributor

Toxicable commented Jul 19, 2018

Would it be possible to add a simple and easy to use degree of parallelism flag?
ie: I want to run 3 specs at the same time: --parallelism=3
or
I want to run all specs at the same time: --parallelism=all, where all = number of cores - 2

This will make it much easier for Cypress to utilise all resources on whatever system it's running on.
Without adding the extra complexity of having to refactor all your specs into different directories and adding the groups proposed.

But I still think it's important to have both options avilable to fit all use cases

@jennifer-shehane
Copy link
Member Author

Hey @Toxicable, thank you for submitting feedback on our proposal! We really want to get this right.

The proposal above as it is currently is missing a bit of context in terms of the --specs flag, which probably should have been included. You will be able to pass this flag to define the specs to run in the cypress run command - this will be a pretty key flag to use in some use cases.

Could you give us some more information on your use case? What does your project look like that you want to run tests on? How are you currently choosing which files are included in a run?

@Toxicable
Copy link
Contributor

Toxicable commented Jul 19, 2018

@jennifer-shehane I don't see the usefulness of the --specs flag at scale.
We have hundrends of spec files and listing off each of them, either manually or with scripts, would be horrible.
Another issue is that our CI systems have between 48 - 128 CPU cores which would also be tedious to manage the number of cypress run commands to be run and we'd probably underutilizing the resources available.

We currently point Cypress at a single directory and expect that it'll run all spec files there.
It would be amazing if we could simply tell Cypress to run all the files in that directory in parallel using all cores available with a single Cypress run commands.

@brian-mann
Copy link
Member

@Toxicable Cores here isn't really the problem, memory is likely the bottleneck. How much memory do your machines have?

Also are they linux machines? Browsers behave differently when they are out a focus, and only a single browser can be in focus... but in linux with the use of xvfb they will behave as if they have focus (which is what we do today).

Regardless though - I believe our implementation will work flawlessly for you. It's up to you to determine how many machines you want to run in parallel - and you can easily do this with a node script or even a bash command.

const _ = require('lodash')
const os = require('os')
const cypress = require('cypress')

// kick off a bunch of cypress runs
const runs = _.times(os.cpus.length / 2, (i) => {
  return cypress.run({ record: true, parallel: true }) // this is it
})

Promise.all(runs)
.then((resultsArray) => {
  const totalFailed = _.map(resultsArray, "totalFailed")

  process.exit(totalFailed)
})

This kicks off a bunch of cypress runs based on half of your cpu cores. You may have other problems trying to run this all on a single machine. For instance your server or database would have to support multiple sesssions at the same time. If you do any kind of seeding you will have to take into account.

That's why we generally suggest parallelizing at the OS level. For instance, you could take your beefy machine and cut it into a dozen docker containers running all at the same time. It would work the same way. In most CI providers, that's how it works - they are all isolated from each other.

The --parallel flag automatically load balances the specs, you don't have to specify anything.

The grouping as @jennifer-shehane suggested is only if you want to chunk your specs by a particular group and then manage / parallelize them differently. If you have a single group, then you don't do anything.

EDIT: I just realized that spawning a bunch of cypress runs together will clobber things like screenshotsFolder and videosFolder. You could pass a configuration option using the (i) index to create those dynamically.

Also, I think there may be an issue because we don't create dynamic user data dir profiles for chrome or electron. There's already an issue open with that, and that would enable chrome to be opened multiple times and prevent session clashing with cookies or localstorage. That's a really simple, easy fix.

@Toxicable
Copy link
Contributor

@brian-mann They are Linux servers with equivilent amounts of RAM, between 128GB - 512GB I believe.
Could you elaborate on the RAM part:
What part of Cypress casues it to uses large amounts of RAM?
Chrome instances? Electron Instances? test length?

I must have misunderstood part of how the parallel flag works.
With your example is there some sort of global variable /singleton that hands off specs to the workers?

If that is all the config needed, aside from app level stuff such as database config as you mentioned then I think you're right about it fitting out use case then.

@jennifer-shehane
Copy link
Member Author

Yeah I think I was so concerned about covering the complicated edge cases in the proposal that I glanced over the standard use cases. I've updated it to hopefully be clearer that:

  • the --group flag and name are optional (only use it if you need it!)
  • Cypress will automatically parallelize all specs in your project when run with the --parallel flag if all run within a homogeneous environment (aka, all run in same OS, version, browser - everyones standard use of cypress run --record today)

@brian-mann
Copy link
Member

@Toxicable its not Cypress that eats up the memory - it is the browser. I would allocate at least 4gb per "instance". Sounds like you have plenty though, so it should be fine. Once we create new profiles for each cypress run I think this will all "just work".

Disregard my previous comment about needing to generate new screenshotsFolder and videosFolder - @jennifer-shehane pointed out that all you have to do is turn off trashAssetsBeforeRuns in the config. That would prevent race conditions where one run accidentally blows away assets from another current run.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment