Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

local runner spawns one transpilation process for each feature file with cucumber framework #4949

Closed
joscha opened this issue Jan 10, 2020 · 30 comments

Comments

@joscha
Copy link

joscha commented Jan 10, 2020

Environment (please complete the following information):

  • WebdriverIO version: 5.18.4
  • Mode: not sure what that means - local?
  • If WDIO Testrunner, running sync/async: sync
  • Node.js version: v10.16.0
  • NPM version: not using npm; yarn 1.21.1
  • Browser name and version: Chrome, but does not matter for this problem
  • Platform name and version: OSX Mojave 10.14.6 (18G87)
  • Additional wdio packages used (if applicable):
"@wdio/cli": "^5.18.4",
"@wdio/cucumber-framework": "^5.16.15",
"chai": "^4.2.0",
"chai-as-promised": "^7.1.1",
"ts-node": "^8.5.4",
"tsconfig-paths": "^3.9.0",
"typescript": "^3.7.4"
"@wdio/local-runner": "^5.18.4",
"@wdio/mocha-framework": "^5.16.15",
"@wdio/spec-reporter": "^5.16.11",
"@wdio/sync": "^5.17.0",
"chromedriver": "^79.0.0",
"wdio-chromedriver-service": "^5.0.2"

Config of WebdriverIO

https://github.com/joscha/wdio-ts-no-recompile/blob/master/wdio.conf.ts

EDIT: https://github.com/joscha/wdio-ts-no-recompile/blob/master/wdio.conf.js (removed one layer of Typescript to show the problem without noise)

Describe the bug

When using the cucumber framework with the local runner and Typescript it turns out that because the local runner spawns one subprocess per feature file the project gets transpiled n times for n feature files.

Our (mono-)repository has ~250 feature files and tens of thousands of Typescript files, which means that the overhead is immense (about 1 minute for each transpilation step; and that's without type-checking enabled).

We are trying to upgrade from wdio 4 (4.14.4) to wdio 5 and with the current local runner it doesn't seem possible to make this work.

Current thought is to do one of the following:

  • Open a PR to the current local runner in this repository to teach it to run all feature files in once process if so desired (something something maxWorkers=1)
  • Writing our own runner (am I thinking of doing this in the right spot or should it be a launcher?); downside of this is that we lose all the nice things the local runner brings (logging, repl, etc.)
  • Writing a service to make this work (I am unsure of whether this would work, hints appreciated).

To Reproduce
Steps to reproduce the behavior:

  • Add typescript to your wdio project
  • enable cucumber
  • Add more than one feature file
  • Run the tests with the local runner
  • use TS_NODE_DEBUG=true wdio to run wdio to see transpilation output (you should see one for each feature file being run)

Example repo is here: https://github.com/joscha/wdio-ts-no-recompile - you can run yarn test in it to see log output from ts-node three times (once for the wdio config and once for each feature file)

Expected behavior

transpilation should only happen once for all feature files.

cc @olebedev @gustavohenke

@mgrybyk
Copy link
Member

mgrybyk commented Jan 10, 2020

There is nothing can be done on our side, transpile your code before running tests instead of using in memory transpilers.

It's not about webdriverio but how nodejs work. The memory cannot be shared between main and forked nodejs instances, so every package is loaded for every nodejs process, including in memory transpilers which are required to load typescript files

@joscha
Copy link
Author

joscha commented Jan 10, 2020

There is nothing can be done on our side, transpile your code before running tests instead of using in memory transpilers.

that means that I have a disconnect between the code that I am developing on, e.g. there is an indirection in the files I watch and the ones I edit. This has potential implications on sourcemaps/errors, breakpoints, etc.

It's not about webdriverio but how nodejs work. The memory cannot be shared between main and forked nodejs instances, so every package is loaded for every nodejs process, including in memory transpilers which are required to load typescript files

I am not sure this is 100% correct. wdio 4 works fine with this, e.g. I have one transpilation process, all features are run against the output of it - the problem is the runner in wdio5 which creates child processes. I understand that the current setup of the runner can't work but it basically means that wdio5 is unusable for non-native JS on a big codebase; I'd appreciate some help in solving this - wdyt about any of the possible solutions I outlined above (own runner, changing the ootb local runner to be able to use one process, service)?

@mgrybyk
Copy link
Member

mgrybyk commented Jan 10, 2020

Yes, it's correct, wdio v5 works differently.
I've already pointed a solution - handle transpilation process by yourself. You can transpile files in onPrepare hook, watch for changes in transpiled files, etc.

@christian-bromann
Copy link
Member

mhm .. I wonder how v4 is different then v5. In both (v4 and v5 we fork a child process. Also in both we would transpile code in the forked process.

I can see the following solutions:

  • compile code in the onPrepare and point spec files and helpers to the compiled version, so this way you only do it once for all workers.
  • compile only required files instead of all

@joscha thanks for reporting this. I think this is an interesting use case which we haven't run into yet. I would love to provide a built in solution for this. What do you think would work best?

@joscha
Copy link
Author

joscha commented Jan 10, 2020

I think ideally we'd transpile in memory once and then use the result for all feature runs. Watch mode should be based on the actual files loaded ideally, so my preferred solution would be to modify the local runner to configure it to use onechild process and then run each feature file sequentially against it. I played around with ts-node-dev and can imagine it working for this.

The optimum would be a configuration option for the OOTB local runner to support this, but if you think it is an edge case and don't want to support it the second best option would be to write our own runner and use it with a custom config.

@christian-bromann
Copy link
Member

christian-bromann commented Jan 10, 2020

then run each feature file sequentially against it

To clarify, your scenario is running tests in watch mode and recompile them when a feature file is changed, right?

for the OOTB local runner

What is the OOTB local runner?

@joscha
Copy link
Author

joscha commented Jan 11, 2020

To clarify, your scenario is running tests in watch mode and recompile them when a feature file is changed, right?

Not yet. I can't even get a normal test run through because even in non-watch mode there is a transpile step per feature file (see sample repository; you can run yarn test in it to see the output). But in the end, yes, tests should rerun whenever a feature file changes and/or a ts file the steps depend on.

What is the OOTB local runner?

Sorry, OOTB = out of the box - the wdio-local-runner package in this repo.

@joscha
Copy link
Author

joscha commented Jan 13, 2020

Sorry, OOTB = out of the box - the wdio-runner-local package in this repo.

e.g.:

const childProcess = this.childProcess = child.fork(path.join(__dirname, 'run.js'), argv, {
cwd: process.cwd(),
env: runnerEnv,
execArgv,
stdio: ['inherit', 'pipe', 'pipe', 'ipc']
})

@joscha
Copy link
Author

joscha commented Jan 13, 2020

I tried a few things now:

  • removing the TS-based config - this reduces some of the overhead as at least not all files of the repo need to be compiled each time. This makes things better but it doesn't change the transpilation per feature.
  • reusing the workers across suites, e.g.:
    const suite = options.cid.split('-')[0];
    let worker;
    if(!this.workerPool[suite]) {
      log.info(`New worker for ${suite}`)
      worker = new _worker.default(this.config, options, this.stdout, this.stderr);
      this.workerPool[suite] = worker;
    } else {
      log.info(`Reusing worker for ${suite}`)
      worker = this.workerPool[suite];
    }

but unfortunately this doesn't work properly as workers build up state when they are first created:

this.cid = cid
this.config = config
this.configFile = configFile
this.caps = caps
this.specs = specs
this.server = server || {}
this.execArgv = execArgv
this.retries = retries
this.isBusy = false
this.stdout = stdout
this.stderr = stderr
; I think making the workers stateless could work, but I am not sure right now how they report whether they are busy or not, so I am not 100% confident this will work

  • writing my own runner (this seems to be much harder than it should be; I thought I could create a Runner instance in the same process and then reuse it, but the default Runner expects to be run in a child process, see
    process.send({ name: 'testFrameworkInit', content: { cid, caps, specs, hasTests: this.framework.hasTests() } })

Thoughts:

  • Change the way the spec runner groups features together, so that they all appear in one suite (
    runSpecs () {
    let config = this.configParser.getConfig()
    /**
    * stop spawning new processes when CTRL+C was triggered
    */
    if (this.hasTriggeredExitRoutine) {
    return true
    }
    while (this.getNumberOfRunningInstances() < config.maxInstances) {
    let schedulableCaps = this.schedule
    /**
    * bail if number of errors exceeds allowed
    */
    .filter(() => {
    const filter = typeof config.bail !== 'number' || config.bail < 1 ||
    config.bail > this.runnerFailed
    /**
    * clear number of specs when filter is false
    */
    if (!filter) {
    this.schedule.forEach((t) => { t.specs = [] })
    }
    return filter
    })
    /**
    * make sure complete number of running instances is not higher than general maxInstances number
    */
    .filter(() => this.getNumberOfRunningInstances() < config.maxInstances)
    /**
    * make sure the capability has available capacities
    */
    .filter((a) => a.availableInstances > 0)
    /**
    * make sure capability has still caps to run
    */
    .filter((a) => a.specs.length > 0)
    /**
    * make sure we are running caps with less running instances first
    */
    .sort((a, b) => a.runningInstances > b.runningInstances)
    /**
    * continue if no capability were schedulable
    */
    if (schedulableCaps.length === 0) {
    break
    }
    let specs = schedulableCaps[0].specs.shift()
    this.startInstance(
    specs.files,
    schedulableCaps[0].caps,
    schedulableCaps[0].cid,
    schedulableCaps[0].seleniumServer,
    specs.rid,
    specs.retries
    )
    schedulableCaps[0].availableInstances--
    schedulableCaps[0].runningInstances++
    }
    ) because currently the runner receives one event per spec file:

e.g.: for 1.feature and 2.feature we don't get something like this:

{ cid: '0-0',
  command: 'run',
  configFile: '/Users/joscha/work/wdio-ts-no-recompile/wdio.conf.js',
  argv:  { ...},
  caps: { maxInstances: 5, browserName: 'chrome' },
  specs:
   [
     '/Users/joscha/work/wdio-ts-no-recompile/test/features/1.feature'
     '/Users/joscha/work/wdio-ts-no-recompile/test/features/2.feature'
   ],
  server: { ... },
  execArgv: [],
  retries: undefined
}

but two separate events like this:

{ cid: '0-0',
  command: 'run',
  configFile: '/Users/joscha/work/wdio-ts-no-recompile/wdio.conf.js',
  argv:  { ...},
  caps: { maxInstances: 5, browserName: 'chrome' },
  specs:
   [ '/Users/joscha/work/wdio-ts-no-recompile/test/features/1.feature' ],
  server: { ... },
  execArgv: [],
  retries: undefined
}
{ cid: '0-1',
  command: 'run',
  configFile: '/Users/joscha/work/wdio-ts-no-recompile/wdio.conf.js',
  argv:  { ...},
  caps: { maxInstances: 5, browserName: 'chrome' },
  specs:
   [ '/Users/joscha/work/wdio-ts-no-recompile/test/features/2.feature' ],
  server: { ... },
  execArgv: [],
  retries: undefined
}

@joscha
Copy link
Author

joscha commented Jan 13, 2020

Seems as if I can fix the performance problem by combining spec files into one suite, e.g. if I change this part of the Launcher:

this.schedule.push({
cid: cid++,
caps: capabilities,
specs: this.configParser.getSpecs(capabilities.specs, capabilities.exclude).map(s => ({ files: [s], retries: specFileRetries })),
availableInstances: capabilities.maxInstances || config.maxInstancesPerCapability,
runningInstances: 0,
seleniumServer: { hostname: config.hostname, port: config.port, protocol: config.protocol }
})

from:

        specs: this.configParser.getSpecs(caps.specs, caps.exclude).map(s => ({
          files: [s],
          retries: specFileRetries
        })),

to:

          specs: [{
            files: this.configParser.getSpecs(capabilities.specs, capabilities.exclude),
            retries: specFileRetries
          }],
  • all tests pass, but the spec reporter doesn't print the results properly anymore, e.g.:

joscha_Joschas-MacBook-Pro____work_wdio-ts-no-recompile

instead of the correct:

joscha_Joschas-MacBook-Pro____work_wdio-ts-no-recompile

@joscha
Copy link
Author

joscha commented Jan 13, 2020

@christian-bromann the attempt merging the spec files in one suite seems to be the most promising. Is there any downside you can see? If we made this configurable via a maxSpecsPerSuite which defaults to 1 (current behaviour) and I fix up the reporting, is this something you could accept to merge?

@christian-bromann
Copy link
Member

Is there any downside you can see?

Yes, these spec files are being run consecutively which slows down your test execution time.

If we made this configurable via a maxSpecsPerSuite which defaults to 1 (current behaviour) and I fix up the reporting, is this something you could accept to merge?

Sounds like an acceptable solution to me. I would love to get the reporting fixed as part of it.

@joscha
Copy link
Author

joscha commented Jan 13, 2020 via email

@joscha
Copy link
Author

joscha commented Jan 16, 2020

I have a fix + tests in https://github.com/webdriverio/webdriverio/compare/master...joscha:joscha/specs-per-suite?expand=1

but for the reporting output it seems it's not actually the spec reporter that is the problem, but the data that's being passed on to the reporters. It seems incorrect:

[ SuiteStats {
    type: 'feature',
    start: 2020-01-16T06:09:34.905Z,
    _duration: 13807,
    uid: 'Performing a Yahoo Search 21',
    cid: '0-0',
    title: 'Performing a Yahoo Search 2',
    fullTitle: undefined,
    tags: [],
    tests: [],
    hooks: [],
    suites: [ [SuiteStats], [SuiteStats] ],
    hooksAndTests: [],
    end: 2020-01-16T06:09:48.712Z },
  SuiteStats {
    type: 'scenario',
    start: 2020-01-16T06:09:34.908Z,
    _duration: 7897,
    uid: 'Performing a search operation11',
    cid: '0-0',
    title: 'Performing a search operation',
    fullTitle:
     'Performing a Yahoo Search 11: Performing a search operation',
    tags: [],
    tests: [ [TestStats], [TestStats], [TestStats], [TestStats] ],
    hooks: [ [HookStats], [HookStats] ],
    suites: [],
    hooksAndTests:
     [ [HookStats],
       [TestStats],
       [TestStats],
       [TestStats],
       [TestStats],
       [HookStats] ],
    end: 2020-01-16T06:09:42.805Z },
  SuiteStats {
    type: 'scenario',
    start: 2020-01-16T06:09:42.805Z,
    _duration: 5906,
    uid: 'Performing a search operation11',
    cid: '0-0',
    title: 'Performing a search operation',
    fullTitle:
     'Performing a Yahoo Search 21: Performing a search operation',
    tags: [],
    tests: [ [TestStats], [TestStats], [TestStats], [TestStats] ],
    hooks: [ [HookStats], [HookStats] ],
    suites: [],
    hooksAndTests:
     [ [HookStats],
       [TestStats],
       [TestStats],
       [TestStats],
       [TestStats],
       [HookStats] ],
    end: 2020-01-16T06:09:48.711Z },
  SuiteStats {
    type: 'scenario',
    start: 2020-01-16T06:09:34.908Z,
    _duration: 7897,
    uid: 'Performing a search operation11',
    cid: '0-0',
    title: 'Performing a search operation',
    fullTitle:
     'Performing a Yahoo Search 11: Performing a search operation',
    tags: [],
    tests: [ [TestStats], [TestStats], [TestStats], [TestStats] ],
    hooks: [ [HookStats], [HookStats] ],
    suites: [],
    hooksAndTests:
     [ [HookStats],
       [TestStats],
       [TestStats],
       [TestStats],
       [TestStats],
       [HookStats] ],
    end: 2020-01-16T06:09:42.805Z },
  SuiteStats {
    type: 'scenario',
    start: 2020-01-16T06:09:42.805Z,
    _duration: 5906,
    uid: 'Performing a search operation11',
    cid: '0-0',
    title: 'Performing a search operation',
    fullTitle:
     'Performing a Yahoo Search 21: Performing a search operation',
    tags: [],
    tests: [ [TestStats], [TestStats], [TestStats], [TestStats] ],
    hooks: [ [HookStats], [HookStats] ],
    suites: [],
    hooksAndTests:
     [ [HookStats],
       [TestStats],
       [TestStats],
       [TestStats],
       [TestStats],
       [HookStats] ],
    end: 2020-01-16T06:09:48.711Z } ]

Some data seems to be jumbled up somehow. I think it might be the test uid; any hints on where I should look would be greatly appreciated.

@joscha
Copy link
Author

joscha commented Jan 20, 2020

Tried to find out where the reporting goes wrong. I think it is incorrect because currently the launcher assumes that there is one spec per run. I added d78f22f to fix this, but it turns out that the run specs are not reported properly either. job:end for example doesn't even carry the information on a per-spec basis, e.g. you can't tell which of the specs were successful and which ones weren't.

@christian-bromann is there any way I can pair with someone on this? I feel like maybe there are other options to solve this problem that I am missing?

@joscha
Copy link
Author

joscha commented Jan 23, 2020

trying to make specs batchable seems to be a rabbit hole.

There a multiple places that have code like this:

        if (passed) {
            this.result.passed++
            this.onSpecPass(cid, job, retryAttempts)
        } else if (retry) {
            this.totalWorkerCnt++
            this.result.retries++
            this.onSpecRetry(cid, job, retryAttempts)
        } else {
            this.result.failed++
            this.onSpecFailure(cid, job, retryAttempts)
        }

(see the passed, retried and failed state being just incremented?)

This code would at least need to become:

        if (passed) {
            this.result.passed += job.specs.length
            this.onSpecPass(cid, job, retryAttempts)
        } else if (retry) {
            this.totalWorkerCnt++
            this.result.retries += job.specs.length
            this.onSpecRetry(cid, job, retryAttempts)
        } else {
            this.result.failed += job.specs.length
            this.onSpecFailure(cid, job, retryAttempts)
        }

but even that is not entirely correct because each spec of a job has its own status, but webdriverio currently assumes that there is only ever one spec per job so a lot of the conditions are simplified to just counting up or down by 1. In order for these places to fix the way we store state about specs inside suites needs to be changed which pretty much affect the whole codebase from launcher over runner to reporter.

@joscha
Copy link
Author

joscha commented Jan 23, 2020

I unsuccessfully tried the following:

  • batch specs (it works for running but reporting is broken)
  • reusing the worker instance, leaving the runner child process inside alive across multiple runs (I get some strange configuration errors)
  • Replacing the whole local-runner with my own; I should probably give this another try but I am worried even if I can make it work to lose all the debugger and REPL features that are part of the official local-runner

Things that I haven't tried, yet:

  • Spawning new workers but keeping the internal runner instance around as a shared global

@mgrybyk
Copy link
Member

mgrybyk commented Jan 23, 2020

How about giving another shot for manual transpilation? I had it in a big project and didn't experience any serious issues.

@joscha
Copy link
Author

joscha commented Jan 23, 2020 via email

@mgrybyk
Copy link
Member

mgrybyk commented Jan 23, 2020

Yes, I transpile code in onPrepare hook, so it happens only once, and doing some magic in debug when running single file in VSCode because it's required to change ts to js.
Haven't tried watching so many files, but it should be more or less same. Watch transpiled files instead, then when you did changes in your ts files, transpile them.

I had some minor issues, but nothing critical. I end up using ts node when I need to debug single file, and transpiling files in advance for other cases.

Consider using source maps for your ts files.

@joscha
Copy link
Author

joscha commented Jan 23, 2020

I think ideally the debug case and the normal case would be the same. I am really keen to make this work out of the box so all docs, etc. for everything hold true for this case as well and I don't have to have 200+ developers relearn how to debug just because we do a wdio upgrade.

I also think that this case comes up a lot more often than you know - if we had a small codebase that transpiles fast I wouldn't have noticed either, even though some seconds are lost in each test run.

The reason I noticed is because the transpilation literally takes 40s to 1m and wdio 5+ becomes unusable for us.

If we do the calculation for a smaller company and assume 100 feature files and a small transpilation overhead of 5s, then we are still looking at an overall test slowdown of ~8min (99*5/60) which is bearable but something I'd still call substantial.
Now everyone could build their own onPrepare workaround and quirky debug experience but I really think this should be fast by default.

@mgrybyk
Copy link
Member

mgrybyk commented Jan 23, 2020

Absolutely agree, feel free to help!

@joscha
Copy link
Author

joscha commented Jan 24, 2020

I opened a draft pull request here: #4975 for discussion. It solves the problem of multiple transpilations whilst keeping reporting intact. I was hoping to contain the changes to the wdio-local-runner package only, but it seems as if a few other packages also assume to be executed only once per process and some assume they are always running in a child process, so the solution is not complete. Feedback would be highly appreciated.

@christian-bromann
Copy link
Member

@joscha I am trying to pick up this conversation and I am wondering if it would be possible to extend the example repo so that we can see the increased transpilation time there (not up to a minute but like 10s). Something that is weird is that we have been using child processes in v4 as well and I don't see where v5 has changed that would make this a problem.

@joscha
Copy link
Author

joscha commented Apr 6, 2020

Okay, will try to extend the example. Did you check out my first fix where I batch tests together? It solves the problem and you can even see the speed increase in the simple repository I created.

@christian-bromann
Copy link
Member

Did you check out my first fix where I batch tests together?

Yes, I commented on it.

you can even see the speed increase in the simple repository I created.

It runs faster because it doesn't spin up a new session for every small feature file. However this would tremendously slow down tests with larger test files and more capabilities as they are not run in parallel.

@olebedev
Copy link

olebedev commented Apr 9, 2020

Hi @christian-bromann,

I created a draft PR which shows the issue in the extended example repo, please take a look #5233

UPD: the codebase not large enough to show a noticeable compile time. Allow me to improve it. Will post here a link to a separate repo.

/cc @joscha

@olebedev
Copy link

olebedev commented Apr 9, 2020

Here is the PR joscha/wdio-ts-no-recompile#2 that adds some ts files to increase a compile time to make it noticeable. @christian-bromann please take a look.

@joscha
Copy link
Author

joscha commented Apr 22, 2020

@christian-bromann did you have time to check out the sample, yet?

@christian-bromann
Copy link
Member

We will tackle running multiple specs in a single worker in #6469

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants