New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: t31135461 / Refine jest-worker API #6676

Closed
wants to merge 19 commits into
base: master
from

Conversation

Projects
None yet
6 participants
@Kureev
Copy link

Kureev commented Jul 11, 2018

Summary

Node 10 shipped with a "threading API" that uses SharedBuffers to communicate between the main process and its child threads. Being jest-worker a parallelization library, we can take advantage of this API when available. At the same time, we'll decouple our scheduling logic from our communication logic for better re-usability.

Here are some key things:

  • Decouple the scheduling logic (i.e. everything that's not specifically to send and receive jobs)
  • Add the option to use "threads" into the library. If they are present, we'll favor them.

Performance

➜ node --experimental-worker --expose-gc test.js empty 10000
---------------------------------------------------------------------------
jest-worker: { globalTime: 272, processingTime: 217 }
worker-farm: { globalTime: 521, processingTime: 462 }
---------------------------------------------------------------------------
jest-worker: { globalTime: 244, processingTime: 210 }
worker-farm: { globalTime: 502, processingTime: 451 }
---------------------------------------------------------------------------
jest-worker: { globalTime: 290, processingTime: 253 }
worker-farm: { globalTime: 455, processingTime: 433 }
---------------------------------------------------------------------------
jest-worker: { globalTime: 262, processingTime: 229 }
worker-farm: { globalTime: 668, processingTime: 634 }
---------------------------------------------------------------------------
jest-worker: { globalTime: 280, processingTime: 222 }
worker-farm: { globalTime: 600, processingTime: 548 }
---------------------------------------------------------------------------
jest-worker: { globalTime: 222, processingTime: 181 }
worker-farm: { globalTime: 490, processingTime: 459 }
---------------------------------------------------------------------------
jest-worker: { globalTime: 225, processingTime: 181 }
worker-farm: { globalTime: 469, processingTime: 428 }
---------------------------------------------------------------------------
jest-worker: { globalTime: 220, processingTime: 180 }
worker-farm: { globalTime: 437, processingTime: 407 }
---------------------------------------------------------------------------
jest-worker: { globalTime: 214, processingTime: 176 }
worker-farm: { globalTime: 489, processingTime: 444 }
---------------------------------------------------------------------------
jest-worker: { globalTime: 241, processingTime: 191 }
worker-farm: { globalTime: 440, processingTime: 412 }
---------------------------------------------------------------------------
total worker-farm: { wFGT: 5071, wFPT: 4678 }
total jest-worker: { jWGT: 2470, jWPT: 2040 }
---------------------------------------------------------------------------
% improvement over 10000 calls (global time): 51.29165845000986
% improvement over 10000 calls (processing time): 56.39162035057717

Test plan

TBA

@Kureev Kureev self-assigned this Jul 11, 2018

@Kureev Kureev requested review from rafeca and mjesun Jul 11, 2018

@Kureev

This comment has been minimized.

Copy link
Author

Kureev commented Jul 11, 2018

Please, take into account that this is a WIP PR. Currently, there are a few parts missing:

  • Retry in case of a failure
  • Tests
  • Usage docs
this._locks = [];
this._offset = 0;

// If we exceeded the amount of retries, we will emulate an error reply

This comment has been minimized.

@Kureev

Kureev Jul 11, 2018

Author

This part will be rewritten and removed from here. This a part of missing functionality I've mentioned in the first comment to the PR

const numOfWorkers = this._workerPool.getWorkers().length;
for (let i = 0; i < numOfWorkers; i++) {
const workerIdx = (this._offset + i) % numOfWorkers;
this.enqueue(task, workerIdx);

This comment has been minimized.

@mjesun

mjesun Jul 11, 2018

Contributor

Separate the initial part of the function into another one, to avoid recursivity.

This comment has been minimized.

@Kureev

Kureev Jul 11, 2018

Author

I can use enqueue(QueueChildMessage, number) for scheduling to a specific worker and push(QueueChildMessage) to schedule a job in all workers. wdyt?

This comment has been minimized.

@mjesun

mjesun Jul 11, 2018

Contributor

What I mean is to move the for loop into the parent implementation, where you will call enqueue always with a workerId. This way workerId won't be optional anymore. This will also allow you to remove getWorkers, because you won't need it to get the length anymore.

This comment has been minimized.

@Kureev

Kureev Jul 11, 2018

Author

From the design perspective I'd say Farm shouldn't be concerned how WorkerPool spread tasks among workers. It is fine that Farm can pin job to the worker if it has a reference to it, however it isn't cool if Farm contains queue distribution logic in it 🤔

This comment has been minimized.

@mjesun

mjesun Jul 11, 2018

Contributor

👍🏻 Let's separate it into a different method as discussed offline.

_options: FarmOptions;
_stderr: Readable;
_stdout: Readable;
_workers: Array<WorkerInterface>;

This comment has been minimized.

@mjesun

mjesun Jul 11, 2018

Contributor

The interface shouldn't have any of the private variables.

send(request: ChildMessage, onProcessStart: OnStart, onProcessEnd: OnEnd) {
onProcessStart(this);
this._onProcessEnd = onProcessEnd;
// $FlowFixMe

This comment has been minimized.

@mjesun

mjesun Jul 11, 2018

Contributor

?

This comment has been minimized.

@Kureev

Kureev Jul 11, 2018

Author

¿
this._child.send(request) doesn't expect to receive a request type

]);
}

function execMethod(method: string, args: $ReadOnlyArray<any>): void {

This comment has been minimized.

@mjesun

mjesun Jul 11, 2018

Contributor

Can you create a common file within the workers/ folder that abstract this functionality? It looks like child and threadChild are 90% the same.

@@ -32,14 +34,16 @@ let file = null;
* If an invalid message is detected, the child will exit (by throwing) with a
* non-zero exit code.
*/
process.on('message', (request: any /* Should be ChildMessage */) => {
process.on('message', (request: any) => {

This comment has been minimized.

@mjesun

mjesun Jul 11, 2018

Contributor

Let's rename this file to processChild as opposed to threadChild.

const workerOptions: WorkerOptions = {
forkOptions: options.forkOptions || {},
maxRetries: options.maxRetries || 3,
useNodeWorkersIfPossible: options.useNodeWorkersIfPossible,

This comment has been minimized.

@mjesun

mjesun Jul 11, 2018

Contributor

Let's remove this and make it true always.

This comment has been minimized.

@SimenB

SimenB Jul 11, 2018

Collaborator

It makes sense as an option, doesn't it? Seems like an opt-in thing (beyond using a version of node where it's unflagged)

This comment has been minimized.

@Kureev

Kureev Jul 11, 2018

Author

There is no real reason why you'd like to use child_process instead of worker_threads when the second one is available. According to our benchmarks, implementation using workers works ~ twice as fast

This comment has been minimized.

@SimenB

SimenB Jul 11, 2018

Collaborator

API stability, there might be behavioural differences. Upgrading my app from node 10 to 11 (or whenever they remove the flag) shouldn't change the underlying implementation IMO. The docs for the module says it uses multiple processes, and just silently not doing that, depending on node version, seems risky to me

FWIW I could get behind setting it to true by default so it's possible to opt-out.

This comment has been minimized.

@Kureev

Kureev Jul 11, 2018

Author

Upgrading my app from node 10 to 11 (or whenever they remove the flag) shouldn't change the underlying implementation IMO

I think that underlying implementation is not something that should concern an average library user. Public interface stays completely the same (which is crucial for all library users), while "engine" is getting faster. Same happens with the language and other libraries like React (Fiber rewrite) etc.

The docs for the module says it uses multiple processes, and just silently not doing that, depending on node version, seems risky to me

I'm completely on the same page, but we're going to cover all that by specs so there should be no issues with unexpected regression bugs.

I think @mjesun can give you more insights.

This comment has been minimized.

@mjesun

mjesun Jul 11, 2018

Contributor

Let's re-write the docs as "we'll choose the most-performant implementation that is possible". People can enforce one or another by explicitly passing the corresponding worker pool.

This comment has been minimized.

@SimenB

SimenB Jul 11, 2018

Collaborator

Sounds good to me!

return;
}
const task = {onEnd, onStart, request};
const workerId = worker ? worker.getWorkerId() : undefined;

This comment has been minimized.

@mjesun

mjesun Jul 11, 2018

Contributor

Let's unify the interface. Either we work with worker instances or with worker ids, but I see a mix in the API.

This comment has been minimized.

@Kureev

Kureev Jul 11, 2018

Author

Done. We use workerIds everywhere up to the pool


if (job.owner) {
this._queue[workerId] = job.next ? job.next : null;
return this.run(workerId);

This comment has been minimized.

@mjesun

mjesun Jul 11, 2018

Contributor

Remove recursivity here and use a while loop to find the first job available. Don't lock by worker, lock by job.

@SimenB

This comment has been minimized.

Copy link
Collaborator

SimenB commented Jul 11, 2018

Ooooooh, nice!

@SimenB
Copy link
Collaborator

SimenB left a comment

This is also missing docs, in addition to the WIP points highlighted in a comment 🙂

Excited about this!

@@ -0,0 +1,33 @@
/**
* Copyright (c) 2017-present, Facebook, Inc. All rights reserved.

This comment has been minimized.

@SimenB

SimenB Jul 11, 2018

Collaborator

18? I have to admit I don't know what these should say

@Kureev

This comment has been minimized.

Copy link
Author

Kureev commented Jul 11, 2018

@SimenB you haven't seen the performance part yet 😉

P.S. Added "Usage docs" as a missing part. I think it should come along with this PR to avoid inconsistency

if (workerId != null) {
if (this._queue[workerId]) {
this._last[workerId].next = task;
this._last[workerId] = task;

This comment has been minimized.

@mjesun

mjesun Jul 11, 2018

Contributor

This line is common.


this._workerPool.send(worker, job.request, job.onStart, onEnd);

job.owner = worker;

This comment has been minimized.

@mjesun

mjesun Jul 11, 2018

Contributor

Convert owner to a boolean.

const job = this._queue[workerId];
const worker = this._workerPool.getWorkers()[workerId];

if (!job) {

This comment has been minimized.

@mjesun

this.lock(workerId);

this._workerPool.send(worker, job.request, job.onStart, onEnd);

This comment has been minimized.

@mjesun

mjesun Jul 11, 2018

Contributor

That's an example of API unification: let's use workerId.

// $FlowFixMe: This has to be a dynamic require.
const child = require(workerPath);
this._cacheKeys = Object.create(null);
this._options = Object.assign({}, defaultFarmOptions, options);

This comment has been minimized.

@mjesun

mjesun Jul 11, 2018

Contributor

Make sure to make all configuration non-optional from here down the line (e.g. your input says numWorkers: ?number, but the _options object passed here should not be optional (i.e. numWorkers: number).

PARENT_MESSAGE_ERROR,
CHILD_MESSAGE_INITIALIZE,

This comment has been minimized.

@mjesun

mjesun Jul 11, 2018

Contributor

Alphabetical order


const onEnd = (error: ?Error, result: mixed, worker: WorkerInterface) => {
this.unlock(workerId);
job.onEnd(error, result, worker);

This comment has been minimized.

@mjesun

mjesun Jul 11, 2018

Contributor

Can we swap this with the previous line? Conceptually sounds better to keep the worker locked while finishing, just in case onEnd decides to go wild.

}

getNextJob(workerId: number): ?QueueChildMessage {
if (!this._queue[workerId]) {

This comment has been minimized.

@mjesun

mjesun Jul 11, 2018

Contributor

Cache the entire queue in const queue = this._queue[workerId].

break;
}
this._queue[workerId] = this._queue[workerId].next;
}

This comment has been minimized.

const numOfWorkers = this._workerPool.getWorkers().length;
for (let i = 0; i < numOfWorkers; i++) {
const workerIdx = (this._offset + i) % numOfWorkers;
this.enqueue(task, workerIdx);

This comment has been minimized.

@mjesun

mjesun Jul 11, 2018

Contributor

What I mean is to move the for loop into the parent implementation, where you will call enqueue always with a workerId. This way workerId won't be optional anymore. This will also allow you to remove getWorkers, because you won't need it to get the length anymore.


this._workers[i] = worker;

// $FlowFixMe

This comment has been minimized.

@mjesun

mjesun Jul 11, 2018

Contributor

Please add explanation to all FlowFixMes.

You can get rid of this ones by redefining at the top the result, as a variable (class methods are covariant in Flow). For instance, putting at the top: getStderr: () => BaseWorkerPool; would do the trick.

Anyway, I'm not sure why do we need to bind them all.

This comment has been minimized.

@Kureev

Kureev Jul 11, 2018

Author

We actually don't need it anymore. Will remove them accordingly

@Kureev Kureev force-pushed the Kureev:feature/T31135461 branch from 1ad9aa5 to 29e04d2 Jul 12, 2018

@SimenB

This comment has been minimized.

Copy link
Collaborator

SimenB commented Aug 15, 2018

Any news on this one? I'm excited!

@SimenB

This comment has been minimized.

Copy link
Collaborator

SimenB commented Sep 15, 2018

@mjesun @Kureev ping 🙂 Would be awesome to land this. Currently landing some breaking changes, so this should be fine to land as part of that if needed

@Kureev

This comment has been minimized.

Copy link
Author

Kureev commented Sep 24, 2018

Hi @SimenB! :) Sorry for being silent. I'm working in a different team now, might be someone would like to take it over? IIRC, it's only about tests now. cc @mjesun

@SimenB SimenB added this to the Jest 24 milestone Oct 16, 2018

@SimenB

This comment has been minimized.

Copy link
Collaborator

SimenB commented Oct 17, 2018

@mjesun @rubennorte anyone at FB working on this? If not, anyone able to say what works and what doesn't with this PR so we can land it?

@rubennorte

This comment has been minimized.

Copy link
Contributor

rubennorte commented Oct 17, 2018

@SimenB AFAIK no one is working on this right now. I might take a look in a few weeks but I don't think it's going to be ready for the next major. Feel free to take over from here if you want to work on this (or anyone else).

@SimenB

This comment has been minimized.

Copy link
Collaborator

SimenB commented Oct 17, 2018

Might look into it if you could say some about what's missing. Just getting CI green? More tests?

@rickhanlonii

This comment has been minimized.

Copy link
Member

rickhanlonii commented Nov 25, 2018

I merged in master and wrote tests for this on #7408

Separated the PRs to get a clean review 👍

@SimenB

This comment has been minimized.

Copy link
Collaborator

SimenB commented Dec 5, 2018

Landed

@SimenB SimenB closed this Dec 5, 2018

@SimenB SimenB referenced this pull request Jan 22, 2019

Merged

Add Jest 24 blog post #7670

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment