WIP: cluster: make scheduler configurable #11546

cjihrig · 2017-02-25T00:05:23Z

This is a WIP for making the cluster scheduler configurable by users. This still needs docs, tests, etc.

In the current implementation, the cluster module exports a Scheduler class. The class provides the following methods/hooks:

constructor() - Used to configure data structures for managing connections (handles) and cluster workers.
addConnection() - Called when a new connection is received. This is where the user would store the incoming connection handle to be distributed later.
addWorker() - Called when a new cluster worker is added. This is where the user would add the worker into the pool of workers.
removeWorker() - Called when a cluster worker is being removed from the pool.
shutdown() - Called when the scheduler is finished working. This would be used to clean up any lingering resources such as handles.
schedule() - This is the scheduling algorithm. The output of the algorithm should be a connection handle to distribute, and the worker to distribute it to.

I tried to minimize the exposure of the cluster's inner workings to incoming connection handles. Worker objects are already part of the public API.

Remaining questions:

Should we enforce that handles from incoming connections are stored in a queue (array)? This would eliminate the need for the current addConnection() hook.
Should we enforce that workers are stored in a "worker id to worker object" map, like they currently are? This would eliminate the need for the addWorker() and removeWorker() hooks.
If we can get rid of the connections and workers hooks, then the shutdown hook can probably go. Then we could drop the Scheduler class, and just provide a schedule() function that takes the available workers and handles as input, and schedules accordingly. Ideally, I'd like to go this simpler route, but want to check with some people who are actually doing this in practice to make sure it would be flexible enough. cc: @Yemanu and @redonkulus from RFC cluster: make scheduler pluggable #10880. Would this approach work for your needs?

Closes #10880

Checklist

make -j4 test (UNIX), or vcbuild test (Windows) passes
tests and/or benchmarks are included
documentation is changed or added
commit message follows commit guidelines

Affected core subsystem(s)

cluster

Fishrock123 · 2017-02-25T00:38:15Z

lib/internal/cluster/round_robin_handle.js

 const { sendHelper } = require('internal/cluster/utils');
-const getOwnPropertyNames = Object.getOwnPropertyNames;
+const { create, getOwnPropertyNames } = Object;


not really sure I'll ever get used to that one

The variable name aliasing for this style is even harder to remember/get used to.

Fishrock123 · 2017-02-25T00:40:12Z

lib/internal/cluster/round_robin_handle.js

+  }
+  shutdown() {
+    for (var handle; handle = this.handles.shift(); handle.close())
+      ;


possibly neater/more readable as

var handle; while (handle = this.handles.shift()) { handle.close(); }

linter might complain less too?

bnoordhuis · 2017-02-27T15:54:37Z

This class-based approach seems way overkill. Why is plugging in a custom schedule function not sufficient?

cjihrig · 2017-02-27T15:57:30Z

I agree, and I think that it should be sufficient (see "Remaining questions" in the PR description). It really just comes down to whether or not people who do this need to define their own data structures for managing the workers and handles, or if we can standardize on the current data structures.

cjihrig · 2017-03-01T16:42:48Z

Ping @Yemanu and @redonkulus. I'd like your feedback before dropping the class based approach.

yemanett · 2017-03-02T18:43:59Z

@cjihrig, we still want to use all functionality provided by the RR scheduler, but we want to customize the schedule(callback) method. It looks like RR is no longer visible to the public API, right?

cjihrig · 2017-03-03T14:54:59Z

@yemanett all of the functionality will still be available, but you'll be responsible for scheduling connections to the worker yourself. For RR, that's pretty simple: you grab the first worker out of an array, then put that worker in the back of the array.

My question is whether or not you need special hooks for managing the data structures for the connection queue and workers. My guess is no.

yemanett · 2017-03-03T20:14:13Z

@cjihrig sorry I missed line https://github.com/cjihrig/node-1/blob/c47fe62a1d53959a12c741731608b7efaf5856e1/lib/internal/cluster/master.js#L300 which lead me to my previous incorrect comment.

Either way should be fine, but the Class based approach feels more intuitive. In that case we need just addWorker(), removeWorker and schedule.

cjihrig · 2017-03-05T00:53:32Z

@yemanett may I ask how you plan to use the addWorker() and removeWorker() hooks? Note that these are called when a worker is added or removed. They are not used to add and remove workers.

If I recall correctly, your original use case was to take workers offline for some amount of time. In that scenario, I would recommend that the child process sends an IPC message to the cluster master saying that it needs to go offline. The master can then take that information into account when calling schedule().

yemanett · 2017-03-05T23:47:00Z

@cjihrig Basically, we are planning to use it with our application runner module which manages the node processes including re-spawning a new worker when a child process dies. The application runner requires cluster. The runner loads the scheduler script in the master process. The scheduler takes a process out of rotation based on its algorithm by calling the removeWorker(w) method which removes w from the ‘this.free’ list and notify the worker via IPC message. The worker can then perform some tasks and notify master via PCI to put it back in rotation, in which case master calls addWorker(w) to add w into this.free list.

cjihrig · 2017-03-06T01:25:49Z

The scheduler takes a process out of rotation based on its algorithm by calling the removeWorker(w) method which removes w from the ‘this.free’ list and notify the worker via IPC message.

That's not how removeWorker() works. It isn't used to take a worker offline. It's a function that is called once the worker is already offline. In your case, the scheduler would take the worker out of rotation based on its own algorithm, or some message received from the worker. With the worker then marked as offline, the schedule() function just wouldn't route any requests to it.

The worker can then perform some tasks and notify master via PCI to put it back in rotation, in which case master calls addWorker(w) to add w into this.free list. In your case, the worker would send an IPC message to the master, and the master would mark the worker as online again. Then schedule() would see that worker as a valid receiver of requests.

Based on this, it still seems like schedule() is all that's needed.

Also not how addWorker() works. It would be invoked when a cluster worker comes online.

yemanett · 2017-03-06T03:03:22Z

@cjihrig yes schedule is what we need, but currently it takes worker and handle, how do we get the handle?

bnoordhuis · 2017-03-06T09:36:44Z

What do you need the handle for? schedule()'s job is to pick the next worker. Ideally, you don't need to look at the handle for that.

That said, schedule() should be flexible enough to support a use case like IP-based load balancing. That's difficult though because the cluster module operates on internal handle objects, not on net.Socket objects.

yemanett · 2017-03-06T17:07:48Z

@bnoordhuis

What do you need the handle for? schedule()'s job is to pick the next worker. Ideally, you don't need to look at the handle for that.
Because the above proposal says "Then we could drop the Scheduler class, and just provide a schedule() function that takes the available workers and handles as input, and schedules accordingly."

That said, schedule() should be flexible enough to support a use case like IP-based load balancing. That's difficult though because the cluster module operates on internal handle objects, not on net.Socket objects.

Our use case simple, it is just to take a worker OOR(out of rotation) so that master won't distribute new incoming requests to the worker while the worker is OOR . We still want to continue to use RR scheduler. But we just need to add a functionality to take a worker offline. The sequence diagram look something like this

cjihrig · 2017-03-14T14:55:53Z

That said, schedule() should be flexible enough to support a use case like IP-based load balancing. That's difficult though because the cluster module operates on internal handle objects, not on net.Socket objects.

@bnoordhuis do you have any preference between:

Dropping the handles from schedule(). IP-based load balancing wouldn't really be possible.
Passing the raw handles to schedule(). This would expose internals.
Wrapping the handles in a net.Socket. There is probably some performance overhead.

bnoordhuis · 2017-03-14T15:10:41Z

I'd say option 3, probably with a way to opt in so people with no need for the handle don't have to pay the overhead.

cjihrig · 2017-03-28T03:02:03Z

@bnoordhuis I've gotten rid of the scheduler class. You would just have to pass in a scheduler function now. You can request that the socket be passed in by adding a flag to the scheduler function. Before I go about writing docs and tests, what do you think?

bnoordhuis

Looks like an alright approach to me.

bnoordhuis · 2017-03-28T17:02:22Z

lib/internal/cluster/round_robin_handle.js

+    socket = new net.Socket({
+      handle,
+      readable: false,
+      writable: false,


readable = writable = false. Is that intentional?

I figured that the socket should only be used to determine where to pass the handle, and not actually read or written.

yemanett · 2017-03-29T22:00:54Z

@cjihrig, the scheduler works nicely for me, The changes looks good to me.

Thanks,

cjihrig · 2017-03-30T14:21:28Z

@bnoordhuis updated.

Removed the cluster.SCHED_CUSTOM constant.
Added stronger language to the docs about altering workers

I went with scheduler over schedule since the function is the thing that determines the schedule.

bnoordhuis

LGTM although the wording in the documentation about not modifying the array could still be stronger.

cjihrig · 2017-04-03T12:25:05Z

@bnoordhuis do you have any specific wording in mind? I'm happy to try to scare users.

bnoordhuis · 2017-04-03T17:39:03Z

"The array should under no condition be mutated"?

cjihrig · 2017-04-04T13:29:26Z

OK, I'll update to that, although we do mutate the array in the round robin scheduler.

bnoordhuis · 2017-04-04T15:18:18Z

Yes, but we are allowed to break our own rules.

cjihrig · 2017-04-07T15:17:16Z

Updated to the suggested wording.

CI: https://ci.nodejs.org/job/node-test-pull-request/7265/. Of course Windows wouldn't work.

yemanett · 2017-04-27T18:11:51Z

The following test is failing: test/windows-fanned

yemanett · 2017-05-04T19:53:10Z

@cjihrig - do you know why test test/windows-fanned is failing? the 'Detail' link returns status 502

cjihrig · 2017-05-05T06:30:40Z

@yemanett the CI results are only kept around temporarily. It looks like those ones are no longer available.

yemanett · 2017-05-08T19:13:29Z

@cjihrig - when will this be merged ?

cjihrig · 2017-05-09T18:50:40Z

@yemanett It needs a rebase on the documentation, and some work to make the CI pass on Windows.

redonkulus · 2017-05-09T23:03:13Z

@cjihrig are you still actively working on this PR? If so, when do you think it can be completed?

cjihrig · 2017-05-10T19:00:21Z

@redonkulus still working on it, but if you wanted to debug the test on Windows, I wouldn't try to stop you.

yemanett · 2017-05-30T15:00:47Z

@cjihrig do you think the PR issue will be resolved and merged soon?

Trott · 2017-07-27T00:15:45Z

Labeling stalled. Feel free to remove the label if that's wrong.

cjihrig · 2017-08-12T18:00:55Z

Someone else can take this over if they want.

nodejs-github-bot added the build Issues and PRs related to build files or the CI. label Feb 25, 2017

cjihrig mentioned this pull request Feb 25, 2017

RFC cluster: make scheduler pluggable #10880

Closed

Fishrock123 reviewed Feb 25, 2017

View reviewed changes

mscdex added cluster Issues and PRs related to the cluster subsystem. and removed build Issues and PRs related to build files or the CI. labels Feb 25, 2017

vmarchaud mentioned this pull request Mar 19, 2017

Sticky Sessions ? Unitech/pm2#389

Closed

This was referenced Mar 26, 2017

cluster:RR - Added public API pausing/unpausing a worker #10369

Closed

cluster:take a worker offline and force GC #7695

Closed

cjihrig force-pushed the scheduler-new branch 3 times, most recently from b20bf08 to 63ec46c Compare March 28, 2017 02:46

bnoordhuis reviewed Mar 28, 2017

View reviewed changes

cjihrig force-pushed the scheduler-new branch 2 times, most recently from e36a038 to 6a01cda Compare March 29, 2017 01:39

cjihrig force-pushed the scheduler-new branch from 6bb24ab to c3416a4 Compare March 30, 2017 14:19

bnoordhuis approved these changes Apr 3, 2017

View reviewed changes

jasnell approved these changes Apr 4, 2017

View reviewed changes

cjihrig added 2 commits April 7, 2017 10:05

cluster: make scheduler configurable

ea168b7

pr updates

fefd041

cjihrig force-pushed the scheduler-new branch from c3416a4 to fefd041 Compare April 7, 2017 14:14

refack force-pushed the master branch from 16073c0 to fbe946b Compare April 14, 2017 04:11

Trott added the stalled Issues and PRs that are stalled. label Jul 27, 2017

cjihrig closed this Aug 12, 2017

soyuka mentioned this pull request Jan 25, 2018

Sticky session cluster #18315

Closed

4 tasks

mscdex mentioned this pull request Oct 12, 2018

cluster: implement API for pluggable distribute() for round-robin scheduler #6001 #23478

Closed

cjihrig mentioned this pull request Sep 25, 2019

cluster: remove the useless parameter for master #29470

Closed

4 tasks

bnoordhuis mentioned this pull request May 1, 2020

Cluster: how to inspect and act upon incoming requests nodejs/help#2664

Closed

WIP: cluster: make scheduler configurable #11546

WIP: cluster: make scheduler configurable #11546

Conversation

cjihrig commented Feb 25, 2017

Checklist

Affected core subsystem(s)

Fishrock123 Feb 25, 2017

Choose a reason for hiding this comment

mscdex Feb 25, 2017

Choose a reason for hiding this comment

Fishrock123 Feb 25, 2017 • edited Loading

Choose a reason for hiding this comment

bnoordhuis commented Feb 27, 2017

cjihrig commented Feb 27, 2017

cjihrig commented Mar 1, 2017

yemanett commented Mar 2, 2017

cjihrig commented Mar 3, 2017

yemanett commented Mar 3, 2017

cjihrig commented Mar 5, 2017

yemanett commented Mar 5, 2017

cjihrig commented Mar 6, 2017

yemanett commented Mar 6, 2017

bnoordhuis commented Mar 6, 2017

yemanett commented Mar 6, 2017

cjihrig commented Mar 14, 2017

bnoordhuis commented Mar 14, 2017

cjihrig commented Mar 28, 2017

bnoordhuis left a comment

Choose a reason for hiding this comment

bnoordhuis Mar 28, 2017

Choose a reason for hiding this comment

cjihrig Mar 28, 2017

Choose a reason for hiding this comment

yemanett commented Mar 29, 2017

cjihrig commented Mar 30, 2017

bnoordhuis left a comment

Choose a reason for hiding this comment

cjihrig commented Apr 3, 2017

bnoordhuis commented Apr 3, 2017

cjihrig commented Apr 4, 2017

bnoordhuis commented Apr 4, 2017

cjihrig commented Apr 7, 2017

yemanett commented Apr 27, 2017

yemanett commented May 4, 2017

cjihrig commented May 5, 2017

yemanett commented May 8, 2017

cjihrig commented May 9, 2017

redonkulus commented May 9, 2017

cjihrig commented May 10, 2017

yemanett commented May 30, 2017

Trott commented Jul 27, 2017

cjihrig commented Aug 12, 2017

Fishrock123 Feb 25, 2017 •

edited

Loading