Skip to content
This repository has been archived by the owner on Apr 22, 2023. It is now read-only.

Commit

Permalink
cluster: centralize removal from workers list.
Browse files Browse the repository at this point in the history
Currently, cluster workers can be removed from the workers list in three
different places:
- In the exit event handler for the worker process.
- In the disconnect event handler of the worker process.
- In the disconnect event handler of the cluster master.

However, handles for a given worker are cleaned up only in one of these
places: in the cluster master's disconnect event handler.

Because these events happen asynchronously, it is possible that the
workers list is empty before we even clean up one handle. This makes
the assert that makes sure that no handle is left when the workers
list is empty fail.

This commit removes the worker from the cluster.workers list only when
the worker is dead _and_ disconnected, at which point we're sure that
its associated handles are cleaned up.

Fixes #8191 and #8192.
  • Loading branch information
Julien Gilli committed Aug 21, 2014
1 parent 0d357fa commit 7b432fb
Show file tree
Hide file tree
Showing 2 changed files with 56 additions and 16 deletions.
6 changes: 4 additions & 2 deletions doc/api/cluster.markdown
Expand Up @@ -344,8 +344,10 @@ A hash that stores the active worker objects, keyed by `id` field. Makes it
easy to loop through all the workers. It is only available in the master
process.

A worker is removed from cluster.workers just before the `'disconnect'` or
`'exit'` event is emitted.
A worker is removed from cluster.workers after the worker has disconnected _and_
exited. The order between these two events cannot be determined in advance.
However, it is guaranteed that the removal from the cluster.workers list happens
before last `'disconnect'` or `'exit'` event is emitted.

// Go through all workers
function eachWorker(callback) {
Expand Down
66 changes: 52 additions & 14 deletions lib/cluster.js
Expand Up @@ -64,6 +64,14 @@ Worker.prototype.send = function() {
this.process.send.apply(this.process, arguments);
};

Worker.prototype.isDead = function isDead() {
return this.state === 'dead';
};

Worker.prototype.isDisconnected = function isDisconnected() {
return this.state === 'disconnected';
};

// Master/worker specific methods are defined in the *Init() functions.

function SharedHandle(key, address, port, addressType, backlog, fd) {
Expand Down Expand Up @@ -310,20 +318,62 @@ function masterInit() {
id: id,
process: workerProcess
});

function removeWorker(worker) {
assert(worker);

delete cluster.workers[worker.id];

if (Object.keys(cluster.workers).length === 0) {
assert(Object.keys(handles).length === 0, 'Resource leak detected.');
intercom.emit('disconnect');
}
}

function removeHandlesForWorker(worker) {
assert(worker);

for (var key in handles) {
var handle = handles[key];
if (handle.remove(worker)) delete handles[key];
}
}

worker.process.once('exit', function(exitCode, signalCode) {
/*
* Remove the worker from the workers list only
* if it has disconnected, otherwise we might
* still want to access it.
*/
if (worker.isDisconnected()) removeWorker(worker);

worker.suicide = !!worker.suicide;
worker.state = 'dead';
worker.emit('exit', exitCode, signalCode);
cluster.emit('exit', worker, exitCode, signalCode);
delete cluster.workers[worker.id];
});

worker.process.once('disconnect', function() {
/*
* Now is a good time to remove the handles
* associated with this worker because it is
* not connected to the master anymore.
*/
removeHandlesForWorker(worker);

/*
* Remove the worker from the workers list only
* if its process has exited. Otherwise, we might
* still want to access it.
*/
if (worker.isDead()) removeWorker(worker);

worker.suicide = !!worker.suicide;
worker.state = 'disconnected';
worker.emit('disconnect');
cluster.emit('disconnect', worker);
delete cluster.workers[worker.id];
});

worker.process.on('internalMessage', internal(worker, onmessage));
process.nextTick(function() {
cluster.emit('fork', worker);
Expand All @@ -345,18 +395,6 @@ function masterInit() {
if (cb) intercom.once('disconnect', cb);
};

cluster.on('disconnect', function(worker) {
delete cluster.workers[worker.id];
for (var key in handles) {
var handle = handles[key];
if (handle.remove(worker)) delete handles[key];
}
if (Object.keys(cluster.workers).length === 0) {
assert(Object.keys(handles).length === 0, 'Resource leak detected.');
intercom.emit('disconnect');
}
});

Worker.prototype.disconnect = function() {
this.suicide = true;
send(this, { act: 'disconnect' });
Expand Down

0 comments on commit 7b432fb

Please sign in to comment.