Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ERR_IPC_CHANNEL_CLOSED and EPIPE errors on exit event #5

Closed
RolandoAndrade opened this issue Oct 7, 2022 · 1 comment
Closed

ERR_IPC_CHANNEL_CLOSED and EPIPE errors on exit event #5

RolandoAndrade opened this issue Oct 7, 2022 · 1 comment
Labels
bug Something isn't working
Milestone

Comments

@RolandoAndrade
Copy link
Contributor

RolandoAndrade commented Oct 7, 2022

Hey! Nice work! I found an issue when I try to run a process in cluster mode using this library.

Description

When I run a process in cluster mode using the @socket.io/pm2 library and I try to delete them using pm2 delete all I get the following errors:

PM2             | Error [ERR_IPC_CHANNEL_CLOSED]: Channel closed
PM2             |     at ChildProcess.target.send (internal/child_process.js:680:16)
PM2             |     at Worker.send (internal/cluster/worker.js:45:28)
PM2             |     at EventEmitter.<anonymous> (/usr/lib/node_modules/@socket.io/pm2/node_modules/@socket.io/cluster-adapter/dist/index.js:392:32)
PM2             |     at EventEmitter.emit (events.js:314:20)
PM2             |     at EventEmitter.emit (domain.js:483:12)
PM2             |     at ChildProcess.<anonymous> (internal/cluster/master.js:191:13)
PM2             |     at Object.onceWrapper (events.js:421:26)
PM2             |     at ChildProcess.emit (events.js:314:20)
PM2             |     at ChildProcess.EventEmitter.emit (domain.js:506:15)
PM2             |     at Process.ChildProcess._handle.onexit (internal/child_process.js:276:12) {
PM2             |   code: 'ERR_IPC_CHANNEL_CLOSED'
PM2             | }
PM2             | Error: write EPIPE
PM2             |     at ChildProcess.target._send (internal/child_process.js:807:20)
PM2             |     at ChildProcess.target.send (internal/child_process.js:678:19)
PM2             |     at Worker.send (internal/cluster/worker.js:45:28)
PM2             |     at EventEmitter.<anonymous> (/usr/lib/node_modules/@socket.io/pm2/node_modules/@socket.io/cluster-adapter/dist/index.js:392:32)
PM2             |     at EventEmitter.emit (events.js:314:20)
PM2             |     at EventEmitter.emit (domain.js:483:12)
PM2             |     at ChildProcess.<anonymous> (internal/cluster/master.js:191:13)
PM2             |     at Object.onceWrapper (events.js:421:26)
PM2             |     at ChildProcess.emit (events.js:314:20)
PM2             |     at ChildProcess.EventEmitter.emit (domain.js:506:15) {
PM2             |   errno: 'EPIPE',
PM2             |   code: 'EPIPE',
PM2             |   syscall: 'write'
PM2             | }

It seems on exit it is trying to send the WORKER_EXIT message to disconnected workers.

cluster.on("exit", (worker) => {
// notify all active workers
for (const workerId in cluster.workers) {
if (hasOwnProperty.call(cluster.workers, workerId)) {
cluster.workers[workerId].send({
source: MESSAGE_SOURCE,
type: EventType.WORKER_EXIT,
data: worker.id,
});
}
}
});

if we set up pm2 to resurrect on failure, this causes the unexpected resurrection of the killed/deleted/stopped processes.

Based on this thread I think it is related to a synchronization error when multiple processes are closed at once. I was also able to delete the processes one by one without any problem.

I could mitigate the error adding a callback to handle the exception:

    cluster.on("exit", (worker) => {
        // notify all active workers
        for (const workerId in cluster.workers) {
            if (hasOwnProperty.call(cluster.workers, workerId)) {
                cluster.workers[workerId].send({
                    source: MESSAGE_SOURCE,
                    type: EventType.WORKER_EXIT,
                    data: worker.id,
                }, (err) => {
                    if (err) {
                        if (err.code == 'ERR_IPC_CHANNEL_CLOSED' || err.code == 'EPIPE') {
                            console.warn(`There was a synchronization problem. Wrong attempt to send a message to a disconnected worker`)
                            console.log(err);
                        } else {
                            throw err;
                        }
                    }
                });
            }
        }
    });

Steps to reproduce

  1. Create a sample process.
process.on('SIGINT', () => {
    console.log(`Received SIGINT.  Shutting down.`);
    process.exit(0);
});

let i = 0;

async function run() {
    while (true) {
        await new Promise(resolve => setTimeout(resolve, 5000));
        console.log(`Number ${i++}.`);
    }
}

run().catch(error => {
    console.error('Error!', error);
    process.exit();
});
  1. Delete pm2 and install @socket.io/pm2
  2. Run the process in cluster mode
pm2 start test-process.js -i 3
  1. Try to delete/stop/kill the processes.
pm2 delete all

After doing this, you will see the exceptions.

Env details

  • node 16
  • socket.io-cluster-adapter 0.1.0
  • socket.io/pm2 latest
@RolandoAndrade RolandoAndrade changed the title ERR_IPC_CHANNEL_CLOSED and EPIPE errors on exit event c ERR_IPC_CHANNEL_CLOSED and EPIPE errors on exit event Oct 7, 2022
@darrachequesne
Copy link
Member

This should be fixed by be0a0e3, included in version 0.2.1.

Thanks for the detailed report!

@darrachequesne darrachequesne added this to the 0.2.1 milestone Oct 13, 2022
@darrachequesne darrachequesne added the bug Something isn't working label Oct 13, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants