cluster.worker.on('message', (msg) => ...) fails to register a callback if ESM file extensions is used #48578

jerome-benoit · 2023-06-27T19:20:29Z

Version

v20.3.1

Platform

All supported platform

Subsystem

No response

What steps will reproduce the bug?

After a migration of code to ESM by using .mjs file extensions, the poolifier project has encountered issues at running internal benchmarks code.

Test case:

main.mjs:

import cluster from 'cluster'

cluster.setupPrimary({ exec: './worker.mjs' })

const worker = cluster.fork()

worker.on('message', message => {
  console.info('message received from worker:', message)
})

worker.on('online', () => {
  console.info('worker is online')
})

worker.on('error', (error) => {
  console.info('worker error', error)
})

worker.on('disconnect', () => {
  console.info('worker disconnected')
})

worker.on('exit', () => {
  console.info('worker exited')
})

worker.send('hello')

worker.mjs:

import cluster from 'cluster'

cluster.worker.on('message', message => {
  console.info('echo message received from main:', message)
  cluster.worker.send(message)
})

node main.mjs is frozen.

How often does it reproduce? Is there a required condition?

100% reproducible

What is the expected behavior? Why is that the expected behavior?

No response

What do you see instead?

Callback is never called if a message is sent from the primary.

Additional information

No response

The text was updated successfully, but these errors were encountered:

jerome-benoit · 2023-06-27T19:28:49Z

Additional information: if example code for IPC with cluster is put the same ESM file, the callback for 'message' event is properly registered and called. Poolifier code is using the exec option to specify the worker file path. And in that case, cluster.worker.on('',() => {}) seems to have issues with ESM.

aduh95 · 2023-06-27T21:21:30Z

Can you share a repro?

jerome-benoit · 2023-06-27T21:30:12Z

Please read the detailed bug report on poolifier repo, it contains all the bits to reproduce it reliably with poolifier. I do not have the time currently to extract the code from it to reproduce with two files using .mjs extension.

aduh95 · 2023-06-27T22:55:52Z

I don't have the time either, I don't plan on working on it without a repro that does not involve external code.

bnoordhuis · 2023-06-28T06:41:46Z

I'll close this until there is a standalone test case. Let me know when I should reopen.

jerome-benoit · 2023-06-28T10:16:48Z

The way that bug 100% reproducible has been handled by the node.js project is below the common sense standards expected if the project goal is stability.
When I receive a confirmed bug report 100% reproducible on one of the FOSS project I maintain, I would not even dare to put the burden on the bug reporter to solve it: it's unrespectful of the work done to identify the issue and pinpoint the root cause.

bnoordhuis · 2023-06-28T10:38:31Z

The reason we don't usually accept bug reports that manifest in third-party code is that 9 out of 10 times the bug is in said third-party code, not Node.js.

Eventually comes the point where you decide you've wasted enough unpaid hours of your life on other people's crappy code.

Long story short: happy to take a look if you have a minimal test case. But if you are not willing to put in the time, then neither are we.

jerome-benoit · 2023-06-28T10:59:36Z

Please read carefully the bug report I've made before applying blindly rules that do not apply to it:

CommonJS benchmarking code: everything works, callback registered with cluster.worker.on(), called in poolifier
benchmarking code migrated to ESM: identical code, except the import part, using the very same (not even the ESM bundle, but the ESM bundle has the same issue) poolifier code previously working: callback not registered with cluster.worker.on() or called

=> Bug confirmed, 100% reproducible. Extracting a standalone test case is then becoming a corollary, not a prerequisite to its resolution. And then the burden is expected to be shared between project maintainers and the bug reporter.

bnoordhuis · 2023-06-28T11:07:17Z

Unless they're on your payroll, then no, you don't get to decide how and where other people invest their time.

Either put together a test case or don't, but stop arguing.

jerome-benoit · 2023-06-28T11:36:01Z

So the node.js project is not interested in releasing a stable ESM support in the cluster module by trying to welcome a proven 100% reproducible confirmed bug? And prefers to apply blindly rules instead of collaborating with the bug reporter to help planning fixing it and makes node.js better?

I will do the standalone test case when time permits. But given the way my first confirmed 100% reproducible bug report on the node.js project has been welcomed, I do not think I will redo it unless it impacts deeply a FOSS project I comaintain.

aduh95 · 2023-06-28T12:09:02Z

What about you? Are you not interested in releasing a stable ESM support in the cluster module? I think there's a misunderstanding here, the Node.js project is run by volunteers, you can't expect that someone else would be interested in investing their time in fixing your bug, especially when you yourself are not. Or you need to pay them (send me an email, I can make myself available).
The Node.js project would gladly accept a PR to fix it, but the Node.js project cannot open PRs by itself, only contributors can; and you are unlikely to find someone to spend time on it if they are not affected by the bug themselves. If you provide a repro – which is as you probably know is often the hardest part of fixing a bug – you're way more likely to find a volunteer.

But given the way my first confirmed 100% reproducible bug report on the node.js project has been welcomed, I do not think I will redo it unless it impacts deeply a FOSS project I comaintain.

You'll be deeply missed. If I may suggest to tune down the entitlement and show a little more respect of other people's personal time, you could be amazed what difference it makes.

jerome-benoit · 2023-06-28T12:44:17Z

I do not expect any fix to be done, I do not expect any answer, I expect nothing from a bug report on a FOSS project, except one thing: that if taken into account, it's done fairly in depth by a volunteer that matters at fixing it and has the time to handle it. A sign of respect of the time spent by another volunteer to pointpoint the bug. If that's not the view of the node.js project volunteers, that's fine by me. But easily understandable that the volunteer in question will be reluctant at doing another detailed bug reports on the same project.
If the node.js project thinks that the present bug report has been handled in a respectful way and fairly, that discussion is going nowhere and there's no point at continuing it.

Back to the subject: I'll do the test case reproducing the issue when time permits and attach the files to the bug report. And when times permit probably do a PR to try to fix it, as I usually do as a long time FOSS hacker on the open bug reports I do.

jerome-benoit · 2023-07-29T18:21:05Z

The test case:

main.mjs:

import cluster from 'cluster'

cluster.setupPrimary({ exec: './worker.mjs' })

const worker = cluster.fork()

worker.on('message', message => {
  console.info('message received from worker:', message)
})

worker.on('online', () => {
  console.info('worker is online')
})

worker.on('error', (error) => {
  console.info('worker error', error)
})

worker.on('disconnect', () => {
  console.info('worker disconnected')
})

worker.on('exit', () => {
  console.info('worker exited')
})

worker.send('hello')

worker.mjs:

import cluster from 'cluster'

cluster.worker.on('message', message => {
  console.info('echo message received from main:', message)
  cluster.worker.send(message)
})

node main.mjs is frozen.

Please reopen the issue.

jerome-benoit · 2023-07-29T18:24:55Z

main.js:

const cluster = require('cluster')

cluster.setupPrimary({ exec: './worker.js' })

const worker = cluster.fork()

worker.on('message', message => {
  console.info('message received from worker:', message)
})

worker.on('online', () => {
  console.info('worker is online')
})

worker.on('error', (error) => {
  console.info('worker error', error)
})

worker.on('disconnect', () => {
  console.info('worker disconnected')
})

worker.on('exit', () => {
  console.info('worker exited')
})

worker.send('hello')

worker.js:

const cluster = require('cluster')

cluster.worker.on('message', message => {
  console.info('echo message received from main:', message)
  cluster.worker.send(message)
})

node main.js just works.

bnoordhuis · 2023-07-30T09:50:06Z

Thanks, I'll reopen. Can you update your original report with the test cases?

I can make an educated guess as to why it doesn't work:

cluster.fork() is implemented on top of child_process.fork()
child_process.fork() spawns the child process with NODE_CHANNEL_FD=<num> set in the environment
NODE_CHANNEL_FD makes node's bootstrap code call child_process._forkChild() to set up IPC
the asynchronous loading of worker.mjs makes that malfunction somehow

Interestingly, the worker sends the 'online' internal message. That's done by cluster._setupWorker() and it's called from lib/internal/process/pre_execution.js so at least some of the steps are working, just not all.

Anyway, you know where to start looking now. Pull request welcome.

jerome-benoit · 2023-08-21T19:15:20Z

That will take quite a while before I will be able to work on it, so if someone starts to work on that issue, please say so and share your progress.

jerome-benoit · 2023-09-22T10:54:40Z

@fabiancook: thanks for your interest. You do need to use poolifier to fix that issue.

…ck if ESM file extensions is used Fixes nodejs#48578

ronag · 2023-10-07T16:15:40Z

@mcollina i think this is quite a significant issue in terms of esm stability.

See @bnoordhuis last comment.

jerome-benoit · 2023-10-07T17:15:35Z

@mcollina i think this is quite a significant issue in terms of esm stability.

The issue is only met if the worker file is an ESM one. The main file type has no impact.

st3ffgv4 · 2024-06-13T17:40:26Z

I'm facing the same issue

jerome-benoit mentioned this issue Jun 27, 2023

[BUG] Cluster worker message listener registration failure with ESM files extension poolifier/poolifier#782

Open

bnoordhuis closed this as not planned Won't fix, can't repro, duplicate, stale Jun 28, 2023

bnoordhuis reopened this Jul 30, 2023

bnoordhuis added confirmed-bug Issues with confirmed bugs. cluster Issues and PRs related to the cluster subsystem. labels Jul 30, 2023

fabiancook mentioned this issue Sep 22, 2023

[FEATURE] No magnification for module poolifier/poolifier#1279

Closed

1 task

aleksejaku added a commit to aleksejaku/node that referenced this issue Oct 6, 2023

cluster.worker.on('message', (msg) => ...) fails to register a callba…

1856443

…ck if ESM file extensions is used Fixes nodejs#48578

jerome-benoit mentioned this issue May 12, 2024

Add scheduling policy to cluster.settings #49292

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cluster.worker.on('message', (msg) => ...) fails to register a callback if ESM file extensions is used #48578

cluster.worker.on('message', (msg) => ...) fails to register a callback if ESM file extensions is used #48578

jerome-benoit commented Jun 27, 2023 •

edited

Loading

jerome-benoit commented Jun 27, 2023 •

edited

Loading

aduh95 commented Jun 27, 2023

jerome-benoit commented Jun 27, 2023 •

edited

Loading

aduh95 commented Jun 27, 2023

bnoordhuis commented Jun 28, 2023

jerome-benoit commented Jun 28, 2023 •

edited

Loading

bnoordhuis commented Jun 28, 2023

jerome-benoit commented Jun 28, 2023 •

edited

Loading

bnoordhuis commented Jun 28, 2023

jerome-benoit commented Jun 28, 2023 •

edited

Loading

aduh95 commented Jun 28, 2023

jerome-benoit commented Jun 28, 2023 •

edited

Loading

jerome-benoit commented Jul 29, 2023 •

edited

Loading

jerome-benoit commented Jul 29, 2023

bnoordhuis commented Jul 30, 2023

jerome-benoit commented Aug 21, 2023

jerome-benoit commented Sep 22, 2023

ronag commented Oct 7, 2023 •

edited

Loading

jerome-benoit commented Oct 7, 2023

st3ffgv4 commented Jun 13, 2024

cluster.worker.on('message', (msg) => ...) fails to register a callback if ESM file extensions is used #48578

cluster.worker.on('message', (msg) => ...) fails to register a callback if ESM file extensions is used #48578

Comments

jerome-benoit commented Jun 27, 2023 • edited Loading

Version

Platform

Subsystem

What steps will reproduce the bug?

How often does it reproduce? Is there a required condition?

What is the expected behavior? Why is that the expected behavior?

What do you see instead?

Additional information

jerome-benoit commented Jun 27, 2023 • edited Loading

aduh95 commented Jun 27, 2023

jerome-benoit commented Jun 27, 2023 • edited Loading

aduh95 commented Jun 27, 2023

bnoordhuis commented Jun 28, 2023

jerome-benoit commented Jun 28, 2023 • edited Loading

bnoordhuis commented Jun 28, 2023

jerome-benoit commented Jun 28, 2023 • edited Loading

bnoordhuis commented Jun 28, 2023

jerome-benoit commented Jun 28, 2023 • edited Loading

aduh95 commented Jun 28, 2023

jerome-benoit commented Jun 28, 2023 • edited Loading

jerome-benoit commented Jul 29, 2023 • edited Loading

jerome-benoit commented Jul 29, 2023

bnoordhuis commented Jul 30, 2023

jerome-benoit commented Aug 21, 2023

jerome-benoit commented Sep 22, 2023

ronag commented Oct 7, 2023 • edited Loading

jerome-benoit commented Oct 7, 2023

st3ffgv4 commented Jun 13, 2024

jerome-benoit commented Jun 27, 2023 •

edited

Loading

jerome-benoit commented Jun 27, 2023 •

edited

Loading

jerome-benoit commented Jun 27, 2023 •

edited

Loading

jerome-benoit commented Jun 28, 2023 •

edited

Loading

jerome-benoit commented Jun 28, 2023 •

edited

Loading

jerome-benoit commented Jun 28, 2023 •

edited

Loading

jerome-benoit commented Jun 28, 2023 •

edited

Loading

jerome-benoit commented Jul 29, 2023 •

edited

Loading

ronag commented Oct 7, 2023 •

edited

Loading