Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failed test: test-child-process-fork-regr-gh-2847 #3635

Closed
jbergstroem opened this issue Nov 2, 2015 · 18 comments

Comments

@jbergstroem
Copy link
Member

commented Nov 2, 2015

Looks like this test is starting to fail in our windows environment: fail 1, fail 2

not ok 11 test-child-process-fork-regr-gh-2847.js
# events.js:141
# throw er; // Unhandled 'error' event
# ^
# 
# Error: read ECONNRESET
# at exports._errnoException (util.js:860:11)
# at TCP.onread (net.js:544:26)
@Fishrock123

This comment has been minimized.

Copy link
Member

commented Nov 4, 2015

Also failed in #3631 (comment), I'm not sure if that is one of those two runs.

The test was added in 36b969f by @indutny and then modified in e9e837c by @mhdawson.

@mhdawson

This comment has been minimized.

Copy link
Member

commented Nov 16, 2015

@gireeshpunathil can you take a look at the windows failures since you worked on the the most recent change.

@gireeshpunathil

This comment has been minimized.

Copy link
Member

commented Nov 17, 2015

@mhdawson , sure, I will debug to see what the issue is.

@gireeshpunathil

This comment has been minimized.

Copy link
Member

commented Nov 18, 2015

When run in many iterations, I am able to see it failing with the original test case as well (36b969f). So this is not a regression.

There are three types of errors observed in total. With e9e837c (a) and (b) seem to have eliminated, but (c) persists.

(a)

events.js:141
      throw er; // Unhandled 'error' event
      ^
Error: connect ECONNREFUSED 127.0.0.1:12346
    at Object.exports._errnoException (util.js:915:11)
    at exports._exceptionWithHostPort (util.js:938:20)
    at TCPConnectWrap.afterConnect [as oncomplete] (net.js:1065:14)

(b)

events.js:141
      throw er; // Unhandled 'error' event
      ^
Error: write EMFILE
    at exports._errnoException (util.js:915:11)
    at ChildProcess.target._send (internal/child_process.js:606:18)
    at internal/child_process.js:469:16
    at Array.forEach (native)
    at ChildProcess.<anonymous> (internal/child_process.js:468:13)
    at emitTwo (events.js:92:20)
    at ChildProcess.emit (events.js:172:7)
    at handleMessage (internal/child_process.js:686:10)
    at Pipe.channel.onread (internal/child_process.js:440:11)

(c)

events.js:141
      throw er; // Unhandled 'error' event
Error: read ECONNRESET
    at exports._errnoException (util.js:915:11)
    at TCP.onread (net.js:544:26)

The summary of the test case is to validate that the worker process being closed have its _channel field nullified. To re-inforce this assertion, a number of requests are sent from the master. The problem with the test case is that it does not handle the failure scenarios which can occur when the server is shutdown at arbitrary time intervals:
i) When an nth connection request is issued, the (n-1)th request could have resulted in a worker closure, followed by server shutdown. This results in failure (a)
ii) When an nth send request is issued, the (n-1)th request could have resulted in a worker closure, followed by server shutdown. This results in failure (b)
iii) Even when (i) and (ii) are false, the server socket could be closed in between several TCP protocol message transfers which constitute the connect or send. This results in failure (c).

Tweaking test case further, I see that even sending just two messages to the worker is sufficient to cause (c). More interestingly, as the error type (c) is coming from the TCP stack, detached from the net connect, send APIs, their error callbacks are incapable of catching this error. This essentially means that we cannot reliably pass messages, or suppress the errors.

Any suggestions to improve the test case is welcome.

Trott added a commit to Trott/io.js that referenced this issue Nov 24, 2015
Trott added a commit to Trott/io.js that referenced this issue Nov 25, 2015
See nodejs#3635 for details and failure
examples.

Ref: nodejs#3635
PR-URL: nodejs#4005
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Trott added a commit that referenced this issue Dec 1, 2015
See #3635 for details and failure
examples.

Ref: #3635
PR-URL: #4005
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Trott added a commit that referenced this issue Dec 4, 2015
See #3635 for details and failure
examples.

Ref: #3635
PR-URL: #4005
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Trott added a commit that referenced this issue Dec 5, 2015
See #3635 for details and failure
examples.

Ref: #3635
PR-URL: #4005
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Trott added a commit that referenced this issue Dec 17, 2015
See #3635 for details and failure
examples.

Ref: #3635
PR-URL: #4005
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Trott added a commit that referenced this issue Dec 23, 2015
See #3635 for details and failure
examples.

Ref: #3635
PR-URL: #4005
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
@evanlucas

This comment has been minimized.

Copy link
Member

commented Feb 2, 2016

@rvagg

This comment has been minimized.

@santigimeno

This comment has been minimized.

Copy link
Member

commented Feb 4, 2016

Could you paste the errors? Thanks

@Trott

This comment has been minimized.

Copy link
Member

commented Feb 4, 2016

Oh, right, the CI is locked for now. It's just this:

not ok 7 test-child-process-fork-regr-gh-2847.js
# TIMEOUT
  ---
  duration_ms: 120.61
Trott added a commit to Trott/io.js that referenced this issue Feb 4, 2016
A few tests have started failing on Raspberry Pi devices in CI.
https://ci.nodejs.org/job/node-test-binary-arm/943/

Ref: nodejs#4830
Ref: nodejs#3635
Ref: nodejs#4526
@santigimeno

This comment has been minimized.

Copy link
Member

commented Feb 5, 2016

It reproduces more or less consistently in my OS X box. I'll try to look into it

@santigimeno

This comment has been minimized.

Copy link
Member

commented Feb 6, 2016

I've tried fixing this. See #5121

stefanmb added a commit to stefanmb/node that referenced this issue Feb 23, 2016
Windows is still sometimes failing with ECONNRESET. Bring back the
handling of this error as was initially introduced in PR nodejs#4442.

PR-URL: nodejs#5179
Reviewed-By: Rich Trott <rtrott@gmail.com>
Fixes: nodejs#3635
MylesBorins added a commit that referenced this issue Mar 2, 2016
A few tests have started failing on Raspberry Pi devices in CI.
https://ci.nodejs.org/job/node-test-binary-arm/943/

PR-URL: #5082
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Reviewed-By: Roman Klauke <romaaan.git@gmail.com>
Ref: #4830
Ref: #3635
Ref: #4526
MylesBorins added a commit that referenced this issue Mar 2, 2016
The test would sometimes time out on some platforms. Take the initial
version of the test and instead of sending 100 handles, just send 2.
It still fails when run with the code before the change was introduced
and passes afterwards.

Remove test from parallel.status.

PR-URL: #5121
Reviewed-By: Brian White <mscdex@mscdex.net>
Reviewed-By: Rich Trott <rtrott@gmail.com>
Reviewed-By: Michael Dawson <michael_dawson@ca.ibm.com>
Fixes: #3635
MylesBorins added a commit that referenced this issue Mar 2, 2016
Windows is still sometimes failing with ECONNRESET. Bring back the
handling of this error as was initially introduced in PR #4442.

PR-URL: #5179
Reviewed-By: Rich Trott <rtrott@gmail.com>
Fixes: #3635
scovetta pushed a commit to scovetta/node that referenced this issue Apr 2, 2016
See nodejs#3635 for details and failure
examples.

Ref: nodejs#3635
PR-URL: nodejs#4005
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
scovetta pushed a commit to scovetta/node that referenced this issue Apr 2, 2016
A few tests have started failing on Raspberry Pi devices in CI.
https://ci.nodejs.org/job/node-test-binary-arm/943/

PR-URL: nodejs#5082
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Reviewed-By: Roman Klauke <romaaan.git@gmail.com>
Ref: nodejs#4830
Ref: nodejs#3635
Ref: nodejs#4526
@refack refack referenced this issue Apr 27, 2017
3 of 3 tasks complete
refack added a commit to refack/node that referenced this issue May 19, 2017
According to the explanation in nodejs#3635#issuecomment-157714683
And as a continuation to nodejs#5422 we also ignore EMFILE
"No more file descriptors are available,so no more files can be opened"

PR-URL: nodejs#12698
Fixes: nodejs#10286
Refs: nodejs#3635 (comment)
Refs: nodejs#5178
Refs: nodejs#5179
Refs: nodejs#4005
Refs: nodejs#5121
Refs: nodejs#5422
Refs: nodejs#12621 (comment)
Reviewed-By: Gibson Fahnestock <gibfahn@gmail.com>
Reviewed-By: Santiago Gimeno <santiago.gimeno@gmail.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
MylesBorins added a commit that referenced this issue Jul 17, 2017
According to the explanation in #3635#issuecomment-157714683
And as a continuation to #5422 we also ignore EMFILE
"No more file descriptors are available,so no more files can be opened"

PR-URL: #12698
Fixes: #10286
Refs: #3635 (comment)
Refs: #5178
Refs: #5179
Refs: #4005
Refs: #5121
Refs: #5422
Refs: #12621 (comment)
Reviewed-By: Gibson Fahnestock <gibfahn@gmail.com>
Reviewed-By: Santiago Gimeno <santiago.gimeno@gmail.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
gireeshpunathil added a commit to gireeshpunathil/node that referenced this issue May 26, 2018
In test-child-process-fork-closed-channel-segfault.js, race condition
is observed between the server getting closed and the worker sending
a message. Accommodate the potential errors.

Earlier, the same race was observed between the client and server
and was addressed through ignoring the relevant errors through error
handler. The same mechanism is re-used for worker too.

Refs: nodejs#3635 (comment)
Fixes: nodejs#20836
@gireeshpunathil gireeshpunathil referenced this issue May 26, 2018
2 of 4 tasks complete
gireeshpunathil added a commit to gireeshpunathil/node that referenced this issue May 27, 2018
In test-child-process-fork-closed-channel-segfault.js, race condition
is observed between the server getting closed and the worker sending
a message. Accommodate the potential errors.

Earlier, the same race was observed between the client and server
and was addressed through ignoring the relevant errors through error
handler. The same mechanism is re-used for worker send scenario too.

The only difference is that the filter is applied at the callback
instead of at the worker's error listener.

Refs: nodejs#3635 (comment)
Fixes: nodejs#20836
gireeshpunathil added a commit to gireeshpunathil/node that referenced this issue May 27, 2018
In test-child-process-fork-closed-channel-segfault.js, race condition
is observed between the server getting closed and the worker sending
a message. Accommodate the potential errors.

Earlier, the same race was observed between the client and server
and was addressed through ignoring the relevant errors through error
handler. The same mechanism is re-used for worker too.

The only difference is that the filter is applied at the callback
instead of at the worker's error listener.

Refs: nodejs#3635 (comment)
Fixes: nodejs#20836
gireeshpunathil added a commit that referenced this issue May 29, 2018
In test-child-process-fork-closed-channel-segfault.js, race condition
is observed between the server getting closed and the worker sending
a message. Accommodate the potential errors.

Earlier, the same race was observed between the client and server
and was addressed through ignoring the relevant errors through error
handler. The same mechanism is re-used for worker too.

The only difference is that the filter is applied at the callback
instead of at the worker's error listener.

Refs: #3635 (comment)
Fixes: #20836
PR-URL: #20973

Reviewed-By: Rich Trott <rtrott@gmail.com>
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Reviewed-By: Jon Moss <me@jonathanmoss.me>
Reviewed-By: Matheus Marchini <matheus@sthima.com>
Reviewed-By: Joyee Cheung <joyeec9h3@gmail.com>
Reviewed-By: Bartosz Sosnowski <bartosz@janeasystems.com>
shisama added a commit to shisama/node that referenced this issue May 30, 2018
In test-child-process-fork-closed-channel-segfault.js, race condition
is observed between the server getting closed and the worker sending
a message. Accommodate the potential errors.

Earlier, the same race was observed between the client and server
and was addressed through ignoring the relevant errors through error
handler. The same mechanism is re-used for worker too.

The only difference is that the filter is applied at the callback
instead of at the worker's error listener.

Refs: nodejs#3635 (comment)
Fixes: nodejs#20836
PR-URL: nodejs#20973

Reviewed-By: Rich Trott <rtrott@gmail.com>
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Reviewed-By: Jon Moss <me@jonathanmoss.me>
Reviewed-By: Matheus Marchini <matheus@sthima.com>
Reviewed-By: Joyee Cheung <joyeec9h3@gmail.com>
Reviewed-By: Bartosz Sosnowski <bartosz@janeasystems.com>
shisama added a commit to shisama/node that referenced this issue May 30, 2018
In test-child-process-fork-closed-channel-segfault.js, race condition
is observed between the server getting closed and the worker sending
a message. Accommodate the potential errors.

Earlier, the same race was observed between the client and server
and was addressed through ignoring the relevant errors through error
handler. The same mechanism is re-used for worker too.

The only difference is that the filter is applied at the callback
instead of at the worker's error listener.

Refs: nodejs#3635 (comment)
Fixes: nodejs#20836
PR-URL: nodejs#20973

Reviewed-By: Rich Trott <rtrott@gmail.com>
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Reviewed-By: Jon Moss <me@jonathanmoss.me>
Reviewed-By: Matheus Marchini <matheus@sthima.com>
Reviewed-By: Joyee Cheung <joyeec9h3@gmail.com>
Reviewed-By: Bartosz Sosnowski <bartosz@janeasystems.com>
addaleax added a commit that referenced this issue May 31, 2018
In test-child-process-fork-closed-channel-segfault.js, race condition
is observed between the server getting closed and the worker sending
a message. Accommodate the potential errors.

Earlier, the same race was observed between the client and server
and was addressed through ignoring the relevant errors through error
handler. The same mechanism is re-used for worker too.

The only difference is that the filter is applied at the callback
instead of at the worker's error listener.

Refs: #3635 (comment)
Fixes: #20836
PR-URL: #20973

Reviewed-By: Rich Trott <rtrott@gmail.com>
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Reviewed-By: Jon Moss <me@jonathanmoss.me>
Reviewed-By: Matheus Marchini <matheus@sthima.com>
Reviewed-By: Joyee Cheung <joyeec9h3@gmail.com>
Reviewed-By: Bartosz Sosnowski <bartosz@janeasystems.com>
shisama added a commit to shisama/node that referenced this issue Jun 5, 2018
In test-child-process-fork-closed-channel-segfault.js, race condition
is observed between the server getting closed and the worker sending
a message. Accommodate the potential errors.

Earlier, the same race was observed between the client and server
and was addressed through ignoring the relevant errors through error
handler. The same mechanism is re-used for worker too.

The only difference is that the filter is applied at the callback
instead of at the worker's error listener.

Refs: nodejs#3635 (comment)
Fixes: nodejs#20836
PR-URL: nodejs#20973

Reviewed-By: Rich Trott <rtrott@gmail.com>
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Reviewed-By: Jon Moss <me@jonathanmoss.me>
Reviewed-By: Matheus Marchini <matheus@sthima.com>
Reviewed-By: Joyee Cheung <joyeec9h3@gmail.com>
Reviewed-By: Bartosz Sosnowski <bartosz@janeasystems.com>
shisama added a commit to shisama/node that referenced this issue Jun 9, 2018
In test-child-process-fork-closed-channel-segfault.js, race condition
is observed between the server getting closed and the worker sending
a message. Accommodate the potential errors.

Earlier, the same race was observed between the client and server
and was addressed through ignoring the relevant errors through error
handler. The same mechanism is re-used for worker too.

The only difference is that the filter is applied at the callback
instead of at the worker's error listener.

Refs: nodejs#3635 (comment)
Fixes: nodejs#20836
PR-URL: nodejs#20973

Reviewed-By: Rich Trott <rtrott@gmail.com>
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Reviewed-By: Jon Moss <me@jonathanmoss.me>
Reviewed-By: Matheus Marchini <matheus@sthima.com>
Reviewed-By: Joyee Cheung <joyeec9h3@gmail.com>
Reviewed-By: Bartosz Sosnowski <bartosz@janeasystems.com>
shisama added a commit to shisama/node that referenced this issue Jun 11, 2018
In test-child-process-fork-closed-channel-segfault.js, race condition
is observed between the server getting closed and the worker sending
a message. Accommodate the potential errors.

Earlier, the same race was observed between the client and server
and was addressed through ignoring the relevant errors through error
handler. The same mechanism is re-used for worker too.

The only difference is that the filter is applied at the callback
instead of at the worker's error listener.

Refs: nodejs#3635 (comment)
Fixes: nodejs#20836
PR-URL: nodejs#20973

Reviewed-By: Rich Trott <rtrott@gmail.com>
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Reviewed-By: Jon Moss <me@jonathanmoss.me>
Reviewed-By: Matheus Marchini <matheus@sthima.com>
Reviewed-By: Joyee Cheung <joyeec9h3@gmail.com>
Reviewed-By: Bartosz Sosnowski <bartosz@janeasystems.com>
shisama added a commit to shisama/node that referenced this issue Jun 12, 2018
In test-child-process-fork-closed-channel-segfault.js, race condition
is observed between the server getting closed and the worker sending
a message. Accommodate the potential errors.

Earlier, the same race was observed between the client and server
and was addressed through ignoring the relevant errors through error
handler. The same mechanism is re-used for worker too.

The only difference is that the filter is applied at the callback
instead of at the worker's error listener.

Refs: nodejs#3635 (comment)
Fixes: nodejs#20836
PR-URL: nodejs#20973

Reviewed-By: Rich Trott <rtrott@gmail.com>
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Reviewed-By: Jon Moss <me@jonathanmoss.me>
Reviewed-By: Matheus Marchini <matheus@sthima.com>
Reviewed-By: Joyee Cheung <joyeec9h3@gmail.com>
Reviewed-By: Bartosz Sosnowski <bartosz@janeasystems.com>
shisama added a commit to shisama/node that referenced this issue Jun 17, 2018
In test-child-process-fork-closed-channel-segfault.js, race condition
is observed between the server getting closed and the worker sending
a message. Accommodate the potential errors.

Earlier, the same race was observed between the client and server
and was addressed through ignoring the relevant errors through error
handler. The same mechanism is re-used for worker too.

The only difference is that the filter is applied at the callback
instead of at the worker's error listener.

Refs: nodejs#3635 (comment)
Fixes: nodejs#20836
PR-URL: nodejs#20973

Reviewed-By: Rich Trott <rtrott@gmail.com>
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Reviewed-By: Jon Moss <me@jonathanmoss.me>
Reviewed-By: Matheus Marchini <matheus@sthima.com>
Reviewed-By: Joyee Cheung <joyeec9h3@gmail.com>
Reviewed-By: Bartosz Sosnowski <bartosz@janeasystems.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
9 participants
You can’t perform that action at this time.