# test: ignore spurious 'EMFILE' #12698

Merged
merged 1 commit into from May 19, 2017

## Conversation

Member

### refack commented Apr 27, 2017 • edited

As discussed in most recently on #10286 test/parallel/test-child-process-fork-regr-gh-2847.js is a little flaky
Last sighting was

c:\workspace\node-test-binary-windows\RUN_SUBSET\1\VS_VERSION\vcbt2015\label\win10\test\parallel\test-child-process-fork-regr-gh-2847.js:61
throw err;
^

Error: write EMFILE
at exports._errnoException (util.js:1026:11)
at ChildProcess.target._send (internal/child_process.js:663:20)
at ChildProcess.target.send (internal/child_process.js:547:19)
at Worker.send (internal/cluster/worker.js:54:28)
at Socket.<anonymous> (c:\workspace\node-test-binary-windows\RUN_SUBSET\1\VS_VERSION\vcbt2015\label\win10\test\parallel\test-child-process-fork-regr-gh-2847.js:33:14)
at Object.onceWrapper (events.js:312:19)
at emitNone (events.js:105:13)
at Socket.emit (events.js:207:7)
at TCPConnectWrap.afterConnect [as oncomplete] (net.js:1102:10)


Possible explanation #3635 (comment)
So as a follow up to #5422 I suggest we also ignore EMFILE raised from the connection.

Ref: #3635
Ref: #5178
Ref: #5179
Ref: #4005
Ref: #5121
Ref: #5422
Ref: #12621 (comment)
Fixes: #10286

##### Checklist
• make -j4 test (UNIX), or vcbuild test (Windows) passes
• tests and/or benchmarks are included
• commit message follows commit guidelines

test

Member Author

Member

reviewed
Member

### Trott left a comment

 Without a reasonable explanation as to why EMFILE is being raised, I'm not thrilled with the idea of adding it just to make the test pass. The other errors listed make sense to me based on the comment above, but EMFILE seems surprising. This could just be my own ignorance....
Member Author

### refack commented Apr 27, 2017 • edited

 Sorry, I missed adding that reference: #3635 (comment) P.S. The test passes most of the time without this change, so IMHO it's not forcing it to pass, but making is less flaky.
requested changes
Member

### gibfahn left a comment

 @@ -37,7 +37,8 @@ const server = net.createServer(function(s) { // is still happening while the server has been closed. s.on('error', function(err) { if ((err.code !== 'ECONNRESET') && ((err.code !== 'ECONNREFUSED'))) { (err.code !== 'ECONNREFUSED') && (err.code !== 'EMFILE')) {

#### gibfahn Apr 27, 2017

Member

@refack could you put a brief two line explanation of why this error occurs in a comment above? When reading through tests mysteriously ignored errors are always a bit of code smell.

Author Member

ack.
PTAL

Member Author

### refack commented Apr 27, 2017

 @Trott @gibfahn I tried to document both the original purpose of this test, and the possible causes for the "EMFILE". test/parallel/test-child-process-fork-regr-gh-2847.js has a very long history of being flaky. As far as I can tell it's original purpose is to generate a very specific condition that pre #2847 used to cause a segfault, because this is based on asynchronous tasks, sometimes we just hit spurious connection errors. It specific error was last seen on Windows, but I don't know if this is a Windows only scenario.
Member Author

### refack commented Apr 27, 2017

 This exclusion would be a continuation to #5422 /cc @santigimeno
Member

### Trott commented Apr 27, 2017

 Let's see if a stress test can reproduce the error (or other errors) and if load on the machine matters at all. If the answer here is "move to sequential", I'd rather do that than have a bunch of exceptions in the code. Maybe we can even remove an exception or two if we go that route. Or not. Let's see... Stress test on win10 with just one test running at a time: https://ci.nodejs.org/job/node-stress-single-test/1180/nodes=win10/console Stress test on win10 with 100 tests running at a time: https://ci.nodejs.org/job/node-stress-single-test/1181/nodes=win10/console
Member

### Trott commented Apr 28, 2017 • edited

 Stress tests seem to show that EMFILE happens under load. Looks like moving the test to sequential may be the way to go therefore? Although I'd be interested in understanding why load causes EMFILE here. Maybe it's obvious once I look at the test, but I gotta log off right now....
Member Author

### refack commented Apr 28, 2017 • edited

 Maybe windows is actually running out of file handles? More probably high load causes the scheduler to be more aggressive, hence races. Is there a way to test is sequentially but under external synthetic load? IMHO if we want to just keep the original intention of the test (protect from regressing #2847) we could ignore all errors, except a SEGFAULT. If you think this test is useful for covering other cases, then it's worth the extra time...
Member

### Trott commented Apr 28, 2017

 Hope I'm not being presumptuous or anything, but on my own fork/branch: I removed all the guard code except the one that I"m sure is needed, moved to sequential, opened #12713 but labeled it in progress and started a CI stress test using the ALL label at https://ci.nodejs.org/job/node-stress-single-test/1182/ to see if any other guard code needs to be restored. I'm in favor of having this be specific to that regression. I like my tests narrowly focused, especially ones designed for oddball regressions. :-D
Member Author

### refack commented Apr 28, 2017

 Why not have them both, one sequential and unfiltered, and one parallel with the filters? All well documented of course.
Member

### gibfahn commented Apr 28, 2017

 Why not have them both, one sequential and unfiltered, and one parallel with the filters? All well documented of course. Why would we? Moving a test to sequential just means it can't be run in parallel, which is a minor annoyance as it makes our test runs slightly slower. Duplicating the test seems unnecessary. Is there a way to test is sequentially but under external synthetic load? I don't think we have a built in way to do that, I normally just build Node in the background, seems to do the trick (not the Node I'm testing obviously 😁 ). Maybe running benchmarks at the same time would provide enough load?
Member Author

### refack commented Apr 28, 2017

 @gibfahn sorry if I'm not RTFM but how do I trigger a stress test (like on the CI) locally?
Member

### gibfahn commented Apr 28, 2017 • edited

 @refack tools/test.py --repeat 1000 parallel will run the parallel test suite 1000 times. Probably worth doing something like: tools/test.py --repeat 1000 -p tap parallel/test-child-process-fork-regr-gh-2847 | tee out.tap and then grepping for not ok. You could use Powershell's Tee-Object on Windows.
Member Author

### refack commented Apr 28, 2017

 Why would we? Moving a test to sequential just means it can't be run in parallel, which is a minor annoyance as it makes our test runs slightly slower. Duplicating the test seems unnecessary. @gibfahn: @Trott's premise is that in sequential mode the exception exclusions are unnecessary (#12713). Although they are obviously needed in parallel mode. So having them tested in both modes tests this scenario under two different conditions. "Can't have too much testing"? Obviously these reasoning should be well documented within the test. P.S. @Trott you had a typo (sequential}/test-child-process-fork-regr-gh-2847) so I'm re-runing: https://ci.nodejs.org/job/node-stress-single-test/1186/
Member

### gibfahn commented Apr 28, 2017

 "Can't have too much testing"? I disagree. Every test should be testing something useful, otherwise you end up with a huge backlog of tests, and it's not sure what a test is actually for or why it failed. So having them tested in both modes tests this scenario under two different conditions. But EMFILE is only thrown very occasionally, and by ignoring this error as well we're not actually testing the EMFILE exception. If we think we should see this EMFILE error under some conditions, then we should try to get a test that reliably reproduces it, and the test for that probably doesn't need to also test this regression.
Member

### Trott commented Apr 28, 2017 • edited

 @refack tools/test.py --repeat 1000 parallel will run the parallel test suite 1000 times. To stress test a parallel test locally, I also use -j to specify how many tests to run simultaneously. \$ tools/test.py --repeat=92 -j 92 parallel/test-foo-bar

### Trott dismissed their stale reviewApr 28, 2017

Just got EMFILE in a sequential run, and reading the explanation linked above, it seems to be an unexpected if infrequent thing to happen in this test. Clearing my objection.

Member

### Trott commented Apr 28, 2017

 Argh, can't edit the above (I don't think) but I meant to say expected if infrequent rather than unexpected if infrequent.
reviewed
test/parallel/test-child-process-fork-regr-gh-2847.js Outdated
 @@ -35,9 +40,14 @@ const server = net.createServer(function(s) { // Errors can happen if this connection // is still happening while the server has been closed. // https://github.com/nodejs/node/issues/3635#issuecomment-157714683 // If a previous request caused the worker __and__ the server to close // a following "send" request could emit EMFILE, this is a sporius

#### gibfahn Apr 28, 2017

Member

sporius -> spurious

reviewed
test/parallel/test-child-process-fork-regr-gh-2847.js Outdated
 @@ -35,9 +40,14 @@ const server = net.createServer(function(s) { // Errors can happen if this connection

#### gibfahn Apr 28, 2017 • edited

Member

Maybe:
- ECONNREFUSED or ECONNRESET errors can happen if this connection

reviewed
test/parallel/test-child-process-fork-regr-gh-2847.js Outdated
 @@ -35,9 +40,14 @@ const server = net.createServer(function(s) { // Errors can happen if this connection // is still happening while the server has been closed. // https://github.com/nodejs/node/issues/3635#issuecomment-157714683 // If a previous request caused the worker __and__ the server to close

#### gibfahn Apr 28, 2017

Member

- If a previous request...

Basically can it be really clear which explanation is for which errors?

reviewed
Member

### gibfahn left a comment

 Mostly LGTM modulo nits
test/parallel/test-child-process-fork-regr-gh-2847.js Outdated
 @@ -1,3 +1,7 @@ /* Description: * Before #2847 a child process trying (asynchronously) to use a the closed

#### gibfahn Apr 28, 2017

Member

#2847 -> https://github.com/nodejs/node/pull/2847

a the -> a

test/parallel/test-child-process-fork-regr-gh-2847.js Outdated
 @@ -1,3 +1,7 @@ /* Description: * Before #2847 a child process trying (asynchronously) to use a the closed * channel to it's creator, would have caused a segfault.

#### gibfahn Apr 28, 2017

Member

to connect to ?

reviewed
test/parallel/test-child-process-fork-regr-gh-2847.js Outdated
 @@ -1,3 +1,7 @@ /* Description:

#### gibfahn Apr 28, 2017

Member

Probably don't need the Description:

Member Author

### refack commented Apr 28, 2017

approved these changes
Member

### santigimeno left a comment

 LGTM with one nit
test/parallel/test-child-process-fork-regr-gh-2847.js Outdated
 // is still happening while the server has been closed. // https://github.com/nodejs/node/issues/3635#issuecomment-157714683 // ECONNREFUSED or ECONNRESET errors can happen if this connection can // happen if this connection is still happening while the server has closed.

#### santigimeno Apr 28, 2017

Member

can happen if this connection is repeated twice

approved these changes
Member

### jasnell left a comment

 Generally LGTM with a couple of nits
test/parallel/test-child-process-fork-regr-gh-2847.js Outdated
 * Before https://github.com/nodejs/node/pull/2847 a child process * trying (asynchronously) to use the closed channel to it's creator * would have caused a segfault. */

#### jasnell Apr 28, 2017

Member

Comment block should use the typical // style.

test/parallel/test-child-process-fork-regr-gh-2847.js Outdated
 @@ -6,6 +11,7 @@ const assert = require('assert'); const cluster = require('cluster'); const net = require('net');

#### jasnell Apr 28, 2017

Member

Unnecessary whitespace change

Member Author

### refack commented Apr 28, 2017

 Comments fixed, PTAL. [joke] You are all a bunch of nit-pickers [/joke] JK. Love you all 💌
reviewed
Member

### gibfahn left a comment

 You are all a bunch of nit-pickers Speaking of which... (I see it as a game, can you minimize the number of nits you'll get per PR)
 @@ -1,3 +1,5 @@ // Before https://github.com/nodejs/node/pull/2847 a child process trying // (asynchronously) to use the closed channel to it's creator caused a segfault.

#### gibfahn Apr 28, 2017

Member

it's -> its

 @@ -1,3 +1,5 @@ // Before https://github.com/nodejs/node/pull/2847 a child process trying

#### gibfahn Apr 28, 2017

Member

Sorry, should be an issue, so:

https://github.com/nodejs/node/pull/2847->https://github.com/nodejs/node/issues/2847

Also can you order this as in the guide? Should look like:

'use strict';
const common = require('../common');

// Before https://github.com/nodejs/node/issues/2847 a child process trying
// (asynchronously) to use the closed channel to it's creator caused a segfault.

const assert = require('assert');
approved these changes

### refackforce-pushed the refack:more-for-10286 branch 3 times, most recently Apr 28, 2017

approved these changes
 test: ignore spurious 'EMFILE' 
According to the explanation in #3635#issuecomment-157714683
And as a continuation to #5422 we also ignore EMFILE
"No more file descriptors are available,so no more files can be opened"

PR-URL: #12698
Fixes: #10286
Refs: #3635 (comment)
Refs: #5178
Refs: #5179
Refs: #4005
Refs: #5121
Refs: #5422
Refs: #12621 (comment)
Reviewed-By: Gibson Fahnestock <gibfahn@gmail.com>
Reviewed-By: Santiago Gimeno <santiago.gimeno@gmail.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
 6f21671 

Member Author

### refack commented May 19, 2017

 Landed in 6f21671
Member Author

### refack commented May 19, 2017

 Post land CI: https://ci.nodejs.org/job/node-test-commit/10004/

### refack deleted the refack:more-for-10286 branch May 19, 2017

referenced this pull request May 28, 2017
referenced this pull request Jun 15, 2017
added a commit that referenced this pull request Jul 17, 2017
 test: ignore spurious 'EMFILE' 
According to the explanation in #3635#issuecomment-157714683
And as a continuation to #5422 we also ignore EMFILE
"No more file descriptors are available,so no more files can be opened"

PR-URL: #12698
Fixes: #10286
Refs: #3635 (comment)
Refs: #5178
Refs: #5179
Refs: #4005
Refs: #5121
Refs: #5422
Refs: #12621 (comment)
Reviewed-By: Gibson Fahnestock <gibfahn@gmail.com>
Reviewed-By: Santiago Gimeno <santiago.gimeno@gmail.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
 e279eb5 
Member

### MylesBorins commented Jul 17, 2017

 I've landed this on v6.x... please lmk if this was a mistake

### MylesBorins added the land-on-v6.x label Jul 17, 2017

referenced this pull request Jul 18, 2017
referenced this pull request Sep 8, 2017