Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test: fix flaky child-process-exec-kill-throws #12111

Closed
wants to merge 1 commit into from

Conversation

Trott
Copy link
Member

@Trott Trott commented Mar 29, 2017

This is a fix for test-child-process-exec-kill-throws which is currently
flaky on Windows.

A bug in the test was causing the child process to fail for reasons
other than those intended by the test. Instead of failing for exceeding
the maxBuffer setting, the test was failing because it was trying to
load internal/child_process without being passed the
expose-internals flag. Move that module to where only the parent
process (which gets the flag) loads it.

Additionally, improve an assertion message to help debug problems like
this.

Fixes: #12053

Checklist
  • make -j4 test (UNIX), or vcbuild test (Windows) passes
  • commit message follows commit guidelines
Affected core subsystem(s)

test child_process

@Trott Trott added child_process Issues and PRs related to the child_process subsystem. test Issues and PRs related to the tests. labels Mar 29, 2017
@Trott
Copy link
Member Author

Trott commented Mar 29, 2017

@Trott
Copy link
Member Author

Trott commented Mar 29, 2017

Stress test on master which should show failures: https://ci.nodejs.org/job/node-stress-single-test/1147/nodes=win-vs2015/console

@Trott
Copy link
Member Author

Trott commented Mar 29, 2017

Stress test on this PR (should show no failures): https://ci.nodejs.org/job/node-stress-single-test/1148/nodes=win-vs2015/console

@Trott
Copy link
Member Author

Trott commented Mar 29, 2017

If the CI and stress tests come up as expected, I would propose that we land this sooner than the 48 hours we normally wait. (I won't argue, though, if someone feels it should wait.)

@Trott
Copy link
Member Author

Trott commented Mar 29, 2017

@nodejs/testing

@Trott
Copy link
Member Author

Trott commented Mar 29, 2017

Hmmm, CI is not looking good. Looks like it might be failing because something (maybe this test somehow) is leaving around leftover child processes? Will have to look more closely...

@Trott
Copy link
Member Author

Trott commented Mar 29, 2017

Confirmed that now that the initial bug in the test is fixed, it now does not reliably terminate the child process. :-|

@Trott
Copy link
Member Author

Trott commented Mar 29, 2017

(On the upside, the stress tests are looking good.)

@Trott
Copy link
Member Author

Trott commented Mar 29, 2017

Adding a listener for the exit event on the child process. Hoping that fixes everything by preventing the parent process from exiting before it has actually sent the signal. (Perhaps it bails early due to the monkey-patched error?) I might be reaching here, but let's try CI and see:

CI: https://ci.nodejs.org/job/node-test-pull-request/7083/

@Trott
Copy link
Member Author

Trott commented Mar 29, 2017

sigh Nope, that didn't fix the "child process not actually terminating" issue...

@santigimeno
Copy link
Member

I think the problem here is that killing the child process will only kill the shell process that executes the nodejs child process leaving the child process orphan and alive. TBH I don't know how kill could work in this scenario. Any ideas?

@bzoz
Copy link
Contributor

bzoz commented Mar 29, 2017

If @santigimeno is right, then try using spawn instead of exec. The spawn function does not create a shell by default.

@gibfahn
Copy link
Member

gibfahn commented Mar 29, 2017

@bzoz Given that the test is called child-process-exec-kill-throws, I'd assume changing to spawn() would defeat the purpose of the test. See the original PR: #11038.

cc/ @cjihrig

@gibfahn
Copy link
Member

gibfahn commented Mar 29, 2017

@Trott this seems like it could be related to #11052, also a child-process-exec test that has issues on windows.

@Trott
Copy link
Member Author

Trott commented Mar 29, 2017

@Trott this seems like it could be related to #11052, also a child-process-exec test that has issues on windows.

Possibly, but this isn't Windows-specific. The bug in the test that this fixes was Windows specific. The new issue it exposes now that bug is fixed.. that's on all sorts of platforms. Check the CI results.

@Trott
Copy link
Member Author

Trott commented Mar 29, 2017

(In fact, the current not-cleaning-up-after-itself issue doesn't affect the Windows runs on the last CI, so it may not affect Windows at all!)

This is a fix for test-child-process-exec-kill-throws which is currently
flaky on Windows.

A bug in the test was causing the child process to fail for reasons
other than those intended by the test. Instead of failing for exceeding
the `maxBuffer` setting, the test was failing because it was trying to
load `internal/child_process` without being passed the
`expose-internals` flag. Move that module to where only the parent
process (which gets the flag) loads it.

Additionally, improve an assertion message to help debug problems like
this.

Fixes: nodejs#12053
@Trott
Copy link
Member Author

Trott commented Mar 30, 2017

Unfortunately, I don't think changing to spawn() will work because this test is designed to add coverage for two lines that get run with exec() but now spawn().

I don't think we need a process that runs forever. We just need a process that errors out. So I'm removing the setInterval() and I think this should do it.

@Trott
Copy link
Member Author

Trott commented Mar 30, 2017

@Trott
Copy link
Member Author

Trott commented Mar 30, 2017

This looks much better! And it still exercises the relevant code.

Some Linux hosts in CI are struggling right now, but that's not related to this. I might run another CI again once that is cleared up, just to be cautious.

@Trott
Copy link
Member Author

Trott commented Mar 30, 2017

(Also: I'm super-pleased that the Makefile changes from a few weeks ago caught a test that sometimes leaves stray processes around. Hooray!)

@Trott
Copy link
Member Author

Trott commented Mar 30, 2017

Oh, right, need to do a Windows stress test too:

https://ci.nodejs.org/job/node-stress-single-test/1149/nodes=win-vs2015/console

@Trott
Copy link
Member Author

Trott commented Mar 30, 2017

CI is clean except for one unrelated build failure.

Stress test is still running, but looks great so far (1200 successes and 0 failures so far).

Trott added a commit to Trott/io.js that referenced this pull request Mar 31, 2017
This is a fix for test-child-process-exec-kill-throws which is currently
flaky on Windows.

A bug in the test was causing the child process to fail for reasons
other than those intended by the test. Instead of failing for exceeding
the `maxBuffer` setting, the test was failing because it was trying to
load `internal/child_process` without being passed the
`expose-internals` flag. Move that module to where only the parent
process (which gets the flag) loads it.

Additionally, improve an assertion message to help debug problems like
this.

PR-URL: nodejs#12111
Fixes: nodejs#12053
Reviewed-By: Richard Lau <riclau@uk.ibm.com>
@Trott
Copy link
Member Author

Trott commented Mar 31, 2017

Landed in a10e657

@Trott Trott closed this Mar 31, 2017
@jasnell jasnell mentioned this pull request Apr 4, 2017
italoacasas pushed a commit to italoacasas/node that referenced this pull request Apr 10, 2017
This is a fix for test-child-process-exec-kill-throws which is currently
flaky on Windows.

A bug in the test was causing the child process to fail for reasons
other than those intended by the test. Instead of failing for exceeding
the `maxBuffer` setting, the test was failing because it was trying to
load `internal/child_process` without being passed the
`expose-internals` flag. Move that module to where only the parent
process (which gets the flag) loads it.

Additionally, improve an assertion message to help debug problems like
this.

PR-URL: nodejs#12111
Fixes: nodejs#12053
Reviewed-By: Richard Lau <riclau@uk.ibm.com>
@italoacasas italoacasas mentioned this pull request Apr 10, 2017
2 tasks
@Trott Trott deleted the taco-tuesday branch January 13, 2022 22:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
child_process Issues and PRs related to the child_process subsystem. test Issues and PRs related to the tests.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Fix flaky parallel/test-child-process-exec-kill-throws
6 participants