Investigate test-inspector-multisession-ws #34730

rexagod · 2020-08-11T17:26:17Z

Encountered this today in node-test-commit for #33636. Looks rare, opening an issue to keep track of this.

Platform: osx
Test: parallel/test-inspector-multisession-ws
Error log:

00:15:17.669 not ok 1417 parallel/test-inspector-multisession-ws
00:15:17.669   ---
00:15:17.669   duration_ms: 15.407
00:15:17.670   severity: fail
00:15:17.670   exitcode: 1
00:15:17.670   stack: |-
00:15:17.670     [test] Connecting to a child Node process
00:15:17.670     [test] Testing /json/list
00:15:17.670     [test] Connecting to a child Node process
00:15:17.670     [test] Testing /json/list
00:15:17.671     [err] Debugger listening on ws://127.0.0.1:53830/43828c02-dd89-4c4f-991b-a038fcc9aae3
00:15:17.671     [err] For help, see: https://nodejs.org/en/docs/inspector
00:15:17.671     [err] 
00:15:17.671     [err] Debugger attached.
00:15:17.671     [err] Debugger attached.
00:15:17.671     [err] 
00:15:17.671     [test] Breaking in code and verifying events are fired
00:15:17.671     Timed out waiting for matching notification (Initial pause))
00:15:17.671     1
00:15:17.671   ...

Ref: https://ci.nodejs.org/job/node-test-commit-osx/35693/nodes=osx1015/testReport/junit/(root)/test/parallel_test_inspector_multisession_ws/

The text was updated successfully, but these errors were encountered:

mhdawson · 2020-10-09T14:50:44Z

Occurred in CI run today as well: https://ci.nodejs.org/job/node-test-commit-osx/36606/

Trott · 2021-04-02T13:44:47Z

https://ci.nodejs.org/job/node-test-commit-osx/39832/nodes=osx1015/console

00:08:19 not ok 1525 parallel/test-inspector-multisession-ws
00:08:19   ---
00:08:19   duration_ms: 15.438
00:08:19   severity: fail
00:08:19   exitcode: 1
00:08:19   stack: |-
00:08:19     [test] Connecting to a child Node process
00:08:19     [test] Testing /json/list
00:08:19     [test] Connecting to a child Node process
00:08:19     [test] Testing /json/list
00:08:19     [err] Debugger listening on ws://127.0.0.1:51733/6ea46706-8716-4a1b-96df-4fa4dad81946
00:08:19     [err] For help, see: https://nodejs.org/en/docs/inspector
00:08:19     [err] 
00:08:19     [err] Debugger attached.
00:08:19     [err] Debugger attached.
00:08:19     [err] 
00:08:19     [test] Breaking in code and verifying events are fired
00:08:19     Timed out waiting for matching notification (Initial pause)
00:08:19     1
00:08:19   ...

Trott · 2021-04-02T14:18:34Z

I'm able to reproduce this locally with tools/test.py --repeat 1024 test/parallel/test-inspector-multisession-ws.js It seems to fail once every few hundred runs.

Trott · 2021-04-14T11:59:40Z

https://ci.nodejs.org/job/node-test-commit-osx/40090/nodes=osx1015/console

00:08:09 not ok 1516 parallel/test-inspector-multisession-ws
00:08:09   ---
00:08:09   duration_ms: 15.585
00:08:09   severity: fail
00:08:09   exitcode: 1
00:08:09   stack: |-
00:08:09     [test] Connecting to a child Node process
00:08:09     [test] Testing /json/list
00:08:09     [test] Connecting to a child Node process
00:08:09     [test] Testing /json/list
00:08:09     [err] Debugger listening on ws://127.0.0.1:61299/45734d86-4ad0-4fa4-87c9-b5db3330abd0
00:08:09     [err] For help, see: https://nodejs.org/en/docs/inspector
00:08:09     [err] 
00:08:09     [err] Debugger attached.
00:08:09     [err] Debugger attached.
00:08:09     [err] 
00:08:09     [test] Breaking in code and verifying events are fired
00:08:09     Timed out waiting for matching notification (Initial pause)
00:08:09     1
00:08:09   ...

Trott · 2021-04-15T05:39:51Z

It's pausing and stalling here:

node/test/common/inspector-helper.js

Lines 246 to 247 in c3a5e15

    
           notification = await new Promise( 
        
             (resolve) => this._notificationCallback = resolve);

tniessen · 2023-03-30T20:25:53Z

Another failure: https://ci.nodejs.org/job/node-test-commit-osx/51271/nodes=osx1015/

20:24:52 not ok 1709 parallel/test-inspector-multisession-ws
20:24:52   ---
20:24:52   duration_ms: 15.456
20:24:52   severity: fail
20:24:52   exitcode: 1
20:24:52   stack: |-
20:24:52     [test] Connecting to a child Node process
20:24:52     [test] Testing /json/list
20:24:52     [test] Connecting to a child Node process
20:24:52     [test] Testing /json/list
20:24:52     [err] Debugger listening on ws://127.0.0.1:58810/6f508bdc-caab-4c65-9e0c-bcfe2963e1e3
20:24:52     [err] For help, see: https://nodejs.org/en/docs/inspector
20:24:52     [err] 
20:24:52     [err] Debugger attached.
20:24:52     [err] Debugger attached.
20:24:52     [err] 
20:24:52     [test] Breaking in code and verifying events are fired
20:24:52     Timed out waiting for matching notification (Initial pause)
20:24:52     1
20:24:52   ...

mmarchini · 2023-06-26T18:22:03Z

Same error seems to be happening on other tests:

test/parallel/test-inspector-wait-for-connection.js (plus a few other runs)
test/parallel/test-inspector-async-stack-traces-set-interval.js

I was able to reproduce test/parallel/test-inspector-wait-for-connection.js locally somewhat consistently by doing tools/test.py --repeat 512 -j8 test/parallel/test-inspector-wait-for-connection.js. The higher the parallelism, the more frequently it seems to happen.

After some digging it looks like (at least for test/parallel/test-inspector-wait-for-connection.js) console.log('before wait for debugger') is called on the child process, then everything is executed on the parent process, causing Runtime.runIfWaitingForDebugger to be sent before the child process can call inspector.waitForDebugger(). Since there's no way for the child process to communicate to the parent process that it's waiting, this becomes non-determininstic and sometimes it fails.

I think the best way to fix these tests would be to add a NodeRuntime.waitingForDebugger and maybe also a NodeWorker.waitingForDebugger, both would be sent when the domain is enabled (if the runtime is waiting for debugger), or when inspector.waitForDebugger() is called. Seems like something that would be useful both for our internal tests as well as libraries/tools so that they don't need to rely on guessing if the runtime is waiting for the debugger or not. I'm happy to send a PR if that seems like a reasonable approach.

mmarchini · 2023-06-29T03:15:11Z

ok I think I have a fix for the flakiness, it's affecting quite a few inspector tests so I'm updating all of them before sending a PR

joyeecheung · 2023-11-16T21:59:08Z

The inspector test flakes are still low-key lurking in the CI. @mmarchini are you still working on it?

`NodeRuntime.waitingForDebugger` is a new Inspector Protocol event that will fire when the process being inspected is waiting for the debugger (for example, when `inspector.waitForDebugger()` is called). This allows inspecting processes to know when the inspected process is waiting for a `Runtime.runIfWaitingForDebugger` message to resume execution. It allows tooling to resume execution of the inspected process as soon as it deems necessary, without having to guess if the inspected process is waiting or not, making the workflow more deterministic. With a more deterministic workflow, it is possible to update Node.js core tests to avoid race conditions that can cause flakiness. Therefore, tests were also changed as following: * Remove no-op Runtime.runIfWaitingForDebugger from tests that don't need it * Use NodeRuntime.waitingForDebugger in all tests that need Runtime.runIfWaitingForDebugger, to ensure order of operations is predictable and correct * Simplify test-inspector-multisession-ws There might be value in adding `NodeWorker.waitingForDebugger` in a future patch, but as of right now, no Node.js core inspector tests using worker threads are not failing due to race conditions. Fixes: nodejs#34730

`NodeRuntime.waitingForDebugger` is a new Inspector Protocol event that will fire when the process being inspected is waiting for the debugger (for example, when `inspector.waitForDebugger()` is called). This allows inspecting processes to know when the inspected process is waiting for a `Runtime.runIfWaitingForDebugger` message to resume execution. It allows tooling to resume execution of the inspected process as soon as it deems necessary, without having to guess if the inspected process is waiting or not, making the workflow more deterministic. With a more deterministic workflow, it is possible to update Node.js core tests to avoid race conditions that can cause flakiness. Therefore, tests were also changed as following: * Remove no-op Runtime.runIfWaitingForDebugger from tests that don't need it * Use NodeRuntime.waitingForDebugger in all tests that need Runtime.runIfWaitingForDebugger, to ensure order of operations is predictable and correct * Simplify test-inspector-multisession-ws There might be value in adding `NodeWorker.waitingForDebugger` in a future patch, but as of right now, no Node.js core inspector tests using worker threads are not failing due to race conditions. Fixes: #34730 PR-URL: #51560 Reviewed-By: Benjamin Gruenbaum <benjamingr@gmail.com> Reviewed-By: Franziska Hinkelmann <franziska.hinkelmann@gmail.com> Reviewed-By: Joyee Cheung <joyeec9h3@gmail.com>

`NodeRuntime.waitingForDebugger` is a new Inspector Protocol event that will fire when the process being inspected is waiting for the debugger (for example, when `inspector.waitForDebugger()` is called). This allows inspecting processes to know when the inspected process is waiting for a `Runtime.runIfWaitingForDebugger` message to resume execution. It allows tooling to resume execution of the inspected process as soon as it deems necessary, without having to guess if the inspected process is waiting or not, making the workflow more deterministic. With a more deterministic workflow, it is possible to update Node.js core tests to avoid race conditions that can cause flakiness. Therefore, tests were also changed as following: * Remove no-op Runtime.runIfWaitingForDebugger from tests that don't need it * Use NodeRuntime.waitingForDebugger in all tests that need Runtime.runIfWaitingForDebugger, to ensure order of operations is predictable and correct * Simplify test-inspector-multisession-ws There might be value in adding `NodeWorker.waitingForDebugger` in a future patch, but as of right now, no Node.js core inspector tests using worker threads are not failing due to race conditions. Fixes: nodejs#34730 PR-URL: nodejs#51560 Reviewed-By: Benjamin Gruenbaum <benjamingr@gmail.com> Reviewed-By: Franziska Hinkelmann <franziska.hinkelmann@gmail.com> Reviewed-By: Joyee Cheung <joyeec9h3@gmail.com>

rexagod added flaky-test Issues and PRs related to the tests with unstable failures on the CI. macos Issues and PRs related to the macOS platform / OSX. labels Aug 11, 2020

mhdawson mentioned this issue Oct 9, 2020

Adds a fuzzer for LoadEnvironment #34844

Closed

1 task

lpinca mentioned this issue Oct 18, 2023

test-inspector-async-stack-traces-set-interval is flaky #50243

Closed

mmarchini mentioned this issue Jan 24, 2024

inspector: add NodeRuntime.waitingForDebugger event #51560

Merged

nodejs-github-bot closed this as completed in #51560 Feb 23, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Investigate test-inspector-multisession-ws #34730

Investigate test-inspector-multisession-ws #34730

rexagod commented Aug 11, 2020 •

edited

mhdawson commented Oct 9, 2020

Trott commented Apr 2, 2021

Trott commented Apr 2, 2021

Trott commented Apr 14, 2021

Trott commented Apr 15, 2021

tniessen commented Mar 30, 2023

mmarchini commented Jun 26, 2023 •

edited

mmarchini commented Jun 29, 2023

joyeecheung commented Nov 16, 2023

Investigate test-inspector-multisession-ws #34730

Investigate test-inspector-multisession-ws #34730

Comments

rexagod commented Aug 11, 2020 • edited

mhdawson commented Oct 9, 2020

Trott commented Apr 2, 2021

Trott commented Apr 2, 2021

Trott commented Apr 14, 2021

Trott commented Apr 15, 2021

tniessen commented Mar 30, 2023

mmarchini commented Jun 26, 2023 • edited

mmarchini commented Jun 29, 2023

joyeecheung commented Nov 16, 2023

rexagod commented Aug 11, 2020 •

edited

mmarchini commented Jun 26, 2023 •

edited