Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test-http2-large-file is flaky #47409

Open
MoLow opened this issue Apr 4, 2023 · 14 comments
Open

test-http2-large-file is flaky #47409

MoLow opened this issue Apr 4, 2023 · 14 comments
Labels
flaky-test Issues and PRs related to the tests with unstable failures on the CI.

Comments

@MoLow
Copy link
Member

MoLow commented Apr 4, 2023

Test

test-http2-large-file

Platform

AIX

Console output

22:30:07 not ok 3904 sequential/test-http2-large-file
22:30:07   ---
22:30:07   duration_ms: 120107.72300
22:30:07   severity: fail
22:30:07   exitcode: -15
22:30:07   stack: |-
22:30:07     timeout
22:30:07   ...

Build links

Additional information

No response

@MoLow MoLow added the flaky-test Issues and PRs related to the tests with unstable failures on the CI. label Apr 4, 2023
@github-actions github-actions bot added the aix Issues and PRs related to the AIX platform. label Apr 4, 2023
@MoLow
Copy link
Member Author

MoLow commented Apr 4, 2023

I took https://ci.nodejs.org/computer/test-osuosl-aix72-ppc64_be-2/ offline to investigate

@MoLow
Copy link
Member Author

MoLow commented Apr 4, 2023

on AIX tools/test.py sequential/test-http2-large-file --timeout 30 --repeat 50 reproduces easily
however, this change causes the test to be very stable and I have no idea why:

diff --git a/test/sequential/test-http2-large-file.js b/test/sequential/test-http2-large-file.js
index 2f2cc2c80d..3afcec73f4 100644
--- a/test/sequential/test-http2-large-file.js
+++ b/test/sequential/test-http2-large-file.js
@@ -24,6 +24,7 @@ server.listen(0, common.mustCall(() => {
                                { settings: { initialWindowSize: 6553500 } });
   const request = client.request({ ':method': 'POST' });
   function writeChunk() {
+    console.log({ remaining });
     if (remaining > 0) {
       remaining -= chunkLength;
       request.write(chunk, writeChunk);

@Trott
Copy link
Member

Trott commented Apr 4, 2023

on AIX tools/test.py sequential/test-http2-large-file --timeout 30 --repeat 50 reproduces easily however, this change causes the test to be very stable and I have no idea why:

diff --git a/test/sequential/test-http2-large-file.js b/test/sequential/test-http2-large-file.js
index 2f2cc2c80d..3afcec73f4 100644
--- a/test/sequential/test-http2-large-file.js
+++ b/test/sequential/test-http2-large-file.js
@@ -24,6 +24,7 @@ server.listen(0, common.mustCall(() => {
                                { settings: { initialWindowSize: 6553500 } });
   const request = client.request({ ':method': 'POST' });
   function writeChunk() {
+    console.log({ remaining });
     if (remaining > 0) {
       remaining -= chunkLength;
       request.write(chunk, writeChunk);

There's a long history of console.log() statements causing tests to succeed that otherwise fail because they are blocking or otherwise causes an I/O delay that "fixes" a race condition. I can probably find an issue from 2015 about this!

@MoLow
Copy link
Member Author

MoLow commented Apr 4, 2023

it does not seem to be a race condition, this did not fix the issue:

diff --git a/test/sequential/test-http2-large-file.js b/test/sequential/test-http2-large-file.js
index 2f2cc2c80d..300e0642ab 100644
--- a/test/sequential/test-http2-large-file.js
+++ b/test/sequential/test-http2-large-file.js
@@ -24,12 +24,14 @@ server.listen(0, common.mustCall(() => {
                                { settings: { initialWindowSize: 6553500 } });
   const request = client.request({ ':method': 'POST' });
   function writeChunk() {
-    if (remaining > 0) {
+   setTimeout(() => {
+   if (remaining > 0) {
       remaining -= chunkLength;
       request.write(chunk, writeChunk);
     } else {
       request.end();
     }
+   }, 10);

@Trott
Copy link
Member

Trott commented Apr 4, 2023

I forget (or maybe never knew!) the details, but console.log() has weird side effects that cause I/O flushing or other things and the side effects vary from platform to platform. Maybe someone on @nodejs/console or @nodejs/libuv would know more. If I recall correctly, it blocks on some platforms but not on others.

@MoLow
Copy link
Member Author

MoLow commented Apr 4, 2023

process.stdout.write(''); does not make the issue go await either so it is definitely very specific to console.log

@richardlau
Copy link
Member

@miladfarca has been investigating this as an upstream V8 issue: https://bugs.chromium.org/p/v8/issues/detail?id=13902

@MoLow
Copy link
Member Author

MoLow commented Apr 4, 2023

@targos
Copy link
Member

targos commented Jan 11, 2024

Happened on Windows and Linux with https://ci.nodejs.org/job/node-test-pull-request/56726/

@targos
Copy link
Member

targos commented Jan 11, 2024

The V8 issue talks about disabling concurrent sparkplug to fix it temporarily but we already do it: 7071c1d

Related:

@targos
Copy link
Member

targos commented Jan 23, 2024

This really happens a lot lately. I opened #51549

From nodejs/reliability#772:

Reason sequential/test-http2-large-file
Type JS_TEST_FAILURE
Failed PR 17 (https://github.com/nodejs/node/pull/45427/, https://github.com/nodejs/node/pull/51401/, https://github.com/nodejs/node/pull/51410/, https://github.com/nodejs/node/pull/50960/, https://github.com/nodejs/node/pull/51422/, https://github.com/nodejs/node/pull/51431/, https://github.com/nodejs/node/pull/51412/, https://github.com/nodejs/node/pull/51352/, https://github.com/nodejs/node/pull/51172/, https://github.com/nodejs/node/pull/51446/, https://github.com/nodejs/node/pull/51459/, https://github.com/nodejs/node/pull/51044/, https://github.com/nodejs/node/pull/51437/, https://github.com/nodejs/node/pull/51461/, https://github.com/nodejs/node/pull/51456/, https://github.com/nodejs/node/pull/51455/, https://github.com/nodejs/node/pull/51472/)
Appeared test-rackspace-win2022_vs2022-x64-1, test-azure_msft-win11_vs2022-x64-3, test-rackspace-win2019_vs2019-x64-1, test-rackspace-win2019_vs2019-x64-2, test-azure_msft-win10_vs2019-x64-1, test-rackspace-win2022_vs2022-x64-4, test-azure_msft-win11_vs2022-x64-1, test-rackspace-win2019_vs2019-x64-3, test-rackspace-win2022_vs2022-x64-6, test-azure_msft-win11_vs2022-x64-2, test-digitalocean-fedora38-x64-1, test-digitalocean-ubuntu2204-x64-2, test-rackspace-win2019_vs2019-x64-4
First CI https://ci.nodejs.org/job/node-test-pull-request/56712/
Last CI https://ci.nodejs.org/job/node-test-pull-request/56791/
Example
not ok 950 sequential/test-http2-large-file
  ---
  duration_ms: 120192.99600
  severity: fail
  exitcode: 1
  stack: |-
    timeout
  ...

@shnooshnoo
Copy link

The first time it occurred in this build https://ci.nodejs.org/job/node-daily-master/3286/, so i guess something that landed around 7th of Jan might've caused this. From what I can see the difference is this commit 1a5acd0

nodejs-github-bot pushed a commit that referenced this issue Jan 25, 2024
Refs: #47409
PR-URL: #51549
Reviewed-By: Richard Lau <rlau@redhat.com>
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
Reviewed-By: Yagiz Nizipli <yagiz.nizipli@sentry.io>
marco-ippolito pushed a commit to marco-ippolito/node that referenced this issue Feb 2, 2024
Refs: nodejs#47409
PR-URL: nodejs#51549
Reviewed-By: Richard Lau <rlau@redhat.com>
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
Reviewed-By: Yagiz Nizipli <yagiz.nizipli@sentry.io>
rdw-msft pushed a commit to rdw-msft/node that referenced this issue Feb 9, 2024
Refs: nodejs#47409
PR-URL: nodejs#51549
Reviewed-By: Richard Lau <rlau@redhat.com>
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
Reviewed-By: Yagiz Nizipli <yagiz.nizipli@sentry.io>
targos added a commit that referenced this issue Feb 15, 2024
Refs: #47409
PR-URL: #51549
Reviewed-By: Richard Lau <rlau@redhat.com>
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
Reviewed-By: Yagiz Nizipli <yagiz.nizipli@sentry.io>
marco-ippolito pushed a commit to marco-ippolito/node that referenced this issue Feb 19, 2024
Refs: nodejs#47409
PR-URL: nodejs#51549
Reviewed-By: Richard Lau <rlau@redhat.com>
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
Reviewed-By: Yagiz Nizipli <yagiz.nizipli@sentry.io>
@mhdawson
Copy link
Member

@mhdawson mhdawson removed the aix Issues and PRs related to the AIX platform. label Mar 25, 2024
richardlau pushed a commit that referenced this issue Mar 25, 2024
Refs: #47409
PR-URL: #51549
Reviewed-By: Richard Lau <rlau@redhat.com>
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
Reviewed-By: Yagiz Nizipli <yagiz.nizipli@sentry.io>
richardlau pushed a commit that referenced this issue Mar 25, 2024
Refs: #47409
PR-URL: #51549
Reviewed-By: Richard Lau <rlau@redhat.com>
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
Reviewed-By: Yagiz Nizipli <yagiz.nizipli@sentry.io>
@mhdawson
Copy link
Member

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
flaky-test Issues and PRs related to the tests with unstable failures on the CI.
Projects
None yet
Development

No branches or pull requests

6 participants