Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test/parallel/test-worker-http2-stream-terminate.js #43084

Closed
RaisinTen opened this issue May 13, 2022 · 11 comments · Fixed by #45438
Closed

test/parallel/test-worker-http2-stream-terminate.js #43084

RaisinTen opened this issue May 13, 2022 · 11 comments · Fixed by #45438
Labels
async_hooks Issues and PRs related to the async hooks subsystem. flaky-test Issues and PRs related to the tests with unstable failures on the CI.

Comments

@RaisinTen
Copy link
Contributor

RaisinTen commented May 13, 2022

Test

test-worker-http2-stream-terminate

Platform

node-test-binary-windows-js-suites

Console output

21:46:33 not ok 773 parallel/test-worker-http2-stream-terminate
21:46:33   ---
21:46:33   duration_ms: 16.879
21:46:33   severity: crashed
21:46:33   exitcode: 3221225477
21:46:33   stack: |-
21:46:33     Error: async hook stack has become corrupted (actual: 5, expected: 28983)
21:46:33      1: 00007FF67A2BAD6F node_api_throw_syntax_error+174655
21:46:33      2: 00007FF67A2B2E3C node_api_throw_syntax_error+142092
21:46:33      3: 00007FF67A1B756A node::PromiseRejectCallback+3658
21:46:33      4: 00007FF67A2E07E3 v8::internal::compiler::Operator::EffectOutputCount+547
21:46:33      5: 00007FF67AC9419D v8::internal::Builtins::code+248653
21:46:33      6: 00007FF67AC93DA9 v8::internal::Builtins::code+247641
21:46:33      7: 00007FF67AC9406C v8::internal::Builtins::code+248348
21:46:33      8: 00007FF67AC93ED0 v8::internal::Builtins::code+247936
21:46:33      9: 000002503B14853D 
21:46:33   ...

Platform

smartos20-64

Console output

19:50:30 not ok 2984 parallel/test-worker-http2-stream-terminate
19:50:30   ---
19:50:30   duration_ms: 17.692
19:50:30   severity: fail
19:50:30   exitcode: 1
19:50:30   stack: |-
19:50:30     Error: async hook stack has become corrupted (actual: 335, expected: 368)
19:50:30      1: 133dc99 node::AsyncHooks::FailWithCorruptedAsyncStack(double) [/home/iojs/build/workspace/node-test-commit-smartos/nodes/smartos20-64/out/Release/node]
19:50:30      2: 1305294 node::AsyncWrap::PopAsyncContext(v8::FunctionCallbackInfo<v8::Value> const&) [/home/iojs/build/workspace/node-test-commit-smartos/nodes/smartos20-64/out/Release/node]
19:50:30      3: 16630a1 v8::internal::FunctionCallbackArguments::Call(v8::internal::CallHandlerInfo) [/home/iojs/build/workspace/node-test-commit-smartos/nodes/smartos20-64/out/Release/node]
19:50:30      4: 1662637 v8::internal::MaybeHandle<v8::internal::Object> v8::internal::(anonymous namespace)::HandleApiCallHelper<false>(v8::internal::Isolate*, v8::internal::Handle<v8::internal::HeapObject>, v8::internal::Handle<v8::internal::HeapObject>, v8::internal::Handle<v8::internal::FunctionTemplateInfo>, v8::internal::Handle<v8::internal::Object>, v8::internal::BuiltinArguments) [/home/iojs/build/workspace/node-test-commit-smartos/nodes/smartos20-64/out/Release/node]
19:50:30      5: 1662b57 v8::internal::Builtin_HandleApiCall(int, unsigned long*, v8::internal::Isolate*) [/home/iojs/build/workspace/node-test-commit-smartos/nodes/smartos20-64/out/Release/node]
19:50:30      6: fffffc7fb9b07b39 
19:50:30      7: fffffc7fb1c5622a 
19:50:30      8: fffffc7fb1c56b14 
19:50:30      9: fffffc7fb1c5b4f9 
19:50:30     10: fffffc7fb9a8ba90 
19:50:30     11: fffffc7fb1c73b67 
19:50:30     12: 1fe02dc Builtins_JSEntryTrampoline [/home/iojs/build/workspace/node-test-commit-smartos/nodes/smartos20-64/out/Release/node]
19:50:30     13: 1fe0003 Builtins_JSEntry [/home/iojs/build/workspace/node-test-commit-smartos/nodes/smartos20-64/out/Release/node]
19:50:30     14: 174b022 v8::internal::(anonymous namespace)::Invoke(v8::internal::Isolate*, v8::internal::(anonymous namespace)::InvokeParams const&) [/home/iojs/build/workspace/node-test-commit-smartos/nodes/smartos20-64/out/Release/node]
19:50:30     15: 174c094 v8::internal::Execution::Call(v8::internal::Isolate*, v8::internal::Handle<v8::internal::Object>, v8::internal::Handle<v8::internal::Object>, int, v8::internal::Handle<v8::internal::Object>*) [/home/iojs/build/workspace/node-test-commit-smartos/nodes/smartos20-64/out/Release/node]
19:50:30     16: 161e1e0 v8::Function::Call(v8::Local<v8::Context>, v8::Local<v8::Value>, int, v8::Local<v8::Value>*) [/home/iojs/build/workspace/node-test-commit-smartos/nodes/smartos20-64/out/Release/node]
19:50:30     17: 12ed4d6 node::InternalMakeCallback(node::Environment*, v8::Local<v8::Object>, v8::Local<v8::Object>, v8::Local<v8::Function>, int, v8::Local<v8::Value>*, node::async_context) [/home/iojs/build/workspace/node-test-commit-smartos/nodes/smartos20-64/out/Release/node]
19:50:30     18: 12ed6ba node::MakeCallback(v8::Isolate*, v8::Local<v8::Object>, v8::Local<v8::Function>, int, v8::Local<v8::Value>*, node::async_context) [/home/iojs/build/workspace/node-test-commit-smartos/nodes/smartos20-64/out/Release/node]
19:50:30     19: 1337bf3 node::Environment::CheckImmediate(uv_check_s*) [/home/iojs/build/workspace/node-test-commit-smartos/nodes/smartos20-64/out/Release/node]
19:50:30     20: 1fbbf29 uv__run_check [/home/iojs/build/workspace/node-test-commit-smartos/nodes/smartos20-64/out/Release/node]
19:50:30     21: 1fb44c4 uv_run [/home/iojs/build/workspace/node-test-commit-smartos/nodes/smartos20-64/out/Release/node]
19:50:30     22: 12ee776 node::SpinEventLoop(node::Environment*) [/home/iojs/build/workspace/node-test-commit-smartos/nodes/smartos20-64/out/Release/node]
19:50:30     23: 14bdeaf node::worker::Worker::Run() [/home/iojs/build/workspace/node-test-commit-smartos/nodes/smartos20-64/out/Release/node]
19:50:30     24: 14be439 node::worker::Worker::StartThread(v8::FunctionCallbackInfo<v8::Value> const&)::{lambda(void*)#1}::_FUN(void*) [/home/iojs/build/workspace/node-test-commit-smartos/nodes/smartos20-64/out/Release/node]
19:50:30     25: fffffc7fef036b6c _thrp_setup [/lib/amd64/libc.so.1]
19:50:30     26: fffffc7fef036e80 _lwp_start [/lib/amd64/libc.so.1]
19:50:30   ...

Platform

freebsd12-x64

Console output

06:15:51 not ok 2822 parallel/test-worker-http2-stream-terminate
06:15:57   ---
06:15:57   duration_ms: 7.851
06:15:57   severity: fail
06:15:57   exitcode: 1
06:15:57   stack: |-
06:15:57     Error: async hook stack has become corrupted (actual: 11, expected: 10888)
06:15:57      1: 0x3349f06 node::AsyncHooks::FailWithCorruptedAsyncStack(double) [/usr/home/iojs/build/workspace/node-test-commit-freebsd/nodes/freebsd12-x64/out/Release/node]
06:15:57      2: 0x32e17ad node::AsyncHooks::pop_async_context(double) [/usr/home/iojs/build/workspace/node-test-commit-freebsd/nodes/freebsd12-x64/out/Release/node]
06:15:57      3: 0x32ee439 node::AsyncWrap::PopAsyncContext(v8::FunctionCallbackInfo<node::AsyncWrap::PopAsyncContext::Value> const&) [/usr/home/iojs/build/workspace/node-test-commit-freebsd/nodes/freebsd12-x64/out/Release/node]
06:15:57      4: 0x35b1b28 v8::internal::FunctionCallbackArguments::Call(v8::internal::CallHandlerInfo) [/usr/home/iojs/build/workspace/node-test-commit-freebsd/nodes/freebsd12-x64/out/Release/node]
06:15:57      5: 0x35b163e v8::internal::Isolate*<v8::internal::Object> v8::internal::Builtins::InvokeApiFunction(v8::internal::Isolate*, bool, v8::internal::Handle<v8::internal::HeapObject>(int, v8::internal::Object*, v8::internal::HeapObject) [/usr/home/iojs/build/workspace/node-test-commit-freebsd/nodes/freebsd12-x64/out/Release/node]
06:15:57   ...

Build links

Additional information

No response

@RaisinTen RaisinTen added the flaky-test Issues and PRs related to the tests with unstable failures on the CI. label May 13, 2022
@F3n67u
Copy link
Member

F3n67u commented May 29, 2022

This flaky test causes a lot of Jenkins build failure, see nodejs/reliability#292. Can we mark this test as flaky?

@RaisinTen
Copy link
Contributor Author

Temporarily marking the test flaky until someone comes up with a proper fix sounds reasonable to me given that it is causing so many CI failures.

@mhdawson
Copy link
Member

RaisinTen added a commit to RaisinTen/node that referenced this issue Jun 14, 2022
The test is causing a lot of CI failures, so I'd say that we should mark
this test flaky until someone comes up with a proper fix.

Refs: nodejs#43084
Signed-off-by: Darshan Sen <raisinten@gmail.com>
@RaisinTen
Copy link
Contributor Author

Sent #43425 to mark the test as flaky on Windows.

nodejs-github-bot pushed a commit that referenced this issue Jun 18, 2022
The test is causing a lot of CI failures, so I'd say that we should mark
this test flaky until someone comes up with a proper fix.

Refs: #43084
Signed-off-by: Darshan Sen <raisinten@gmail.com>

PR-URL: #43425
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Tobias Nießen <tniessen@tnie.de>
@F3n67u
Copy link
Member

F3n67u commented Jun 29, 2022

Sent #43425 to mark the test as flaky on Windows.

Hi @RaisinTen the test-worker-http2-stream-terminate test is still failing on other platforms, such as ubuntu, freebsd. Can we mark this test as flaky on all platforms?

Reason parallel/test-worker-http2-stream-terminate
Type JS_TEST_FAILURE
Failed PR 6 (#43569#43585#43519#43580#43589#43567)
Appeared test-digitalocean-freebsd12-x64-1, test-digitalocean-freebsd12-x64-2, test-digitalocean-alpine311_container-x64-1, test-digitalocean-alpine312_container-x64-1, test-digitalocean-alpine312_container-x64-2, test-digitalocean-ubuntu1804_sharedlibs_container-x64-6, test-digitalocean-ubuntu1804_sharedlibs_container-x64-4, test-digitalocean-ubuntu1804_sharedlibs_container-x64-8, test-joyent-smartos20-x64-3
First CI https://ci.nodejs.org/job/node-test-pull-request/44852/
Last CI https://ci.nodejs.org/job/node-test-pull-request/44919/

This data is extracted from today's nodejs/reliability#315

@RaisinTen
Copy link
Contributor Author

Can we mark this test as flaky on all platforms?

@F3n67u Sounds good to me

@F3n67u
Copy link
Member

F3n67u commented Jun 29, 2022

I created a pr to move test-worker-http2-stream-terminate flaky on all platforms: #43620.

@RaisinTen
Copy link
Contributor Author

RaisinTen commented Jul 10, 2022

cc @nodejs/async_hooks

For some reason the ids don't match up

async_id_fields_[kExecutionAsyncId] != async_id)) {

@RaisinTen RaisinTen added the async_hooks Issues and PRs related to the async hooks subsystem. label Jul 10, 2022
targos pushed a commit that referenced this issue Jul 12, 2022
The test is causing a lot of CI failures, so I'd say that we should mark
this test flaky until someone comes up with a proper fix.

Refs: #43084
Signed-off-by: Darshan Sen <raisinten@gmail.com>

PR-URL: #43425
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Tobias Nießen <tniessen@tnie.de>
targos pushed a commit that referenced this issue Jul 18, 2022
The test is causing a lot of CI failures, so I'd say that we should mark
this test flaky until someone comes up with a proper fix.

Refs: #43084
Signed-off-by: Darshan Sen <raisinten@gmail.com>

PR-URL: #43425
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Tobias Nießen <tniessen@tnie.de>
targos pushed a commit that referenced this issue Jul 31, 2022
The test is causing a lot of CI failures, so I'd say that we should mark
this test flaky until someone comes up with a proper fix.

Refs: #43084
Signed-off-by: Darshan Sen <raisinten@gmail.com>

PR-URL: #43425
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Tobias Nießen <tniessen@tnie.de>
kvakil added a commit to kvakil/node that referenced this issue Aug 2, 2022
These tests seem to timeout quite often. I don't know why, but one
possible reason is that they are starting a lot of threads. It seems
that tests in `test/parallel` are assumed to only start one thread each,
so having 11 threads running at a time feels like a lot.

It also seems that these tests fail in a correlated fashion: take a look
at [this reliability report][]. The failures all occur on the same build
machines on the same PRs. This suggests to me some sort of CPU
contention.

[this reliability report]: nodejs/reliability#334

On my Linux machine decreasing the parallelism & iterations here reduce
the `user` time from ~11.5 seconds to ~2 seconds, depending on the test.
I have seen these tests take 30-60 seconds on CI (Alpine in particular).

I went back to the diffs that introduced that introduced these changes
and verified that they failed at least 90% of the time with the reduced
iteration count, which feels sufficient.

Refs: nodejs#43499
Refs: nodejs#43084
nodejs-github-bot pushed a commit that referenced this issue Aug 5, 2022
These tests seem to timeout quite often. I don't know why, but one
possible reason is that they are starting a lot of threads. It seems
that tests in `test/parallel` are assumed to only start one thread each,
so having 11 threads running at a time feels like a lot.

It also seems that these tests fail in a correlated fashion: take a look
at [this reliability report][]. The failures all occur on the same build
machines on the same PRs. This suggests to me some sort of CPU
contention.

[this reliability report]: nodejs/reliability#334

On my Linux machine decreasing the parallelism & iterations here reduce
the `user` time from ~11.5 seconds to ~2 seconds, depending on the test.
I have seen these tests take 30-60 seconds on CI (Alpine in particular).

I went back to the diffs that introduced that introduced these changes
and verified that they failed at least 90% of the time with the reduced
iteration count, which feels sufficient.

Refs: #43499
Refs: #43084
PR-URL: #44090
Reviewed-By: Antoine du Hamel <duhamelantoine1995@gmail.com>
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
danielleadams pushed a commit that referenced this issue Aug 16, 2022
These tests seem to timeout quite often. I don't know why, but one
possible reason is that they are starting a lot of threads. It seems
that tests in `test/parallel` are assumed to only start one thread each,
so having 11 threads running at a time feels like a lot.

It also seems that these tests fail in a correlated fashion: take a look
at [this reliability report][]. The failures all occur on the same build
machines on the same PRs. This suggests to me some sort of CPU
contention.

[this reliability report]: nodejs/reliability#334

On my Linux machine decreasing the parallelism & iterations here reduce
the `user` time from ~11.5 seconds to ~2 seconds, depending on the test.
I have seen these tests take 30-60 seconds on CI (Alpine in particular).

I went back to the diffs that introduced that introduced these changes
and verified that they failed at least 90% of the time with the reduced
iteration count, which feels sufficient.

Refs: #43499
Refs: #43084
PR-URL: #44090
Reviewed-By: Antoine du Hamel <duhamelantoine1995@gmail.com>
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
ruyadorno pushed a commit that referenced this issue Aug 23, 2022
These tests seem to timeout quite often. I don't know why, but one
possible reason is that they are starting a lot of threads. It seems
that tests in `test/parallel` are assumed to only start one thread each,
so having 11 threads running at a time feels like a lot.

It also seems that these tests fail in a correlated fashion: take a look
at [this reliability report][]. The failures all occur on the same build
machines on the same PRs. This suggests to me some sort of CPU
contention.

[this reliability report]: nodejs/reliability#334

On my Linux machine decreasing the parallelism & iterations here reduce
the `user` time from ~11.5 seconds to ~2 seconds, depending on the test.
I have seen these tests take 30-60 seconds on CI (Alpine in particular).

I went back to the diffs that introduced that introduced these changes
and verified that they failed at least 90% of the time with the reduced
iteration count, which feels sufficient.

Refs: #43499
Refs: #43084
PR-URL: #44090
Reviewed-By: Antoine du Hamel <duhamelantoine1995@gmail.com>
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
targos pushed a commit that referenced this issue Sep 5, 2022
These tests seem to timeout quite often. I don't know why, but one
possible reason is that they are starting a lot of threads. It seems
that tests in `test/parallel` are assumed to only start one thread each,
so having 11 threads running at a time feels like a lot.

It also seems that these tests fail in a correlated fashion: take a look
at [this reliability report][]. The failures all occur on the same build
machines on the same PRs. This suggests to me some sort of CPU
contention.

[this reliability report]: nodejs/reliability#334

On my Linux machine decreasing the parallelism & iterations here reduce
the `user` time from ~11.5 seconds to ~2 seconds, depending on the test.
I have seen these tests take 30-60 seconds on CI (Alpine in particular).

I went back to the diffs that introduced that introduced these changes
and verified that they failed at least 90% of the time with the reduced
iteration count, which feels sufficient.

Refs: #43499
Refs: #43084
PR-URL: #44090
Reviewed-By: Antoine du Hamel <duhamelantoine1995@gmail.com>
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
Fyko pushed a commit to Fyko/node that referenced this issue Sep 15, 2022
These tests seem to timeout quite often. I don't know why, but one
possible reason is that they are starting a lot of threads. It seems
that tests in `test/parallel` are assumed to only start one thread each,
so having 11 threads running at a time feels like a lot.

It also seems that these tests fail in a correlated fashion: take a look
at [this reliability report][]. The failures all occur on the same build
machines on the same PRs. This suggests to me some sort of CPU
contention.

[this reliability report]: nodejs/reliability#334

On my Linux machine decreasing the parallelism & iterations here reduce
the `user` time from ~11.5 seconds to ~2 seconds, depending on the test.
I have seen these tests take 30-60 seconds on CI (Alpine in particular).

I went back to the diffs that introduced that introduced these changes
and verified that they failed at least 90% of the time with the reduced
iteration count, which feels sufficient.

Refs: nodejs#43499
Refs: nodejs#43084
PR-URL: nodejs#44090
Reviewed-By: Antoine du Hamel <duhamelantoine1995@gmail.com>
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
guangwong pushed a commit to noslate-project/node that referenced this issue Oct 10, 2022
The test is causing a lot of CI failures, so I'd say that we should mark
this test flaky until someone comes up with a proper fix.

Refs: nodejs/node#43084
Signed-off-by: Darshan Sen <raisinten@gmail.com>

PR-URL: nodejs/node#43425
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Tobias Nießen <tniessen@tnie.de>
juanarbol pushed a commit that referenced this issue Oct 10, 2022
These tests seem to timeout quite often. I don't know why, but one
possible reason is that they are starting a lot of threads. It seems
that tests in `test/parallel` are assumed to only start one thread each,
so having 11 threads running at a time feels like a lot.

It also seems that these tests fail in a correlated fashion: take a look
at [this reliability report][]. The failures all occur on the same build
machines on the same PRs. This suggests to me some sort of CPU
contention.

[this reliability report]: nodejs/reliability#334

On my Linux machine decreasing the parallelism & iterations here reduce
the `user` time from ~11.5 seconds to ~2 seconds, depending on the test.
I have seen these tests take 30-60 seconds on CI (Alpine in particular).

I went back to the diffs that introduced that introduced these changes
and verified that they failed at least 90% of the time with the reduced
iteration count, which feels sufficient.

Refs: #43499
Refs: #43084
PR-URL: #44090
Reviewed-By: Antoine du Hamel <duhamelantoine1995@gmail.com>
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
juanarbol pushed a commit that referenced this issue Oct 11, 2022
These tests seem to timeout quite often. I don't know why, but one
possible reason is that they are starting a lot of threads. It seems
that tests in `test/parallel` are assumed to only start one thread each,
so having 11 threads running at a time feels like a lot.

It also seems that these tests fail in a correlated fashion: take a look
at [this reliability report][]. The failures all occur on the same build
machines on the same PRs. This suggests to me some sort of CPU
contention.

[this reliability report]: nodejs/reliability#334

On my Linux machine decreasing the parallelism & iterations here reduce
the `user` time from ~11.5 seconds to ~2 seconds, depending on the test.
I have seen these tests take 30-60 seconds on CI (Alpine in particular).

I went back to the diffs that introduced that introduced these changes
and verified that they failed at least 90% of the time with the reduced
iteration count, which feels sufficient.

Refs: #43499
Refs: #43084
PR-URL: #44090
Reviewed-By: Antoine du Hamel <duhamelantoine1995@gmail.com>
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
@ywave620
Copy link
Contributor

Hey, there. I would like to investigates into this issue. Is there any recent failure? I searched in the CI system, https://github.com/nodejs/reliability/issues?q=is%3Aissue+is%3Aopen+test-worker-http2-stream-terminate, it looks like this test has paased consistently since 2022-07-15

@Trott
Copy link
Member

Trott commented Nov 12, 2022

Hey, there. I would like to investigates into this issue. Is there any recent failure? I searched in the CI system, https://github.com/nodejs/reliability/issues?q=is%3Aissue+is%3Aopen+test-worker-http2-stream-terminate, it looks like this test has paased consistently since 2022-07-15

I don't think it will show up in those reports now that it is marked as unreliable. I think the only way to make progress on this is to figure out how to cause the issue on your own machine or else to remove the unreliable designation in parallel.status and see if it starts causing failures again.

@Trott
Copy link
Member

Trott commented Nov 12, 2022

I don't think it will show up in those reports now that it is marked as unreliable. I think the only way to make progress on this is to figure out how to cause the issue on your own machine or else to remove the unreliable designation in parallel.status and see if it starts causing failures again.

I suppose we could also try running it in CI with the stress test job and have it ignore the flaky designation, treating failures as failures.

Trott added a commit to Trott/io.js that referenced this issue Nov 12, 2022
Trott added a commit to Trott/io.js that referenced this issue Nov 13, 2022
nodejs-github-bot pushed a commit that referenced this issue Nov 14, 2022
Closes: #43084
PR-URL: #45438
Fixes: #43084
Reviewed-By: Rafael Gonzaga <rafael.nunu@hotmail.com>
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
ruyadorno pushed a commit that referenced this issue Nov 21, 2022
Closes: #43084
PR-URL: #45438
Fixes: #43084
Reviewed-By: Rafael Gonzaga <rafael.nunu@hotmail.com>
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
danielleadams pushed a commit that referenced this issue Dec 30, 2022
Closes: #43084
PR-URL: #45438
Fixes: #43084
Reviewed-By: Rafael Gonzaga <rafael.nunu@hotmail.com>
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
danielleadams pushed a commit that referenced this issue Dec 30, 2022
Closes: #43084
PR-URL: #45438
Fixes: #43084
Reviewed-By: Rafael Gonzaga <rafael.nunu@hotmail.com>
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
guangwong pushed a commit to noslate-project/node that referenced this issue Jan 3, 2023
These tests seem to timeout quite often. I don't know why, but one
possible reason is that they are starting a lot of threads. It seems
that tests in `test/parallel` are assumed to only start one thread each,
so having 11 threads running at a time feels like a lot.

It also seems that these tests fail in a correlated fashion: take a look
at [this reliability report][]. The failures all occur on the same build
machines on the same PRs. This suggests to me some sort of CPU
contention.

[this reliability report]: nodejs/reliability#334

On my Linux machine decreasing the parallelism & iterations here reduce
the `user` time from ~11.5 seconds to ~2 seconds, depending on the test.
I have seen these tests take 30-60 seconds on CI (Alpine in particular).

I went back to the diffs that introduced that introduced these changes
and verified that they failed at least 90% of the time with the reduced
iteration count, which feels sufficient.

Refs: nodejs/node#43499
Refs: nodejs/node#43084
PR-URL: nodejs/node#44090
Reviewed-By: Antoine du Hamel <duhamelantoine1995@gmail.com>
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
guangwong pushed a commit to noslate-project/node that referenced this issue Jan 3, 2023
These tests seem to timeout quite often. I don't know why, but one
possible reason is that they are starting a lot of threads. It seems
that tests in `test/parallel` are assumed to only start one thread each,
so having 11 threads running at a time feels like a lot.

It also seems that these tests fail in a correlated fashion: take a look
at [this reliability report][]. The failures all occur on the same build
machines on the same PRs. This suggests to me some sort of CPU
contention.

[this reliability report]: nodejs/reliability#334

On my Linux machine decreasing the parallelism & iterations here reduce
the `user` time from ~11.5 seconds to ~2 seconds, depending on the test.
I have seen these tests take 30-60 seconds on CI (Alpine in particular).

I went back to the diffs that introduced that introduced these changes
and verified that they failed at least 90% of the time with the reduced
iteration count, which feels sufficient.

Refs: nodejs/node#43499
Refs: nodejs/node#43084
PR-URL: nodejs/node#44090
Reviewed-By: Antoine du Hamel <duhamelantoine1995@gmail.com>
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
danielleadams pushed a commit that referenced this issue Jan 3, 2023
Closes: #43084
PR-URL: #45438
Fixes: #43084
Reviewed-By: Rafael Gonzaga <rafael.nunu@hotmail.com>
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
danielleadams pushed a commit that referenced this issue Jan 4, 2023
Closes: #43084
PR-URL: #45438
Fixes: #43084
Reviewed-By: Rafael Gonzaga <rafael.nunu@hotmail.com>
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
async_hooks Issues and PRs related to the async hooks subsystem. flaky-test Issues and PRs related to the tests with unstable failures on the CI.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants