Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

async-hooks.test-emit-after-on-destroyed is flaky #50245

Closed
anonrig opened this issue Oct 18, 2023 · 7 comments · Fixed by #51995
Closed

async-hooks.test-emit-after-on-destroyed is flaky #50245

anonrig opened this issue Oct 18, 2023 · 7 comments · Fixed by #51995
Labels
flaky-test Issues and PRs related to the tests with unstable failures on the CI.

Comments

@anonrig
Copy link
Member

anonrig commented Oct 18, 2023

Test

async-hooks.test-emit-after-on-destroyed

Platform

Other

Console output

not ok 32 async-hooks/test-emit-after-on-destroyed
  ---
  duration_ms: 403.10800
  severity: fail
  exitcode: 1
  stack: |-
    node:assert:125
      throw new AssertionError(obj);
      ^
    
    AssertionError [ERR_ASSERTION]: Expected values to be strictly equal:
    
    null !== 1
    
        at ChildProcess.<anonymous> (/home/iojs/build/workspace/node-test-commit-aix/nodes/aix72-ppc64/test/async-hooks/test-emit-after-on-destroyed.js:56:12)
        at ChildProcess.<anonymous> (/home/iojs/build/workspace/node-test-commit-aix/nodes/aix72-ppc64/test/common/index.js:476:15)
        at ChildProcess.emit (node:events:515:28)
        at maybeClose (node:internal/child_process:1105:16)
        at Socket.<anonymous> (node:internal/child_process:457:11)
        at Socket.emit (node:events:515:28)
        at Pipe.<anonymous> (node:net:337:12) {
      generatedMessage: true,
      code: 'ERR_ASSERTION',
      actual: null,
      expected: 1,
      operator: 'strictEqual'
    }
    
    Node.js v21.0.0
  ...

Build links

Additional information

No response

@anonrig anonrig added the flaky-test Issues and PRs related to the tests with unstable failures on the CI. label Oct 18, 2023
anonrig added a commit to anonrig/node that referenced this issue Oct 18, 2023
@Flarna
Copy link
Member

Flarna commented Oct 18, 2023

Seems to be a child process issue not async hooks.
The assert checks the exit code given via the childprocess close event which is a number according to docs but here it is null.

@richardlau
Copy link
Member

Seems to be a child process issue not async hooks. The assert checks the exit code given via the childprocess close event which is a number according to docs but here it is null.

Usually the exit code being null means that the child was ended by a signal.

@anonrig
Copy link
Member Author

anonrig commented Oct 18, 2023

Similar test is flaky as well: #50262

nodejs-github-bot pushed a commit that referenced this issue Oct 20, 2023
Ref: #50245
PR-URL: #50246
Refs: #50245
Reviewed-By: Filip Skokan <panva.ip@gmail.com>
Reviewed-By: Geoffrey Booth <webadmin@geoffreybooth.com>
Reviewed-By: Michael Dawson <midawson@redhat.com>
targos pushed a commit that referenced this issue Oct 23, 2023
Ref: #50245
PR-URL: #50246
Refs: #50245
Reviewed-By: Filip Skokan <panva.ip@gmail.com>
Reviewed-By: Geoffrey Booth <webadmin@geoffreybooth.com>
Reviewed-By: Michael Dawson <midawson@redhat.com>
alexfernandez pushed a commit to alexfernandez/node that referenced this issue Nov 1, 2023
Ref: nodejs#50245
PR-URL: nodejs#50246
Refs: nodejs#50245
Reviewed-By: Filip Skokan <panva.ip@gmail.com>
Reviewed-By: Geoffrey Booth <webadmin@geoffreybooth.com>
Reviewed-By: Michael Dawson <midawson@redhat.com>
targos pushed a commit that referenced this issue Nov 11, 2023
Ref: #50245
PR-URL: #50246
Refs: #50245
Reviewed-By: Filip Skokan <panva.ip@gmail.com>
Reviewed-By: Geoffrey Booth <webadmin@geoffreybooth.com>
Reviewed-By: Michael Dawson <midawson@redhat.com>
targos pushed a commit that referenced this issue Nov 27, 2023
Ref: #50245
PR-URL: #50246
Refs: #50245
Reviewed-By: Filip Skokan <panva.ip@gmail.com>
Reviewed-By: Geoffrey Booth <webadmin@geoffreybooth.com>
Reviewed-By: Michael Dawson <midawson@redhat.com>
@abmusse
Copy link
Contributor

abmusse commented Jan 29, 2024

I started to investigate this issue last week.

I asked @mhdawson to spin up some stress tests

ref: https://ci.nodejs.org/job/node-stress-single-test/nodes=rhel8-ppc64le/469/console

We ran this test case 1000 times on rhel8 ppc64le. This did not reproduce the error though.

I then built Node.js 21.0.0 (The version of node at the time of the issue) on my local machine (Fedora 39). From there I ran this test 100k times and I still was not able to reproduce a test failure.

$ tools/test.py -j 16 --repeat=100000 async-hooks/test-emit-after-on-destroyed
[14:17|% 100|+ 100000|-   0]: Done                               

All tests passed.

As noted before in #50245 (comment), the failure occurs due to a signal be raised but will need find a way to reproduce the failure to get more info on the signal and why the test case is getting signaled.

@mhdawson

Any further suggestions on how to reproduce the error?

Maybe it occurs more when running all the test cases together?

OR

Maybe it presents itself more frequently on some platforms? The original issue indicates it occurred on ppc64 AIX. Maybe stress testing AIX would reproduce the error.

@mhdawson
Copy link
Member

@abmusse I think trying the stress test on one of the platforms where we saw the failure makes sense. I think that @richardlau mentioned you still have access to one of the AIX machines from an earlier investigation so trying the 100k run there would be a good next step.

@abmusse
Copy link
Contributor

abmusse commented Feb 2, 2024

@mhdawson

Today I ran 100k stress test on one our AIX machines.

$ tools/test.py --repeat=100000 async-hooks/test-emit-after-on-destroyed
[59:37|% 100|+ 100000|-   0]: Done   

Running it 100k times on AIX didn't reproduce the error.

@abmusse
Copy link
Contributor

abmusse commented Mar 5, 2024

I suggest we un-mark this test as flaky as running it 100k times did not reproduce the error.

I will keep an eye on it and if it returns to a flaky state will handle marking it as flaky again.

abmusse added a commit to abmusse/node that referenced this issue Mar 6, 2024
I tested running the test case 100k times on the AIX ci machine
and was unable to re-produce the error. Also it has not showed up
recently as flaky on the ci. I suggest we mark this
as un-flaky.

ref: nodejs#50245 (comment)
@richardlau richardlau linked a pull request Mar 7, 2024 that will close this issue
nodejs-github-bot pushed a commit that referenced this issue Mar 8, 2024
I tested running the test case 100k times on the AIX ci machine
and was unable to re-produce the error. Also it has not showed up
recently as flaky on the ci. I suggest we mark this
as un-flaky.

ref: #50245 (comment)
PR-URL: #51995
Reviewed-By: Richard Lau <rlau@redhat.com>
Reviewed-By: Michael Dawson <midawson@redhat.com>
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
rdw-msft pushed a commit to rdw-msft/node that referenced this issue Mar 26, 2024
I tested running the test case 100k times on the AIX ci machine
and was unable to re-produce the error. Also it has not showed up
recently as flaky on the ci. I suggest we mark this
as un-flaky.

ref: nodejs#50245 (comment)
PR-URL: nodejs#51995
Reviewed-By: Richard Lau <rlau@redhat.com>
Reviewed-By: Michael Dawson <midawson@redhat.com>
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
sercher added a commit to sercher/graaljs that referenced this issue Apr 25, 2024
Ref: nodejs/node#50245
PR-URL: nodejs/node#50246
Refs: nodejs/node#50245
Reviewed-By: Filip Skokan <panva.ip@gmail.com>
Reviewed-By: Geoffrey Booth <webadmin@geoffreybooth.com>
Reviewed-By: Michael Dawson <midawson@redhat.com>
sercher added a commit to sercher/graaljs that referenced this issue Apr 25, 2024
Ref: nodejs/node#50245
PR-URL: nodejs/node#50246
Refs: nodejs/node#50245
Reviewed-By: Filip Skokan <panva.ip@gmail.com>
Reviewed-By: Geoffrey Booth <webadmin@geoffreybooth.com>
Reviewed-By: Michael Dawson <midawson@redhat.com>
marco-ippolito pushed a commit that referenced this issue May 2, 2024
I tested running the test case 100k times on the AIX ci machine
and was unable to re-produce the error. Also it has not showed up
recently as flaky on the ci. I suggest we mark this
as un-flaky.

ref: #50245 (comment)
PR-URL: #51995
Reviewed-By: Richard Lau <rlau@redhat.com>
Reviewed-By: Michael Dawson <midawson@redhat.com>
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
jcbhmr pushed a commit to jcbhmr/node that referenced this issue May 15, 2024
I tested running the test case 100k times on the AIX ci machine
and was unable to re-produce the error. Also it has not showed up
recently as flaky on the ci. I suggest we mark this
as un-flaky.

ref: nodejs#50245 (comment)
PR-URL: nodejs#51995
Reviewed-By: Richard Lau <rlau@redhat.com>
Reviewed-By: Michael Dawson <midawson@redhat.com>
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
flaky-test Issues and PRs related to the tests with unstable failures on the CI.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants