Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dd-trace leaks handles in jest test suite, resulting in incomplete CI test tracing #2134

Closed
jmichel17 opened this issue Jun 17, 2022 · 11 comments
Assignees
Labels
bug Something isn't working ci-app

Comments

@jmichel17
Copy link

Expected behaviour

Instrumenting jest tests with dd-trace should not leak any handles.

Actual behaviour

My jest test suite contains 76 tests in it. When executing all of these tests in a Github actions job, only ~32 were actually getting reported to the Datadog CI test dashboard. Adding an arbitrarily selected 10 second wait after the jest tests completed resulted in 75 of the 76 getting reported up to the dashboard. Presumably, the difference in tracing coverage was due to the fact that the dd-trace library was not flushing its traces.

I've yet to see all 76 tests reported. On the runs where 75 were reported with the help of a wait, the following error was logged:

A worker process has failed to exit gracefully and has been force exited. This is likely caused by tests leaking due to improper teardown. Try running with --detectOpenHandles to find leaks. Active timers can also cause this, ensure that .unref() was called on them.

In combination with jest's --detectOpenHandles and --runInBand options plus the leaked-handles npm package (https://github.com/Raynos/leaked-handles), I discovered there is evidence dd-trace is the culprit leaking handles and potentially causing the forced exit. And in turn probably also the cause of the 76th test not getting reported.

The pertinent logs produced by leaked-handles are as follows:

http handle leaked at one of: 
    at retriableRequest (/code/packages/test/node_modules/dd-trace/packages/dd-trace/src/exporters/common/request.js:56:26)
http stream {
  fd: -1,
  readable: true,
  writable: true,
  address: {},
  method: 'POST',
  path: '/telemetry/proxy/api/v2/apmtelemetry',
  host: 'datadog-agent:8126'
}
http handle leaked at one of: 
    at /code/packages/test/node_modules/dd-trace/packages/datadog-instrumentations/src/http/client.js:62:29
http stream {
  fd: -1,
  readable: true,
  writable: true,
  address: {},
  method: 'PUT',
  path: '/v0.4/traces',
  host: 'datadog-agent:8126'
}

and...

Jest has detected the following 1 open handle potentially keeping Jest from exiting:
  ●  bound-anonymous-fn
      35 |   async ({ url, method, data }) => {
      36 |     try {
    > 37 |       const response = await fetch(baseUrl + url, {
         |                              ^
      38 |         method: method,
      39 |         headers: {
      40 |           Authorization: 'Token ' + getAuth().access,
      at new exports.AsyncResource (../../packages/test/node_modules/dd-trace/packages/datadog-instrumentations/src/helpers/instrument.js:59:27)
      at Socket.connect (../../packages/test/node_modules/dd-trace/packages/datadog-instrumentations/src/net.js:35:27)
      at Agent.fakeCreateConnection [as createConnection] (../../node_modules/leaked-handles/stacks.js:100:41)
      at new NodeClientRequest (../../node_modules/@mswjs/interceptors/src/interceptors/ClientRequest/NodeClientRequest.ts:68:5)
      at ../../node_modules/@mswjs/interceptors/src/interceptors/ClientRequest/http.request.ts:23:12
      at ../../node_modules/node-fetch/lib/index.js:1468:15
      at Object.fetch (../../node_modules/node-fetch/lib/index.js:1437:9)
      at Object.fetch (../../node_modules/cross-fetch/dist/node-ponyfill.js:10:20)
      at ../online-banking/src/store/api/index.ts:37:30
      at ../../node_modules/@reduxjs/toolkit/src/query/core/buildThunks.ts:286:24
      at step (../../node_modules/@reduxjs/toolkit/dist/query/rtk-query.cjs.development.js:23:23)
      at Object.next (../../node_modules/@reduxjs/toolkit/dist/query/rtk-query.cjs.development.js:4:53)
      at ../../node_modules/@reduxjs/toolkit/dist/query/rtk-query.cjs.development.js:110:65
      at __async (../../node_modules/@reduxjs/toolkit/dist/query/rtk-query.cjs.development.js:92:12)
      at executeEndpoint (../../node_modules/@reduxjs/toolkit/src/query/core/buildThunks.ts:265:8)
      at ../../node_modules/@reduxjs/toolkit/src/createAsyncThunk.ts:605:15
      at step (../../node_modules/@reduxjs/toolkit/dist/redux-toolkit.cjs.development.js:38:23)
      at Object.next (../../node_modules/@reduxjs/toolkit/dist/redux-toolkit.cjs.development.js:19:53)
      at ../../node_modules/@reduxjs/toolkit/dist/redux-toolkit.cjs.development.js:112:65
      at __async (../../node_modules/@reduxjs/toolkit/dist/redux-toolkit.cjs.development.js:94:12)
      at ../../node_modules/@reduxjs/toolkit/src/createAsyncThunk.ts:580:42
      at ../../node_modules/@reduxjs/toolkit/src/createAsyncThunk.ts:651:16
      at ../../node_modules/redux-thunk/lib/index.js:23:18
      at ../../node_modules/@reduxjs/toolkit/src/immutableStateInvariantMiddleware.ts:258:32
      at dispatch (../../node_modules/redux/lib/redux.js:667:28)
      at ../../node_modules/@reduxjs/toolkit/src/query/core/buildInitiate.ts:277:29
      at ../../node_modules/redux-thunk/lib/index.js:23:18
      at ../../node_modules/@reduxjs/toolkit/src/immutableStateInvariantMiddleware.ts:258:32
      at ../../node_modules/@reduxjs/toolkit/src/query/react/buildHooks.ts:667:27
      at invokePassiveEffectCreate (../../node_modules/react-dom/cjs/react-dom.development.js:23487:20)
      at HTMLUnknownElement.callCallback (../../node_modules/react-dom/cjs/react-dom.development.js:3945:14)
      at HTMLUnknownElement.callTheUserObjectsOperation (../../node_modules/jsdom/lib/jsdom/living/generated/EventListener.js:26:30)
      at innerInvokeEventListeners (../../node_modules/jsdom/lib/jsdom/living/events/EventTarget-impl.js:338:25)
      at invokeEventListeners (../../node_modules/jsdom/lib/jsdom/living/events/EventTarget-impl.js:274:3)
      at HTMLUnknownElementImpl._dispatch (../../node_modules/jsdom/lib/jsdom/living/events/EventTarget-impl.js:221:9)
      at HTMLUnknownElementImpl.dispatchEvent (../../node_modules/jsdom/lib/jsdom/living/events/EventTarget-impl.js:94:17)
      at HTMLUnknownElement.dispatchEvent (../../node_modules/jsdom/lib/jsdom/living/generated/EventTarget.js:231:34)
      at Object.invokeGuardedCallbackDev (../../node_modules/react-dom/cjs/react-dom.development.js:3994:16)
      at invokeGuardedCallback (../../node_modules/react-dom/cjs/react-dom.development.js:[405](https://github.com/avantibank/avanti-main/runs/6943246960?check_suite_focus=true#step:5:406)6:31)
      at flushPassiveEffectsImpl (../../node_modules/react-dom/cjs/react-dom.development.js:23574:9)
      at unstable_runWithPriority (../../node_modules/scheduler/cjs/scheduler.development.js:468:12)
      at runWithPriority$1 (../../node_modules/react-dom/cjs/react-dom.development.js:11276:10)
      at flushPassiveEffects (../../node_modules/react-dom/cjs/react-dom.development.js:23[447](https://github.com/avantibank/avanti-main/runs/6943246960?check_suite_focus=true#step:5:448):14)
      at Object.<anonymous>.flushWork (../../node_modules/react-dom/cjs/react-dom-test-utils.development.js:992:10)
      at act (../../node_modules/react-dom/cjs/react-dom-test-utils.development.js:1107:9)
      at render (../../node_modules/@testing-library/react-hooks/lib/dom/pure.js:79:26)
      at renderHook (../../node_modules/@testing-library/react-hooks/lib/core/index.js:114:5)
      at Object.<anonymous> (../online-banking/src/hooks/useTxsForCurrentAccount.spec.tsx:86:24)
      at Object.bound (../../packages/test/node_modules/dd-trace/packages/datadog-instrumentations/src/helpers/instrument.js:71:26)

Steps to reproduce

The trouble in reproducing this is we aren't doing anything exotic in our jest test suite. We run our tests with the jest-environment-jsdom test environment and include the dd-trace library per your documentation:

require('dd-trace/ci/jest/env')

Environment

@jmichel17 jmichel17 added the bug Something isn't working label Jun 17, 2022
@rochdev rochdev added the ci-app label Jun 20, 2022
@juan-fernandez
Copy link
Collaborator

hi @jmichel17, thanks for reporting this issue.

Since you're using github actions, would it be possible for you to use the agentless mode? https://docs.datadoghq.com/continuous_integration/setup_tests/javascript/?tab=cloudciprovideragentless

This should make the setup easier and hopefully fix the flush issue. As for the leak, they seem to be from requests to the agent, so this change should improve that as well.

@jmichel17
Copy link
Author

@juan-fernandez Thank you for the response.

Unfortunately, I have been unable to resolve this for my team as of yet. I tried utilizing agentless mode to ensure full flushing. In addition I tried all of the following but with still incomplete reporting up to the DD CI dashboard:

  1. Override dd-trace 's flushInterval from the default 400,000 down to 1
  2. Extended the default jest-circus runner and tapped into lifecycle events (teardown and run_finished) to try and sleep for 15 seconds after each event to give the async datadog reporter time to complete
  3. Tried the same sleep technique inside a jest globalTeardown script
  4. Tried to sleep after running jest test at the package.json script level

@jmichel17
Copy link
Author

I should also add... I also tried upgrading to Node v16 in case that mattered, but unfortunately got the same result.

@juan-fernandez
Copy link
Collaborator

hi @jmichel17 , thanks for the information. There seems to be an error before the flush. Did you try to run your tests with DD_TRACE_DEBUG=1 DD_TRACE_LOG_LEVEL=debug? It might give us an indication of what's happening. Additionally, could you use NODE_OPTIONS=-r dd-trace/ci/init for the initialization rather than using the test environment? It shouldn't make a difference but it's slightly simpler (more info in https://docs.datadoghq.com/continuous_integration/setup_tests/javascript/?tabs=onpremisesciproviderdatadogagent#instrument-your-tests)

@juan-fernandez juan-fernandez self-assigned this Jul 8, 2022
@davidzambrana
Copy link

Hi,

I am experiencing the very same on my side. In our case, we have a test suite of 218 test cases and what we see it's being reported is actually a random number of tests that can go from 90 to 160, just by rerunning the same job.

Just tried with the options you commented @juan-fernandez and I did see something "interesting"
For many payloads I see that the parent_id is set to 0000000000000000 while in many others I do see a more true-ish value such as 7c7ce0be7d14822d

Encoding payload: [{"trace_id":"43ac2edc28887db4","span_id":"3a185e3b9a5258eb","parent_id":"0000000000000000",

I also saw other traces like the following

PASS src/components/Landing/.../....spec.tsx
Extract from carrier: {"x-datadog-trace-id":"6613246714027553703","x-datadog-parent-id":"8970291666562089517"}.
PASS src/contexts/Localization/....spec.ts
Extract from carrier: {"x-datadog-trace-id":"1938286422715588783","x-datadog-parent-id":"0000000000000000"}.

or like this

• GET http://www.apitesturl.com/api/payee/trend/payouts
      
      If you still wish to intercept this unhandled request, please create a request handler for it.
      Read more: https://mswjs.io/docs/getting-started/mocks
      at Object.warn (node_modules/msw/src/utils/internal/devUtils.ts:17:11)
      at applyStrategy (node_modules/msw/src/utils/request/onUnhandledRequest.ts:208:18)
      at onUnhandledRequest (node_modules/msw/src/utils/request/onUnhandledRequest.ts:233:3)
      at handleRequest (node_modules/msw/src/utils/handleRequest.ts:80:5)
      at setupServerListener (node_modules/msw/src/node/createSetupServer.ts:75:24)
    console.warn
      [MSW] Warning: captured a request without a matching request handler:```

Do you have any suggestion?

Thanks

@juan-fernandez
Copy link
Collaborator

a 0 parent id is expected for test spans, so I don't think that's an issue. Could you provide the following info:

  • dd-trace and jest versions
  • whether you're using agentless or agent
  • how you initialise dd-trace
  • do you see any errors if you set DD_TRACE_DEBUG=1 and DD_TRACE_LOG_LEVEL=error ?

@davidzambrana
Copy link

Oh ok good to know.

The versions I am using are dd-trace: 4.13.1 jest: 27.5.1, using agentless (DD_CIVISIBILITY_AGENTLESS_ENABLED: true).

I run it with this command where I use react-scripts test for this repo

DD_CIVISIBILITY_AGENTLESS_ENABLED=true DD_TRACE_DEBUG=1 DD_TRACE_LOG_LEVEL=debug DD_SITE=datadoghq.com NODE_OPTIONS="-r dd-trace/ci/init" DD_ENV=ci DD_SERVICE=my-service yarn test:dd src --coverage

where test:dd is the following:

"NODE_OPTIONS=\"-r dd-trace/ci/init\" DD_ENV=ci DD_SERVICE=my-service react-scripts test"

Worth mentioning that in other repo, I run it with jest (without any react-scripts on top of it) and I do not have any issues.

I've just run it using the options DD_TRACE_DEBUG=1 DD_TRACE_LOG_LEVEL=error as you recommended and I see the following.

Error: Intelligent Test Runner configuration could not be fetched. Request to settings endpoint was not done because Datadog application key is not defined.
    at error (/home/runner/work/my-project/my-project/node_modules/dd-trace/packages/dd-trace/src/log/writer.js:104:11)
    at onError (/home/runner/work/my-project/my-project/node_modules/dd-trace/packages/dd-trace/src/log/writer.js:81:16)
    at Channel.publish (node:diagnostics_channel:56:[9]
    at Object.error (/home/runner/work/my-project/my-project/node_modules/dd-trace/packages/dd-trace/src/log/index.js:67:20)
    at /home/runner/work/my-project/my-project/node_modules/dd-trace/packages/dd-trace/src/plugins/ci_plugin.js:36:15
    at /home/runner/work/my-project/my-project/node_modules/dd-trace/packages/dd-trace/src/ci-visibility/exporters/ci-visibility-exporter.js:146:9
    at getItrConfiguration (/home/runner/work/my-project/my-project/node_modules/dd-trace/packages/dd-trace/src/ci-visibility/intelligent-test-runner/get-itr-configuration.js:42:14)
    at /home/runner/work/my-project/my-project/node_modules/dd-trace/packages/dd-trace/src/ci-visibility/exporters/ci-visibility-exporter.js:140:7
Error: No original method shard.
    at assertMethod (/home/runner/work/my-project/my-project/node_modules/dd-trace/packages/datadog-shimmer/src/shimmer.js:133:[11]
    at wrapMethod (/home/runner/work/my-project/my-project/node_modules/dd-trace/packages/datadog-shimmer/src/shimmer.js:39:3)
    at Object.wrap (/home/runner/work/my-project/my-project/node_modules/dd-trace/packages/datadog-shimmer/src/shimmer.js:82:7)
    at /home/runner/work/my-project/my-project/node_modules/dd-trace/packages/datadog-instrumentations/src/jest.js:198:11
    at requireOrImportModule (/home/runner/work/my-project/my-project/node_modules/jest-util/build/requireOrImportModule.js:53:28)
    at runJest (/home/runner/work/my-project/my-project/node_modules/@jest/core/build/runJest.js:[26]

Will check the application key thing just in case

@juan-fernandez
Copy link
Collaborator

juan-fernandez commented Sep 20, 2023

Oh ok good to know.

The versions I am using are dd-trace: 4.13.1 jest: 27.5.1, using agentless (DD_CIVISIBILITY_AGENTLESS_ENABLED: true).

I run it with this command where I use react-scripts test for this repo

DD_CIVISIBILITY_AGENTLESS_ENABLED=true DD_TRACE_DEBUG=1 DD_TRACE_LOG_LEVEL=debug DD_SITE=datadoghq.com NODE_OPTIONS="-r dd-trace/ci/init" DD_ENV=ci DD_SERVICE=my-service yarn test:dd src --coverage

where test:dd is the following:

"NODE_OPTIONS=\"-r dd-trace/ci/init\" DD_ENV=ci DD_SERVICE=my-service react-scripts test"

Worth mentioning that in other repo, I run it with jest (without any react-scripts on top of it) and I do not have any issues.

I've just run it using the options DD_TRACE_DEBUG=1 DD_TRACE_LOG_LEVEL=error as you recommended and I see the following.

Error: Intelligent Test Runner configuration could not be fetched. Request to settings endpoint was not done because Datadog application key is not defined.
    at error (/home/runner/work/my-project/my-project/node_modules/dd-trace/packages/dd-trace/src/log/writer.js:104:11)
    at onError (/home/runner/work/my-project/my-project/node_modules/dd-trace/packages/dd-trace/src/log/writer.js:81:16)
    at Channel.publish (node:diagnostics_channel:56:[9]
    at Object.error (/home/runner/work/my-project/my-project/node_modules/dd-trace/packages/dd-trace/src/log/index.js:67:20)
    at /home/runner/work/my-project/my-project/node_modules/dd-trace/packages/dd-trace/src/plugins/ci_plugin.js:36:15
    at /home/runner/work/my-project/my-project/node_modules/dd-trace/packages/dd-trace/src/ci-visibility/exporters/ci-visibility-exporter.js:146:9
    at getItrConfiguration (/home/runner/work/my-project/my-project/node_modules/dd-trace/packages/dd-trace/src/ci-visibility/intelligent-test-runner/get-itr-configuration.js:42:14)
    at /home/runner/work/my-project/my-project/node_modules/dd-trace/packages/dd-trace/src/ci-visibility/exporters/ci-visibility-exporter.js:140:7
Error: No original method shard.
    at assertMethod (/home/runner/work/my-project/my-project/node_modules/dd-trace/packages/datadog-shimmer/src/shimmer.js:133:[11]
    at wrapMethod (/home/runner/work/my-project/my-project/node_modules/dd-trace/packages/datadog-shimmer/src/shimmer.js:39:3)
    at Object.wrap (/home/runner/work/my-project/my-project/node_modules/dd-trace/packages/datadog-shimmer/src/shimmer.js:82:7)
    at /home/runner/work/my-project/my-project/node_modules/dd-trace/packages/datadog-instrumentations/src/jest.js:198:11
    at requireOrImportModule (/home/runner/work/my-project/my-project/node_modules/jest-util/build/requireOrImportModule.js:53:28)
    at runJest (/home/runner/work/my-project/my-project/node_modules/@jest/core/build/runJest.js:[26]

Will check the application key thing just in case

thanks, that's helpful. Don't worry about the application key, it's not mandatory.

This error message Error: No original method shard. is worrisome though.
We don't test react-scripts so that might be related.

What jest testRunner are you using? jest@27 has jest-jasmine2 as default, so if there's none specified, that'd be the one.

@davidzambrana
Copy link

It's jest-circus as I can see traces of it in the logs

@hxstdio
Copy link

hxstdio commented Mar 21, 2024

We encountered the same issue. All tests were executed in a Github action job, but none of them were reported in the datadog CI test dashboard.
The log displayed the error message "Error: No original method shard.”

We resolved this issue by upgrading Jest from version v27 to v29. Although we have yet to determine the root cause, the test report is now displaying normally in the datadog CI test dashboard.

@juan-fernandez
Copy link
Collaborator

closing due to lack of activity. Feel free to reopen if you encounter this issue again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working ci-app
Projects
None yet
Development

No branches or pull requests

5 participants