dd-trace with datadog-lambda-js in Lambda throws error #3699

kasleet · 2023-10-10T15:52:39Z

Expected behaviour
We are using dd-trace and datadog-lambda-js in our Lambdas for logging and tracing, which works out quite well.

We encountered the following scenario, where dd-trace will throw an error:

We have a Lambda which is triggered by a message in SQS. The timeout of the lambda is set to 30 seconds. The visibilityTimeout of the queue is set to 3 minutes.
The Lambda reads a new message from SQS, processes it and fails with an error (in our use case this is totally fine)
After the message is visible again (~ 3 minutes later), the Lambda will try to process the message again.
This immediately fails with an exception within the dd-trace lambda/handler.js crashFlush method. Somehow, tracer._processor is undefined and the .killAll() call will fail (https://github.com/DataDog/dd-trace-js/blob/8511189932652f362baaeab2323cb9f94ed79975/packages/dd-trace/src/lambda/handler.js#L56C3-L56C30)

After some debugging, we saw in the logs from dd-trace that the Lambda RequestId and the traceId of the requests are somehow the same. We think that somehow the old context of the first request is used by or passed to the second request in dd-trace. This also explains why crashFlush is called, because dd-trace thinks that the lambda timed out (still doesn't explain why the _processor is undefined, but I think that's not relevant for now).

Here is a obfuscated CloudWatch screenshot, which shows this behaviour. I tried to mark the important parts with different colors for both requests.

I don't know if this is a bug, it could very likely be that we are using or configuring dd-trace and datadog-lambda-js in a wrong way.

Actual behaviour
Executing the second request with the actual context and don't throw an error.

Steps to reproduce
As described above. Here is a code snippet regarding our setup (we created our own custom 'lambda middleware' setup).

import type { Context } from 'aws-lambda'

const middleware = <TEvent, TResult>(
  event: TEvent,
  context: Context,
  next: (event: TEvent) => Promise<TResult>,
): Promise<TResult> => {
  const { datadog } = require('datadog-lambda-js')
  const { tracer } = require('dd-trace')

  const wrapped = datadog((e: TEvent) => {
    tracer.init({})

    return tracer.trace(
      `init middleware`,
      () => {
        return next(e)
      },
    )
  }, { logForwarding: true })

  return wrapped(event, context)
}

Environment

Operation system/Lambda: Runtime Version: nodejs:18.v13 Runtime Version ARN: arn:aws:lambda:eu-central-1::runtime:0229ff5ced939264450549058d8f267110e92677c27063e6dcd781a280f2462b
Tracer version: newest / arn:aws:lambda:eu-central-1:464622532012:layer:Datadog-Node18-x:98

The text was updated successfully, but these errors were encountered:

kasleet added the bug Something isn't working label Oct 10, 2023

tlhunter added the serverless label Dec 18, 2023

tlhunter assigned astuyve Dec 19, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

dd-trace with datadog-lambda-js in Lambda throws error #3699

dd-trace with datadog-lambda-js in Lambda throws error #3699

kasleet commented Oct 10, 2023

dd-trace with datadog-lambda-js in Lambda throws error #3699

dd-trace with datadog-lambda-js in Lambda throws error #3699

Comments

kasleet commented Oct 10, 2023