-
Notifications
You must be signed in to change notification settings - Fork 39
Description
Expected behaviour
We are using dd-trace and datadog-lambda-js in our Lambdas for logging and tracing, which works out quite well.
We encountered the following scenario, where dd-trace will throw an error:
- We have a Lambda which is triggered by a message in SQS. The timeout of the lambda is set to 30 seconds. The visibilityTimeout of the queue is set to 3 minutes.
- The Lambda reads a new message from SQS, processes it and fails with an error (in our use case this is totally fine)
- After the message is visible again (~ 3 minutes later), the Lambda will try to process the message again.
- This immediately fails with an exception within the dd-trace lambda/handler.js
crashFlushmethod. Somehow,tracer._processoris undefined and the.killAll()call will fail (https://github.com/DataDog/dd-trace-js/blob/8511189932652f362baaeab2323cb9f94ed79975/packages/dd-trace/src/lambda/handler.js#L56C3-L56C30)
After some debugging, we saw in the logs from dd-trace that the Lambda RequestId and the traceId of the requests are somehow the same. We think that somehow the old context of the first request is used by or passed to the second request in dd-trace. This also explains why crashFlush is called, because dd-trace thinks that the lambda timed out (still doesn't explain why the _processor is undefined, but I think that's not relevant for now).
Here is a obfuscated CloudWatch screenshot, which shows this behaviour. I tried to mark the important parts with different colors for both requests.
I don't know if this is a bug, it could very likely be that we are using or configuring dd-trace and datadog-lambda-js in a wrong way.
Actual behaviour
Executing the second request with the actual context and don't throw an error.
Steps to reproduce
As described above. Here is a code snippet regarding our setup (we created our own custom 'lambda middleware' setup).
import type { Context } from 'aws-lambda'
const middleware = <TEvent, TResult>(
event: TEvent,
context: Context,
next: (event: TEvent) => Promise<TResult>,
): Promise<TResult> => {
const { datadog } = require('datadog-lambda-js')
const { tracer } = require('dd-trace')
const wrapped = datadog((e: TEvent) => {
tracer.init({})
return tracer.trace(
`init middleware`,
() => {
return next(e)
},
)
}, { logForwarding: true })
return wrapped(event, context)
}
Environment
- Operation system/Lambda: Runtime Version: nodejs:18.v13 Runtime Version ARN: arn:aws:lambda:eu-central-1::runtime:0229ff5ced939264450549058d8f267110e92677c27063e6dcd781a280f2462b
- Tracer version: newest / arn:aws:lambda:eu-central-1:464622532012:layer:Datadog-Node18-x:98
