New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: enable Datadog APM error tracking with a tracing layer #1626
Conversation
set error fields on otel span directly when emitting errors, enabling log and apm error tracking
this strategy does not give us the error type
…tadog-error-tracking
I agree with this. How I see it is that at some point when we clarify how to use
Why is this the case? |
I'm not sure, but it's listed in the requirements here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Super excited by what this enables! Thanks @oddgrd ! 😍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left some notes. I think this PR warrants a demo to all the engineers as we should use the dyn StdError
whenever we are logging an error
add comment about order of tracing layers, remove old unsafe
Sure, I'm happy to do a demo! I'm also planning to document how we should format errors in the M&O Wiki. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great job 🎉 I agree with your comments regarding Valuable
and that we should work iteratively. If we get the errors from the logs there, it will already be highly valuable (no pun intended).
I'm hoping a demo will make it clear as to why this is valuable - the way we should be doing it 😄 |
Description of change
Add an error layer to the tracing subscriber to capture any
dyn std::error::Error
, and set the fields required for Datadog APM error tracking in the otel data extension on the current span.This also fixes the issue where the error message field was being overwritten by the
http.status_code
field, by setting the status code in the otel data extension rather than directly on the span.Todo
&(dyn std::error::Error)
field where possible. (Done in fa8d1b5)More remains for both points above, but I have tried to remove all obvious "errors that shouldn't be errors", and I've migrated the errors in all error events to
&dyn std::error::Error
where it was straight forward. I think the clean-up can continue in future PRs.Caveats
dyn Error
, but the errors will still be distinguished in error tracking by error.message and error.stack (Datadog generates a fingerprint for each error based onerror.message
,error.type
anderror.stack
).Alternatives considered
How has this been tested? (if applicable)
Tested with a local environment against Datadog.