Skip to content

Do not use span.RecordError for terminating errors #7470

Open
@pellared

Description

@pellared

Why

What is the value of recording the error as event if the type is set as attribute and value (err.Error()) as status description?

An exception SHOULD be recorded as an Event on the span during which it occurred if and only if it remains unhandled when the span ends and causes the span status to be set to ERROR.

I argue that error result is not an exception. For me a panic that was not being recovered is an unhandled exception. We already cover this part of this specification here:

https://github.com/open-telemetry/opentelemetry-go/blob/bc531cb3f9ce9c328b12e473b6409784728dc608/sdk/trace/span.go#L465-L482

Additionally see: https://github.com/open-telemetry/semantic-conventions/blob/main/docs/general/recording-errors.md#recording-errors-on-spans

Therefore, we should not call span.RecordError(err).

Originally posted by @pellared in #7254

My proposal is to not use span.RecordError for terminating errors per recording errors on spans semantic conventions.

During the SIG meeting, it was also noted that span.RecordError captures a stack trace. However, since the stack trace reflects where the error was recorded, not where it originated, I don’t find it particularly valuable. Additionally, capturing and emitting a stack trace has a performance cost. In most languages with exception handling (such as Java or .NET), the stack trace is already included in the thrown exception so the cost of capturing the stack trace was already taken and the stack trace is taken in the place where the exception was thrown.

@pellared does that mean span.RecordError should essentially never be called by instrumentation, since panic recovery is already handled by the SDK? Almost makes it seem like we should deprecate RecordError in the API,

I believe span.RecordError remains valuable, particularly for non-terminating errors. A common use case is retry logic, where errors occur but don't immediately end the span. This is especially relevant in the context of custom instrumentation, where developers have full control and can explicitly record such intermediate errors.

While instrumentation libraries typically just return errors, they could also benefit from span.RecordError if the underlying library provides hooks into retry or failure logic. In such cases, recording retry attempts as span events via span.RecordError gives users better visibility into transient issues without prematurely ending the span.

For these reasons, I don’t think span.RecordError should be deprecated. It might be worth capturing these use cases in the error recording semantic conventions (cc @lmolkova).

Side note: There is an OTEP open-telemetry/opentelemetry-specification#4333 that proposes to record exceptions/errors as log records instead of span events.

and document that End should handle recording exceptions for panics.

I think that documenting that span.End should handle recording exceptions for panics is a good idea.

In my opinion, this is how we implement recording an exception in the most possible user-friendly way.

Originally posted by @pellared in #7254

What

Remove usages of span.RecordError where there is a single error that terminates the span.

Impediments

There is no currently no easy to set error.type attribute.

Related issue:

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions