Skip to content

otelhttp: Cannot record error for spans when producing new semantic convention #7254

@XSAM

Description

@XSAM
Member

Description

otelhttp only uses span.RecordError(err) when errType is not empty

if err != nil {
// set error type attribute if the error is part of the predefined
// error types.
// otherwise, record it as an exception
if errType := t.semconv.ErrorType(err); errType.Valid() {
span.SetAttributes(errType)
} else {
span.RecordError(err)
}

But ErrorType method never returns an empty attribute.

func (n CurrentHTTPClient) ErrorType(err error) attribute.KeyValue {
t := reflect.TypeOf(err)
var value string
if t.PkgPath() == "" && t.Name() == "" {
// Likely a builtin type.
value = t.String()
} else {
value = fmt.Sprintf("%s.%s", t.PkgPath(), t.Name())
}
if value == "" {
return semconvNew.ErrorTypeOther
}
return semconvNew.ErrorTypeKey.String(value)
}

Expected behavior

I don't have a preference on whether we should record error.type with exception.type at the same time. But the comment and the logic here seem conflicted.

Activity

added theissue type on Apr 23, 2025
changed the title [-]otelhttp: Cannot record error when producing new semantic convention[/-] [+]otelhttp: Cannot record error for spans when producing new semantic convention[/+] on Apr 23, 2025
dmathieu

dmathieu commented on Apr 23, 2025

@dmathieu
Member

I don't know what was the intent there. But it feels like ErroTypeOther should be the case where we want to fallback to recording an error?

XSAM

XSAM commented on Apr 23, 2025

@XSAM
MemberAuthor

But it feels like ErroTypeOther should be the case where we want to fallback to recording an error?

I don't find a way to make it return ErrorTypeOther. In most cases, it returns semconv.ErrorTypeKey.String("*errors.errorString").

For instance, errors created by errors.New() like driver.ErrBadConn. The type is *errors.errorString.

I know it is difficult (or impossible) to get this via reflect, but do you think returning semconv.ErrorTypeKey.String("database/sql/driver.ErrBadConn") is the ideal result for driver.ErrBadConn?

where we want to fallback to recording an error?

I feel we should always make span recording the error, as this is the only way users can choose to record exception.message and exception.stacktrace. While only know the error type does not seem to help a lot, these two attributes are very helpful when debugging.

@open-telemetry/go-maintainers What do you think about this issue?

XSAM

XSAM commented on Apr 24, 2025

@XSAM
MemberAuthor

More information: https://opentelemetry.io/docs/specs/otel/trace/exceptions/ (not in the semconv doc)

The spec about exceptions says

An exception SHOULD be recorded as an Event on the span during which it occurred if and only if it remains unhandled when the span ends and causes the span status to be set to ERROR.

It even has a pseudo code that sets the error.type attribute for spans and calls recordException at the same time.

Span span = myTracer.startSpan(/*...*/);
try {
  // Code that does the actual work which the Span represents
} catch (Throwable e) {
  span.recordException(e);
  span.setAttribute(AttributeKey.stringKey("error.type"), e.getClass().getCanonicalName())
  span.setStatus(StatusCode.ERROR, e.getMessage());
  throw e;
} finally {
  span.end();
}

So, a good start for us is to change the code from

		// set error type attribute if the error is part of the predefined
		// error types.
		// otherwise, record it as an exception
		if errType := t.semconv.ErrorType(err); errType.Valid() {
			span.SetAttributes(errType)
		} else {
			span.RecordError(err)
		}

		span.SetStatus(codes.Error, err.Error())

to

		// ErrorType should always return a valid error.type` for error
		span.SetAttributes(t.semconv.ErrorType(err))
		span.RecordError(err)
		span.SetStatus(codes.Error, err.Error())
pellared

pellared commented on Jun 11, 2025

@pellared
Member

What is the value of recording the error as event if the type is set as attribute and value (err.Error()) as status description?

An exception SHOULD be recorded as an Event on the span during which it occurred if and only if it remains unhandled when the span ends and causes the span status to be set to ERROR.

I argue that error result is not an exception. For me a panic that was not being recovered is an unhandled exception. We already cover this part of this specification here:

https://github.com/open-telemetry/opentelemetry-go/blob/bc531cb3f9ce9c328b12e473b6409784728dc608/sdk/trace/span.go#L465-L482

Additionally see: https://github.com/open-telemetry/semantic-conventions/blob/main/docs/general/recording-errors.md#recording-errors-on-spans

Therefore, we should not call span.RecordError(err).

dashpole

dashpole commented on Jun 12, 2025

@dashpole
Contributor

@pellared does that mean span.RecordError should essentially never be called by instrumentation, since panic recovery is already handled by the SDK?

Almost makes it seem like we should deprecate RecordError in the API, and document that End should handle recording exceptions for panics.

pellared

pellared commented on Jun 13, 2025

@pellared
Member

My proposal is to not use span.RecordError for terminating errors per recording errors on spans semantic conventions.

During the SIG meeting, it was also noted that span.RecordError captures a stack trace. However, since the stack trace reflects where the error was recorded, not where it originated, I don’t find it particularly valuable. Additionally, capturing and emitting a stack trace has a performance cost. In most languages with exception handling (such as Java or .NET), the stack trace is already included in the thrown exception so the cost of capturing the stack trace was already taken and the stack trace is taken in the place where the exception was thrown.

@pellared does that mean span.RecordError should essentially never be called by instrumentation, since panic recovery is already handled by the SDK? Almost makes it seem like we should deprecate RecordError in the API,

I believe span.RecordError remains valuable, particularly for non-terminating errors. A common use case is retry logic, where errors occur but don't immediately end the span. This is especially relevant in the context of custom instrumentation, where developers have full control and can explicitly record such intermediate errors.

While instrumentation libraries typically just return errors, they could also benefit from span.RecordError if the underlying library provides hooks into retry or failure logic. In such cases, recording retry attempts as span events via span.RecordError gives users better visibility into transient issues without prematurely ending the span.

For these reasons, I don’t think span.RecordError should be deprecated. It might be worth capturing these use cases in the error recording semantic conventions (cc @lmolkova).

Side note: There is an OTEP open-telemetry/opentelemetry-specification#4333 that proposes to record exceptions/errors as log records instead of span events.

and document that End should handle recording exceptions for panics.

I think that documenting that span.End should handle recording exceptions for panics is a good idea.

In my opinion, this is how we implement recording an exception in the most possible user-friendly way.

pellared

pellared commented on Jun 16, 2025

@pellared
Member

I created

I think it might be also good to add some "error recording" recommendations in https://pkg.go.dev/go.opentelemetry.io/otel/trace. However, we should first clarify it on the semantic conventions side.

I believe span.RecordError remains valuable, particularly for non-terminating errors. A common use case is retry logic, where errors occur but don't immediately end the span. This is especially relevant in the context of custom instrumentation, where developers have full control and can explicitly record such intermediate errors.

This seems not compliant with https://github.com/open-telemetry/semantic-conventions/blob/main/docs/general/recording-errors.md#recording-errors-on-metrics

Errors that were retried or handled (allowing an operation to complete gracefully) SHOULD NOT be recorded on spans or metrics that describe this operation.

Currently, the only use case I see the only use of span.RecordError to use it for "unhandled exceptions" (which for Go can be done in the SDK) and to record multiple errors like here:

if len(c.Errors) > 0 {
span.SetStatus(codes.Error, c.Errors.String())
for _, err := range c.Errors {
span.RecordError(err.Err)
}
}

I do not find the semantic conventions clear and probably some work is also needed there.

CC @lmolkova

XSAM

XSAM commented on Aug 3, 2025

@XSAM
MemberAuthor

Sorry for joining the discussion late.

@pellared I am confused about what we should do to handle errors, especially non-terminating errors. Should we do nothing if we encounter an error?

For instance, when facing code like this, do we do nothing for the error we receive?

ctx, span := tracer.Tracer(instrumentationName).Start(ctx, "example")
defer span.End()

err := query(ctx, db)
if err != nil {
	return err
}
return nil

Or, should we record this error?

ctx, span := tracer.Tracer(instrumentationName).Start(ctx, "example")
defer span.End()

err := query(ctx, db)
if err != nil {
	span.RecordError(err)
	return err
}
return nil

I don't think we should just ignore the error without recording it in the span. RecordError has semconv.ExceptionMessage(err.Error()) to record error message, which is always useful when debugging.

Are you trying to limit the usage of RecordError to only instrumentation? Why can the instrumentation record the error message?

pellared

pellared commented on Aug 4, 2025

@pellared
Member

For instance, when facing code like this,

It looks like a terminating error to me which falls into https://github.com/open-telemetry/semantic-conventions/blob/main/docs/general/recording-errors.md#recording-errors-on-spans:

When the operation ends with an error, instrumentation:

XSAM

XSAM commented on Aug 5, 2025

@XSAM
MemberAuthor

I see. So, your proposal is

ctx, span := tracer.Tracer(instrumentationName).Start(ctx, "example")
defer span.End()

err := query(ctx, db)
if err != nil {
        span.SetAttributes(semconv.ErrorType(err))
	span.SetStatus(codes.Error, err.Error())
	return err
}
return nil

It makes sense.

pellared

pellared commented on Aug 5, 2025

@pellared
Member

Correct

XSAM

XSAM commented on Aug 5, 2025

@XSAM
MemberAuthor

In such cases, recording retry attempts as span events via span.RecordError gives users better visibility into transient issues without prematurely ending the span.

Would it be strange to say we only record exception events when panic and retriable error happens? Retriable errors do not seem to be categorized as a type of exception.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Type

    Projects

    Status

    Todo

    Milestone

    No milestone

    Relationships

    None yet

      Participants

      @dmathieu@dashpole@pellared@XSAM

      Issue actions

        otelhttp: Cannot record error for spans when producing new semantic convention · Issue #7254 · open-telemetry/opentelemetry-go-contrib