Skip to content

feat: OTel context propagation#716

Open
AzHicham wants to merge 4 commits intoapalis-dev:mainfrom
AzHicham:haz/otel
Open

feat: OTel context propagation#716
AzHicham wants to merge 4 commits intoapalis-dev:mainfrom
AzHicham:haz/otel

Conversation

@AzHicham
Copy link
Contributor

@AzHicham AzHicham commented Mar 25, 2026

Description

  • Refactored TracingContext to carry W3C propagation headers (traceparent, tracestate) as opaque metadata.
  • Added OTel propagation behavior directly on TracingContext via Extractor/Injector.
  • Introduced:
    • TracingContext::current() to capture context from Span::current()
    • TracingContext::restore(&span) to restore parent context during task span creation
  • Updated ContextualTaskSpan to restore parent OTel context from task metadata.
  • Updated tracing example to use a real tracing-opentelemetry subscriber/tracer setup and print span context for verification.
  • Added integration coverage for producer→consumer propagation in:
    • apalis/tests/otel_context_propagation.rs

Type of Change

  • New feature (non-breaking change which adds functionality)expected)

Testing

  • I have added tests that prove my fix is effective or that my feature works
  • I have run the existing tests and they pass
  • I have run cargo fmt and cargo clippy

Checklist

  • My code follows the code style of this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes

Additional Notes

  • Verified locally with:
    • cargo run -p tracing-example
    • cargo test -p apalis --test otel_context_propagation -- --nocapture
  • The tracing example now shows non-zero trace/span IDs and preserved traceparent on contextual task execution.

span_id: Option<String>,
trace_flags: Option<u8>,
trace_state: Option<String>,
traceparent: Option<String>,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

traceparent is enough to recreate the otel context!
Do we really need the tracestate ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to keep may be for vendor specific otel context

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am curious about removing these fields. Arent there cases that one would want to use the removed fields? Maybe we need to have a type called OtelTraceContext for otel specific contexts and keep TraceContext for non-otel cases. I am not the expert with telemetry so bear with me if I am not viewing this the right way.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well I think you are right in a way !
TraceContext is too generic we can rename it OtelTraceContext !
But TraceContext could be used for another purpose but IMO not for forwarding trace_id, span_id, etc ... thoses are too otel specific.
Btw traceparent contains all removed fields just formatted according to otel spec

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TraceContext is too generic we can rename it OtelTraceContext !

I think we can have both and implement From<TraceContext> for OtelTraceContext
This would also mean we dont change anything in apalis-core. OtelTraceContext can go to the apalis crate and behind the opentelemetry feature flag. What do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok but which one do we need to send with tasks ?
Do you mean the user must be able to choose between TraceContext and OtelTraceContext ? if yes the From<TraceContext> for OtelTraceContext is useless ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the advantage with OtelTraceContext is that on creation OtelTraceContext::current() we automatically get the ctx filled if any, no need to fill each field! while with TraceContext we must do that

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One last remark, TraceContext contains IMO otel specific field (trace_id, span_id, etc) it may be better to have TraceContext as a wrapper over a HashMap for generic ctx transmission
and have OtelTraceContext for opentelemetry with traceparent=trace_id+span_id+...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From<TraceContext> for OtelTraceContext is useless ?
yes From<TraceContext> for OtelTraceContext might not be that valuable.

But I am assuming cases that One wants to transfer the context from one application to another or a scenario where OtelTraceContext::current() might not cut it (Maybe its not the same span)

To use the OtelTraceContext, we can have something like:

.layer(TraceLayer::new().make_span_with(OtelContextualSpan::new()))
.layer(OpenTelemetryMetricsLayer::default())

and a user can do:


async fn produce_task_with_ctx(storage: &mut JsonStorage<Email>) -> Result<()> {
    let email = Email {
        to: "test@example".to_string(),
        text: "Test background job from apalis".to_string(),
        subject: "Welcome Sentry Email".to_string(),
    };
    // This might come from a http request etc
    let context = OtelTraceContext::current();
    let task = Task::builder(email).meta(context).build();
    storage.push_task(task).await?;
    Ok(())
}

In some backends such as redis and postgres, metadata is stored in a k->v structure meaning its easier to query it than a string.

I think we might also want something like FromStr for OtelTraceContext (or use OtelTraceContext::new(string) for flexibility. Since it might not always be the case that one can build from current scope.

it may be better to have TraceContext as a wrapper over a HashMap for generic ctx transmission

Sure, I will look into this

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see you have pushed some of the changes, Let me look through

pub fn with_trace_flags(mut self, trace_flags: u8) -> Self {
self.trace_flags = Some(trace_flags);
self
pub fn current() -> Self {
Copy link
Contributor Author

@AzHicham AzHicham Mar 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May be this should be the new fn ?
fn new is currently private... it's needed only for a test

}
};

tracing_ctx.restore(&span);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

magic happens here


impl Error for InvalidEmailError {}

fn print_otel_context(message: &str) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will generate this king of logs:

INFO produce_task: produce_task trace_id=6e288a541db0109a841fada148b7bfc7 span_id=fc6ace54ead0dc2b
INFO produce_task_with_ctx: produce_task_with_ctx trace_id=9ff8092f18c1f5a09c72502004c8fc6a span_id=6a1ccde6e61308cd
...
INFO email_service trace_id=b60bb828c0c54dca3ea7063919acc0f3 span_id=3a2e9333c835397b
INFO email_service trace_id=9ff8092f18c1f5a09c72502004c8fc6a span_id=9da96fa05218cc5e

As expect produce_task does not transmit the Otel context so we do not see it in the email_service
BUT produce_task_with_ctx running with trace_id 9ff8092f18c1f5a09c72502004c8fc6a forward the otel context resulting to the same trace_id in email_service as expected

@AzHicham AzHicham marked this pull request as ready for review March 25, 2026 21:25
@AzHicham AzHicham requested a review from geofmureithi as a code owner March 25, 2026 21:25
.with_trace_flags(1)
.with_trace_state("key=value");
// Capture whichever tracing context is currently active.
let _guard = tracing::info_span!("enqueue-email").entered();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldnt there be already an existing span in this case? Is there a reason to enter the new span here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not really, a span is not automatically created on must:

  • create it with tracing::xxx_span! macro
  • use the #[instrument] macro
  • or use actix or axum (or any other lib) with tracing enabled to delegate the span creation on each request for example
    I have removed it anyway from this example (monitor) because not really related to otel (tracing example is better)

span_id: Option<String>,
trace_flags: Option<u8>,
trace_state: Option<String>,
traceparent: Option<String>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am curious about removing these fields. Arent there cases that one would want to use the removed fields? Maybe we need to have a type called OtelTraceContext for otel specific contexts and keep TraceContext for non-otel cases. I am not the expert with telemetry so bear with me if I am not viewing this the right way.

@geofmureithi
Copy link
Member

geofmureithi commented Mar 26, 2026

Thanks for this contribution!
My main question is here @AzHicham

@AzHicham
Copy link
Contributor Author

AzHicham commented Mar 26, 2026

Thanks for this contribution! My main question is here @AzHicham

I have pushed 5e89d52 with your idea of splitting generic TraceContext and OtelTraceContext :)

};

#[cfg(feature = "opentelemetry")]
OtelTraceContext::from(tracing_ctx).restore(&span);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm.. Shouldn't OtelTraceContext have its own specific span?

Copy link
Contributor Author

@AzHicham AzHicham Mar 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you mean by own specific span ?
The span is created here before running a task. Then we restore the context by putting a parent (ie we tell to the newly created span that the trace_id is the one comming from the producer, but the span_id is new and the parent_span is the one comming from the producer).
Then we "enter" the span

/// This type provides a clearer API surface for users that explicitly work with
/// OpenTelemetry context propagation, while preserving `apalis-core` compatibility.
#[derive(Debug, Default, Clone)]
pub struct OtelTraceContext(pub(crate) TracingContext);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this approach

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants