Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Propagate OTEL context between ZIO and non-ZIO code #580

Merged
merged 4 commits into from
Oct 10, 2022

Conversation

dmytr
Copy link
Contributor

@dmytr dmytr commented Sep 6, 2022

Motivation

At the moment OTEL context is not propagated between ZIO and non-ZIO code. Of course, there are functions from scopedEffect family provided by Tracing, but they expect an unsafe size-effecting code as their argument.

In case if you need to propagate context into libraries that wrap non-ZIO code into ZIO, but don't use tracing - there is no guarantee that underlying non-ZIO code will have access to the correct context.

And in cases when non-ZIO code invokes ZIO code (for example in zhttp, which starts fibers from Netty code) context is not (correctly) propagated as well.

This change makes it possible to use automatic instrumentation provided by OpenTelemetry JVM agent.

Related: #561

How this works

This solution uses a special ZIO supervisor that takes a snapshot of the current OpenTelemetry context before a new ZIO fiber is started and uses it as a parent context in the scope of that fiber.

Each time a fiber is suspended, supervisor takes a fresh snapshot of the current context and then restores the root context. When fiber is resumed, it restores the last context snapshot for that fiber.

When fiber is completed, supervisor restores the root context.

Probably, this is not the most correct or efficient implementation, but it does seem to work.

There is an end-to-end setup that uses this kind of supervisor here: https://github.com/dmytr/zio-otel-instrumentation.

import io.opentelemetry.context.Context
import zio._

import java.util.concurrent.ConcurrentHashMap
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess it can be replaced with https://zio.dev/reference/sync/concurrentmap. As a consequence, you will have Supervisor[UIO[Unit]] and @nowarn annotation might be omitted.

Copy link
Contributor Author

@dmytr dmytr Sep 7, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm afraid we'll have to stick to the regular ConcurrentHashMap because all callbacks in Supervisor are unsafe and return Unit. Unsafe side effects have to be executed within those callbacks for context to be properly propagated into non-ZIO code.

Copy link
Contributor Author

@dmytr dmytr Sep 7, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nowarn basically tells us that we are throwing away Scope that we get when we call Context.makeCurrent(). Ideally we shouldn't do this. I saw that other tracing libraries (like Datadog's tracing SDK) use stacks of scopes. So potentially we could change storage to be of type ConcurrentHashMap[FiberId, Stack[Scope]]. But this is kind of involved and apparently this all works even when we do throw scopes away.

@grouzen
Copy link
Contributor

grouzen commented Sep 15, 2022

@dmytr Thanks for your contribution! Sorry for the delay, I honestly need some time to process this PR.

@grouzen
Copy link
Contributor

grouzen commented Sep 29, 2022

Thanks again, @dmytr, for your contribution! It looks good to me.

Copy link
Contributor

@easel easel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice contribution. Given the interface surface area change is very small, I'd advocate for merging and releasing soon to make it easier to get more feedback. In general, I think the approach is the right one -- configure the zio runtime appropriately and context propagates as expected into and out-of the zio runtime.

@dmytr dmytr marked this pull request as ready for review October 4, 2022 17:56
@dmytr dmytr requested a review from a team as a code owner October 4, 2022 17:56
@grouzen
Copy link
Contributor

grouzen commented Oct 10, 2022

@dmytr Hey! Can we merge it?

@dmytr
Copy link
Contributor Author

dmytr commented Oct 10, 2022

@grouzen yes, please. I don't have access to do this.

@grouzen grouzen merged commit fcbd309 into zio:series/2.x Oct 10, 2022
@dmytr dmytr deleted the otel-supervisor branch October 10, 2022 17:53
@dmytr
Copy link
Contributor Author

dmytr commented Oct 10, 2022

I've updated my test harness to use the published snapshot version: dmytr/zio-otel-instrumentation#2
It works as expected in that specific setup 🥳

@grouzen
Copy link
Contributor

grouzen commented Oct 10, 2022

I'll cut a release ASAP.
Update:
https://github.com/zio/zio-telemetry/releases/tag/v2.0.3

Doikor added a commit to Doikor/zio-telemetry that referenced this pull request Oct 18, 2022
It was reverted to the old style by accident (I think?) in zio#580
Doikor added a commit to Doikor/zio-telemetry that referenced this pull request Oct 18, 2022
It was reverted to the old style by accident (I think?) in zio#580
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants