Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add an OpenTelemetry context mechanism. #1658

Merged
merged 17 commits into from Oct 5, 2020

Conversation

anuraaga
Copy link
Contributor

@anuraaga anuraaga commented Sep 17, 2020

We have had an open issue, #575, to find an alternative to an external dependency for context propagation. As the spec states that for languages with no built in mechanism, like Go, or no obviously popular library, like JS's cls-hooked, we should create our own. So let's follow the spec and move on.

The API is heavily inspired by gRPC and Armeria. Much code / docs are borrowed too and I kept license headers in tact to reflect that.

  • Context is immutable. However, ContextStorage.rootContext() is customizable to allow storing a mutable value at the very earliest point possible, effectively making context mutable if that is needed by a user. One reason someone may do this is to store their context in ours
  • attach returns a try-with-resources Scope instead of having a detach
  • Getting values is from the context, not from the key (personally find more natural)
  • Wrappers for pretty much any callback-like type in JDK. If targeting JDK8 could add even more for the completable stuff
  • Context storage can be customized using SPI with normal mechanisms, with less of the trickiness of mechanisms like gRPC

One key point is that because storage is customizable, this can be used fine with any other context mechanism, it is just the API OpenTelemetry happens to use. If a user prefers a different mechanism, it is not that difficult to delegate our API to them and users can keep on using what they're used to. We can provide extensions implementing using other context propagation libraries anytime.

Give it a first look to see if it generally looks ok, and I'll add tests, etc.

/cc @zoercai

public final class Context {

/** Returns the root {@link Context} which all other {@link Context} are derived from. */
public static Context root() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work.

What is the motivation for exposing the root? When is this useful?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment. I'd be happier if we don't expose it now, in order to keep things simple.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We use it in instrumentation unfortunately. We're not supposed to, but there are cases where our automatic propagation doesn't work correctly and we have to force a new local root

https://github.com/open-telemetry/opentelemetry-java-instrumentation/blob/7e41b516ee793f2cb12c4f0893ec00c66837f7ee/instrumentation-api/src/main/java/io/opentelemetry/instrumentation/api/tracer/HttpServerTracer.java#L257

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason to call it root? Perhaps empty makes more sense?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I missed this comment - @tylerbenson I'm not sure about empty because a context override could change the default context to include something (needed if someone needs to stuff mutable state in there). I could rename to defaultContext() (not default() because Java), but any thoughts on which sounds best?

}

/** Returns a new context with the given key value set. */
public <V1, V2, V3> Context withValues(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need all the overrides at this point? I'd like to figure out how minimal we can make this API for starters.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm ok with removing for now - but IIRC we did already have a case of adding two values at the same time in instrumentation. That doesn't mean the chaining mechanism is that much worse, but a bit less efficiency.

FWIW, I was half-jokingly thinking of going up to 5 values to have something "better" than gRPC API :P

@jkwatson
Copy link
Contributor

This looks like a great start! Thanks for pulling this together so quickly!

* assert Context.current() == prevCtx;
* }</pre>
*/
public Scope attach() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1. Definitely something I've always wanted to have in grpc.Context

@carlosalberto
Copy link
Contributor

Hey @anuraaga, thanks for prototyping this out.

As the spec states that for languages with no built in mechanism, like Go, or no obviously popular library, like JS's cls-hooked, we should create our own. So let's follow the spec and move on.

Historically this was added to cover up Ruby, as there was no existent implementation, as opposed to Java, where there are more than a few bullet-proof implementations (although it got slightly hard, slightly political, which one to settle down with).

However, ContextStorage.rootContext() is customizable

As commented, is there any specific case for which we truly need this?

Wrappers for pretty much any callback-like type in JDK

We originally had similar wrappers in Tracer itself (inherited by OC), but decided to move them to its own tracer-propagation package under extensions/. We may need to do it later on, or at least have them in a separate class/package in the same artifact (so the core Context API stays simple).

Give it a first look to see if it generally looks ok, and I'll add tests, etc.

I'd advise to not go further till this idea overall has been accepted, or otherwise you may lose cycles if eventually this change is not accepted 😢

IMHO this is a good way to go, even though we 'duplicate' the grpc.Context logic - otherwise fine. But I'm wondering about the typical case that @bogdandrutu used to address for these kind of changes, regarding the double wrapping.

Will include this PR in tomorrow's agenda so we can discuss it properly.

Disclaimer: I had a hyper simple prototype long ago, where you had a simple Context API, implementing almost nothing, simply falling back to grpc.Context or any other implementation, such as Reactor's, which would have been easier to maintain. Of course, the advantage of this very PR is that we provide a full, out-of-the-box implementation.

Copy link
Contributor Author

@anuraaga anuraaga left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Historically this was added to cover up Ruby, as there was no existent implementation, as opposed to Java, where there are more than a few bullet-proof implementations (although it got slightly hard, slightly political, which one to settle down with).

Quoting the spec again "Languages are expected to use the single, widely used Context implementation if one exists for them. In the cases where an extremely clear, pre-existing option is not available, OpenTelemetry MUST provide its own Context implementation."

There's obviously no extremely clear option, hence the tension - perhaps Ruby was involved in the initial thought process, but it seems like a nice spec statement to avoid the exact political issues you mention. As this issue has been sitting too long, I encourage everyone just taking a fresh take, and the simplest solution is to just do what the spec says, provide our Context implementation.

As commented, is there any specific case for which we truly need this?

Yeah this is important for interop with other contexts - if the context is immutable, the only way to interop with it from another system is when it's attached, as that's when it's finally in a threadlocal. It may never be attached, or is often too late when it is. So because this has a goal of interop, allowing interception at the moment a "context is created" is very important. @trask mentioned he has a use case for this as well.

We originally had similar wrappers in Tracer itself (inherited by OC), but decided to move them to its own tracer-propagation package under extensions/.

I can't find the wrappers in trace_propagators extension - maybe we're referring to something different here. These are just the way to propagate context using JDK's mechanisms like Runnable. Context propagation is the most important feature of context, I'd say they're first-class APIs. Having them in the same class or not is just about usability of the API - I'd say having them on Context directly is the easiest for users, alternative would probably be something using static methods.

FWIW, since this is OpenTelemetry Context, I'm interesting in not keeping it too simple but making it easy to use for OTel users. This means merging TracingContextUtils in as well in the future - currently it's a bunch of static methods just living in the middle of nowhere. They, ahem, lack context making them a hard API to use in my experience in the instrumentation.

Will include this PR in tomorrow's agenda so we can discuss it properly.

Thanks - hoping to get this issue done with so we can launch!

public final class Context {

/** Returns the root {@link Context} which all other {@link Context} are derived from. */
public static Context root() {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We use it in instrumentation unfortunately. We're not supposed to, but there are cases where our automatic propagation doesn't work correctly and we have to force a new local root

https://github.com/open-telemetry/opentelemetry-java-instrumentation/blob/7e41b516ee793f2cb12c4f0893ec00c66837f7ee/instrumentation-api/src/main/java/io/opentelemetry/instrumentation/api/tracer/HttpServerTracer.java#L257

}

/** Returns a new context with the given key value set. */
public <V1, V2, V3> Context withValues(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm ok with removing for now - but IIRC we did already have a case of adding two values at the same time in instrumentation. That doesn't mean the chaining mechanism is that much worse, but a bit less efficiency.

FWIW, I was half-jokingly thinking of going up to 5 values to have something "better" than gRPC API :P

@carlosalberto
Copy link
Contributor

carlosalberto commented Sep 18, 2020

the simplest solution is to just do what the spec says

Sadly no. The simplest solution would be to use the solid, robust, well understood and independent grpc.Context implementation (Reactor's context would have been another possibility, but the fact it does not exist as a standalone artifact doesn't make it eligible).

Overall, I just want to make clear that staying with grpc.Context is not the end of the world (in the bad scenario of not agreeing on taking this path).

Context propagation is the most important feature of context, I'd say they're first-class APIs.

I prefer having them in its own class or package, just as grpc.Context does. I think having all of this functionality in the same artifact is enough to make it a first-class component. But that's a detail we can discuss later as well.

This means merging TracingContextUtils in as well in the future

This may be hard. We really don't want to have any bit of trace(or baggage or any other component) in the Context layer, as discussed in previous prototypes, like one that @trask had (although I agree that we need to make TraceContextUtil cleaner).

Thanks again for the prototype and hope we can come to a decision after discussing it with @bogdandrutu tomorrow!

* take care to avoid excessive dependence on context when designing an API.
* </ul>
*/
public final class Context {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we make this an interface, now that we're going to java 8?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thoughts on this, @anuraaga ?

@bogdandrutu
Copy link
Member

Just to make sure it is documented, and I can tell everyone "I told you" :)

This will make manual instrumentation a nightmare when it comes to deal with third party libraries instrumented with other Context, not that gRPC was solving that, but we were aiming for unification, now we are encouraging more and more contexts.

otherThread.submit(io.grpc.Context.current().wrap(runnable)).get();
assertThat(grpcValue).hasValue("japan");

// Since gRPC context is inside the OTel context, propagating gRPC context does not
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When using storage override with immutable contexts, this is where there can be gaps - you pick one context propagation mechanism and go with it.

However, since Context is an interface now it would be fairly simple to allow providers to implement that too to close this gap without any breaking changes. We do still need a default context, DefaultContext here for users of libraries with no context propagation mechanism.

/**
* The storage for storing and retrieving the current {@link Context}.
*
* <p>If you want to implement your own storage or add some hooks when a {@link Context} is attached
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To make this easier without SPI we can add a static method that, when called close to app startup, does similar configuration (probably similar to what setResource type of things will end up at in the future). Left it out for now so this currently isn't completely trivial.

* <p>Inspired by popcnt-based compression seen in Ideal Hash Trees, Phil Bagwell (2000). The rest
* of the implementation is ignorant of/ignores the paper.
*/
final class PersistentHashArrayMappedTrie {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not convinced this is actually faster than a normal CoW array for our propagation cases. Will benchmark and examine that in the future.

@anuraaga anuraaga marked this pull request as ready for review September 25, 2020 09:19
@codecov
Copy link

codecov bot commented Sep 25, 2020

Codecov Report

Merging #1658 into master will decrease coverage by 0.03%.
The diff coverage is n/a.

Impacted file tree graph

@@             Coverage Diff              @@
##             master    #1658      +/-   ##
============================================
- Coverage     85.17%   85.14%   -0.04%     
  Complexity     1351     1351              
============================================
  Files           166      166              
  Lines          5276     5276              
  Branches        549      549              
============================================
- Hits           4494     4492       -2     
- Misses          578      579       +1     
- Partials        204      205       +1     
Impacted Files Coverage Δ Complexity Δ
...try/sdk/metrics/aggregator/LongMinMaxSumCount.java 96.07% <0.00%> (-3.93%) 6.00% <0.00%> (ø%)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update ed16964...62131e2. Read the comment docs.

@FroMage
Copy link

FroMage commented Sep 25, 2020

Hi,

Following up on the discussion about MicroProfile-Context-Propagation, I see that with this PR, MP-CP will be able to propagate your context to users of MP-CP, which is great.

On the other hand, the lambda wrappers and your ContextExecutorService will not allow you to execute tasks that will have more than your context.

So, for instance, if we use open-telemetry in a Java project that relies on many contexts, as long as we're using MP-CP we will be able to propagate the CDI, JAX-RS, Transaction and OpenTelemetry contexts, which is great. But if OpenTelemetry runs a user task, they will get random contexts for CDI, JAX-RS and Transaction, because you only propagate the OpenTelemetry context.

Now, I don't know enough about open-telemetry to guess if that will matter, because it depends what kinds of tasks you are wrapping. If they're internal tasks to open-telemetry, that don't call user code, that'll be fine. If on the other hand, they do call user code, the integrator will have to go through hoops to pre-wrap the user code with MP-CP to propagate every context (excluding the OpenTelemetry one, I assume, to not have to restore it twice) before passing that lambda to OpenTelemetry.

I do think it's a pity to see OpenTelemetry re-implement parts of MP-CP (the wrapping and executor service) but only for a single context, when a standard solution exists and could be used (from what I see in this PR), but this is probably due to your JDK7 requirement.

@kenfinnigan
Copy link
Member

I found out on another issue that JDK 8 is now in play, so maybe it's possible to re-evaluate using MP-CP as a solution?

@jkwatson
Copy link
Contributor

I found out on another issue that JDK 8 is now in play, so maybe it's possible to re-evaluate using MP-CP as a solution?

I think the original motivation still precludes this. MP-CP might be super awesome and the new hotness, but it's definitely not "the single, widely used Context implementation" for java at this point.

@kenfinnigan
Copy link
Member

And that's totally fine, wanted to raise the possibility but I understand the reasoning

@jkwatson
Copy link
Contributor

This looks great. I don't think this module should be in the extensions directory, though, as it will be a top-level construct needed by the API. I think it should just be at the top level.

@kenfinnigan
Copy link
Member

If this were approved, would it replace the dependency on opentelemetry-context-prop in opentelemetry-api?

If so, should this work move from the extension to opentelemetry-context-prop to replace what's there?

Or is the intention to support both?

@jkwatson
Copy link
Contributor

If this were approved, would it replace the dependency on opentelemetry-context-prop in opentelemetry-api?

If so, should this work move from the extension to opentelemetry-context-prop to replace what's there?

Or is the intention to support both?

The plan would be to have opentelemetry-context-prop depend on this, rather than on grpc-context. opentelemetry-api would continue to depend on the context propagation package, I believe.

@kenfinnigan
Copy link
Member

Thanks @jkwatson, from taking another look at the code that makes sense.

Overuse of the same term for different pieces was throwing me.

Copy link
Contributor Author

@anuraaga anuraaga left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this module should be in the extensions directory, though, as it will be a top-level construct needed by the API.

Yeah figured we hadn't quite decided a final home for this yet so had it in extensions for incubation, but went ahead and moved to top-level.

While not explored at all, I see a possibility of putting the interfaces in API and DefaultContext and friends in SDK even :P

Overuse of the same term for different pieces was throwing me.

I'm always confused too :) The ContextUtils will go away, and I think we can merge the propagators into API. @jkwatson do you know if they need to be a separate artifact?

import java.util.logging.Level;
import java.util.logging.Logger;

public class ContextStorageOverride extends Context.Storage {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because it's public I don't think IntelliJ will try to delete it (if it was suggesting me an annotation I'd add it since don't know what it is :P). Added a comment though

Copy link
Contributor

@jkwatson jkwatson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

Copy link
Contributor Author

@anuraaga anuraaga left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added one more extension point and an example of directly using the grpc context from the otel api when that interop is enabled. This allows both gRPC or OTel API to be used for context propagation without chance of data loss.

* Returns a {@link ContextKey} for the given name. This is only useful when integrating with a
* separate context propagation mechanism, where
*/
default <T> ContextKey<T> contextKey(String name) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added this extension point to allow customizing the implementation of a key to match a storage.


otherThread.submit(Context.current().wrap(runnable)).get();

assertThat(grpcValue).hasValue("japan");
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Works

@anuraaga
Copy link
Contributor Author

I added a couple of examples of context interop with Brave. Since Brave is a tracing library, the interop will require more, such as a way to map spans to/from OTel and Brave, or at least just allow parent/child between them. But figured it could provide some hints /cc @marcingrzejszczak

Copy link
Contributor

@jkwatson jkwatson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's do it!

@carlosalberto
Copy link
Contributor

Hey, I will do a late review first thing tomorrow. Sorry for the delay.

*/
public interface ContextStorageProvider {

/** Returns the {@link ContextStorage} to use to store {@link DefaultContext}. */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is DefaultContext really the expected one? Shouldn't be Context instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doh thought I caught all these, thanks!

@carlosalberto
Copy link
Contributor

Overall LGTM - my only problem is that I think the Brave/Grpc.Context parts should exist in an extension package, as they are NOT really needed (nor will be) by the majority of users. We already have exporters in a separate package, so logically speaking we should do the same ;)

PS - Let's also give @bogdandrutu some time to review.

@anuraaga
Copy link
Contributor Author

@carlosalberto The gRPC / brave are both examples and also verification of some of the interop parts, so I put them in tests, they're not in the module. I think it's worth keeping there to be able to verify interop pieces while iterating, if they also make sense to present to users (which they probably will at some point), then I'll move them to the extensions.

Yesterday Bogdan mentioned he's generally ok with this and working with this approach - @bogdandrutu can you confirm you're ok with it? Let me know if you want to review this more :)

@carlosalberto
Copy link
Contributor

The gRPC / brave are both examples and also verification of some of the interop parts, so I put them in tests, they're not in the module.

Oh, my bad. Verified this is the case. LGTM.

@carlosalberto carlosalberto self-requested a review October 1, 2020 17:30
@anuraaga
Copy link
Contributor Author

anuraaga commented Oct 5, 2020

@bogdandrutu You asked for one day and got three - going to go ahead and merge but let me know if anything comes up.

@anuraaga anuraaga merged commit 21fbb36 into open-telemetry:master Oct 5, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants