Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tracer refresh #576

Merged
merged 17 commits into from
Apr 10, 2019
Merged

Tracer refresh #576

merged 17 commits into from
Apr 10, 2019

Conversation

ivantopo
Copy link
Contributor

@ivantopo ivantopo commented Apr 1, 2019

⚠️ Warning: This PR builds on top of #574 and should not be merged until that PR is merged.

Background and Motivations

Similarly to other PRs raised over the last few weeks, this PR was rooted on the fact that we are breaking a few things on Kamon 2.0 and there has been a number of changes we wanted to do since 1.0 came out so better now than ever.

Changes in this PR

Remove the SpanContext

The information that used to be contained by the SpanContext (containing spanId, traceId, parentId and sampling decision) has been split it two different places:

  • The Span, which now has .id and .parentId members
  • A new Trace abstraction that contains information that is common to all spans in the same trace. This new Trace is a tiny class containing the .id and .samplingDecision members.
    One of the motivations of this change was confusion between the Context a the SpanContext, which are not the same thing but usually make people think whether they should be propagating one or the other. Also, the SpanContext had sort of a mixed level of abstraction since some information there should never change between Spans (traceId and samplingDecision) and some other must always change between Spans (id and parentId) so it seemed only logical to have the Span properties as part of the Span and the shared properties in a different abstraction, like the new Trace class. Finally, having this distinction can allow us to introduce Trace-related features without changing the Span interface, like "debug" traces which we might be adding pretty soon.

Remove SpanCustomizer in favor of PreStart/PreFinish transformations

The SpanCustomizer had a noble goal: allow users to provide customizations to be applied to Spans that are created beyond their control, by adding a SpanCustomizer instance to the current Context and making the instrumentation look for them and apply. One example would be, adding a SpanCustomizer to the current context right before an HTTP or JDBC call, so that the user could modify the characteristics of that Span created via automatic instrumentation. This falls short on situations where the instrumentation might be creating several Spans or the creation happens before the execution hits user code (like on the server side of operations). So, we retired the SpanCustomizer and added to types of transformations to the tracer itself:

  • PreStartTransformations are called right before a SpanBuilder will turn into a started Span.
  • PreFinishTransformations are called right before a Span is turned into a Finished Span.
    With these two types of transformations pretty much any change can be applied to the Spans, without directly modifying user's code so this wins over the old SpanCustomizer.

Additional information on Spans

Now Spans have a few extra bits of information:

  • Kind which tells whether a Span is representing a Server, Client, Producer, Consumer or Internal operation. Somehow this information was always there, but as a Span tag that might or might not be added by instrumentation. Starting with this PR there is a explicit place for the Span Kind and it should be always provided.
  • Position which tells where does a Span sit regarding to the trace it belongs to. A Span could be a Root, a LocalRoot or in a Unknown position. I have to say that I am slightly torn about this addition. I think it will be very useful in the near future when we get into showing Spans on the Status Page so I would like to introduce it right away but other than "i'm sure we will use it", I don't have a major motivation for this.

Changes to the SpanBuilder and Span APIs

Notable changes are:

  • We are now using the TagSet abstraction for both metric and span tags. This was introduced on Introduce a common abstractions to handle tags #572 and with this PR we are even introducing a faster TagSet.Builder implementation.
  • We are no longer using the .withXXX method names. As a convention throughout the codebase, we will only use .withXXX methods when they return a new, independent copy of the object on which the method was called and that was not the case with the SpanBuilder interface as it always returning the same instance (itself).
  • Now it is possible to configure whether Spans will get a tag with the full error stacktrace object or not.

SpanReporters now can differentiate Span from Metric tags

Before this PR all tags were put together in a single map, making it impossible for reporters to figure out which tags were used to track metrics and which ones were not. Now they are passed on to reporters as separate TagSet instances.

Introducing a new AdaptiveSampler

This was a quite challenging one. There were a few recurring issues we wanted to address with this sampler:

  • Being able to "mute" certain operations that are not relevant for tracing (like, status checks).
  • Being able to give more/less priority to particular operations. Things like "get me at least 100 traces of operationA and no more than 50 of operationB"
  • Being able to set a global limit on the amount of sampled traces coming out of the service.
    All of the requirements above are addressed by the new adaptive sampler shipped with Kamon. I very much would like to have it as the default sampler in Kamon 2.0, but we will most likely deploy it to our APM production environment when the first RC comes out and decide based on how it behaves.

Samplers now receive a SpanBuilder instance

Instead of getting the operation name and tags as separate parameters, Samplers are now getting a SpanBuilder instance from which they can read the information they need to take a decision (if any) and, importantly, since they get access to a SpanBuilder, Samplers could now add additional information to Spans like the probability used or any other relevant information.

Additional xxxSpanBuilder methods on the Tracer

We want to encourage all instrumentation developers to provide the right Span Kind and the "component" metric tag, since they help figure out which parts of the instrumentation are producing more/less metrics and having a proper SpanKind helps mapping operations properly on the SpanReporters, so we added .serverSpanBuilder(operationName, component), .clientSpanBuilder.... and so on variations to the tracer. Users can still just create a plain SpanBuilder as before, but we will encourage use of these specialized versions as much as possible.

IdentityProvider is now IdentifierScheme

This is just a naming change, since the previous name didn't seem very much appropriate. We also added simpler versions to the configuration: instead of providing a FQCN for the known schemes you could simply configure them as single or double.

@jtjeferreira
Copy link
Contributor

Remove SpanCustomizer in favor of PreStart/PreFinish transformations

The SpanCustomizer had a noble goal: allow users to provide customizations to be applied to Spans that are created beyond their control, by adding a SpanCustomizer instance to the current Context and making the instrumentation look for them and apply. One example would be, adding a SpanCustomizer to the current context right before an HTTP or JDBC call, so that the user could modify the characteristics of that Span created via automatic instrumentation. This falls short on situations where the instrumentation might be creating several Spans or the creation happens before the execution hits user code (like on the server side of operations). So, we retired the SpanCustomizer and added to types of transformations to the tracer itself:

Can you provide an example of an PreStartTransformation for JDBC or play-ws?

@ivantopo
Copy link
Contributor Author

ivantopo commented Apr 1, 2019

Hey @jtjeferreira, I see a few ways in which people using this could migrate, all depending on how involved the original customizer was. If we just look at the common case which was providing a decent operation name, you would have done it like this with Kamon 1.1:

Kamon.withContextKey(
  SpanCustomizer.ContextKey, 
  SpanCustomizer.forOperationName("myCustomName")) {
      // here goes the code that ends up making the JDBC or HTTP client call
  }

In that case, the forOperationName customizer would just apply the provided name as the operation name on the Span. You could replicate the same behavior by having a PreStartTransformation that searches for a custom operation name in the Context and applies it. If you decided to do that with a Context tag it would look like this:

Kamon.withContextTag("customOperationName", "myCustomName") {
  // here goes the code that ends up making the JDBC or HTTP client call
}

And your transformation would look like this:

class OperationNameTransformation extends PreStartTransformation {
  def transform(spanBuilder: SpanBuilder): SpanBuilder =
    Kamon.currentContext()
      .getTag(option("customOperationName"))
      .fold(spanBuilder)(customName => spanBuilder.name(customName))
}

That would be doing the exact same thing. Your question made me think that maybe we can ship this 5-liner with Kamon core, just not enabled by default, so it would be easier to migrate for people using this.

Couple disclaimers:

  • I did not compile that code, just wrote it out of my head but you get the idea.
  • There is no Kamon.withContextTag(...) at the moment but I just noticed how useful it would be so will be adding it soon.

@jtjeferreira
Copy link
Contributor

Thanks for the detailed example. What about the alternative of adding the Transformation to the context in code instead of configuration, like this:

Kamon.withContextTag("customTransformation", new OperationNameTransformation()) {
  // here goes the code that ends up making the JDBC or HTTP client call
}

I could even pass some context to the transformation so that I can use it to customize the span.

Kamon.withContextTag("customTransformation", new OperationNameTransformation(context)) {
  // here goes the code that ends up making the JDBC or HTTP client call
}

My use cases would be to enrich SOAP or GraphQL HTTP calls with more rich information like the SOAPAction or the Graphql query without having to get those from the low level request

@ivantopo ivantopo force-pushed the tracer-refresh branch 2 times, most recently from b10e8ac to c72b5fc Compare April 8, 2019 15:25
@ivantopo
Copy link
Contributor Author

ivantopo commented Apr 8, 2019

@jtjeferreira what you are saying about adding the transformation to the Context is something that sounds reasonable and would ease the migration for people using the SpanCustomizer. I just went ahead and added an implementation to this PR. A couple considerations on the latest changes:

  • Changed from "transformation" into "hook".. the signature of the previous "transform" method required users to return a SpanBuilder/Span instance, which would usually be the same we passed in when invoking the transform function but if the user decided by any reason to return a different instance then it could lead to undefined behavior. The hooks are SpanBuilder/Span => Unit and they can do whatever they want with their arguments without opening the the chance for weird behavior.
  • Even though Kamon now includes these hooks, they are not enabled by default. The reasoning being that very few users will actually need them and there is that little unnecessary extra processing time spent on trying to find the hooks.

Going back to the example above, you would need the following:

// application.conf
kamon.trace.hooks.pre-start = [ "kamon.trace.Hooks$PreStart$FromContext" ]
// application code

Kamon.withContext(PreStart.Key, PreStart.updateOperationName("customOperation")) {
  // here goes the code that ends up making the HTTP/JDBC call
}

Note: The example above is almost a copy/paste from the current tests 😃

@ivantopo ivantopo marked this pull request as ready for review April 10, 2019 22:28
@ivantopo
Copy link
Contributor Author

Folks, I'm going to merge this PR since there is a lot of other work happening on top of this. If there are any additional comments we could still address them in the next week or two so please feel free to comment.

In the meantime, I'm going to be pushing for the Kamon 2.0 release! Hopefully we will be getting something to test by the end of the week 🎉

@ivantopo ivantopo merged commit 72dea7a into kamon-io:master Apr 10, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants