Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Abstract tracer implementations via an API #1757

Merged
merged 46 commits into from
Oct 23, 2020

Conversation

marcingrzejszczak
Copy link
Contributor

@marcingrzejszczak marcingrzejszczak commented Oct 22, 2020

Rationale

Spring Cloud Sleuth currently is an autoconfiguration over Brave. It also consists of various instrumentation mechanisms for libraries that are not supported by Brave (e.g. Spring Cloud Circuitbreaker).

We would like to abstract Brave so that Spring Cloud Sleuth becomes an autoconfiguration for any tracer implementation that is compatible with Spring Cloud Sleuth. That way Spring Cloud Sleuth in its core module would consist of an API and various tracer implementations would implement that API which would also allow automatic instrumentation of libraries that are supported by Spring Cloud Sleuth.

OpenTelemetry Support

Thanks to doing this abstraction we are able to support new tracer implementations, not only Brave. We've decided to add support for the OpenTelemetry SDK as the second one. If in the future if we decide to add new tracers then it will be just a matter of adding a new module that bridges to the Spring Cloud Sleuth one. Thanks to abstraction of tests as well we will be easily able to plug that tracer mechanism into our current suite of tests.

Design

In spring-cloud-sleuth-core we have created the new API in the org.springframework.cloud.sleuth.api package. That API was heavily inspired and sometimes literally taken from Brave (we wanted the potential transition to be as easy as possible). In some of the API interfaces we took inspiration from the OpenTelemetry Java API.

The work is still in progress because we want to create an abstraction over the messaging parts.

Changed Modules

spring-cloud-sleuth-core now consists of the API, instrumentation (async, circuitbreaker, messaging (via spring integration), quartz, reactor, rxjava, scheduling and web) and its autoconfiguration. It also consists some common properties and annotation configuration setup (as it used to).

New Modules

New modules were created - spring-cloud-sleuth-brave and spring-cloud-sleuth-otel and spring-cloud-starter-sleuth-otel

In spring-cloud-sleuth-brave we have autoconfigurations for the bridge between Sleuth and Brave and Brave autoconfiguration as such.

In spring-cloud-sleuth-otel we have autoconfigurations for the bridge between Sleuth and OpenTelemetry and OpenTelemetry autoconfiguration as such.

Reorganization of Tests

The tests/common module consists of numerous abstract tests that then would be reused by various tracer implementations to assert whether they are complaint against Spring Cloud Sleuth instrumentation requirements. That way we write test once and we apply configuration for the tracers.

Additional Changes

Multiple Propagation Options

We've added a feature to allow multiple propagation context mechanisms. The spring.sleuth.propagation.type property is a list where you can provide various options such as B3 (the default), but also AWS or W3C. OpenTelemetry also supports JAEGER and OT_TRACER. Of course, you can also provide your own, custom mechanism.

Documentation Rewrite

We've decided to rewrite the documentation to be more consistent with Spring Boot standards (cc @Buzzardo ).

How to Use It / How Does It Affect Me?

If you're a current Spring Cloud Sleuth user you will not see any changes since you will still be using Brave's API directly. If you were using spring-cloud-starter-sleuth dependency you automatically get the Brave tracer implementation. If you used spring-cloud-starter-zipkin only, you need to change it into two dependencies: spring-cloud-sleuth-zipkin and spring-cloud-starter-sleuth.

If you are a new user you should do the following

  • Pick which tracer implementation you want to work with (Brave or OpenTelemetry)
  • Decide if you want to use Spring Cloud Sleuth's API or the direct tracer's API

If you want to migrate to the Spring Cloud Sleuth API then the migration path should be relatively simple since the API resembles Brave's one a lot.

Breaking Changes

This section might be expanded in time and will end up on a wiki page.

We did our best to make the changes least demanding for our users since backward compatibility is critical to us. However since this is a major release change we've decided to perform some backward incompatible changes.

  • spring-cloud-starter-zipkin module was removed - to maintain backward compatibility you need to change it into two dependencies: spring-cloud-sleuth-zipkin and spring-cloud-starter-sleuth
  • If for some reason you depended on Sleuth's autoconfiguration classes, they might have moved to a different package (some of the autoconfigs that were in core module now ended up in brave under a different package name

@codecov
Copy link

codecov bot commented Oct 22, 2020

Codecov Report

Merging #1757 into master will decrease coverage by 22.35%.
The diff coverage is 35.18%.

Impacted file tree graph

@@              Coverage Diff              @@
##             master    #1757       +/-   ##
=============================================
- Coverage     55.46%   33.10%   -22.36%     
+ Complexity      785      724       -61     
=============================================
  Files           141      238       +97     
  Lines          4383     6259     +1876     
  Branches        521      693      +172     
=============================================
- Hits           2431     2072      -359     
- Misses         1748     4000     +2252     
+ Partials        204      187       -17     
Impacted Files Coverage Δ Complexity Δ
...k/cloud/sleuth/brave/bridge/BraveBaggageEntry.java 0.00% <0.00%> (ø) 0.00 <0.00> (?)
...ork/cloud/sleuth/brave/bridge/BraveScopedSpan.java 0.00% <0.00%> (ø) 0.00 <0.00> (?)
...rk/cloud/sleuth/brave/bridge/BraveSpanBuilder.java 0.00% <0.00%> (ø) 0.00 <0.00> (?)
...k/cloud/sleuth/brave/bridge/BraveTraceContext.java 0.00% <0.00%> (ø) 0.00 <0.00> (?)
...euth/brave/bridge/http/BraveHttpClientRequest.java 0.00% <0.00%> (ø) 0.00 <0.00> (?)
...uth/brave/bridge/http/BraveHttpClientResponse.java 0.00% <0.00%> (ø) 0.00 <0.00> (?)
...euth/brave/bridge/http/BraveHttpRequestParser.java 0.00% <0.00%> (ø) 0.00 <0.00> (?)
...ud/sleuth/brave/bridge/http/BraveHttpResponse.java 0.00% <0.00%> (ø) 0.00 <0.00> (?)
...uth/brave/bridge/http/BraveHttpResponseParser.java 0.00% <0.00%> (ø) 0.00 <0.00> (?)
...euth/brave/bridge/http/BraveHttpServerRequest.java 0.00% <0.00%> (ø) 0.00 <0.00> (?)
... and 292 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e6ebce4...8e8d5d6. Read the comment docs.

pom.xml Outdated Show resolved Hide resolved
* @author Marcin Grzejszczak
* @since 3.0.0
*/
@Configuration(proxyBeanMethods = true)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you mean proxyBeanMethods = false ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No I didn't cause at that time I was working with the BaggageManager abstraction and I needed one to be a bean but not autowirable anywhere. Regardless, I'll have to come back to it anyways so maybe that will change

@codefromthecrypt
Copy link
Contributor

TL;DR; please add to the description exactly what has been or is planned to move to this new abstraction into the PR description. This includes any remote spring projects, as the abstraction itself is the trickiest part of this.


I think in general people would like the idea of abstraction in, well abstract. Even if the abstraction were perfect, and more people were hired to support this, there are other support impacts of doing this. Plus, Autoconfiguration is different than making another tracing api. It is possible to auto-configure different tracers, but difficult to get abstraction to actually work, especially with fast moving apis underneath such as open telemetry. Anyway, lets get to abstraction.

For example, OpenTracing abstraction had gaps in it that made use cases difficult to really port between things. For example, the opaque span context, and lack of access to internal lifecycle of the span (ex data changing after it was recorded). This resulted in many things pinning straight to Jaeger (hunch).

One reason for the brave SpanHandler design (particularly reference tracing part) was to drive other libraries. It is extremely difficult to do design and implement a non-lossy abstraction, especially multi-threaded. It has really hard to reproduce things to ensure that you have an internal hook when a span forked and is put in scope, and ending. And we had to do that while still accommodating all the various MDC sync requests. This was really so hard, and would be hard to reproduce.

Sleuth used brave in the first place. Higher re-use, and support leverage. Not having to know which version of spring boot has which tracer api. Being able to upgrade safely, getting model improvements for free, having some folks with deeper knowledge doing the "hard parts" and ensuring those are tested. Meanwhile end users have had the benefit of that. They have their own instrumentation which works across spring boot versions and even in non spring environments. The new abstraction will only exist and be usable in the specific version of spring cloud release train. Even if lossiness and bugs are overcome, the pinning will return for any instrumentation on this api.

In short, the highest risk is the new api its existence. Auto-configuration does not require abstraction always. In doing another abstraction, anything using it will be locked to it. Maybe you should note which instrumentation will use the abstract api you have here, and what compatability guarantees you will make on that? This will help move things into a position of knowledge as no one is going to read this PR and taking things on faith requires well, faith. Adding some high level descriptions of precisely the features impacted (MDC sync, context, what have you) will give a clearer sense of the risk this PR will add.

@marcingrzejszczak
Copy link
Contributor Author

marcingrzejszczak commented Oct 23, 2020

Hey @adriancole

TL;DR; the description is updated. From the current user perspective the impact is close to none. For the future users they have various choices to pick. Any Brave updates can be immediately applied to Sleuth. Remark about Boot / Cloud compatibility is not applicable.


TL;DR; please add to the description exactly what has been or is planned to move to this new abstraction into the PR description. This includes any remote spring projects, as the abstraction itself is the trickiest part of this.

I've updated the description by literaly adding "spring-cloud-sleuth-core now consists of the API, instrumentation (async, circuitbreaker, messaging (via spring integration), quartz, reactor, rxjava, scheduling and web and its autoconfiguration. It also consists some common properties and annotation configuration setup (as it used to).".

In general there are almost no visible changes for the user (as I wrote in the PR description).

Sleuth used brave in the first place. Higher re-use, and support leverage. Not having to know which version of spring boot has which tracer api. Being able to upgrade safely, getting model improvements for free, having some folks with deeper knowledge doing the "hard parts" and ensuring those are tested. Meanwhile end users have had the benefit of that. They have their own instrumentation which works across spring boot versions and even in non spring environments. The new abstraction will only exist and be usable in the specific version of spring cloud release train. Even if lossiness and bugs are overcome, the pinning will return for any instrumentation on this api.

I think you misunderstood the whole change, let me try to explain this again.

Sleuth's new API is used for instrumentation and span lifecycle management. Sleuth was always bound to a given boot and cloud version so I think that this is not a new thing. Users can use Brave directly. Thanks to this change they can use other tracers too.

In short, the highest risk is the new api its existence. Auto-configuration does not require abstraction always. In doing another abstraction, anything using it will be locked to it. Maybe you should note which instrumentation will use the abstract api you have here, and what compatability guarantees you will make on that? This will help move things into a position of knowledge as no one is going to read this PR and taking things on faith requires well, faith. Adding some high level descriptions of precisely the features impacted (MDC sync, context, what have you) will give a clearer sense of the risk this PR will add.

Everything is described in the docs. At this point async, circuitbreaker, messaging (via spring integration), quartz, reactor, rxjava, scheduling and web are in core and Brave specific modules grpc , messaging (kafka, rabbit...), mongodb, opentracing, redis, rpc and web (sampling mechanism) are using Brave.

@marcingrzejszczak marcingrzejszczak merged commit 6e306e5 into master Oct 23, 2020
2020.0.0-M5 automation moved this from In progress to Done Oct 23, 2020
@marcingrzejszczak marcingrzejszczak deleted the otel_abstracted_new_api branch October 23, 2020 06:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
No open projects
2020.0.0-M5
  
Done
Development

Successfully merging this pull request may close these issues.

None yet

4 participants