New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC: Integrate with a distributed tracing system like OpenTracing or Zipkin #1436
Comments
It would be really nice to have the proxy add details about things only it knows when it records its spans. Examples (many of these are forward-looking to features not yet implemented):
|
@jakerobb Great list, thanks! |
@jakerobb @serkangunes Is there any active dev on this? I'm trying to figure out where it is on the backlog and if it's worth me taking a look if I have time. |
Not afaik. Go for it! |
I have started looking at OpenTracing and experimented a little bit on my dev cluster. I've done a small internal demo and we all agree that implementing this throughout our services is a must. The good thing about it is that we can add instrumentation in an incremental/iterative way, leveraging automagic instrumentation/span creation tools already available to us, such as:
And then gradually adding manual instrumentation where we need it. To me, the immediate and indispensable value of Linkerd2 is in how it gives me visibility into my system "for free" - without any complicated/invasive setup. The very simple act of injecting the proxy into my deployments gives me tremendous value in terms of knowing what's going on inside my cluster. Therefore, having automagic cross-service OpenTracing trace data seems like a perfect and natural fit for Linkerd2, where it would augment/complement the visibility already provided by Linkerd2. It would allow to go from zero to minimal useful tracing in no time. Obviously, manual instrumentation will always be required to get even more out of the traces, but it would set a solid foundation on top of which manual instrumentation could be added. Anyways, I realize that I am not adding much value to this RFC in terms of the "what/how", but I felt compelled to add my voice to the "why". 🤓 |
One approach to consider is OpenCensus. In theory this could allow for the exporting of traces to multiple backends including Jaeger, Zipkin etc. (and metrics). The glaring issue right now is that there is currently no support for Rust. There doesn't appear to be community recognized Rust work for Zipkin but I did find palantir/rust-zipkin. On the OpenTracing front I found opentracing/opentracing-rust, opentracingrust and rustracing. |
We are looking at Linkerd2. I like its lightweight install process. This is one of the features we hope linkerd2 will support. What i like about linkerd 1 are out of the box support for distributed tracing and circuit breaker. I hope linkerd2 will have feature parity with linkerd1. |
fyi OpenTracing is not a tracing system, so the subject line is a bit confused. it is same as saying JDBC is the same as oracle. |
also "OpenTracing trace data" implies another confusion as the project defines no data format. If this is about programming api, then it would be about how you link extra data like add more instrumentation to the proxy. OpenTracing defines nothing useful for service abstraction, no headers, data format anything. So many times people misunderstand and think OpenTracing (an api with no data format or headers) is the same as Zipkin (defined both). Whatever comes out of this, better to not add to the confusion. Census is more accurate as it explicitly supports header propagation formats including B3, and you also have a consistent implementation regardless of which backend processes the data. In fact they have defined an intermediate data format intended to be processed similar to how most people process zipkin data. If it is intentional to not use the most used format (zipkin data and b3), for whatever reason :P then at the moment, the only alternative for this abstraction is census. |
Whoa, my worlds colliding. Hi Adrian! It’s also important to note that you can’t have tracing “for free” — even if all instrumentation is offloaded to the proxy, you still have to update your services such that any outgoing requests propagate the tracing headers (e.g. Zipkin’s B3 headers) that came in with the triggering request/message/etc. That’s not a particular lot of work, but it’s strictly necessary. |
Hi again, Jake!
yep. there is actually an FAQ istio are making which covers this but yeah I
agree. something needs to handle at least propagation.
I suppose my point is less important the brand name of the library but it
is important if choosing brand names that they have scope to promise
something like B3.
I think before (linkerd 1) it might not have been confusing as the feature
wasn't opentracing washed in other words.
On 9 Dec 2018 13:22, "Jake Robb" <notifications@github.com> wrote:
Whoa, my worlds colliding. Hi Adrian!
It’s also important to note that you can’t have tracing “for free” — even
if all instrumentation is offloaded to the proxy, you still have to update
your services such that any outgoing requests propagate the tracing headers
(e.g. Zipkin’s B3 headers) that came in with the triggering
request/message/etc.
That’s not a particular lot of work, but it’s strictly necessary.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#1436 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAD618DftfH4fm2jbY73JzbjvGOQwLT6ks5u3J4ZgaJpZM4V4Ytg>
.
|
for those wondering. opentracing exposes no data needed for B3 to work
https://github.com/opentracing/opentracing-java/blob/master/opentracing-api/src/main/java/io/opentracing/SpanContext.java
if OT does b3 it is only because an implementation that supports b3 was
chosen (like zipkin or jaeger). there is a proposal for adding trace id to
the context but yeah still no sampling data. this means it is impossible to
portable create b3 headers using opentracing.
census not only defines be compatible trace identifiers but also sampling
api (also absent from OT)
https://github.com/census-instrumentation/opencensus-specs/blob/master/trace/Sampling.md
so anyway this is a technical concern and I am plagued very often by people
misunderstanding and wondering why OT doesnt magically do things when the
short answer is because it doesnt!
so anyway rambling over.. hopefully someone finds this helpful info!
|
What is the status of this? |
@renannprado On the big list of things we want to do, but haven't found anyone to implement yet. Want to give it a try? |
@wmorgan I actually want to help, but I wonder if I can. Besides all of that, what does linkerd2 misses or provides in terms of distributed tracing? because looking at this thread I couldn't really tell what exactly is missing. Of course it does require some cooperation for the applications as well for this to work, but besides that I would like to understand what's missing today related to distributed tracing. Is it not giving any support whatsoever? i.e. start from scratch? I won't promise anything, but if I understand all of that I might give it a try and show you guys in case there's progress. Thanks! |
User Stories
Rationale
Distributed nature of micro services makes it difficult to find out where are the point of failures. Having the support for a distributed tracing within the service mesh will free the microservices from integration code and will keep them cleaner.
The text was updated successfully, but these errors were encountered: