-
Notifications
You must be signed in to change notification settings - Fork 182
Running multiple tracing implementations simultaneously #9
Comments
Higher Order Tracers (HOT) & Migrating Between VendorsI think there's a case for Higher Order Tracers i.e. Tracers that combine other tracers (at construction time) e.g.
Any in-band keys should be namespaced in order to enable >1 tracers working in parallel - I think this (impressive) might be the only extra constraint that has to hold true to enable HOTs. There will be the need to add some "magic" for handling error cases (i.e. vectorized)... but shouldn't be huge deal. Obviously there could be others like Something like this would allow one to switch vendors without downtime since it allows a transition period (could be days) while both the old and the new tracing systems are running. As far as I can see, it should be close to impossible to switch with 0-downtime unless you have such feature. |
Hopefully different tracers would be using different header keys anyway, so I would punt on the namespacing, especially since introducing namespacing is also incompatible with the existing deployment. |
@yurishkuro - I agree... no "formal" namespacing. Just that people should be aware of aliases when they design their protocols. One interesting corner-case is baggage where I can imagine |
From the previous discussion...
I think the only mathematically correct (i.e. functional) way to do this, is to not discard any information i.e. something like The same thinking applies on the return/error codes. |
if you assume that the tracers already use prefixed keys, then encoding of the baggage is no different from encoding of the span context. Errors are harder. I would go with a configuration option for the multiplexing tracer to either fail fast or to do best effort and aggregate all errors into one. |
Let's assume that I upgrade from a tracer that doesn't support baggage propagation to one that supports (or the reverse). I think this use case, could easily hint towards a good solution. |
I may have misread your point about baggage - so no issues for encoding/decoding, but just what to return from MultiplexinsSpan.getBaggage - I would go with a simple solution and return whatever the first tracer returns, and a more complicated solution is again internally configurable strategy in the tracer on what to do in case of conflicts (I wouldn't worry until there's a real need). |
No - you are correct. There were two issues on baggage, one is the one you said above, about the prefixed-keys. i.e. if someone sets |
If the mux tracer can be configured with error handling strategies like FAIL_FAST and ACCUMULATE, I think it mostly solves the issue with extract(). I don't think there's a way to define the "correct" behavior of the mux, so give users a choice to pick the behavior and do best effort. There's probably a need for a 3rd error strategy - to succeed if at least one tracer succeeds, because the user code might be checking the error and not using the returned span context as parent in case of an error. |
What are your thoughts on |
I agree on this a lot. Overall you want to be able to upgrade/downgrade gracefully in terms of Tracer features. If baggage or a specific type of carrier wasn't supported and it's supported with the new tracing system I guess you want to print an |
Bringing #28 (comment) here...
Yes exactly. There should be some kind of metric and you can't switch off till you see no-hits.
I think that the first concerns mostly what we talk about above. I'm not quite clear on the second one... |
Error handling strategies should be the very important thing in multi-tracer-implementations. We should choose them carefully. If use the wrong strategy, a trace-implementation will failure, because of other implementation. Such as: extract in This bug will be hard to find out from the tracer-user-side or tracer-deveploer-side. |
Most likely, multi-tracers execute like a chain as I said above. Of course, with some Error handling strategies. |
@yurishkuro , I prefer the strategies can isolate the tracer-implementations, we could provide |
@lookfwd sorry for the delay, having a crazy week. The primary value prop around interoperation of Tracers is not actually so that an org can necessarily switch from one tracing vendor to another with zero hiccups, although that's another nice possible benefit :) The main advantage I see is actually for 3rd-party OSS/etc projects that are used by many organizations and need to provide tracing instrumentation without making assumptions about the environments they're integrated into. E.g., take DropWizard: there are companies using DropWizard and Jaeger; companies using DropWizard and Zipkin; and companies using DropWizard and LightStep. By instrumenting DropWizard with OpenTracing, the vendor-neutrality is preserved for at the DW level and the vendors don't all need to independently instrument DW. As for the idea of informally namespacing headers: sure. Though IMO this is redundant to discuss at the OT level since that's true for any possible HTTP header (given that HTTP headers are a shared namespace to begin with). I don't strongly object to making the docs longer / more complex to accommodate this advice, but I am skeptical that it's needed to achieve your desired result :) |
The main advantage I see is actually for 3rd-party OSS/etc projects that
are used by *many* organizations and need to provide tracing
instrumentation without making assumptions about the environments they're
integrated into. E.g., take DropWizard: there are companies using
DropWizard and Jaeger; companies using DropWizard and Zipkin; and companies
using DropWizard and LightStep. By instrumenting DropWizard with
OpenTracing, the vendor-neutrality is preserved for at the DW level *and*
the vendors don't all need to independently instrument DW.
That would be possible if people could rely on the api. Currently, it is
blind.. features are added without an inventory of which systems support
them. It is limited to whoever is paying attention which usually is several
people at most (out of hundreds of people actively designing tracing
systems). If the goal is for people to rely on this abstraction for
frameworks, users will need to know which features are risky and which
aren't. In other words, that would be a completely independent issue which
would track the viability of apis, for example encouraging use of more
stable or more often implemented features when attempting to design generic
framework instrumentation. Until then, the promise of instrumentation with
no assumptions about environments is unlikely to be credible.
|
@bensigelman Awesome - that wasn't clear for me
Aren't they making the assumption there's only a single tracing framework? How can one use DropWizard with both Jaeger and Zipkin? Frankly, adding instrumentation is adding quite some non-trivial code on a potentially large (e.g. N>>1000) number of applications, and hopefully their instrumentation code might stay in place (without the need for any changes) potentially for 5+ years. I believe that an application developer wants to make the most out of the added instrumentation code which means it's quite likely one might like to be running solutions from 2-3 different vendors + a few other potentially irrelevant smaller tracing-like applications that implement a limited part of OT API. |
It's a pretty reasonable assumption, but it's not a concern of Dropwizard instrumentation. If someone manages to make a Multiplexing tracer, as discussed on this thread, then a Dropwizard user can use two tracers simultaneously, and DW doesn't need to change anything in its instrumentation.
I don't think Ben was suggesting that. Unless OpenTracing 2.0 comes up with a common in-band format for context propagation, even if you use multiple tracers each one of them will still see only the slice of architecture that directly runs with that tracer. |
Exactly - the question is - does OT impose some constraints that prevent such a tracer from being possible? My guess is that we wouldn't like that. For example, as we mentioned previously, |
P.S. I realize that it's mostly talking and thinking here. I think I will try to write/plug a mixing tracer next time I have time to play with Zipkin-Python-OT |
There's a first ugly throw-away C++ draft of some concepts here. |
Continuing the discussion from opentracing/opentracing.io#66
cc: @yurishkuro @bensigelman
The text was updated successfully, but these errors were encountered: