New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support new OpenTracing API #1196
Comments
This sounds good to me. cc @adriancole @rshriram @RomanDzhabarov @fabolive for further comment. |
I was doing some very initial testing with @RomanDzhabarov to upgrade to lightstep-tracer-cpp 0.38 (currently on 0.36). We should probably put this on hold and wait for the implementation suggested by @tedsuo? |
@moderation yes I would probably just skip the upgrade at this point. |
There are usually mapping problems with OpenTracing efforts, surfacing as poorly encoded data in Zipkin or spans not appearing as client/server etc. These add support burden to our gitter channel and issues list. These mapping problems happen because usually OT is tested as a mock, as opposed to a micro-integration test (ex to the degree tests exist, they don't relate directly to data emitted). In other words, adding any layer of indirection is likely to undo the clean integration currently in envoy. Solving problems will become harder and not possible directly in envoy code. TL;DR; I would highly discourage ripping out the existing Zipkin support until there are tests in permanently to show that envoy works with Zipkin via OT, not just that OT variants work. |
Agreed. Nothing will change until we have confidence that any new solution is supported by the community and works. I'm mainly saying that I have no objection to people attempting to tackle this if they are willing to do all the work. |
Yes, @adriancole this is also part of an effort to validate that the new API is correct, we want to prove that we can bind a ZipKin C++ implementation to it and have it work in real systems like Envoy and NGINX. (I'd also like to talk to you about test harnesses and other compatibility guarantees for OT, but probably a different thread). We're also fine supporting the C++ integration w/ZipKin and ensuring it is stable until the API has reached maturity. We don't plan on making a PR until we're confident that there is solid integration that has parity with the existing instrumentation. |
Thanks for the offer @tedsuo, and glad you are on board with making the first version of this portable with zipkin. Having an organization of mostly volunteers, I'm more concerned about future versions. Don't read this as skepticism about you personally, as your statement sounds great. What I'm generally concerned with is a scenario where OT is inserted then Zipkin later removed or degraded due to api decisions not made in the interest of the OpenZipkin community of end users. We've had problems with this in the past where apis crucial to Zipkin were on the chopping board and would have been removed if @basvanbeek or I didn't notice. It integration tests are here and someone in OT makes a decision not in favor of Zipkin users (regardless of intent), the build breaks. Fixing the build is either sacrificing these users or making the OT -> Zipkin story right again. TL;DR; if the goal includes removing the code that just went in for Zipkin support, let's make sure integration tests happen here, and are left in for the duration Envoy wants to support the OpenZipkin community. I think this "tests here" part is the only thing different than your plan. |
Sounds good @adriancole, we'll include Zipkin-based integration tests when we make our PR. |
Hello Everyone, I extracted out the Zipkin tracing code from envoy into a stand-alone library and added a bridge to the proposed C++ OpenTracing API. I also wrote a version of LightStep’s tracer that supports the new API. I’m starting to look into how best to integrate envoy with the OpenTracing API. @mattklein123 One thing I was wondering — Would you be willing to use the recorders that come with the Zipkin and LightStep libraries instead of having custom ones built into envoy, provided that Zipkin and LightStep support a way to get access to reporting statistics so you can continue recording things like the dropped span counts? Doing so could simplify envoy's tracing code and make it easier to support features like Kafka collection since you could just turn on the functionality provided by Zipkin's tracing library. |
@rnburn my primary requirement for the tracing libraries are that:
As long as the previous two points are handled, the more we can push into the libraries the better IMO. |
One thing I mentioned on another discussion about this is that diversity of
interest maintaining is important. For example, the current zipkin code
(which this effort aims to remove) was reviewed by folks at Lyft and IBM.
The one mentioned here is a replacement, and currently single-author (no
offence). In zipkin itself, there's a community provided c/c++ library
https://github.com/flier/zipkin-cpp, which has a bit more history (even if
short) and importantly it is used in production.
If we are ripping out months of reviewed code, seems it should be the case
that the replacement should have more not less ecosystem behind it. Can we
please check with the folks that wrote the existing one and get buy-in that
zipkin or performance of this will not become degraded by accident of being
pushed under an umbrella?
|
Lightstep C++ tracer 0.4.0 released - https://github.com/lightstep/lightstep-tracer-cpp/releases/tag/v0.4.0. Envoy still using 0.36 from February 2017. /cc @tedsuo |
I am fine with this, as long as functionality is maintained, and the output remains consistent with what we have in Istio today. I am mostly concerned about the steps that we take today to ensure that the tags are not duplicated between client and server [some of these steps have been undone, as me and @fabolive noticed yesterday]. Put more generally, (this might be a dumb question): do all tracing systems have to make similar decisions like what we do in Zipkin (delineating cs/sr/ss/cr, having to decide whether to do treat a new trace in front-envoy [gateway] should have server-receive/ client-send tag, etc)? If not, then with an opentracing implementation, will the zipkin-specific instrumentation points in code go away? |
cc @objectiser |
Hey @rshriram, not all tracing systems use the cs/sr/ss/cr fields -- LightStep, for example, doesn't have them. I'm not a Zipkin expert, but I believe they can controlled in Zipkin via OpenTracing by specifying the span.kind tag. |
When folks ask about "if only zipkin does x" remember there are a lot of
proxies for existing systems, including jaeger, influx, google cloud,
datadog, and soon dynatrace and AWS.. all accept zipkin format. Oracle even
demod zipkin integration at javaone this year though I am unsure if they
are also a proxy. Point is there is a heck of a lot of interop, and many of
these have no opentracing alternatives. Zipkin as defined here is formats
and it is bigger than just zipkin the server.
Adding support for the c++ OT library (cut 1.0 in the last week, with only
3 contributors this year) sounds nice and all, but removing direct zipkin
format will punt to historically troublesome opentracing translators.. and
water down with it the most effective, plugin free point of interop in
envoy.
|
All: I'm going to put it in my GH bio. We are not deleting direct Zipkin format, probably ever. |
Ps has anyone done a benchmark test on the implicit proposal to switch to
translators? I know the zipkin code got a lot of scrutiny wrt performance
and you cant see all of the code as it is in another repo. Do we have
integrated microbenchmarks?
|
re: performance, all of the tracers need perf work. There are quite a few issues in both cases. There are not currently any integrated microbenchmarks. Would love for someone to work on this. |
We'll definitely be benchmarking this work, and the OT C++ API in general. Related to that, we're discussing adding key lookup interface for Carriers on Extract, to avoid iterating over all of the keys. This is the only place so far where we have seen the current API cause an implementation difference. Please have a look if you are interested: opentracing/opentracing-cpp#25. Once that is resolved, we will have a PR ready for review. @rshriram not all tracing systems have zipkin-like functionality around |
Lightstep 0.5.0 C++ tracer released - https://github.com/lightstep/lightstep-tracer-cpp/releases/tag/v0.5.0. Envoy still using 0.36 from February 2017. /cc @tedsuo. I suggest we keep this issue focused on upgrading Lightstep and not about Zipkin (@mattklein123 has it in his Github bio that Zipkin will not be deleted). opentracing/opentracing-cpp#25 merged 2 days ago. |
I will defer to @tedsuo on how to proceed. I think the focus is to deprecate the existing LS tracer and use the new OT one. |
I just put in #2017. It adds common OpenTracing instrumentation that can be shared across tracers and switches the LightStep tracer to use it. |
Thanks Ryan! Yes, I would like us to use lightstep behind OT, and vet the performance. |
Also, while we can continue to compile tracers directly into envoy, I would encourage Envoy maintainers to consider some form of dynamic loading to allow users to install instrumentation of their choosing. There's a discussion on this issue here: opentracing/opentracing-cpp#28 |
Does it also provides the capability to use jaeger as a tracing solution since it is based on OpenTracing standard? |
Signed-off-by: Jose Nino <jnino@lyft.com> Signed-off-by: JP Simard <jp@jpsim.com>
Signed-off-by: Jose Nino <jnino@lyft.com> Signed-off-by: JP Simard <jp@jpsim.com>
A new version of the OpenTracing API for C++11 is almost ready (opentracing/opentracing-cpp#11). Once it has been merged, we will be changing the LightStep tracer so that it is OT-compatible. It would be great if we could then do the following with Envoy:
We're happy to put the work in to make this happen. Any recommendations or objections?
The text was updated successfully, but these errors were encountered: