New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
As a developer/operator I want to be able to see what queries Atlas is running #495
Comments
I believe that Brave is the library we'd want to look at: https://github.com/openzipkin/brave |
Do we have a good concept of how often this would be used? This question seems to have been brought up several times and I would like to have a better understanding of it’s value in the field. I’m having trouble seeing it prioritized anytime soon given our current spree of correctness issues as well as our more general lack of debugging tools until console can be easily deployed. |
@jboreiko I filed this mainly for tracking as I had talked about some of this with @rjullman . Now that we've migrated the C*KVS and DbKVS over, we've lost some of the more granular distributed tracing we previously had, which is very useful when tracking down service latency/response time where Atlas is one piece of broader service composition. This partially relates to palantir/conjure-java-runtime#115 and the main challenge here is hooking up cross-thread tracing via |
Is there a plan for Cassandra? I know that since 3.4 they allowed custom tracers to be plugged in, as per http://thelastpickle.com/blog/2015/12/07/using-zipkin-for-full-stack-tracing-including-cassandra.html, but both Atlas and Phoenix are on rather much older versions. Was wondering what thoughts you guys had about this? |
http-remoting recently added Zipkin support via Brave in palantir/conjure-java-runtime#142 though we probably need to help push openzipkin/brave#166 along if we want to use that for internal service tracing. |
I received approval from internal open source to work on this, but I am currently hosed by other work. I also talked extensively to @adriancole about the design, so somebody just needs to do it. Sent from my iPhone
|
Created http-remoting PR palantir/conjure-java-runtime#235 to get the initial plumbing for Dropwizard based services & clients to have a Brave tracer available. We still need to fix the span collection plumbing, sampling configuration, and a few more things @clockfort mentioned he's looking at adding tracing, so assuming we inject a |
@schlosna glad to help get this pushed through. I was looking at this a bit a few weeks ago and was unsure how to get this to work when AtlasDB doesn't control the service endpoints. Sounds like this solves that problem. |
Slight update to this ticket - we're going to be supporting Cassandra 3.7 soon (#1147), which will have zipkin integration. |
@schlosna @jboreiko @clockfort ping - is anyone doing anything about this ticket, or planning to do anything soon? It's been a P1 for almost 2 months with no action. |
Out of curiosity - is this already done? |
When running AtlasDB under real workloads, one wants to be able to enable additional tracing and produce tracing spans in a format that could be consumed into a distributed system tracing tool such as Zipkin. This would include the raw Cassandra Thrift, CQL, and/or SQL queries being executed.
The text was updated successfully, but these errors were encountered: