Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Client interceptors depend on server Endpoint set #66

Closed
fedor57 opened this issue Mar 3, 2015 · 4 comments
Closed

Client interceptors depend on server Endpoint set #66

fedor57 opened this issue Mar 3, 2015 · 4 comments
Labels
Milestone

Comments

@fedor57
Copy link

fedor57 commented Mar 3, 2015

Hi,

I have a test with a structure Browser -> Service2.Method2 -> Service1.Method1

On Service2 by some reason interceptors are installed separately for receiving the request for Method2 and sending the request for Service1.Method1 as a client. Those interceptors use ServerAndClientSpanStateImpl.

If I don't install interceptors for receiving request for Method2 the following happens:

  • ClientRequestInterceptor.handle is called by Out interceptor for client request
  • it tries to submit "request" binary annotation by clientTracer.submitBinaryAnnotation
  • it calls ClientTracerImpl.getClientEndPoint()
  • it checks serviceName and if it is not null, tries to create an EndPoint using copy constructor
  • the last one assumes that Endpoint was already stored in ThreadLocal because probably we've sent some annotations like "sr" already and so filled the Endpoint
  • but in my case the Endpoint was not submitted yet, it has null value, so I get "null reference exception" inside copy constructor
  • just adding a test if (endPoint != null) probably will not work well since the serviceName on client will be lost. And it is not clear what should be passed as "spanname", "ip" if I use client only and override serviceName? Is EndPoint capable of storing such a state and Zipkin to process it?

I don't understand what is the source of problem here:

  1. incorrect getClientEndPoint implementation?
  2. incorrect logic in ClientTracerImpl in an untested scenario, so it incorrectly assumes that EndPoint must be already submitted without testing it?
  3. incorrect usage of class ServerAndClientSpanStateImpl in the case when the Method2 is not traced itself so I have to use only ClientState and ClientTracer?

I hope this is 1 or 2. If it's 3 that's bad since I will have to create a different code for client tracing depending on tracing of current method.

P.S. Also - the whole algorithm for providing service/span names from client and server side is not clear, including the purpose of X-B3-Span-Name header. That happens if they don't match on client/server side. And what are the best practices? I think it's worth to have it all described in one readme.

Thank you for an attention!

@kristofa
Copy link
Member

Hey @fedor57 ,

So far with all implementations we have the endpoint set as part of the server side integration code.
This does not mean however that we should fail with NullPointerException at client side when endpoint has not been set. This should get fixed.

I also agree that we should be able to start a span from server or client side. Currently it looks like we have a dependency on the server side.

The X-B3-Span-Name header was introduced by me and had the purpose of aligning client/server side snap names. If a span name was created at client side which was not easily derivable at server side passing the name and re-using it at server side was a way to solve this. However I'm thinking of abandoning this approach and choosing simple span names, see #59.

I want to spend some time soon on the work needed for brave 3.0 and I'll take these 2 issues + documentation into account.

@kristofa kristofa added the bug label Mar 15, 2015
@fedor57
Copy link
Author

fedor57 commented Mar 16, 2015

Thanx, Kristof. I have answered on service/span naming on #59, regarding the correct approach for clients:

If current request had no instrumentation but the current module had a Zipkin library attached with client requests instrumented we should start a span automatically. No point in having "cs" without parent-span-id and break 3 client calls into different traces.

But theoretically there is a case when it shouldn't. Imagine you have a testing utility that fires 1000 requests from logs from different clients.

If we autostart the span it should start at least at the beginning of the first call and end at least at the end of the last.

But also we should be able to mark the beginning earlier and end later.

For the IP call we can use not Start/Done, but call some annotation method like AddAnnotation("Start"), AddAnnotation("End") - so those does not assume that current span exists or not, but makes sure that it will exist after the call. Or we can use AddAnnotation() without parameters that just checks the the current moment belongs to some span.

There is a problem with end of the span if no "ss" interceptor will be fired. So if I don't call Done - the trace is created but the current span is never finished. Maybe this to be considered - to autoclose span after timeout and mark it somehow?

@kristofa
Copy link
Member

@fedor57 This issue should be fixed with brave 3.0.0-rc-1.

@kristofa
Copy link
Member

Fixed. EndpointSubmitter does not exist anymore. No need to submit endpoint upfront.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants