Client interceptors depend on server Endpoint set #66

fedor57 · 2015-03-03T14:48:43Z

Hi,

I have a test with a structure Browser -> Service2.Method2 -> Service1.Method1

On Service2 by some reason interceptors are installed separately for receiving the request for Method2 and sending the request for Service1.Method1 as a client. Those interceptors use ServerAndClientSpanStateImpl.

If I don't install interceptors for receiving request for Method2 the following happens:

ClientRequestInterceptor.handle is called by Out interceptor for client request
it tries to submit "request" binary annotation by clientTracer.submitBinaryAnnotation
it calls ClientTracerImpl.getClientEndPoint()
it checks serviceName and if it is not null, tries to create an EndPoint using copy constructor
the last one assumes that Endpoint was already stored in ThreadLocal because probably we've sent some annotations like "sr" already and so filled the Endpoint
but in my case the Endpoint was not submitted yet, it has null value, so I get "null reference exception" inside copy constructor
just adding a test if (endPoint != null) probably will not work well since the serviceName on client will be lost. And it is not clear what should be passed as "spanname", "ip" if I use client only and override serviceName? Is EndPoint capable of storing such a state and Zipkin to process it?

I don't understand what is the source of problem here:

incorrect getClientEndPoint implementation?
incorrect logic in ClientTracerImpl in an untested scenario, so it incorrectly assumes that EndPoint must be already submitted without testing it?
incorrect usage of class ServerAndClientSpanStateImpl in the case when the Method2 is not traced itself so I have to use only ClientState and ClientTracer?

I hope this is 1 or 2. If it's 3 that's bad since I will have to create a different code for client tracing depending on tracing of current method.

P.S. Also - the whole algorithm for providing service/span names from client and server side is not clear, including the purpose of X-B3-Span-Name header. That happens if they don't match on client/server side. And what are the best practices? I think it's worth to have it all described in one readme.

Thank you for an attention!

kristofa · 2015-03-15T16:43:30Z

Hey @fedor57 ,

So far with all implementations we have the endpoint set as part of the server side integration code.
This does not mean however that we should fail with NullPointerException at client side when endpoint has not been set. This should get fixed.

I also agree that we should be able to start a span from server or client side. Currently it looks like we have a dependency on the server side.

The X-B3-Span-Name header was introduced by me and had the purpose of aligning client/server side snap names. If a span name was created at client side which was not easily derivable at server side passing the name and re-using it at server side was a way to solve this. However I'm thinking of abandoning this approach and choosing simple span names, see #59.

I want to spend some time soon on the work needed for brave 3.0 and I'll take these 2 issues + documentation into account.

fedor57 · 2015-03-16T09:37:23Z

Thanx, Kristof. I have answered on service/span naming on #59, regarding the correct approach for clients:

If current request had no instrumentation but the current module had a Zipkin library attached with client requests instrumented we should start a span automatically. No point in having "cs" without parent-span-id and break 3 client calls into different traces.

But theoretically there is a case when it shouldn't. Imagine you have a testing utility that fires 1000 requests from logs from different clients.

If we autostart the span it should start at least at the beginning of the first call and end at least at the end of the last.

But also we should be able to mark the beginning earlier and end later.

For the IP call we can use not Start/Done, but call some annotation method like AddAnnotation("Start"), AddAnnotation("End") - so those does not assume that current span exists or not, but makes sure that it will exist after the call. Or we can use AddAnnotation() without parameters that just checks the the current moment belongs to some span.

There is a problem with end of the span if no "ss" interceptor will be fired. So if I don't call Done - the trace is created but the current span is never finished. Maybe this to be considered - to autoclose span after timeout and mark it somehow?

kristofa · 2015-08-31T18:53:25Z

@fedor57 This issue should be fixed with brave 3.0.0-rc-1.

kristofa · 2015-09-21T18:20:13Z

Fixed. EndpointSubmitter does not exist anymore. No need to submit endpoint upfront.

kristofa added the bug label Mar 15, 2015

kristofa mentioned this issue Jun 21, 2015

Brave 3.0.0 #59

Closed

kristofa mentioned this issue Jun 30, 2015

Fix NPE with MySQL when EndPoint not submitted yet #80

Merged

kristofa added this to the brave-3.0.0 milestone Jul 9, 2015

kristofa closed this as completed Sep 21, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Client interceptors depend on server Endpoint set #66

Client interceptors depend on server Endpoint set #66

fedor57 commented Mar 3, 2015

kristofa commented Mar 15, 2015

fedor57 commented Mar 16, 2015

kristofa commented Aug 31, 2015

kristofa commented Sep 21, 2015

Client interceptors depend on server Endpoint set #66

Client interceptors depend on server Endpoint set #66

Comments

fedor57 commented Mar 3, 2015

kristofa commented Mar 15, 2015

fedor57 commented Mar 16, 2015

kristofa commented Aug 31, 2015

kristofa commented Sep 21, 2015