-
Notifications
You must be signed in to change notification settings - Fork 287
Span functions (SetTag
, LogKV
, etc.) only operate on spans initially marked as sampled
#391
Comments
This probably extends to quite a few other places as well, e.g. here, and I'm wondering if forcing span collection via the AFAICT the OpenTracing spec doesn't actually say what the tracers should do in cases where collection is forced, and simply states the following:
However, even if setting the sampling priority is only intended for debugging purposes, why wouldn't one want tags, logs, child spans, etc. assigned as per normal? |
Using |
It's the only way I can see being able to decide whether a span should be sampled after initialization, which is a requirement for this use-case. From what I can tell, OpenTracing doesn't support this specifically, but doesn't call for implementations to short-circuit operations on as-of-yet unsampled spans (which seems to be the issue here), but I could be wrong. I'm willing to do the work here provisionally, but I'm also guessing the operations are made to short-circuit for performance reasons, and you might not be willing to incur the penalty for only a small number of users, though the overhead might be negligible. |
Again, I just want to confirm that you understand that with this approach you will capture exactly one span out of the whole trace. Is that your intention? |
Ah, apologies. No, my intention is that I'm able to start a span, attach tags, logs, and child spans as per usual, but defer the decision of whether the span should be sampled or not until some time before I realize this isn't possible in the current state of the code, and my understanding is that As stated above, I'm willing to explore a correct solution and do the work for implementing it, assuming the business case is strong enough. Otherwise, the only (extremely hacky) workaround I can think of is having the initial decision made by a |
I understand the issue, but not your objective. If you have a trace with 100 spans and one of them failed, do you want to see the whole trace in Jaeger or just that single failed span? |
Just the span that was "forced" plus any of its child spans, though in our case this would be the root span (there are no other sibling spans). What we essentially do is start a span for the incoming HTTP request, then have underlying handlers start child spans for blocks of business logic (attaching tags and logs along the way). Once the HTTP handler is about return, we check the code returned by the underlying handlers, and set |
It won't work for children spans as they would have been already finished by the time the main span knows it wants to be sampled. |
Then the use-case here is perhaps impossible under OpenTracing. I don't mean to waste any more of your time, feel free to close this. |
It is possible, but not through client work alone. It's called tail-based sampling, which is partially already supported in OpenCensus Collector. |
That makes sense, thanks a million. We use Jaeger server-side, so I'll need to dig deeper. |
Note that the recently added delayed sampling (#449) allows you to add a sampler that will do this. |
Wow, thanks! I'll run some tests and try and get this in, thanks for the ping and your work! |
some docs in the readme: https://github.com/jaegertracing/jaeger-client-go#delayed-sampling |
Requirement - what kind of business use case are you trying to solve?
Span values (tags, logs, etc.) can only be set on spans marked as "sampled", a state generally decided on
StartSpan
via the configuredSampler
.However, it's possible to "force" span collection via the
sampling.priority
tag, which sets the "sampled" state totrue
after the fact, with any tags set before this being ignored (perhaps counter-intuitively). In our case, this is used to force span collection on 5xx errors, which are rare but important enough to collect regardless ofSampler
configuration.Problem - what in Jaeger blocks you from solving the requirement?
Tags and logs collected before collection is forced are essentially lost, having never been attached in the first place. This is supposedly done on purpose as an optimization.
Proposal - what do you suggest to solve the problem or improve the existing situation?
Calls to
Span.SetOperationName
,Span.SetTag
,Span.LogKV
, etc should not checks.context.IsSampled()
, as the result can change after initialization, and having these operations never take place is counter-intuitive, and prevents "sample-on-error" scenarios as described above.The text was updated successfully, but these errors were encountered: