Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Spans/traces leak between requests if child spans not finished correctly #109

Closed
rnorth opened this issue Oct 16, 2018 · 4 comments
Closed

Comments

@rnorth
Copy link
Contributor

rnorth commented Oct 16, 2018

I don't think this is a bug per se, but perhaps a missed opportunity: the current implementation relies on developers' span handling being bug-free, otherwise extremely undesirable traces are generated.

As an example, we had a service where a span had been started but not finished due to a bug in service code. This imbalance between start/finish meant that, when the server started handling a new request, there was already an unrelated span in on the threadlocal. Future requests served on the same thread would all pile on top of the same trace.

The effect for us was massive, useless traces, with many thousands of spans and lasting for minutes or hours.

Obviously 'don't forget to finish spans' would be good advice, but I think the JAX-RS filter could do more to limit the damage when such a bug inevitably occurs.

ServerTracingFilter seems to actively try and join any existing (thread local) spans - I can see that this was a deliberate move as part of #74. I'm wondering if this behaviour could be made configurable, though, so that users can opt out.

If it helps, I've created an example repo to demonstrate the issue:

We have forked ServerTracingFilter locally to do this, so could easily submit a PR if you'd like.

FWIW we did also try a few approaches to fix this in SpanFinishingFilter, but found it difficult to correctly clear out thread local state. In the end, an approach of ignoring the state in ServerTracingFilter was the only thing that worked for us.

I think this is similar to, but not the same as, #87.

@pavolloffay
Copy link
Collaborator

@rnorth thanks for reporting!

What is your fix? Add a parameter to jax-rs filter to skip any thread locals?

@rnorth
Copy link
Contributor Author

rnorth commented Oct 17, 2018

So far (internally) I've just taken the crude approach of disabling it in code - but I'll happily submit as a PR where it is made configurable. I'll add a suitable with method to ServerTracingDynamicFeature.Builder.

I think the only thing I'm not sure of is what the default should be. I'm strongly inclined to not disrupt the current default behaviour, even though in my view it's a bit of a gotcha.

What do you think?

@pavolloffay
Copy link
Collaborator

I am fine with changing the default behavior. We could even set the parent active span as a second child-of. But I am not sure if it is supported by all tracing systems.

rnorth pushed a commit to rnorth/java-jaxrs that referenced this issue Oct 17, 2018
…ured.

The ensures that the damage caused by accidental thread local leakage between requests is minimized
Fixes opentracing-contrib#109
rnorth pushed a commit to rnorth/java-jaxrs that referenced this issue Oct 22, 2018
…ured.

The ensures that the damage caused by accidental thread local leakage between requests is minimized
Fixes opentracing-contrib#109
rnorth added a commit to rnorth/java-jaxrs that referenced this issue Oct 22, 2018
…ured.

The ensures that the damage caused by accidental thread local leakage between requests is minimized
Fixes opentracing-contrib#109
@sjoerdtalsma
Copy link
Contributor

Isn't it better to do a 'hard' clear of the Scope after the request (before it is returned to the threadpool) using opentracing/opentracing-java#313 ?

FWIW we did also try a few approaches to fix this in SpanFinishingFilter, but found it difficult to correctly clear out thread local state

Can you elaborate? Would OT-java PR 313 above help at all or do you need something else?

pavolloffay pushed a commit that referenced this issue Oct 31, 2018
…ured (#110)

* Make active span be ignored by default; used as parent span if configured.

The ensures that the damage caused by accidental thread local leakage between requests is minimized
Fixes #109

* empty commit to refresh PR

* Changes based on review comments from @pavolloffay
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants