Skip to content
This repository was archived by the owner on Jul 11, 2022. It is now read-only.

Conversation

philipgian
Copy link
Contributor

This PR add three things:

  • The ability to override the codecs used by the tracer on initialization.
  • The ability to override the default one-span-per-rpc behaviour on server-side spans.
    This is quite useful for visualizing the server-side spans on a call as nested spans.
  • An interface and a mechanism that allows an external entity to accumulate and batch reporting spans for performance reasons.

Add report_spans method to Reporter Interface to formally support batch
sending spans accumulated by an external entity.
This enables accumulating finished requests and reporting them all
together using the new 'report_spans' interface
Add option to override default one-span-per-rpc behaviour. This is
useful to visualize RPC calls as nested spans.
Add options to provide extra codecs to the tracer. This allows extending
or overriding the tracer's codec during initialization.
@CLAassistant
Copy link

CLA assistant check
Thank you for your submission, we really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.

@coveralls
Copy link

Coverage Status

Coverage decreased (-0.7%) to 93.329% when pulling c621a81 on philipgian:master into 626ee22 on uber:master.

def report_span(self, span):
self.reporter.report_span(span)

def report_spans(self, spans):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nothing calls this method, why do we need it?


self.end_time = finish_time or time.time() # no locking
self.tracer.report_span(self)
if not defer_report:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this needed? What else would trigger reporting if not span.finish()?

@yurishkuro
Copy link
Member

Regarding the objectives of this PR:

  • the first two points would be best done as separate PRs, as otherwise it's hard to provide a sensible commit message describing the change
  • I don't quite follow why you need the last point. "an external entity to accumulate and batch reporting" is precisely what the Reporter interface is for. Specifically, the RemoteReporter does the batching and uses Senders to actually ship the data out of process. There may be a way to refactor the remote reporter to not depend on Tornado, with a separate thread - it was just easier to write with Tornado's producer-consumer framework.

@philipgian
Copy link
Contributor Author

  • Regarding the first two commits, from what your are saying I'm guessing that you squash the commits while merging so having them in one PR is a problem. I will make two extra pull requests.

Regarding the batch reporting I went with the interface and the external entity approach for two reasons:

  • It's more flexible.
  • It's less intrusive.

What I am trying to achieve, is to have a SpanManager which does something similar to the opetracing-instrumentation python module. The goal is to track the Spans created during the execution of a request, accumulate them, (possibly on a thread local variable), and submit them for reporting when the request is finished. This way the SpanManager knows when to submit all the relative spans and keep the reporter unaware of this.

I understand that having a reporter which is capable of batch sending the requests both in terms of getting them out of the critical path and of delivering them to the trace collector is also an advantage.

Perhaps the best approach is a combination of both solutions, as having an external SpanManager submitting spans in batches to the Reporter, will allow a Reporter's helper thread (or something similar) to wake-up less frequently or to submit requests to the trace collector more efficiently.

@yurishkuro
Copy link
Member

yurishkuro commented Feb 15, 2017

SpanManager would be something that only knows about OpenTracing API. The Reporter and the relevant methods are the internal implementation detail of Jaeger tracer, SpanManager cannot make use of them. The only thing SpanManager can do is call finish() on the spans. So I do not think any extension to reporting is needed.

@philipgian
Copy link
Contributor Author

I understand what you are saying, but I was aiming to have a SpanManager that checked if the tracer supports batch reporting and gave a hint to the Tracer that the Span will be reported later in a batch, otherwise fallback to standard OpenTracing API and just call finish(). So the SpanManager could work with all OpenTracing compatible Tracers, but work better with those supporting this "extension".

Anyway, if you object on having this merged, I will withdraw the changes and make only two new PRs with the first two bullets, and maybe proceed with a different approach.

@yurishkuro
Copy link
Member

I am definitely not in favor of extending the API, mostly because I still do not understand your use case. Maybe you have some instrumentation code you can demonstrate? Whatever the responsibilities of your SpanManager are, it cannot control when span.finish() is called. For example, if a user wants to wrap a function into a span, they can do this:

with tracer.start_span('my function', child_of=get_current_span()) as span:
    # do the work
    # maybe log to span
    span.log(..)

This will automatically call finish() when the span is logically completed, and neither the tracer nor the SpanManager can do anything to delay that. But internally the call to finish() simply delegates to tracer.reportSpan(), which passes the span to the reporter. If any batching needs to happen (and it already does in our RemoteReporter) it can happen inside the reporter.

@philipgian
Copy link
Contributor Author

SpanManager has to manage/accumulate/batch report only the spans it manages. And for those, it can require that they are finished only through it, and disallow manual finish() (e.g., by overriding the finish() function on the Span it returns). In your example, the span is created manually by the user and the SpanManager knows nothing about it, so of course it cannot control it.

The point is not to prevent or make batch reporting mandatory, but to facilitate towards this direction. I understand how the Reporter() solution works better, but I also consider the batch reporting support complementary, as it can also work with non batch-aware Reporters which are not tied to a specific framework and which will be called only once in the lifetime of a request.

This extension was not meant to be compatible with a certain API, but enable certain functionality without breaking any API compatibility. Anyway, as I said, I totally understand your objection and I if your are not still convinced of the functionality it enables, I will withdraw this PR and work towards a different direction.

@yurishkuro
Copy link
Member

At this point I would advise perhaps just monkey-patching the methods onto the tracer and reporter in your own module. It goes without saying that you're making your module dependent on implementation details of Jaeger, as other OpenTracing tracers may not even have the concept of a "reporter" (at minimum it might be called differently).

We try to keep to the "three strikes" rule, i.e. not introduce API changes or features unless there are 3 use cases for them. I am happy to revisit if you release your span manager code in the future so that we can talk in more concrete terms.

In fact, if you think you're building some useful generic functionality, it might be worth opening an issue in opentracing-python repo (or even the specification repo) so that other people can weigh in. I think your use case sits between two audiences, the end users using OT for instrumentation on one hand, and the maintainers of the tracers on the other hand. So far OT has had no opinion on how the maintainers implement their OT API, this common functionality that affects how tracers work largely does not exist. For example, the process of registering custom codecs is implementation-specific.

@philipgian
Copy link
Contributor Author

OK. Thx for the comments. I'm closing this now as discussed. I've opened #32 as 1/2 PRs

@philipgian philipgian closed this Feb 20, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants