Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DO NOT MERGE - PLEASE REVIEW DESIGN] Feed Resource to Exporter #575

Closed

Conversation

jsuereth
Copy link
Contributor

This PR is for discussion purposes against #570. It represents one option for how we integrate Resource through the SDK.

Basic outline:

  • TracerProvider is associated with a Resource
  • Every time a SpanProcessor is attached to a TracerProvider, it is given a pointer to the resource owned by the TracerProvider
  • When a SpanProcessor calls Export they must pass the associated Resource or the default.
  • It looks like we're missing robust tests around OTLP exporter, which I'll add before this can be submitted. Specifically I want to write a test ensuring resource-labels are appropraitely added to OTLP protos and passed around correctly.

Implications:

  • We've been debating "sharing" SpanProcessors. This PR would make that impossible. Once a SpanProcessor is given to a TracerProvider, we hijack the whole thing for the given TracerProvider
  • SpanProcessor is currently a shared_ptr to a TracerProvider. I think we'd need to turn that into a unique_ptrto guarantee ownership rules around passing Resource * downstream.

Other Alternatives:

  • I had another idea where instead of passing Resource downstream, we actually have Recordable remember its Tracer at the time of creation (and Tracer remember its TracerProvider) which should give us access to both InstrumentationLibrary and Resource. This has the same limitation where we'd need to enforce lifecycles of Exporters (the owners of Recordable) and no longer allow sharing.

Given the nuance and tracking information required from TracerProvider/Tracer for Resource/InstrumentationLibrary I actually think we should abandon the idea of sharing physical instances of Exporters/Processors and instead share configuration/setup (with duplicate classes) to keep this integration simple and easy to maintain.

Wanted to elicit feedback first:

@lalitb @pyohannes @maxgolov @ThomsonTan

@ThomsonTan
Copy link
Contributor

Should the "[WIP]" in the title be removed?

@jsuereth
Copy link
Contributor Author

@ThomsonTan I'll add more tests if we think this is the direction we want to go WRT to Resource, but I'd like to hear from the other contributors (and yourself) if you like what you see here first :) It has the caveat I mentioned and I want to make sure those are acceptable.

@jsuereth jsuereth changed the title [WIP] Feed Resource to Exporter [DO NOT MERGE - PLEASE REVIEW DESIGN] Feed Resource to Exporter Feb 11, 2021
@lalitb
Copy link
Member

lalitb commented Feb 11, 2021

@jsuereth Thanks for the design proposal. I can see the benefits arising from this proposal. Couple of points.

  1. Specs considers Resources and InstrumentationLibrary as (additional ) span attributes, and should be accessible from span.
    https://github.com/open-telemetry/opentelemetry-specification/blob/03a6f0fd9d1d2f5e08858b102ba620ebf3e4a6fa/specification/trace/sdk.md#additional-span-interfaces

Readable span: A function receiving this as argument MUST be able to access all information that was added to the span, as listed in the API spec. In particular, it MUST also be able to access the InstrumentationLibrary and Resource information (implicitly) associated with the span. It must also be able to reliably determine whether the Span has ended (some languages might implement this by having an end timestamp of null, others might have an explicit hasEnded boolean).

Are we going to loose this functionality with the approach.

  1. I agree on the proposed design being much simpler than Add Resources to Recordable interface. #570 . Probably, I am missing to understand the performance benefit we will achieve against passing resource reference through Recordable interface.

@jsuereth jsuereth marked this pull request as draft February 11, 2021 17:15
@pyohannes
Copy link
Contributor

SpanProcessor is currently a shared_ptr to a TracerProvider. I think we'd need to turn that into a unique_ptr to guarantee ownership rules around passing Resource * downstream.

Theoretically, a TracerProvider can hand out multiple different Tracer instances, and each Tracer currently has to hold on to the SpanProcessor. We might run into troubles there when changing to an unique pointer here.

@pyohannes
Copy link
Contributor

pyohannes commented Feb 12, 2021

I'm torn about what approach to take.

  1. The specification suggests that resources should be bound to spans (maybe via making it possible to navigate to the tracer via the span). I like that from a conceptual point of view, because it doesn't put any constraints on the tracer/processor/exporter architecture.
  2. The approach in this PR provides a more performant implementation, and it also makes more performant exporter implementations possible.

I have a slight preference for (1). Some reasons for that:

  • I think ideally the span processor should only hold state information that is necessary for processing spans. The makes for a clean design.
  • SpanProcessor is an extension point of the SDK (users can implement their own span processors). If the span processor has to store and pass on resources this gets more complicated and error prone.
  • We don't have to violate the spec when it comes to the Export method of the exporter.

@reyang
Copy link
Member

reyang commented Feb 12, 2021

@jsuereth this is exactly what folks did in OpenTelemetry .NET! 😄

@CodeBlanch made it happen with open-telemetry/opentelemetry-dotnet#1463.

@reyang
Copy link
Member

reyang commented Feb 12, 2021

Not blocking this PR - I think we want to enforce the exporter and processor ownership - e.g. an exporter should be owned by one processor, and a processor should be owned by the provider (we probably need to discuss this in the spec SIG).

Since both exporter and processor have Shutdown function that should be only called once, it doesn't make sense for them to "belong" to multiple providers/pipelines (e.g. what do we do if an exporter is registered to two providers, do we shutdown the exporter when the 1st provider dtor happened, or we use ref-counting approach? - both seem to be cumbersome).

@utpilla FYI since you're doing some investigation on OpenTelemetry .NET object lifecycle management. (cc @cijothomas)

@lalitb
Copy link
Member

lalitb commented Feb 12, 2021

Not blocking this PR - I think we want to enforce the exporter and processor ownership - e.g. an exporter should be owned by one processor, and a processor should be owned by the provider (we probably need to discuss this in the spec SIG).

Since both exporter and processor have Shutdown function that should be only called once, it doesn't make sense for them to "belong" to multiple providers/pipelines (e.g. what do we do if an exporter is registered to two providers, do we shutdown the exporter when the 1st provider dtor happened, or we use ref-counting approach? - both seem to be cumbersome).

@utpilla FYI since you're doing some investigation on OpenTelemetry .NET object lifecycle management. (cc @cijothomas)

Thanks Reiley for providing .Net apporach. Proposal looks good to me with below assumptions:

  • Exporters won't be attached to more than one span processor.
  • Span processor won't be attached with more than one tracer provider.
  • A process having multiple tracer provides ( and thus multiple resources ) won't be sharing the same exporter pipeline.

And with below changes:

  • As per Specs, resources and instrumentation library are extended span properties, and gettable from it. And we may have to store the references to tracer and tracer provider in span to achieve this.
  • Exporter::Export() should be aligned with specs, so we may have to maintain the reference to tracer provider in exporter to fetch resources.
  • In case resources are made mutable/append-only in specs ( Adding resource attributes post-creation (e.g. via auto-discovery) opentelemetry-specification#1298 ), the current design proposal should have minimal changes, which can be tricky if exporters are using optimization to pre-convert the resources to required format.

I can see performance benefit of this approach for languages which doesn't support references to shared data, and so resources need to be copied to each span data. In our case, we can pass reference to resources as part of span data to exporters without performance overhead. The benefit can be optimization at exporter level ( as Johannes mentioned), where exporter can convert the resources to required format once instead of doing for every span data.

@ThomsonTan
Copy link
Contributor

For the scenario TracerProvider updates Resource after Span creation, if Resource is owned by TracerProvider which belongs to Span logically, this means Spans will always refer to last set Resource which seems different than the current spec requires.

@jsuereth
Copy link
Contributor Author

let me collate concerns and answer questions here:

  1. @pyohannes: Theoretically, a TracerProvider can hand out multiple different Tracer instances, and each Tracer currently has to hold on to the SpanProcessor. We might run into troubles there when changing to an unique pointer here.
  • I think we can make Tracer take the SpanProcessor by reference. AFAIK, Tracer's lifecycle is entirely contained within TracerProvider, so this should be ok.
  • I do not think this logic applies to Tracer and InstrumentationLibrary.
  1. @lalitb + @pyohannes The specification requires all Spans to have access to the resource that created them.
  • This is the one that I struggled with the most. I can see a variant of this PR where we pass a pointer to the Resource in TracerProvider to the Recordable. Here's why I think it's ok:
    • Recordable is fully owned by an Exporter
    • Exporter (in this proposal) would be fully owned by Processor
    • Processor (in this proposal) would be fully owned by the TracerProvider, so we can guarantee the reference is alive
  • I still don't think this works for InstrumentationLibrary, but I may be able to noodle something out.
  1. @reyang We should enforce the process/exporter ownership.
    • I don't think this needs to be in the Specification, but we can raise it there. I think C++ has some unique challenges to compared to other languages where "sharing" is less of an issue (or indeed, you can't enforce NOT sharing).
    • IF you want to raise this to the specification please do! I personally think this is a localized C++-ish issue. We could see how Rust/Swift are doing this for ideas from closer-to-C++ languages.
  2. @ThomsonTan the scenario TracerProvider updates Resource after Span creation, if Resource is owned by TracerProvider which belongs to Span logically, this means Spans will always refer to last set Resource which seems different than the current spec requires
  • Yes. I'm not sure there's a way to get around this if we want to share the object itself.
  • I think this proposal is likely to be closer in behavior to what changes will occur for mutable Resources. I.e. we should prioritize getting as many labelsfrom Resource into spans we export, which means delay/deferring locking down those labels.

SO next step proposals. I"m going to make several next step comments, please +1 to vote for your choice.

@jsuereth
Copy link
Contributor Author

Proposal #1: Finish the CL as-is

  • Add necessary OTLP tests and submit for full review.
  • Open a ticket/PR to migrate SpanProcess/Exporter ownership to match what I think is consensus.

@jsuereth
Copy link
Contributor Author

jsuereth commented Feb 12, 2021

Proposal #2: Change Recordable to take a Resource* from TraceProvider/Tracer on creation.

  • add "SetResourceRef" to Recordable
  • Tracer remembers its TracerProvider (or is given a Resource*)
  • Creating a Recordable will assign the resoourceRef
  • Fix up OTLP recordable such that Resource refs are visible to the OTLP exporter, update our logic appropriately.
  • Open a ticket/PR to migrate SpanProcess/Exporter ownership to match what I think is consensus.

Note: If i understand correctly THIS is what @lalitb suggests. I'll start working a PR for this, but just wanted to check sentiment against the other comments.

@pyohannes
Copy link
Contributor

#2 sounds good to me.

@lalitb
Copy link
Member

lalitb commented Feb 12, 2021

Proposal #2: Change Recordable to take a Resource* from TraceProvider/Tracer on creation.

  • add "SetResourceRef" to Recordable
  • Tracer remembers its TracerProvider (or is given a Resource*)
  • Creating a Recordable will assign the resoourceRef
  • Fix up OTLP recordable such that Resource refs are visible to the OTLP exporter, update our logic appropriately.
  • Open a ticket/PR to migrate SpanProcess/Exporter ownership to match what I think is consensus.

Note: If i understand correctly THIS is what @lalitb suggests. I'll start working a PR for this, but just wanted to check sentiment against the other comments.

Thanks @jsuereth for summarizing the discussions, and both the alternatives. Yes, I was suggesting close to approach #2, probably if possible to have quick prototype on this ( needn't be complete, spending couple of hours), it would be easy to finalize on one.
We can proceed with this option, if there are sufficient votes.

@jsuereth
Copy link
Contributor Author

I'm taking 3 votes as sufficient. You can watch progress on #580 (or help :) ).

@jsuereth jsuereth closed this Feb 12, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants