[DO NOT MERGE - PLEASE REVIEW DESIGN] Feed Resource to Exporter #575

jsuereth · 2021-02-10T21:26:16Z

This PR is for discussion purposes against #570. It represents one option for how we integrate Resource through the SDK.

Basic outline:

TracerProvider is associated with a Resource
Every time a SpanProcessor is attached to a TracerProvider, it is given a pointer to the resource owned by the TracerProvider
When a SpanProcessor calls Export they must pass the associated Resource or the default.
It looks like we're missing robust tests around OTLP exporter, which I'll add before this can be submitted. Specifically I want to write a test ensuring resource-labels are appropraitely added to OTLP protos and passed around correctly.

Implications:

We've been debating "sharing" SpanProcessors. This PR would make that impossible. Once a SpanProcessor is given to a TracerProvider, we hijack the whole thing for the given TracerProvider
SpanProcessor is currently a shared_ptr to a TracerProvider. I think we'd need to turn that into a unique_ptrto guarantee ownership rules around passing Resource * downstream.

Other Alternatives:

I had another idea where instead of passing Resource downstream, we actually have Recordable remember its Tracer at the time of creation (and Tracer remember its TracerProvider) which should give us access to both InstrumentationLibrary and Resource. This has the same limitation where we'd need to enforce lifecycles of Exporters (the owners of Recordable) and no longer allow sharing.

Given the nuance and tracking information required from TracerProvider/Tracer for Resource/InstrumentationLibrary I actually think we should abandon the idea of sharing physical instances of Exporters/Processors and instead share configuration/setup (with duplicate classes) to keep this integration simple and easy to maintain.

Wanted to elicit feedback first:

@lalitb @pyohannes @maxgolov @ThomsonTan

ThomsonTan · 2021-02-10T22:43:51Z

Should the "[WIP]" in the title be removed?

jsuereth · 2021-02-11T00:48:26Z

@ThomsonTan I'll add more tests if we think this is the direction we want to go WRT to Resource, but I'd like to hear from the other contributors (and yourself) if you like what you see here first :) It has the caveat I mentioned and I want to make sure those are acceptable.

lalitb · 2021-02-11T12:28:54Z

@jsuereth Thanks for the design proposal. I can see the benefits arising from this proposal. Couple of points.

Specs considers Resources and InstrumentationLibrary as (additional ) span attributes, and should be accessible from span.
https://github.com/open-telemetry/opentelemetry-specification/blob/03a6f0fd9d1d2f5e08858b102ba620ebf3e4a6fa/specification/trace/sdk.md#additional-span-interfaces

Readable span: A function receiving this as argument MUST be able to access all information that was added to the span, as listed in the API spec. In particular, it MUST also be able to access the InstrumentationLibrary and Resource information (implicitly) associated with the span. It must also be able to reliably determine whether the Span has ended (some languages might implement this by having an end timestamp of null, others might have an explicit hasEnded boolean).

Are we going to loose this functionality with the approach.

I agree on the proposed design being much simpler than Add Resources to Recordable interface. #570 . Probably, I am missing to understand the performance benefit we will achieve against passing resource reference through Recordable interface.

pyohannes · 2021-02-12T02:26:40Z

SpanProcessor is currently a shared_ptr to a TracerProvider. I think we'd need to turn that into a unique_ptr to guarantee ownership rules around passing Resource * downstream.

Theoretically, a TracerProvider can hand out multiple different Tracer instances, and each Tracer currently has to hold on to the SpanProcessor. We might run into troubles there when changing to an unique pointer here.

pyohannes · 2021-02-12T02:30:49Z

I'm torn about what approach to take.

The specification suggests that resources should be bound to spans (maybe via making it possible to navigate to the tracer via the span). I like that from a conceptual point of view, because it doesn't put any constraints on the tracer/processor/exporter architecture.
The approach in this PR provides a more performant implementation, and it also makes more performant exporter implementations possible.

I have a slight preference for (1). Some reasons for that:

I think ideally the span processor should only hold state information that is necessary for processing spans. The makes for a clean design.
SpanProcessor is an extension point of the SDK (users can implement their own span processors). If the span processor has to store and pass on resources this gets more complicated and error prone.
We don't have to violate the spec when it comes to the Export method of the exporter.

reyang · 2021-02-12T05:09:19Z

@jsuereth this is exactly what folks did in OpenTelemetry .NET! 😄

@CodeBlanch made it happen with open-telemetry/opentelemetry-dotnet#1463.

reyang · 2021-02-12T05:41:13Z

Not blocking this PR - I think we want to enforce the exporter and processor ownership - e.g. an exporter should be owned by one processor, and a processor should be owned by the provider (we probably need to discuss this in the spec SIG).

Since both exporter and processor have Shutdown function that should be only called once, it doesn't make sense for them to "belong" to multiple providers/pipelines (e.g. what do we do if an exporter is registered to two providers, do we shutdown the exporter when the 1st provider dtor happened, or we use ref-counting approach? - both seem to be cumbersome).

@utpilla FYI since you're doing some investigation on OpenTelemetry .NET object lifecycle management. (cc @cijothomas)

lalitb · 2021-02-12T06:57:12Z

Not blocking this PR - I think we want to enforce the exporter and processor ownership - e.g. an exporter should be owned by one processor, and a processor should be owned by the provider (we probably need to discuss this in the spec SIG).

Since both exporter and processor have Shutdown function that should be only called once, it doesn't make sense for them to "belong" to multiple providers/pipelines (e.g. what do we do if an exporter is registered to two providers, do we shutdown the exporter when the 1st provider dtor happened, or we use ref-counting approach? - both seem to be cumbersome).

@utpilla FYI since you're doing some investigation on OpenTelemetry .NET object lifecycle management. (cc @cijothomas)

Thanks Reiley for providing .Net apporach. Proposal looks good to me with below assumptions:

Exporters won't be attached to more than one span processor.
Span processor won't be attached with more than one tracer provider.
A process having multiple tracer provides ( and thus multiple resources ) won't be sharing the same exporter pipeline.

And with below changes:

As per Specs, resources and instrumentation library are extended span properties, and gettable from it. And we may have to store the references to tracer and tracer provider in span to achieve this.
Exporter::Export() should be aligned with specs, so we may have to maintain the reference to tracer provider in exporter to fetch resources.
In case resources are made mutable/append-only in specs ( Adding resource attributes post-creation (e.g. via auto-discovery) opentelemetry-specification#1298 ), the current design proposal should have minimal changes, which can be tricky if exporters are using optimization to pre-convert the resources to required format.

I can see performance benefit of this approach for languages which doesn't support references to shared data, and so resources need to be copied to each span data. In our case, we can pass reference to resources as part of span data to exporters without performance overhead. The benefit can be optimization at exporter level ( as Johannes mentioned), where exporter can convert the resources to required format once instead of doing for every span data.

ThomsonTan · 2021-02-12T08:34:04Z

For the scenario TracerProvider updates Resource after Span creation, if Resource is owned by TracerProvider which belongs to Span logically, this means Spans will always refer to last set Resource which seems different than the current spec requires.

jsuereth · 2021-02-12T14:37:22Z

let me collate concerns and answer questions here:

@pyohannes: Theoretically, a TracerProvider can hand out multiple different Tracer instances, and each Tracer currently has to hold on to the SpanProcessor. We might run into troubles there when changing to an unique pointer here.

I think we can make Tracer take the SpanProcessor by reference. AFAIK, Tracer's lifecycle is entirely contained within TracerProvider, so this should be ok.
I do not think this logic applies to Tracer and InstrumentationLibrary.

@lalitb + @pyohannes The specification requires all Spans to have access to the resource that created them.

This is the one that I struggled with the most. I can see a variant of this PR where we pass a pointer to the Resource in TracerProvider to the Recordable. Here's why I think it's ok:
- Recordable is fully owned by an Exporter
- Exporter (in this proposal) would be fully owned by Processor
- Processor (in this proposal) would be fully owned by the TracerProvider, so we can guarantee the reference is alive
I still don't think this works for InstrumentationLibrary, but I may be able to noodle something out.

@reyang We should enforce the process/exporter ownership.
- I don't think this needs to be in the Specification, but we can raise it there. I think C++ has some unique challenges to compared to other languages where "sharing" is less of an issue (or indeed, you can't enforce NOT sharing).
- IF you want to raise this to the specification please do! I personally think this is a localized C++-ish issue. We could see how Rust/Swift are doing this for ideas from closer-to-C++ languages.
@ThomsonTan the scenario TracerProvider updates Resource after Span creation, if Resource is owned by TracerProvider which belongs to Span logically, this means Spans will always refer to last set Resource which seems different than the current spec requires

Yes. I'm not sure there's a way to get around this if we want to share the object itself.
I think this proposal is likely to be closer in behavior to what changes will occur for mutable Resources. I.e. we should prioritize getting as many labelsfrom Resource into spans we export, which means delay/deferring locking down those labels.

SO next step proposals. I"m going to make several next step comments, please +1 to vote for your choice.

jsuereth · 2021-02-12T14:38:29Z

Proposal #1: Finish the CL as-is

Add necessary OTLP tests and submit for full review.
Open a ticket/PR to migrate SpanProcess/Exporter ownership to match what I think is consensus.

jsuereth · 2021-02-12T14:41:26Z

Proposal #2: Change Recordable to take a Resource* from TraceProvider/Tracer on creation.

add "SetResourceRef" to Recordable
Tracer remembers its TracerProvider (or is given a Resource*)
Creating a Recordable will assign the resoourceRef
Fix up OTLP recordable such that Resource refs are visible to the OTLP exporter, update our logic appropriately.
Open a ticket/PR to migrate SpanProcess/Exporter ownership to match what I think is consensus.

Note: If i understand correctly THIS is what @lalitb suggests. I'll start working a PR for this, but just wanted to check sentiment against the other comments.

pyohannes · 2021-02-12T17:29:32Z

#2 sounds good to me.

lalitb · 2021-02-12T17:33:39Z

Proposal #2: Change Recordable to take a Resource* from TraceProvider/Tracer on creation.

add "SetResourceRef" to Recordable

Tracer remembers its TracerProvider (or is given a Resource*)

Creating a Recordable will assign the resoourceRef

Fix up OTLP recordable such that Resource refs are visible to the OTLP exporter, update our logic appropriately.

Open a ticket/PR to migrate SpanProcess/Exporter ownership to match what I think is consensus.

Note: If i understand correctly THIS is what @lalitb suggests. I'll start working a PR for this, but just wanted to check sentiment against the other comments.

Thanks @jsuereth for summarizing the discussions, and both the alternatives. Yes, I was suggesting close to approach #2, ~~probably if possible to have quick prototype on this ( needn't be complete, spending couple of hours), it would be easy to finalize on one~~.
We can proceed with this option, if there are sufficient votes.

jsuereth · 2021-02-12T21:50:37Z

I'm taking 3 votes as sufficient. You can watch progress on #580 (or help :) ).

jsuereth added 4 commits February 10, 2021 13:59

Wire Resource through sdk to exporters.

72de436

Update exporters to take in resource (but not use it).

9511a1f

Add attempt at etw fixes from linux.

9b6c301

Wire resource into OTLP exporter.

4da2249

jsuereth requested a review from a team as a code owner February 10, 2021 21:26

jsuereth mentioned this pull request Feb 10, 2021

Add Resources to Recordable interface. #570

Closed

jsuereth changed the title ~~[WIP] Feed Resource to Exporter~~ [DO NOT MERGE - PLEASE REVIEW DESIGN] Feed Resource to Exporter Feb 11, 2021

jsuereth marked this pull request as draft February 11, 2021 17:15

jsuereth mentioned this pull request Feb 12, 2021

Add resource to Recordable, make room for InstrumentationLibrary #580

Closed

12 tasks

jsuereth closed this Feb 12, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[DO NOT MERGE - PLEASE REVIEW DESIGN] Feed Resource to Exporter #575

[DO NOT MERGE - PLEASE REVIEW DESIGN] Feed Resource to Exporter #575

jsuereth commented Feb 10, 2021

ThomsonTan commented Feb 10, 2021

jsuereth commented Feb 11, 2021

lalitb commented Feb 11, 2021 •

edited

Loading

pyohannes commented Feb 12, 2021

pyohannes commented Feb 12, 2021 •

edited

Loading

reyang commented Feb 12, 2021

reyang commented Feb 12, 2021 •

edited

Loading

lalitb commented Feb 12, 2021 •

edited

Loading

ThomsonTan commented Feb 12, 2021

jsuereth commented Feb 12, 2021

jsuereth commented Feb 12, 2021

jsuereth commented Feb 12, 2021 •

edited

Loading

pyohannes commented Feb 12, 2021

lalitb commented Feb 12, 2021 •

edited

Loading

jsuereth commented Feb 12, 2021

[DO NOT MERGE - PLEASE REVIEW DESIGN] Feed Resource to Exporter #575

[DO NOT MERGE - PLEASE REVIEW DESIGN] Feed Resource to Exporter #575

Conversation

jsuereth commented Feb 10, 2021

ThomsonTan commented Feb 10, 2021

jsuereth commented Feb 11, 2021

lalitb commented Feb 11, 2021 • edited Loading

pyohannes commented Feb 12, 2021

pyohannes commented Feb 12, 2021 • edited Loading

reyang commented Feb 12, 2021

reyang commented Feb 12, 2021 • edited Loading

lalitb commented Feb 12, 2021 • edited Loading

ThomsonTan commented Feb 12, 2021

jsuereth commented Feb 12, 2021

jsuereth commented Feb 12, 2021

jsuereth commented Feb 12, 2021 • edited Loading

pyohannes commented Feb 12, 2021

lalitb commented Feb 12, 2021 • edited Loading

jsuereth commented Feb 12, 2021

lalitb commented Feb 11, 2021 •

edited

Loading

pyohannes commented Feb 12, 2021 •

edited

Loading

reyang commented Feb 12, 2021 •

edited

Loading

lalitb commented Feb 12, 2021 •

edited

Loading

jsuereth commented Feb 12, 2021 •

edited

Loading

lalitb commented Feb 12, 2021 •

edited

Loading