Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Loading the Go SDK #52

Closed
jmacd opened this issue Jul 11, 2019 · 18 comments · Fixed by open-telemetry/oteps#11
Closed

Loading the Go SDK #52

jmacd opened this issue Jul 11, 2019 · 18 comments · Fixed by open-telemetry/oteps#11

Comments

@jmacd
Copy link
Contributor

jmacd commented Jul 11, 2019

As discussed briefly in the 7/9/2019 OpenTelemetry specification SIG meeting, the current specification does not say exactly how to initialize the SDK, particularly, the standard "global" tracer, meter, and recorder interfaces. It's not clear whether this requires a note in the cross-language specification, but I think we can state requirements for Go here. This is a proposal we might want to adopt:

  • The API must provide getters and setters for global Tracer, Meter, and Recorder interfaces, which comprise the standard SDK, these are expected to be widely used by default
  • To satisfy "Zero-Touch telemetry requirements", the application is not required to call an initializer to install or start the default SDK.
  • Where the language provides dynamic code loading, the API is required to support loading a SDK dynamically based on a language-appropriate external configuration (e.g., environment variables).
  • The API is required to support manual installation of a specific SDK, via a direct dependency.

Given these (proposed) requirements, here is an implementation for Go that works. It requires moving the global setter/getter methods into dedicated packages to avoid an import cycle--it is those global packages that trigger auto-loading the SDK.

jmacd#1

@bogdandrutu
Copy link
Member

Offering a setter may not be feasible in regard to the static initialization order. In other words once I got an instance of the Tracer probably is good to guarantee that the get will always return the same instance.

@tigrannajaryan
Copy link
Member

@jmacd

Do I understand it correct that this proposal is the following: as an end user I enable telemetry sending by importing a package and don't need to make an explicit "enable telemetry" call.

To satisfy "Zero-Touch telemetry requirements", the application is not required to call an initializer to install or start the default SDK.

Is "zero-touch" a valid requirement for manually instrumented code? Doesn't the end user have to at least configure destinations to export the data to? Or we assume that there is a default listener somewhere on "localhost"?

@tigrannajaryan
Copy link
Member

@jmacd I may be missing something, so perhaps you can expand a bit on what problem auto-loading solves. It seem to me that "add an import line to auto-load" vs "add a one-line call to load" are similar amount of effort for the end-user with the later being more explicit (would be my preference), but I may be wrong.

@jmacd
Copy link
Contributor Author

jmacd commented Jul 11, 2019

@bogdandrutu This is a tricky question -- I like the idea that once a tracer is set it will not be set again, but I can't see how to avoid it always. Suppose in my example that the OPENTELEMETRY_LIB environment is not set, meaning the auto-load would not install a Tracer. Some code would then, I suppose, retrieve a NoopTracer. At some point later, the application (either via dependency or explicit setter) will install a Tracer, but some code might have obtained a Noop. I think this might be unavoidable, unless the "zero touch" requirement is not really a requirement.

@tigrannajaryan This question is really trying to establish whether the user will configure the SDK before it's installed, or whether it can be fully automatic. If it's fully automatic, I think we could assume that environment variables will be used for configuration.

@tigrannajaryan To be clear, in the PR I shared above, the loading would be achieved as a side-effect of calling global.Tracer() (or one of the other global getters). There would be no application Init() method. I'm not sure what the requirements really are, so I posted this issue. We could instead decide that end users are required to add a one-line import or call a one-line Init() method, and then none of this auto-load stuff would be needed. However, it doesn't avoid the scenario where a global.Tracer() call returns a no-op tracer when the library is used before initialization.

This comes down to the "zero touch" requirement, which I actually like. To me, this means that if you're using any code that depends on OpenTelemetry and which uses the global tracer, you can auto-enable instrumentation just by setting an environment variable.

@tigrannajaryan
Copy link
Member

To be clear, in the PR I shared above, the loading would be achieved as a side-effect of calling global.Tracer() (or one of the other global getters).

@jmacd my understanding is that global.Tracer() is in API package and can be called by any bit of code, including a third-part library instrumented with OpenTelemetry. global.Tracer() should return minimal Tracer if the app chooses to NOT use the SDK. This mean the decision to load or not load the SDK cannot happen merely as a side-effect of global.Tracer() call, that decision has to be done in some other way.

IMO, it must be either an explicit call e.g. something like registerSDK() or be a side-effect of "import"-ing the SDK package.

This comes down to the "zero touch" requirement, which I actually like. To me, this means that if you're using any code that depends on OpenTelemetry and which uses the global tracer, you can auto-enable instrumentation just by setting an environment variable.

If I am using a third-party library which happens to be instrumented with OpenTelemetry API, I cannot simply enable telemetry collection by setting an environment variable. At the minimum I need to build my code with OpenTelemetry SDK. If my application (or third-party library) is merely built with OpenTelemetry API setting an environment variable is not enough for enabling telemetry collection.

So I am assuming that you actually meant to write "you can auto-enable instrumentation just by setting an environment variable AND by building with OpenTelemetry SDK". Is this so or I still misunderstand what you want to say?

@jmacd
Copy link
Contributor Author

jmacd commented Jul 12, 2019

To review, I see three ways to trigger loading the SDK:

  1. The explicit dependency
import _ "github.com/open-telemetry/opentelemetry-go/v1/sdk"
  1. The explicit initializer
import "github.com/open-telemetry/opentelemetry-go/v1/sdk"

func main() {
  sdk.Register()
}
  1. The automatic load I've described:
import "github.com/open-telemetry/opentelemetry-go/api/trace/global"

func main() {
  // ...
  global.Tracer().Start(...)
}

I believe that with case (1) and (2), it's impossible to prevent multiple SDKs from being registered, possibly overwriting each other. It's also impossible to prevent code which does not explicitly depend on / initialize an SDK from getting the a global Noop tracer before any SDK is registered. I am not against supporting the explicit approaches, but these are drawbacks. This doesn't qualify as "zero touch" because the code has to be modified to bind it to a SDK.

For case (3) as outlined in the associated PR above, we're able to ensure that one SDK is loaded before the first use of global state and satisfy the "zero-touch" requirement (if it's a requirement).

@tigrannajaryan
Copy link
Member

@jmacd thanks for listing the cases, this is helpful for discussion.

I believe that with case (1) and (2), it's impossible to prevent multiple SDKs from being registered, possibly overwriting each other.

if I am not wrong Go guarantees that each package is initialized once only (even if imported from multiple places). So for case(1) if the SDK is registered in init() func of sdk.go (or whatever file is named) it will be registered only once.

It's also impossible to prevent code which does not explicitly depend on / initialize an SDK from getting the a global Noop tracer before any SDK is registered.

True. With this approach I do not see a way to guarantee that SDK is initialized before any API call is made for those API calls that may happen from initialization code of other packages (AFAIK package initialization order is by dependency graph and SDK cannot be a dependency of those other packages - thus no guarantee). This is a problem if we do want availability of OpenTelemetry Tracer from initialization code.

The automatic load I've described:

import "github.com/open-telemetry/opentelemetry-go/api/trace/global"

func main() {
  // ...
  global.Tracer().Start(...)
}

OK, I understand what you suggest now. Re-reading this issue description I believe I agree with the goal. I will post comments in the PR that you posted since I have questions about the specific implementation.

@jmacd
Copy link
Contributor Author

jmacd commented Jul 12, 2019

if I am not wrong Go guarantees that each package is initialized once only (even if imported from multiple places). So for case(1) if the SDK is registered in init() func of sdk.go (or whatever file is named) it will be registered only once.

I was thinking of a scenario where two independent libraries tried to import different SDKs, then one would be registered first, followed by a second registration.

@freeformz
Copy link
Contributor

I don't want to continue to be the sour person here, but these are all considered anti-patterns in Go. Global state is terrible. Libraries instrumented with open telemetry should provide a function that allows users of that library to inject the tracer, etc. I am literally running out the door right now so I can't comment any further, but I'm happy to discuss or comment more later.

@jmacd
Copy link
Contributor Author

jmacd commented Jul 15, 2019

I don't take your remarks as "sour"! In making this proposal I've tried to satisfy what I think are requirements, and I think we're disagreeing about what the requirements should be.

I think your position, "Libraries instrumented with open telemetry should provide a function that allows users of that library to inject the tracer, etc.", is incompatible with "zero touch instrumentation", which has been discussed as one of the goals for OpenTelemetry as a project. This means that libraries should just "be instrumented", not worry about where the tracer comes from. OpenTracing had this feature, too.

This proposal showed that using the Go plugin system offers one way to achieve that requirement (i.e., auto dependency injection), but there are other ways it can be satisfied. Your point about global state is true, but if we strike the requirement to support setting the global value, then we have an immutable initialized-before-use object, not global state.

I'd like to point out that even if we accept the idea that an SDK should not be virtualized, that instead we should have a concrete API and rely on exporters for pluggable functionality (https://github.com/iredelmeier/rfcs/blob/sdk-as-exporter/text/0000-sdk-as-exporter.md), we still have a global initialization problem. Instead of debating how to auto-inject an SDK, then, we will be debating how to auto-inject exporters and how to support exporters that are attached "mid-stream".

Should automatic dependency injection be a requirement? I think it should. I filed this issue in the Go repository to test if we really accept it as a requirement.

@jmacd
Copy link
Contributor Author

jmacd commented Jul 23, 2019

I can see that the use of plugins makes this proposal difficult to accept, but I think we'd have just the same problems without plugins.

If we allow the SDK to auto-initialize, and allow the application to re-initialize inside their main() function, what happens when the user never initializes the SDK? If we've installed a temporary SDK that buffers events "until the SDK is loaded", we'll have to address the case where no SDK is loaded by the application.

I suppose one answer would be to install one of these "buffering" SDKs with a fixed limit of events. If an SDK is installed before it reaches the limit, the events can be replayed for the new SDK. If the limit is reached w/o another SDK registering, then it assumes it's the "real thing" and ... then what? I suppose the SDK would then disable itself.

If the application intends for a real SDK to register, they can easily do that via an import statement, then I'd ask what happens if there are multiple registrations? If we've already replayed buffered events to a new SDK and another one comes along, then what?

I believe these questions will stand with or without a plugin approach.

To summarize, a potential proposal:

(1) the API auto-initalizes a buffering SDK while initializing the globals package, ensuring every event is recorded from the very beginning of the process lifetime
(2) if the buffering SDK fills its buffer, it replaces itself with a Noop SDK
(3) the application can call Init() once to install a new SDK
(4) if a new SDK is installed in place of the buffering SDK, the events are replayed
(5) if the new SDK is installed but it is not the first registration, the new SDK is ignored

That's pretty complicated, but it supports explicit application initialization. Add in the plugin behavior, or force-link a "real" SDK as a dependency, and this approach can also support zero-touch as well.

@jmacd
Copy link
Contributor Author

jmacd commented Jul 23, 2019

@tigrannajaryan I responded to some of the concerns you expressed in the PR here. I'm afraid the proposal I'm left with is too complicated, and that's without trying to figure out how to initialize global resources, which I've also been thinking about.

@tigrannajaryan
Copy link
Member

@jmacd I agree, it looks too complex. I don't have a better proposal though (sorry).

@jmacd
Copy link
Contributor Author

jmacd commented Jul 23, 2019

Considering the thread above, I'm leaning away from auto-installing any SDKs. I think we should require the user to call an Init() method from main() or somewhere in the framework, if there's a framework in use.

This follows a similar suggestion for Ruby, open-telemetry/opentelemetry-ruby#19, where @mwear writes:

The time at which to install instrumentation in a Ruby application can be challenging to get right
To make this straightforward for the reference implementation, the SDK should have an API method that a user can call to install instrumentation at the right time for their application
Various SDK implementations can engage in whatever clever means they need to try to install instrumentation at the right time by calling this API method at the right time on behalf of the user

I think probably the same can be said in Go, and will probably suggest that we put something similar to this phrasing into the cross-language spec.

@freeformz
Copy link
Contributor

@jmacd AFAICT "zero touch instrumentation" is related to compiled code that I don't control though. So, afaict they are orthogonal concerns. I'm talking about code myself or engineering teams I work with will be compiling from scratch.

I think the library we're writing should be available for use by someone implementing "zero touch instrumentation" of Go applications (i.e. they should be able to use this library in their agent), but I don't really understand the need to build the library as zero touch.

@yurishkuro
Copy link
Member

@jmacd

Your point about global state is true, but if we strike the requirement to support setting the global value, then we have an immutable initialized-before-use object, not global state.

How does that help with unit tests? If I am writing instrumentation for some framework, I need to test it, and for that I need to be able to give it different types of telemetry implementations than the prod library. I nearly had a revolt in one group at Uber because of global tracer in OpenTracing for Java.

@jmacd
Copy link
Contributor Author

jmacd commented Jul 31, 2019

@yurishkuro I would suggest that unit tests should not use the global SDK. If pushed, I'd propose a test-specific SDK that supports dynamically resetting the global instance, since the semantics of swapping SDKs isn't a problem in testing environments.

Further, the proposal I just filed (RFC 0010 linked above) specifies that Java should do the Java thing and use the Service Provider Interface.

@yurishkuro
Copy link
Member

@jmacd I don't see how any global instance can work in unit tests, whether it's resettable or not. Unit tests can run in parallel, global kills that possibility. What's worse, when global is a part of the public API, someone writing instrumentation for some framework X would think this is how they are supposed to access the tracer (why bother implementing proper dependency injection wiring when you can cheat and just call a global). Then writing proper unit tests for such instrumentation or the use of that framework elsewhere becomes impossible.

hstan referenced this issue in hstan/opentelemetry-go Oct 15, 2020
…rades (#52)

* Create pre_release.sh script to prepare tagged releases

In addition, with the '-o' option it can also be used to bump the
version of go.opentelemetry.io/otel used by the modules in this repo.

* Apply suggestions from code review

per @MrAlias

Co-authored-by: Tyler Yahn <MrAlias@users.noreply.github.com>

* Apply PR suggestions

* fixup! Create pre_release.sh script to prepare tagged releases

Co-authored-by: Tyler Yahn <MrAlias@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants