open-telemetry · svrnm · Dec 9, 2021 · Dec 13, 2021 · yurishkuro · Dec 13, 2021
@@ -0,0 +1,69 @@
+# Mandatory unique identifier for sdk-based telemetry sources
+
+Provide an explicit mandatory unique identifier for sdk-based telemetry sources.
+
+## Motivation
+
+Having a way to uniquely identify a telemetry source is helpful in many ways, like in processing and storing data from that source, visualizing them in a backend UI or debugging issues with that source and it's data.
+
+For sdk-based telemetry sources, as of now `service.name` (and related attributes `service.namespace` and `service.instance_id`) are the implicit standard for that due to `service.name` being enforced as mandatory by the [Resource SDK specification](https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/resource/sdk.md#sdk-provided-resource-attributes) and [Resource Semantic Conventions](https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/resource/semantic_conventions/README.md#semantic-attributes-with-sdk-provided-default-value).
+
+But, because those attributes are not **explicitly** available to uniquely identify a sdk-based telemetry source, multiple issues are calling out problems with the current state:
+
+* [opentelemetry-specification/issues#1034](https://github.com/open-telemetry/opentelemetry-specification/issues/1034) calls out that `service.instance.id` is poorly defined right now and should be replaced by something more meaningful, that can help to uniquely identify a SDK-based telemetry source.
+* [open-telemetry/opentelemetry-specification#2111](https://github.com/open-telemetry/opentelemetry-specification/pull/2111) calls out that there is no proper definition of what a `Service` is and that a proper definition is important since `service.name` is such an important attribute
+* [open-telemetry/opentelemetry-specification#2115](https://github.com/open-telemetry/opentelemetry-specification/pull/2115) asks for introducing `app.name` and others alongside `service.name` since client-side applications (browser, mobile) are **not** services and end-users might be confused by calling them a _Service_.
+* [open-telemetry/opentelemetry-specification#2192](https://github.com/open-telemetry/opentelemetry-specification/pull/2192)) is providing a middle ground between `app.name` and `service.name` by suggesting `telemetry.source.name` as broader term.
+
+To address all requirements outlined in those approaches, we are proposing the following combined approach for uniquely identifying a SDK-based telemetry source:
+
+* Introduce an `telemetry.sdk.source.id` attribute, which MUST either be autogenerated by the SDK at application start or be supplied via an environment variable to the SDK. This will be the unique identifier for an SDK-based telemetry.source.
+* Remove `service.instance.id` as attribute, since it is superseded by the `telemetry.sdk.source.id`
+* Replace `service.name` and `service.namespace` with attributes `telemetry.sdk.source.name` and `telemetry.sdk.source.namespace`, to have a more broad term for identification.
+* Make `telemetry.sdk.source.name` the attribute which MUST be provided by the SDK.
+* Provide backward compatibility with service.name by adopting [open-telemetry/oteps#161](https://github.com/open-telemetry/oteps/pull/161)
+* Backend specific exporters who rely on `service.name` should set a default value themselves if the attribute is missing
+* Add a term definition for `Service` and `App` to the specification glossary, which are non-overlapping.
+* Introduce further attributes to describe the telemetry source where needed, e.g. `telemetry.sdk.source.version`, `app.bundle`, `app.short_version`, ...
+
+## Explanation
+
+With those changes in place, the following use cases will be covered:
+
+* If one of many instances of a SDK-based telemetry source is in an erroneous state, the user can quickly identify that instance using the `telemetry.sdk.source.id` and fix the issue. This will improve observability of OTel SDKs themselves.
+* With replacing `service.name` with `telemetry.sdk.source.name` frontend applications and other sources, which are not seen as _Services_ by their application owners can be named in a more user-expected way. They also can use different scopes like `app` for additional attributes which might not be reasonable for a backend `service`
+* Collectors & backends can use `telemetry.sdk.source.id` (or the combination of `id`, `name` and `namespace`) as unique identifier for storing data, processing data & displaying data.
+
+## Internal details
+
+Replacing `service.instance.(id|name|namespace)` with `telemetry.sdk.source.(id|name|namespace)` will require a mechanism to provide backward-compatibility. For this we are suggesting to adopt [open-telemetry/oteps#161](https://github.com/open-telemetry/oteps/pull/161).
+
+Language specific implementations of the SDK used for instrumenting backend services will need to update their code to expect `telemetry.sdk.*` where `service.*` was used so far. This requires significant effort, although we believe that going down this route earlier is better than going on with a less-invasive change which has different drawbacks (see alternatives below).
+
+Language specific implementations of the SDK for other kinds of telemetry sources, like client side applications, gain the flexibility to use a different scope like `app` for additional attributes of their telemetry source.
+
+Implementations of the SDK need to add a mechanism to either load the `telemetry.sdk.source.id` from an environment variable or to autogenerate a value at application start. For the auto-generated ID the existing recommendation for `service.instance.id`, to use a random Version 1 or Version 4 RFC 4122 UUID, can be used.
+
+Different modules in the collector and implementations of the backend will need to adopt this change. The solution for those backend-specific exporters would be to set some default value for `service.name`, to satisfy their particular backends.
+
+## Alternatives
+
+We think that the proposed approach is the best among many. The following list provides existing alternatives and reasons why they have been rejected:
+
+1. Provide a broad definition for the term `Service`, which then would also cover client-side applications. With that `service.(instance_id|name|namespace)` could be used as unique identifier. It is possible to extend the definition of `Service` to cover that ([open-telemetry/opentelemetry-specification#2111](https://github.com/open-telemetry/opentelemetry-specification/pull/2111)), but frontend application developers & owners do not think about their applications as services and might be confused by this broad definition. Additionally it is not forseeable if other future SDK-based telemetry sources might need a different name which could not be covered by this definition.
+
+2. Introduce `app.(instance_id|name|namespace)` alongside `service.(instance_id|name|namespace)` and require that either `app.name` or `service.name` MUST be provided by the SDK. While this approach addresses the issues of (1), it comes with the disadvantage that a processor like the collector or backend needs to check multiple attributes to identify the type of the telemetry source. This creates additional unnecessary overhead. Also, this _may_ lead to attribute explosion if further SDK-based telemetry sources are introduced and are looking into providing attributes for an id, name, namespace, version or other similar attributes.
+
+## Open questions
+
+* Is the namespace `telemetry.sdk.source.*` suitable? Alternative names could be used
+  * `telemetry.source.*` as suggested by [open-telemetry/opentelemetry-specification#2192](https://github.com/open-telemetry/opentelemetry-specification/pull/2192). The difference is that it does state explicitly that only SDK-based telemetry sources are covered. This is not necessarily bad, since other telemetry sources _could_ decide to use it as well.
+  * `telemetry.instance.*`
+  * `source.*`
+  * `telemetry.sdk.*` can not be used since `telemetry.sdk.name` is already used
+* Should duplication of attributes be allowed, e.g. that `telemetry.sdk.source.name` and `service.name`and `app.name` are specified and possible to be set, or should an attribute that exists in `telemetry.sdk.source.*` not be allowed in `service.*` and `app.*`?  
+* How should additional attributes like `version`, `bundle`, `firmware_version`, `short_name`, `short_version` be treated? Does it make sense to provide a rule, that attributes common to all sources (like `version`) should also be part of `telemetry.sdk.source.*` and only specific attributes like `bundle` or `firmware_version` should live in a different namespace?
+
+## Future possibilities
+
+While the discussion right now is between backend and frontend services, in the future additional SDK-based telemetry sources like different kinds of devices could be introduced without the need to re-use `service.name` as a mandatory attribute and with the possibility to simply introduce their own scope of additional specific attributes.