From 08688facdde5f1879bdc5ae7d4c9decfb023a670 Mon Sep 17 00:00:00 2001 From: Sylvain Juge Date: Wed, 29 Sep 2021 14:42:52 +0200 Subject: [PATCH 01/26] add OpenTracing spec from apm#32 --- specs/agents/README.md | 1 + specs/agents/tracing-api-ot.md | 36 ++++++++++++++++++++++++++++++++ specs/agents/tracing-api-otel.md | 7 +++++++ 3 files changed, 44 insertions(+) create mode 100644 specs/agents/tracing-api-ot.md create mode 100644 specs/agents/tracing-api-otel.md diff --git a/specs/agents/README.md b/specs/agents/README.md index 9c406c21..60db1ac7 100644 --- a/specs/agents/README.md +++ b/specs/agents/README.md @@ -55,6 +55,7 @@ You can find details about each of these in the [APM Data Model](https://www.ela - [Messaging systems](tracing-instrumentation-messaging.md) - [gRPC](tracing-instrumentation-grpc.md) - [GraphQL](tracing-instrumentation-graphql.md) + - [OpenTelemetry API Bridge](tracing-otel-api-bridge.md) - [Error/exception tracking](error-tracking.md) - [Metrics](metrics.md) - [Logging Correlation](log-correlation.md) diff --git a/specs/agents/tracing-api-ot.md b/specs/agents/tracing-api-ot.md new file mode 100644 index 00000000..3ab7b980 --- /dev/null +++ b/specs/agents/tracing-api-ot.md @@ -0,0 +1,36 @@ +## OpenTracing API + +[OpenTracing](https://opentracing.io) provides a vendor-neutral API for tracing. It is now deprecated in favor +of [OpenTelemetry](https://opentelemetry.io). + +Support for OpenTelemetry is defined in [OpenTelemetry API bridge](tracing-api-otel.md). + +Agents may provide a bridge implementation of OpenTracing API following this specification. + +### Tags + +- If the bridge sees one of our predefined special purpose tags, it should use the value of the tag to set the + associated value, but the tag it self should not be stored. Example: The tag `user.id` should not be stored as a tag, + but instead be used to populate `context.user.id` on the active transaction +- If no `type` tag is provided, the current span/transaction should use whatever their default type normally is + +### Logs + +- If a "log" is set on a span with an `event` field containing the value `error`, the bridge should do one of the + following: + - If the log contains an `error.object` field, expect that to be a normal error object and log that however the + agent normally logs errors + - Alternatively, if the log contains a `message` field, log that however the agent normally logs plain text messages + +### Formats + +- Tracers should support the text format. The value should be the same format as the http header value +- Tracers should _not_ support the binary format. Bridges should implement it as a no-op and optionally log a warning + the first time the user tries to use the binary format + +### Parent/Child relationships + +- Tracers should only support a single "child-of" relationship + - If a span is given a list of more than one parent relationship, use the first that is of type "child-of" + - If the provided list of parent relationships doesn't contain a "child-of", the span should be a root-span + - Optionally log a warning the first time an unsupported parent type is seen or if more than one parent is provided diff --git a/specs/agents/tracing-api-otel.md b/specs/agents/tracing-api-otel.md new file mode 100644 index 00000000..b6cdc8af --- /dev/null +++ b/specs/agents/tracing-api-otel.md @@ -0,0 +1,7 @@ +## OpenTelemetry API Bridge + +### Support + +Agents may provide a bridge implementation of the OpenTelemetry API. + +- Rely on vendor-neutral OTel API in place of Elastic API From 4932d5a41a48031a9fd693f957972a3834cc07e2 Mon Sep 17 00:00:00 2001 From: Sylvain Juge Date: Wed, 29 Sep 2021 14:44:09 +0200 Subject: [PATCH 02/26] add otel bridge spec --- specs/agents/tracing-api-otel.md | 50 +++++++++++++++++++++++++++++--- specs/agents/tracing-api.md | 10 +++++-- 2 files changed, 54 insertions(+), 6 deletions(-) diff --git a/specs/agents/tracing-api-otel.md b/specs/agents/tracing-api-otel.md index b6cdc8af..6d395342 100644 --- a/specs/agents/tracing-api-otel.md +++ b/specs/agents/tracing-api-otel.md @@ -1,7 +1,49 @@ -## OpenTelemetry API Bridge +## OpenTelemetry API (Tracing) -### Support +[OpenTelemetry](https://opentelemetry.io) (OTel in short) provides a vendor-neutral API that allows to capture tracing, logs and metrics data. -Agents may provide a bridge implementation of the OpenTelemetry API. +Agents may provide a bridge implementation of OpenTracing Tracing API following this specification. -- Rely on vendor-neutral OTel API in place of Elastic API +Bridging here means that for each OTel span created with the API, a native span/transaction will be created and sent to APM server. + +From the perspective of the application code calling the OTel API, the delegation to a native span/transaction is transparent. +Also, this means that any OTel processors will be bypassed and ignored by the bridge. + +### Attributes mapping + +OTel relies on key-value pairs for span attributes. Keys and values are protocol-specific and are defined in [semantic convention](https://github.com/open-telemetry/opentelemetry-specification/tree/main/specification/trace/semantic_conventions) specification. + +In order to minimize the mapping complexity in agents, most of the mapping between OTel attributes and agent protocol will be delegated to APM server: +- All OTel span attributes should be captured as-is and written to agent protocol. +- APM server will handle the mapping between OTel attributes and their native transaction/spans equivalents +- Some native span/transaction attributes will still require mapping within agents for [compatibility with existing features](#compatibility-mapping) + +Proposal (WIP) +`otel.attributes` : flat key-value pair mapping added to `span` and `transaction` objects. +```json +{ + // [...] other span/transaction attributes + "otel": { + "attributes": { + "db.system": "mysql", + "db.statement": "SELECT * from table_1" + } + } +} +``` + +### Compatibility mapping + +Agents should ensure compatibility with the following features: +- breakdown metrics +- [dropped spans statistics](handling-huge-traces/tracing-spans-dropped-stats.md) +- [compressed spans](handling-huge-traces/tracing-spans-compress.md) + +As a consequence, agents have to infer and provide values for the following attributes: +- `transaction.name` +- `transaction.type` +- `span.name` +- `span.type` +- `span.subtype` +- `span.name` +- `span.destination.service.resource` diff --git a/specs/agents/tracing-api.md b/specs/agents/tracing-api.md index df961c8d..5fbb9fce 100644 --- a/specs/agents/tracing-api.md +++ b/specs/agents/tracing-api.md @@ -1,6 +1,9 @@ ## Tracer APIs -All agents must provide an API to enable developers to instrument their applications manually, in addition to any automatic instrumentation. Agents document their APIs in the elastic.co docs: +All agents must provide a native API to enable developers to instrument their applications manually, in addition to any +automatic instrumentation. + +Agents document their APIs in the elastic.co docs: - [Node.js Agent](https://www.elastic.co/guide/en/apm/agent/nodejs/current/api.html) - [Go Agent](https://www.elastic.co/guide/en/apm/agent/go/current/api.html) @@ -10,4 +13,7 @@ All agents must provide an API to enable developers to instrument their applicat - [Ruby Agent](https://www.elastic.co/guide/en/apm/agent/ruby/current/api.html) - [RUM JS Agent](https://www.elastic.co/guide/en/apm/agent/js-base/current/api.html) -In addition to each agent having a "native" API for instrumentation, they also implement the [OpenTracing APIs](https://opentracing.io). Agents should align implementations according to https://github.com/elastic/apm/issues/32. +In addition, each agent may "bridge" implementations of vendor-neutral APIs: + +- [OpenTracing API](tracing-api-ot.md). +- [OpenTelemetry API](tracing-api-otel.md). From 5831f7b8071218974ad7abba3608b750ee59eec5 Mon Sep 17 00:00:00 2001 From: Sylvain Juge Date: Wed, 29 Sep 2021 15:08:24 +0200 Subject: [PATCH 03/26] cleanup + add label fallback --- specs/agents/tracing-api-otel.md | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/specs/agents/tracing-api-otel.md b/specs/agents/tracing-api-otel.md index 6d395342..fb6662a8 100644 --- a/specs/agents/tracing-api-otel.md +++ b/specs/agents/tracing-api-otel.md @@ -18,8 +18,7 @@ In order to minimize the mapping complexity in agents, most of the mapping betwe - APM server will handle the mapping between OTel attributes and their native transaction/spans equivalents - Some native span/transaction attributes will still require mapping within agents for [compatibility with existing features](#compatibility-mapping) -Proposal (WIP) -`otel.attributes` : flat key-value pair mapping added to `span` and `transaction` objects. +OpenTelemetry attributes should be stored in `otel.attributes` as a flat key-value pair mapping added to `span` and `transaction` objects: ```json { // [...] other span/transaction attributes @@ -32,6 +31,9 @@ Proposal (WIP) } ``` +When the server does not support `otel.attributes` property, agents should use `labels` as fallback with OTel attribute +name as key. + ### Compatibility mapping Agents should ensure compatibility with the following features: From b94c05b2b8098260fe408658f0e417c0c70af1d9 Mon Sep 17 00:00:00 2001 From: Sylvain Juge Date: Wed, 29 Sep 2021 15:16:34 +0200 Subject: [PATCH 04/26] add some clarification on server-side mapping --- specs/agents/tracing-api-otel.md | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/specs/agents/tracing-api-otel.md b/specs/agents/tracing-api-otel.md index fb6662a8..a14ea485 100644 --- a/specs/agents/tracing-api-otel.md +++ b/specs/agents/tracing-api-otel.md @@ -31,9 +31,12 @@ OpenTelemetry attributes should be stored in `otel.attributes` as a flat key-val } ``` -When the server does not support `otel.attributes` property, agents should use `labels` as fallback with OTel attribute +When the APM server version does not support `otel.attributes` property, agents should use `labels` as fallback with OTel attribute name as key. +When the APM server supports `otel.attributes` property, the server-side mapping should be identical to the one +that is applied to handle native OpenTelemetry Protocol (OTLP) intake. + ### Compatibility mapping Agents should ensure compatibility with the following features: From 071253ed6b5ccfa8322d3b4920d607e21b2cc7e7 Mon Sep 17 00:00:00 2001 From: SylvainJuge Date: Wed, 29 Sep 2021 15:42:09 +0200 Subject: [PATCH 05/26] fix wording specs/agents/tracing-api.md Co-authored-by: Benjamin Wohlwend --- specs/agents/tracing-api.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/specs/agents/tracing-api.md b/specs/agents/tracing-api.md index 5fbb9fce..2f1cadab 100644 --- a/specs/agents/tracing-api.md +++ b/specs/agents/tracing-api.md @@ -13,7 +13,7 @@ Agents document their APIs in the elastic.co docs: - [Ruby Agent](https://www.elastic.co/guide/en/apm/agent/ruby/current/api.html) - [RUM JS Agent](https://www.elastic.co/guide/en/apm/agent/js-base/current/api.html) -In addition, each agent may "bridge" implementations of vendor-neutral APIs: +In addition, each agent may provide "bridge" implementations of vendor-neutral APIs: - [OpenTracing API](tracing-api-ot.md). - [OpenTelemetry API](tracing-api-otel.md). From 27d79c153d4906eb092301e08fab4f520446990e Mon Sep 17 00:00:00 2001 From: Sylvain Juge Date: Wed, 29 Sep 2021 18:16:08 +0200 Subject: [PATCH 06/26] extend spec with fallbacks + context activations --- specs/agents/tracing-api-otel.md | 72 +++++++++++++++++++++++++++++++- 1 file changed, 71 insertions(+), 1 deletion(-) diff --git a/specs/agents/tracing-api-otel.md b/specs/agents/tracing-api-otel.md index a14ea485..1a949c53 100644 --- a/specs/agents/tracing-api-otel.md +++ b/specs/agents/tracing-api-otel.md @@ -9,9 +9,63 @@ Bridging here means that for each OTel span created with the API, a native span/ From the perspective of the application code calling the OTel API, the delegation to a native span/transaction is transparent. Also, this means that any OTel processors will be bypassed and ignored by the bridge. +### Spans and Transactions + +OTel only defines Spans, whereas Elastic APM relies on both Spans and Transactions. +OTel allows users to provide the _remote context_ when creating a span, which is equivalent to providing a parent to a transaction or span, +it also allows to provide a (local) parent span. + +As a result, when creating Spans through OTel API with a bridge, agents must implement the following algorithm: + +```javascript +// otel_span contains the properties set through the OTel API +span_or_transaction = null; +if (otel_span.remote_contex != null) { + span_or_transaction = createTransactionWithParent(otel_span.remote_context); +} else if (otel_span.parent == null) { + span_or_transaction = createRootTransaction(); +} else { + span_or_transaction = createSpanWithParent(otel_span.parent); +} +``` + +### Span Kind + +OTel spans have an `SpanKind` property ([specification](https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/trace/api.md#spankind)) which is close but not strictly equivalent to our definition of spans and transactions. + +For both transactions and spans, an optional `otel.span_kind` property will be provided by agents when set through +the OTel API. +This value should be stored into Elasticsearch documents to preserve OTel semantics and help future OTel integration. + +Possible values are `CLIENT`, `SERVER`, `PRODUCER`, `CONSUMER` and `INTERNAL`, refer to [specification](https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/trace/api.md#spankind) for details on semantics. + +When `otel.span_kind` is not provided by agent, server should infer it using the following algorithm: + +```javascript +span_kind = null; +if (isTransaction(item)) { + if (item.type == "messaging") { + span_kind = "CONSUMER"; + } else if (item.type == "request") { + span_kind = "SERVER"; + } +} else { + // span + if (item.type == "external" || item.type == "storage") { + span_kind = "CLIENT"; + } +} + +if (span_kind == null) { + span_kind = "INTERNAL"; +} + +``` + ### Attributes mapping -OTel relies on key-value pairs for span attributes. Keys and values are protocol-specific and are defined in [semantic convention](https://github.com/open-telemetry/opentelemetry-specification/tree/main/specification/trace/semantic_conventions) specification. +OTel relies on key-value pairs for span attributes. +Keys and values are protocol-specific and are defined in [semantic convention](https://github.com/open-telemetry/opentelemetry-specification/tree/main/specification/trace/semantic_conventions) specification. In order to minimize the mapping complexity in agents, most of the mapping between OTel attributes and agent protocol will be delegated to APM server: - All OTel span attributes should be captured as-is and written to agent protocol. @@ -23,6 +77,7 @@ OpenTelemetry attributes should be stored in `otel.attributes` as a flat key-val { // [...] other span/transaction attributes "otel": { + "span_kind": "client", "attributes": { "db.system": "mysql", "db.statement": "SELECT * from table_1" @@ -52,3 +107,18 @@ As a consequence, agents have to infer and provide values for the following attr - `span.subtype` - `span.name` - `span.destination.service.resource` + +### Active Spans and Context + +OTel has the concept of "active context", which is implemented as a key-value map and is used for local context +propagation implicitly through thread-locals or explicitly through API. + +Our agents may not have a similar abstraction and only have the currently active span or transaction stored into a thread-local stack. +Making OTel span active means adding a reference to it in the current context, deactivating is restoring the context +before activation. + +As a result, a proper bridge implementation should ensure transparent interoperability between Elastic and OTel spans from their respective APIs +- When an Elastic span is active, the OTel current context API should have the Elastic span as current +- When an OTel context is activated, the OTel current context API should provide access to it (likely stored as a thread-local). +- Activating an OTel span on top of an Elastic span should behave exactly as if the underlying span was created using OTel API. +- Activating an Elastic span on top of an OTel span should behave like if the underlying span was created from Elastic API. From e72ad6dd0c6144b967037458fb0f7e7f01a9aadb Mon Sep 17 00:00:00 2001 From: SylvainJuge Date: Thu, 30 Sep 2021 11:16:38 +0200 Subject: [PATCH 07/26] Apply suggestions from code review Co-authored-by: Felix Barnsteiner --- specs/agents/tracing-api-otel.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/specs/agents/tracing-api-otel.md b/specs/agents/tracing-api-otel.md index 1a949c53..4c9e7906 100644 --- a/specs/agents/tracing-api-otel.md +++ b/specs/agents/tracing-api-otel.md @@ -39,7 +39,7 @@ This value should be stored into Elasticsearch documents to preserve OTel semant Possible values are `CLIENT`, `SERVER`, `PRODUCER`, `CONSUMER` and `INTERNAL`, refer to [specification](https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/trace/api.md#spankind) for details on semantics. -When `otel.span_kind` is not provided by agent, server should infer it using the following algorithm: +When `otel.span_kind` is not provided by the agent, APM Server should infer it using the following algorithm: ```javascript span_kind = null; @@ -77,7 +77,7 @@ OpenTelemetry attributes should be stored in `otel.attributes` as a flat key-val { // [...] other span/transaction attributes "otel": { - "span_kind": "client", + "span_kind": "CLIENT", "attributes": { "db.system": "mysql", "db.statement": "SELECT * from table_1" From 1a59b14e1f81407c9abb921aed332c9ae43dfecb Mon Sep 17 00:00:00 2001 From: Sylvain Juge Date: Thu, 30 Sep 2021 11:18:41 +0200 Subject: [PATCH 08/26] remove opentracing from spec --- specs/agents/tracing-api-ot.md | 36 ---------------------------------- specs/agents/tracing-api.md | 5 +---- 2 files changed, 1 insertion(+), 40 deletions(-) delete mode 100644 specs/agents/tracing-api-ot.md diff --git a/specs/agents/tracing-api-ot.md b/specs/agents/tracing-api-ot.md deleted file mode 100644 index 3ab7b980..00000000 --- a/specs/agents/tracing-api-ot.md +++ /dev/null @@ -1,36 +0,0 @@ -## OpenTracing API - -[OpenTracing](https://opentracing.io) provides a vendor-neutral API for tracing. It is now deprecated in favor -of [OpenTelemetry](https://opentelemetry.io). - -Support for OpenTelemetry is defined in [OpenTelemetry API bridge](tracing-api-otel.md). - -Agents may provide a bridge implementation of OpenTracing API following this specification. - -### Tags - -- If the bridge sees one of our predefined special purpose tags, it should use the value of the tag to set the - associated value, but the tag it self should not be stored. Example: The tag `user.id` should not be stored as a tag, - but instead be used to populate `context.user.id` on the active transaction -- If no `type` tag is provided, the current span/transaction should use whatever their default type normally is - -### Logs - -- If a "log" is set on a span with an `event` field containing the value `error`, the bridge should do one of the - following: - - If the log contains an `error.object` field, expect that to be a normal error object and log that however the - agent normally logs errors - - Alternatively, if the log contains a `message` field, log that however the agent normally logs plain text messages - -### Formats - -- Tracers should support the text format. The value should be the same format as the http header value -- Tracers should _not_ support the binary format. Bridges should implement it as a no-op and optionally log a warning - the first time the user tries to use the binary format - -### Parent/Child relationships - -- Tracers should only support a single "child-of" relationship - - If a span is given a list of more than one parent relationship, use the first that is of type "child-of" - - If the provided list of parent relationships doesn't contain a "child-of", the span should be a root-span - - Optionally log a warning the first time an unsupported parent type is seen or if more than one parent is provided diff --git a/specs/agents/tracing-api.md b/specs/agents/tracing-api.md index 2f1cadab..961236b3 100644 --- a/specs/agents/tracing-api.md +++ b/specs/agents/tracing-api.md @@ -13,7 +13,4 @@ Agents document their APIs in the elastic.co docs: - [Ruby Agent](https://www.elastic.co/guide/en/apm/agent/ruby/current/api.html) - [RUM JS Agent](https://www.elastic.co/guide/en/apm/agent/js-base/current/api.html) -In addition, each agent may provide "bridge" implementations of vendor-neutral APIs: - -- [OpenTracing API](tracing-api-ot.md). -- [OpenTelemetry API](tracing-api-otel.md). +In addition, each agent may provide "bridge" implementations of vendor-neutral [OpenTelemetry API](tracing-api-otel.md). \ No newline at end of file From e8e008b44816912e5b34e48b01cb9d449e704eb1 Mon Sep 17 00:00:00 2001 From: SylvainJuge Date: Fri, 1 Oct 2021 09:17:55 +0200 Subject: [PATCH 09/26] Apply suggestions from code review Co-authored-by: Felix Barnsteiner --- specs/agents/tracing-api-otel.md | 12 ++++++++---- 1 file changed, 8 insertions(+), 4 deletions(-) diff --git a/specs/agents/tracing-api-otel.md b/specs/agents/tracing-api-otel.md index 4c9e7906..b3685165 100644 --- a/specs/agents/tracing-api-otel.md +++ b/specs/agents/tracing-api-otel.md @@ -118,7 +118,11 @@ Making OTel span active means adding a reference to it in the current context, d before activation. As a result, a proper bridge implementation should ensure transparent interoperability between Elastic and OTel spans from their respective APIs -- When an Elastic span is active, the OTel current context API should have the Elastic span as current -- When an OTel context is activated, the OTel current context API should provide access to it (likely stored as a thread-local). -- Activating an OTel span on top of an Elastic span should behave exactly as if the underlying span was created using OTel API. -- Activating an Elastic span on top of an OTel span should behave like if the underlying span was created from Elastic API. +- After activating an Elastic span via the agent's API, the [`Context`] returned via the [get current context API] should contain that Elastic span +- When an OTel context is [attached] (aka activated), the [get current context API] should return the same [`Context`] instance. +- Starting an OTel span in the scope of an active Elastic span should make the OTel span a child of the Elastic span. +- Starting an Elastic span in the scope of an active OTel span should make the Elastic span a child of the OTel span. + +[`Context`]: https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/context/context.md +[attached]: https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/context/context.md#attach-context +[get current context API]: https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/context/context.md#get-current-context From bb3b8aea1aa913a91f17f32f47bab79eba064eae Mon Sep 17 00:00:00 2001 From: SylvainJuge Date: Mon, 4 Oct 2021 09:22:43 +0200 Subject: [PATCH 10/26] Fix typo specs/agents/tracing-api-otel.md Co-authored-by: Trent Mick --- specs/agents/tracing-api-otel.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/specs/agents/tracing-api-otel.md b/specs/agents/tracing-api-otel.md index b3685165..c2ee187f 100644 --- a/specs/agents/tracing-api-otel.md +++ b/specs/agents/tracing-api-otel.md @@ -2,7 +2,7 @@ [OpenTelemetry](https://opentelemetry.io) (OTel in short) provides a vendor-neutral API that allows to capture tracing, logs and metrics data. -Agents may provide a bridge implementation of OpenTracing Tracing API following this specification. +Agents may provide a bridge implementation of OpenTelemetry Tracing API following this specification. Bridging here means that for each OTel span created with the API, a native span/transaction will be created and sent to APM server. From fcabe0bf4d044310288b35baec78d2c9273b701a Mon Sep 17 00:00:00 2001 From: stuart nelson Date: Mon, 11 Oct 2021 11:14:43 +0200 Subject: [PATCH 11/26] Update tracing-api-otel.md add supported apm-server version for translation --- specs/agents/tracing-api-otel.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/specs/agents/tracing-api-otel.md b/specs/agents/tracing-api-otel.md index c2ee187f..fb2b8661 100644 --- a/specs/agents/tracing-api-otel.md +++ b/specs/agents/tracing-api-otel.md @@ -86,6 +86,8 @@ OpenTelemetry attributes should be stored in `otel.attributes` as a flat key-val } ``` +APM server supports the `otel.attributes` property starting with version 7.16.0. + When the APM server version does not support `otel.attributes` property, agents should use `labels` as fallback with OTel attribute name as key. From d6a9e984ede2d7f03c64adbe7c48caa205ed6be2 Mon Sep 17 00:00:00 2001 From: Sylvain Juge Date: Mon, 11 Oct 2021 12:03:14 +0200 Subject: [PATCH 12/26] add span type 'db' to spec --- specs/agents/tracing-api-otel.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/specs/agents/tracing-api-otel.md b/specs/agents/tracing-api-otel.md index fb2b8661..fe3a5033 100644 --- a/specs/agents/tracing-api-otel.md +++ b/specs/agents/tracing-api-otel.md @@ -51,7 +51,7 @@ if (isTransaction(item)) { } } else { // span - if (item.type == "external" || item.type == "storage") { + if (item.type == "external" || item.type == "storage" || item.type == "db") { span_kind = "CLIENT"; } } From 7318a1bd9fdfebacb314e421c2808a0c5c61eb31 Mon Sep 17 00:00:00 2001 From: Sylvain Juge Date: Tue, 2 Nov 2021 13:22:24 +0100 Subject: [PATCH 13/26] add type,subtype & resource algorithm --- specs/agents/tracing-api-otel.md | 112 ++++++++++++++++++++++++++++--- 1 file changed, 104 insertions(+), 8 deletions(-) diff --git a/specs/agents/tracing-api-otel.md b/specs/agents/tracing-api-otel.md index fe3a5033..f073f00f 100644 --- a/specs/agents/tracing-api-otel.md +++ b/specs/agents/tracing-api-otel.md @@ -101,14 +101,110 @@ Agents should ensure compatibility with the following features: - [dropped spans statistics](handling-huge-traces/tracing-spans-dropped-stats.md) - [compressed spans](handling-huge-traces/tracing-spans-compress.md) -As a consequence, agents have to infer and provide values for the following attributes: -- `transaction.name` -- `transaction.type` -- `span.name` -- `span.type` -- `span.subtype` -- `span.name` -- `span.destination.service.resource` +As a consequence, agents must provide values for the following attributes: +- `transaction.name` or `span.name` : value directly provided by OTel API +- `transaction.type` : see inference algorithm below +- `span.type` and `span.subtype` : see inference algorithm below +- `span.destination.service.resource` : see inference algorithm below + +Transaction type: +```javascript +a = transation.otel.attributes; +span_kind = transaction.otel_span_kind; +isRpc = a['rpc.system'] !== undefined; +isHttp = a['http.url'] !== undefined || a['http.scheme'] !== undefined; +if (span_kind == 'SERVER' && (isRpc || isHttp)) { + type = 'request'; +} else if (span_kind == 'CONSUMER' && isMessaging) { + type = 'messaging'; +} else { + type = 'unknown'; +} +``` + +Span type, sub-type and destination service resource + +```javascript +a = span.otel.attributes; +type = undefined; +subtype = undefined; +resource = undefined; + +// extracts 'host:port' from URL +parseNetName = function (url) { +} + +httpPortFromScheme = function (scheme, defaultValue) { + if ('http' == scheme) { + return 80; + } else if ('https' == scheme) { + return 443; + } + return defaultValue; +} + +peerPort = a['net.peer.port']; +netName = a['net.peer.name'] || a['net.peer.ip']; + +if (netName && peerPort > 0) { + netName += ':'; + netName += peerPort; +} + +if (a['db.system']) { + type = 'db' + subtype = a['db.system']; + resource = netName || subtype; + if (a['db.name']) { + resource += '/' + resource += a['db.name']; + } + +} else if (a['messaging.system']) { + type = 'messaging'; + subtype = a['messaging.system']; + + if (!netName && a['messaging.url']) { + netName = parseNetName(a['messaging.url']); + } + resource = netName || subtype; + if (a['messaging.destination']) { + resource += '/'; + resource += a['messaging.destination']; + } + +} else if (a['rpc.system']) { + type = 'external'; + subtype = a['rpc.system']; + resource = netName || subtype; + if (a['rpc.service']) { + resource += '/'; + resource += a['rpc.service']; + } + +} else if (a['http.url'] || a['http.scheme']) { + type = 'external'; + subtype = 'http'; + + if (a['http.host'] && a['http.scheme']) { + resource = a['http.host'] + ':' + httpPortFromScheme(a['http.scheme']); + } else if (a['http.url']) { + resource = parseNetName(a['http.url']); + } +} + +if (type === undefined) { + if (span.otel.span_kind == 'INTERNAL') { + type = 'app'; + subtype = 'internal'; + } else { + type = 'unknown'; + } +} +span.type = type; +span.subtype = subtype; +span.destination.service.resource = resource; +``` ### Active Spans and Context From ac510ac06d564406924e9e1159f90f30ebb8080e Mon Sep 17 00:00:00 2001 From: Sylvain Juge Date: Tue, 2 Nov 2021 13:54:19 +0100 Subject: [PATCH 14/26] add sub-sections for algorithms --- specs/agents/tracing-api-otel.md | 24 ++++++++++++++++-------- 1 file changed, 16 insertions(+), 8 deletions(-) diff --git a/specs/agents/tracing-api-otel.md b/specs/agents/tracing-api-otel.md index f073f00f..63e4ae9f 100644 --- a/specs/agents/tracing-api-otel.md +++ b/specs/agents/tracing-api-otel.md @@ -107,7 +107,8 @@ As a consequence, agents must provide values for the following attributes: - `span.type` and `span.subtype` : see inference algorithm below - `span.destination.service.resource` : see inference algorithm below -Transaction type: +#### Transaction type + ```javascript a = transation.otel.attributes; span_kind = transaction.otel_span_kind; @@ -122,7 +123,7 @@ if (span_kind == 'SERVER' && (isRpc || isHttp)) { } ``` -Span type, sub-type and destination service resource +#### Span type, sub-type and destination service resource ```javascript a = span.otel.attributes; @@ -130,17 +131,24 @@ type = undefined; subtype = undefined; resource = undefined; -// extracts 'host:port' from URL -parseNetName = function (url) { -} - -httpPortFromScheme = function (scheme, defaultValue) { +httpPortFromScheme = function (scheme) { if ('http' == scheme) { return 80; } else if ('https' == scheme) { return 443; } - return defaultValue; + return -1; +} + +// extracts 'host' or 'host:port' from URL +parseNetName = function (url) { + var u = new URL(url); // https://developer.mozilla.org/en-US/docs/Web/API/URL + if (u.port != '') { + return u.hostname; // host:port already in URL + } else { + var port = httpPortFromScheme(u.protocol.substring(0, u.protocol.length - 1)); + return port > 0 ? u.host + ':'+ port : u.host; + } } peerPort = a['net.peer.port']; From 97b6ec07bcb71b559daa3f0a5c255236d913a105 Mon Sep 17 00:00:00 2001 From: Sylvain Juge Date: Tue, 2 Nov 2021 13:55:36 +0100 Subject: [PATCH 15/26] active context impl --- specs/agents/tracing-api-otel.md | 17 +++++++++-------- 1 file changed, 9 insertions(+), 8 deletions(-) diff --git a/specs/agents/tracing-api-otel.md b/specs/agents/tracing-api-otel.md index 63e4ae9f..bf332169 100644 --- a/specs/agents/tracing-api-otel.md +++ b/specs/agents/tracing-api-otel.md @@ -216,14 +216,8 @@ span.destination.service.resource = resource; ### Active Spans and Context -OTel has the concept of "active context", which is implemented as a key-value map and is used for local context -propagation implicitly through thread-locals or explicitly through API. - -Our agents may not have a similar abstraction and only have the currently active span or transaction stored into a thread-local stack. -Making OTel span active means adding a reference to it in the current context, deactivating is restoring the context -before activation. - -As a result, a proper bridge implementation should ensure transparent interoperability between Elastic and OTel spans from their respective APIs +When possible, bridge implementation SHOULD ensure proper interoperability between Elastic transactions/spans and OTel spans when +used from their respective APIs: - After activating an Elastic span via the agent's API, the [`Context`] returned via the [get current context API] should contain that Elastic span - When an OTel context is [attached] (aka activated), the [get current context API] should return the same [`Context`] instance. - Starting an OTel span in the scope of an active Elastic span should make the OTel span a child of the Elastic span. @@ -232,3 +226,10 @@ As a result, a proper bridge implementation should ensure transparent interopera [`Context`]: https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/context/context.md [attached]: https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/context/context.md#attach-context [get current context API]: https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/context/context.md#get-current-context + +Both OTel and our agents have their own definition of what "active context" is, for example: +- Java Agent: Elastic active context is implemented as a thread-local stack +- Java OTel API: active context is implemented as a key-value map propagated through thread local + +In order to avoid potentially complex and tedious synchronization issues between OTel and our existing agent +implementations, the bridge implementation SHOULD provide an abstraction to have a single "active context" storage. \ No newline at end of file From 93a4aa502073df0daf9edc1f9e824d99dc193f38 Mon Sep 17 00:00:00 2001 From: Sylvain Juge Date: Tue, 2 Nov 2021 14:10:21 +0100 Subject: [PATCH 16/26] add gherkin spec --- .../agents/gherkin-specs/otel_bridge.feature | 205 ++++++++++++++++++ 1 file changed, 205 insertions(+) create mode 100644 tests/agents/gherkin-specs/otel_bridge.feature diff --git a/tests/agents/gherkin-specs/otel_bridge.feature b/tests/agents/gherkin-specs/otel_bridge.feature new file mode 100644 index 00000000..5823c04f --- /dev/null +++ b/tests/agents/gherkin-specs/otel_bridge.feature @@ -0,0 +1,205 @@ +@opentelemetry-bridge +Feature: OpenTelemetry bridge + + # --- Creating Elastic span or transaction from OTel span + + Scenario: Create transaction from OTel span with remote context + Given an agent + And OTel span is created with remote context as parent + Then Elastic bridged object is a transaction + Then Elastic bridged transaction has remote context as parent + + Scenario: Create root transaction from OTel span without parent + Given an agent + And OTel span is created without parent + Then Elastic bridged object is a transaction + Then Elastic bridged transaction is a root transaction + + Scenario: Create span from OTel span + Given an agent + And OTel span is created with local context as parent + Then Elastic bridged object is a span + Then Elastic bridged span has local context as parent + + # --- TODO : compatibility mapping for server < 7.16 + # --> extra complexity here as it's part of the + + # --- OTel span kind mapping for spans & transactions + + Scenario Outline: OTel span kind for spans & default span type & subtype + Given an agent + And an active transaction + And OTel span is created with kind "" + Then Elastic bridged object is a span + Then Elastic bridged span OTel kind is "" + Then Elastic bridged span type is "" + Then Elastic bridged span subtype is "" + Examples: + | kind | default_type | default_subtype | + | INTERNAL | app | internal | + | SERVER | unknown | | + | CLIENT | unknown | | + | PRODUCER | unknown | | + | CONSUMER | unknown | | + + Scenario Outline: OTel span kind for transactions & default transaction type + Given an agent + And OTel span is created with kind "" + Then Elastic bridged object is a transaction + Then Elastic bridged transaction OTel kind is "" + Then Elastic bridged transaction type is 'unknown' + Examples: + | kind | + | INTERNAL | + | SERVER | + | CLIENT | + | PRODUCER | + | CONSUMER | + + # --- span type, subtype and action inference from OTel attributes + + # --- HTTP server + # https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/trace/semantic_conventions/http.md#http-server + Scenario Outline: HTTP server [ ] + Given an agent + And OTel span is created with kind 'SERVER' + And OTel span has following attributes + | http.url | | + | http.scheme | | + Then Elastic bridged object is a transaction + Then Elastic bridged transaction type is "request" + Examples: + | http.url | http.scheme | + | http://testing.invalid/ | | + | | http | + + # --- HTTP client + # https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/trace/semantic_conventions/http.md#http-client + Scenario Outline: HTTP client [ ] + Given an agent + And an active transaction + And OTel span is created with kind 'CLIENT' + And OTel span has following attributes + | http.url | | + | http.scheme | | + | http.host | | + | net.peer.ip | | + | net.peer.name | | + | net.peer.port | | + Then Elastic bridged span type is 'external' + Then Elastic bridged span subtype is 'http' + Then Elastic bridged span OTel attributes are copied as-is + Then Elastic bridged span destination resource is set to "" + Examples: + | http.url | http.scheme | http.host | net.peer.ip | net.peer.name | net.peer.port | resource | + | https://testing.invalid:8443/ | | | | | | testing.invalid:8443 | + | https://[::1]/ | | | | | | [::1]:443 | + | http://testing.invalid/ | | | | | | testing.invalid:80 | + | | http | testing.invalid | | | | testing.invalid:80 | + | | https | testing.invalid | 127.0.0.1 | | | testing.invalid:443 | + | | http | | 127.0.0.1 | | 81 | 127.0.0.1:81 | + | | https | | 127.0.0.1 | | 445 | 127.0.0.1:445 | + | | http | | 127.0.0.1 | host1 | 445 | host1:445 | + | | https | | 127.0.0.1 | host2 | 445 | host2:445 | + + # --- DB client + # https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/trace/semantic_conventions/database.md + Scenario Outline: DB client [ ] + Given an agent + And an active transaction + And OTel span is created with kind 'CLIENT' + And OTel span has following attributes + | db.system | | + | db.name | | + | net.peer.ip | | + | net.peer.name | | + | net.peer.port | | + Then Elastic bridged span type is 'db' + Then Elastic bridged span subtype is "" + Then Elastic bridged span OTel attributes are copied as-is + Then Elastic bridged span destination resource is set to "" + Examples: + | db.system | db.name | net.peer.ip | net.peer.name | net.peer.port | resource | + | mysql | | | | | mysql | + | oracle | | | oracledb | | oracledb | + | oracle | | 127.0.0.1 | | | 127.0.0.1 | + | mysql | | 127.0.0.1 | dbserver | 3307 | dbserver:3307 | + | mysql | myDb | | | | mysql/myDb | + | oracle | myDb | | oracledb | | oracledb/myDb | + | oracle | myDb | 127.0.0.1 | | | 127.0.0.1/myDb | + | mysql | myDb | 127.0.0.1 | dbserver | 3307 | dbserver:3307/myDb | + + # --- Messaging consumer (transaction consuming/receiving a message) + # https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/trace/semantic_conventions/messaging.md + Scenario: + Given an agent + And an active transaction + And OTel span is created with kind 'CONSUMER' + And OTel span has following attributes + | messaging.system | anything | + Then Elastic bridged transaction type is 'messaging' + + # --- Messaging producer (client span emitting a message) + # https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/trace/semantic_conventions/messaging.md + Scenario Outline: Messaging producer [ ] + Given an agent + And an active transaction + And OTel span is created with kind 'PRODUCER' + And OTel span has following attributes + | messaging.system | | + | messaging.destination | | + | messaging.url | | + | net.peer.ip | | + | net.peer.name | | + | net.peer.port | | + Then Elastic bridged span type is 'messaging' + Then Elastic bridged span subtype is "" + Then Elastic bridged span OTel attributes are copied as-is + Then Elastic bridged span destination resource is set to "" + Examples: + | messaging.system | messaging.destination | messaging.url | net.peer.ip | net.peer.name | net.peer.port | resource | + | rabbitmq | | amqp://carrot:4444/q1 | | | | carrot:4444 | + | rabbitmq | | | 127.0.0.1 | carrot-server | 7777 | carrot-server:7777 | + | rabbitmq | | | | carrot-server | | carrot-server | + | rabbitmq | | | 127.0.0.1 | | | 127.0.0.1 | + | rabbitmq | myQueue | amqp://carrot:4444/q1 | | | | carrot:4444/myQueue | + | rabbitmq | myQueue | | 127.0.0.1 | carrot-server | 7777 | carrot-server:7777/myQueue | + | rabbitmq | myQueue | | | carrot-server | | carrot-server/myQueue | + | rabbitmq | myQueue | | 127.0.0.1 | | | 127.0.0.1/myQueue | + + # --- RPC client + # https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/trace/semantic_conventions/rpc.md + Scenario Outline: RPC client [ ] + Given an agent + And an active transaction + And OTel span is created with kind 'CLIENT' + And OTel span has following attributes + | rpc.system | | + | rpc.service | | + | net.peer.ip | | + | net.peer.name | | + | net.peer.port | | + Then Elastic bridged span type is 'external' + Then Elastic bridged span subtype is "" + Then Elastic bridged span OTel attributes are copied as-is + Then Elastic bridged span destination resource is set to "" + Examples: + | rpc.system | rpc.service | net.peer.ip | net.peer.name | net.peer.port | resource | + | grpc | | | | | grpc | + | grpc | myService | | | | grpc/myService | + | grpc | myService | | rpc-server | | rpc-server/myService | + | grpc | myService | 127.0.0.1 | rpc-server | | rpc-server/myService | + | grpc | | 127.0.0.1 | rpc-server | 7777 | rpc-server:7777 | + | grpc | myService | 127.0.0.1 | rpc-server | 7777 | rpc-server:7777/myService | + | grpc | myService | 127.0.0.1 | | 7777 | 127.0.0.1:7777/myService | + + # --- RPC server + # https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/trace/semantic_conventions/rpc.md + Scenario: RPC server + Given an agent + And OTel span is created with kind 'SERVER' + And OTel span has following attributes + | rpc.system | grpc | + Then Elastic bridged transaction type is 'request' + + From 3701625c52966f240681c0a347cdc5d88f8971c2 Mon Sep 17 00:00:00 2001 From: SylvainJuge Date: Mon, 15 Nov 2021 08:39:42 +0100 Subject: [PATCH 17/26] Update specs/agents/README.md Co-authored-by: Trent Mick --- specs/agents/README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/specs/agents/README.md b/specs/agents/README.md index 60db1ac7..e7c0af1b 100644 --- a/specs/agents/README.md +++ b/specs/agents/README.md @@ -55,7 +55,7 @@ You can find details about each of these in the [APM Data Model](https://www.ela - [Messaging systems](tracing-instrumentation-messaging.md) - [gRPC](tracing-instrumentation-grpc.md) - [GraphQL](tracing-instrumentation-graphql.md) - - [OpenTelemetry API Bridge](tracing-otel-api-bridge.md) + - [OpenTelemetry API Bridge](tracing-api-otel.md) - [Error/exception tracking](error-tracking.md) - [Metrics](metrics.md) - [Logging Correlation](log-correlation.md) From 433e2c0b248d6c487daf4ad71059a81b798aae77 Mon Sep 17 00:00:00 2001 From: SylvainJuge Date: Mon, 15 Nov 2021 10:02:32 +0100 Subject: [PATCH 18/26] Update specs/agents/tracing-api-otel.md Co-authored-by: Felix Barnsteiner --- specs/agents/tracing-api-otel.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/specs/agents/tracing-api-otel.md b/specs/agents/tracing-api-otel.md index bf332169..eb8fb799 100644 --- a/specs/agents/tracing-api-otel.md +++ b/specs/agents/tracing-api-otel.md @@ -216,7 +216,7 @@ span.destination.service.resource = resource; ### Active Spans and Context -When possible, bridge implementation SHOULD ensure proper interoperability between Elastic transactions/spans and OTel spans when +When possible, bridge implementation MUST ensure proper interoperability between Elastic transactions/spans and OTel spans when used from their respective APIs: - After activating an Elastic span via the agent's API, the [`Context`] returned via the [get current context API] should contain that Elastic span - When an OTel context is [attached] (aka activated), the [get current context API] should return the same [`Context`] instance. From 5a0cd1beadb31882710dcaa2453fbd5f3e3c63d3 Mon Sep 17 00:00:00 2001 From: Sylvain Juge Date: Tue, 7 Dec 2021 16:16:21 +0100 Subject: [PATCH 19/26] add status mapping + configurability --- specs/agents/tracing-api-otel.md | 18 +++++++++++++++++- 1 file changed, 17 insertions(+), 1 deletion(-) diff --git a/specs/agents/tracing-api-otel.md b/specs/agents/tracing-api-otel.md index bf332169..e091b0e8 100644 --- a/specs/agents/tracing-api-otel.md +++ b/specs/agents/tracing-api-otel.md @@ -2,7 +2,8 @@ [OpenTelemetry](https://opentelemetry.io) (OTel in short) provides a vendor-neutral API that allows to capture tracing, logs and metrics data. -Agents may provide a bridge implementation of OpenTelemetry Tracing API following this specification. +Agents MAY provide a bridge implementation of OpenTelemetry Tracing API following this specification. +When available, implementation MUST be configurable and should be disabled by default when marked as `experimental`. Bridging here means that for each OTel span created with the API, a native span/transaction will be created and sent to APM server. @@ -62,6 +63,21 @@ if (span_kind == null) { ``` +### Span status + +OTel spans have a [Status](https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/trace/api.md#set-status) +field to indicate the status of the underlying task they represent. + +When the [Set Status](https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/trace/api.md#set-status) on OTel API is used, we can map it directly to `span.outcome`: +- OK => Success +- Error => Failure +- Unset (default) => Unknown + +However, when not provided explicitly agents can infer the outcome from the presence of a reported error. +This behavior is not expected with OTel API with status, thus bridged spans/transactions should NOT have their outcome +altered by reporting (or lack of reporting) of an error. Here the behavior should be identical to when the end-user provides +the outcome explicitly and thus have higher priority over the inferred value. + ### Attributes mapping OTel relies on key-value pairs for span attributes. From c02707f654df69df5f8929123e0fc51f5891e409 Mon Sep 17 00:00:00 2001 From: Sylvain Juge Date: Tue, 7 Dec 2021 16:25:26 +0100 Subject: [PATCH 20/26] update gherkin spec --- .../agents/gherkin-specs/otel_bridge.feature | 49 +++++++++++++++++-- 1 file changed, 45 insertions(+), 4 deletions(-) diff --git a/tests/agents/gherkin-specs/otel_bridge.feature b/tests/agents/gherkin-specs/otel_bridge.feature index 5823c04f..8d65772b 100644 --- a/tests/agents/gherkin-specs/otel_bridge.feature +++ b/tests/agents/gherkin-specs/otel_bridge.feature @@ -12,17 +12,20 @@ Feature: OpenTelemetry bridge Scenario: Create root transaction from OTel span without parent Given an agent And OTel span is created without parent + And OTel span ends Then Elastic bridged object is a transaction Then Elastic bridged transaction is a root transaction + # outcome should not be inferred from the lack/presence of errors + Then Elastic bridged transaction outcome is "unknown" Scenario: Create span from OTel span Given an agent And OTel span is created with local context as parent + And OTel span ends Then Elastic bridged object is a span Then Elastic bridged span has local context as parent - - # --- TODO : compatibility mapping for server < 7.16 - # --> extra complexity here as it's part of the + # outcome should not be inferred from the lack/presence of errors + Then Elastic bridged span outcome is "unknown" # --- OTel span kind mapping for spans & transactions @@ -30,6 +33,7 @@ Feature: OpenTelemetry bridge Given an agent And an active transaction And OTel span is created with kind "" + And OTel span ends Then Elastic bridged object is a span Then Elastic bridged span OTel kind is "" Then Elastic bridged span type is "" @@ -45,6 +49,7 @@ Feature: OpenTelemetry bridge Scenario Outline: OTel span kind for transactions & default transaction type Given an agent And OTel span is created with kind "" + And OTel span ends Then Elastic bridged object is a transaction Then Elastic bridged transaction OTel kind is "" Then Elastic bridged transaction type is 'unknown' @@ -56,6 +61,35 @@ Feature: OpenTelemetry bridge | PRODUCER | | CONSUMER | + # OTel span status mapping for spans & transactions + + Scenario Outline: OTel span mapping with status for transactions + Given an agent + And OTel span is created with kind 'SERVER' + And OTel span status set to "" + And OTel span ends + Then Elastic bridged object is a transaction + Then Elastic bridged transaction outcome is "" + Examples: + | status | outcome | + | unset | unknown | + | ok | success | + | error | failure | + + Scenario Outline: OTel span mapping with status for spans + Given an agent + Given an active transaction + And OTel span is created with kind 'INTERNAL' + And OTel span status set to "" + And OTel span ends + Then Elastic bridged object is a span + Then Elastic bridged span outcome is "" + Examples: + | status | outcome | + | unset | unknown | + | ok | success | + | error | failure | + # --- span type, subtype and action inference from OTel attributes # --- HTTP server @@ -66,6 +100,7 @@ Feature: OpenTelemetry bridge And OTel span has following attributes | http.url | | | http.scheme | | + And OTel span ends Then Elastic bridged object is a transaction Then Elastic bridged transaction type is "request" Examples: @@ -86,6 +121,7 @@ Feature: OpenTelemetry bridge | net.peer.ip | | | net.peer.name | | | net.peer.port | | + And OTel span ends Then Elastic bridged span type is 'external' Then Elastic bridged span subtype is 'http' Then Elastic bridged span OTel attributes are copied as-is @@ -114,6 +150,7 @@ Feature: OpenTelemetry bridge | net.peer.ip | | | net.peer.name | | | net.peer.port | | + And OTel span ends Then Elastic bridged span type is 'db' Then Elastic bridged span subtype is "" Then Elastic bridged span OTel attributes are copied as-is @@ -131,12 +168,13 @@ Feature: OpenTelemetry bridge # --- Messaging consumer (transaction consuming/receiving a message) # https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/trace/semantic_conventions/messaging.md - Scenario: + Scenario: Messaging consumer Given an agent And an active transaction And OTel span is created with kind 'CONSUMER' And OTel span has following attributes | messaging.system | anything | + And OTel span ends Then Elastic bridged transaction type is 'messaging' # --- Messaging producer (client span emitting a message) @@ -152,6 +190,7 @@ Feature: OpenTelemetry bridge | net.peer.ip | | | net.peer.name | | | net.peer.port | | + And OTel span ends Then Elastic bridged span type is 'messaging' Then Elastic bridged span subtype is "" Then Elastic bridged span OTel attributes are copied as-is @@ -179,6 +218,7 @@ Feature: OpenTelemetry bridge | net.peer.ip | | | net.peer.name | | | net.peer.port | | + And OTel span ends Then Elastic bridged span type is 'external' Then Elastic bridged span subtype is "" Then Elastic bridged span OTel attributes are copied as-is @@ -200,6 +240,7 @@ Feature: OpenTelemetry bridge And OTel span is created with kind 'SERVER' And OTel span has following attributes | rpc.system | grpc | + And OTel span ends Then Elastic bridged transaction type is 'request' From a376d5eedda572d729d1252173f54ec6c9312054 Mon Sep 17 00:00:00 2001 From: Sylvain Juge Date: Mon, 7 Feb 2022 15:26:21 +0100 Subject: [PATCH 21/26] add a few clarifications --- specs/agents/tracing-api-otel.md | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/specs/agents/tracing-api-otel.md b/specs/agents/tracing-api-otel.md index 4fa5eb28..ca3e0c31 100644 --- a/specs/agents/tracing-api-otel.md +++ b/specs/agents/tracing-api-otel.md @@ -40,7 +40,9 @@ This value should be stored into Elasticsearch documents to preserve OTel semant Possible values are `CLIENT`, `SERVER`, `PRODUCER`, `CONSUMER` and `INTERNAL`, refer to [specification](https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/trace/api.md#spankind) for details on semantics. -When `otel.span_kind` is not provided by the agent, APM Server should infer it using the following algorithm: +By default, OTel spans have their `SpanKind` set to `INTERNAL` by OTel API implementation, so it is assumed to always be provided when using the bridge. + +For existing agents without OTel bridge or for data captured without the bridge, the APM server has to infer the value of `otel.span_kind` with the following algorithm: ```javascript span_kind = null; @@ -63,6 +65,8 @@ if (span_kind == null) { ``` +While being optional, inferring the value of `otel.span_kind` helps to keep the data model closer to the OTel specification, even if the original data was sent using the native agent protocol. + ### Span status OTel spans have a [Status](https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/trace/api.md#set-status) @@ -130,6 +134,7 @@ a = transation.otel.attributes; span_kind = transaction.otel_span_kind; isRpc = a['rpc.system'] !== undefined; isHttp = a['http.url'] !== undefined || a['http.scheme'] !== undefined; +isMessaging = a['messaging.system'] !== undefined; if (span_kind == 'SERVER' && (isRpc || isHttp)) { type = 'request'; } else if (span_kind == 'CONSUMER' && isMessaging) { From da7447acdaf51bd1372d95c4b726d59a671b58a9 Mon Sep 17 00:00:00 2001 From: Sylvain Juge Date: Mon, 7 Feb 2022 16:08:11 +0100 Subject: [PATCH 22/26] clarify user-experience --- specs/agents/tracing-api-otel.md | 14 ++++++++++++-- 1 file changed, 12 insertions(+), 2 deletions(-) diff --git a/specs/agents/tracing-api-otel.md b/specs/agents/tracing-api-otel.md index ca3e0c31..32582276 100644 --- a/specs/agents/tracing-api-otel.md +++ b/specs/agents/tracing-api-otel.md @@ -7,8 +7,18 @@ When available, implementation MUST be configurable and should be disabled by de Bridging here means that for each OTel span created with the API, a native span/transaction will be created and sent to APM server. -From the perspective of the application code calling the OTel API, the delegation to a native span/transaction is transparent. -Also, this means that any OTel processors will be bypassed and ignored by the bridge. +### User experience + +On a high-level, from the perspective of the application code, using the OTel bridge should not differ from using the +OTel API. + +The aim of the bridge is to allow any application/library that is instrumented with OTel API to capture OTel spans to +seamlessly delegate to Elastic APM span/transactions. Also, it provides a vendor-neutral alternative to any existing +manual agent API with similar features. + +One major difference though is that since the implementation of OTel API will be delegated to Elastic APM agent, the +whole OTel configuration that might be present in the application code (OTel processor pipeline) or deployment +(env. variables) will be ignored. ### Spans and Transactions From 7b9e59b499dda637ea24f9e1010a9a2a88bad7e8 Mon Sep 17 00:00:00 2001 From: Sylvain Juge Date: Mon, 7 Feb 2022 16:16:59 +0100 Subject: [PATCH 23/26] clarify bridge limitations --- specs/agents/tracing-api-otel.md | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/specs/agents/tracing-api-otel.md b/specs/agents/tracing-api-otel.md index 32582276..63fa3ca6 100644 --- a/specs/agents/tracing-api-otel.md +++ b/specs/agents/tracing-api-otel.md @@ -10,7 +10,7 @@ Bridging here means that for each OTel span created with the API, a native span/ ### User experience On a high-level, from the perspective of the application code, using the OTel bridge should not differ from using the -OTel API. +OTel API for tracing. See [limitations](#limitations) below for details on the currently unsupported OTel features. The aim of the bridge is to allow any application/library that is instrumented with OTel API to capture OTel spans to seamlessly delegate to Elastic APM span/transactions. Also, it provides a vendor-neutral alternative to any existing @@ -20,6 +20,14 @@ One major difference though is that since the implementation of OTel API will be whole OTel configuration that might be present in the application code (OTel processor pipeline) or deployment (env. variables) will be ignored. +### Limitations + +The OTel API/specification goes beyond tracing, as a result, the following OTel features are not supported: +- metrics +- logs +- span events +- span links + ### Spans and Transactions OTel only defines Spans, whereas Elastic APM relies on both Spans and Transactions. From a476547841cae2efa1572d2c266d5f5413b7246f Mon Sep 17 00:00:00 2001 From: Sylvain Juge Date: Wed, 9 Feb 2022 11:28:48 +0100 Subject: [PATCH 24/26] MAY use labels for server < 7.16 --- specs/agents/tracing-api-otel.md | 13 ++++++------- 1 file changed, 6 insertions(+), 7 deletions(-) diff --git a/specs/agents/tracing-api-otel.md b/specs/agents/tracing-api-otel.md index 63fa3ca6..d2c1579f 100644 --- a/specs/agents/tracing-api-otel.md +++ b/specs/agents/tracing-api-otel.md @@ -5,6 +5,8 @@ Agents MAY provide a bridge implementation of OpenTelemetry Tracing API following this specification. When available, implementation MUST be configurable and should be disabled by default when marked as `experimental`. +The bridge implementation relies on APM Server version 7.16 or later, as a result when sending data to a server < 7.16 is recommended. + Bridging here means that for each OTel span created with the API, a native span/transaction will be created and sent to APM server. ### User experience @@ -124,13 +126,10 @@ OpenTelemetry attributes should be stored in `otel.attributes` as a flat key-val } ``` -APM server supports the `otel.attributes` property starting with version 7.16.0. - -When the APM server version does not support `otel.attributes` property, agents should use `labels` as fallback with OTel attribute -name as key. +Starting from version 7.16 onwards, APM server must provide a mapping that is equivalent to the native OpenTelemetry Protocol (OTLP) intake for the +fields provided in `otel.attributes`. -When the APM server supports `otel.attributes` property, the server-side mapping should be identical to the one -that is applied to handle native OpenTelemetry Protocol (OTLP) intake. +When sending data to APM server version before 7.16, agents MAY use span and transaction labels as fallback to store OTel attributes to avoid dropping information. ### Compatibility mapping @@ -271,4 +270,4 @@ Both OTel and our agents have their own definition of what "active context" is, - Java OTel API: active context is implemented as a key-value map propagated through thread local In order to avoid potentially complex and tedious synchronization issues between OTel and our existing agent -implementations, the bridge implementation SHOULD provide an abstraction to have a single "active context" storage. \ No newline at end of file +implementations, the bridge implementation SHOULD provide an abstraction to have a single "active context" storage. From c9e3004492110f592c18e1d9120df77cccfd87e6 Mon Sep 17 00:00:00 2001 From: SylvainJuge Date: Thu, 10 Feb 2022 09:11:43 +0100 Subject: [PATCH 25/26] Update specs/agents/tracing-api-otel.md Co-authored-by: Colton Myers --- specs/agents/tracing-api-otel.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/specs/agents/tracing-api-otel.md b/specs/agents/tracing-api-otel.md index d2c1579f..d4bc3b79 100644 --- a/specs/agents/tracing-api-otel.md +++ b/specs/agents/tracing-api-otel.md @@ -5,7 +5,7 @@ Agents MAY provide a bridge implementation of OpenTelemetry Tracing API following this specification. When available, implementation MUST be configurable and should be disabled by default when marked as `experimental`. -The bridge implementation relies on APM Server version 7.16 or later, as a result when sending data to a server < 7.16 is recommended. +The bridge implementation relies on APM Server version 7.16 or later. Agents SHOULD recommend this minimum version to users in bridge documentation. Bridging here means that for each OTel span created with the API, a native span/transaction will be created and sent to APM server. From b195414aa6abb8a5764290db4570026006a5ce46 Mon Sep 17 00:00:00 2001 From: Sylvain Juge Date: Wed, 23 Feb 2022 10:38:25 +0100 Subject: [PATCH 26/26] clarify error capture + supported features --- specs/agents/tracing-api-otel.md | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/specs/agents/tracing-api-otel.md b/specs/agents/tracing-api-otel.md index d4bc3b79..f78f81c6 100644 --- a/specs/agents/tracing-api-otel.md +++ b/specs/agents/tracing-api-otel.md @@ -13,6 +13,10 @@ Bridging here means that for each OTel span created with the API, a native span/ On a high-level, from the perspective of the application code, using the OTel bridge should not differ from using the OTel API for tracing. See [limitations](#limitations) below for details on the currently unsupported OTel features. +For tracing the support should include: +- creating spans with attributes +- context propagation +- capturing errors The aim of the bridge is to allow any application/library that is instrumented with OTel API to capture OTel spans to seamlessly delegate to Elastic APM span/transactions. Also, it provides a vendor-neutral alternative to any existing