From f0f993b467d7711faf7d1c004d3c59ef44a6e75e Mon Sep 17 00:00:00 2001 From: Reiley Yang Date: Wed, 17 Feb 2021 20:46:46 -0800 Subject: [PATCH 01/20] metrics prototype scenario --- .../0145-metrics-prototype-scenarios.md | 214 ++++++++++++++++++ 1 file changed, 214 insertions(+) create mode 100644 text/metrics/0145-metrics-prototype-scenarios.md diff --git a/text/metrics/0145-metrics-prototype-scenarios.md b/text/metrics/0145-metrics-prototype-scenarios.md new file mode 100644 index 000000000..b41f6fe59 --- /dev/null +++ b/text/metrics/0145-metrics-prototype-scenarios.md @@ -0,0 +1,214 @@ +# Scenarios for Metrics API/SDK Prototyping + +With the stable release of the tracing specification, the OpenTelemetry +community is willing to spend more energy on metrics API/SDK. The goal is to get +the metrics API/SDK specification to +[`Experimental`](https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/versioning-and-stability.md#experimental) +state by end of 6/2021, and make it +[`Stable`](https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/versioning-and-stability.md#stable) +before end of 2021: + +* By end of 6/2021, we should have a good confidence that we can recommend + language client owners to work on metrics preview release. This means starting + from 7/1/2021 the specification should not have major surprise or big changes + which would thrust the language client maintainers. + +* By end of 10/2021, we should mark the metrics API/SDK specification as + [`Feature-freeze`](https://github.com/open-telemetry/opentelemetry-specification/blob/1afab39e5658f807315abf2f3256809293bfd421/specification/document-status.md#feature-freeze), + and focusing on bug fixing or editorial changes. + +* By end of 2021, we want to have a stable release of metrics API/SDK + specification, with multiple language SIGs providing RC (release candidate) or + GA (general available) clients. + +In this document, we will focus on two scenarios that we use for prototypeing +metrics API/SDK. The goal is to have two scenarios which clearly capture the +major requirements, so we can work with language client SIGs to prototype, +gather the learnings, determine the scopes and stages. Later the scenarios can +be used as examples and test cases for all the language clients. + +Here are the languages we've agreed to use during the prototyping: + +* C# +* Java +* Python + +Instead of boiling the ocean, we will need to divide the work into multiple +stages: + +1. Do the end-to-end prototype to get the overall understanding of the problem + domain. We should also clarify the scope and be able to articulate it + precisely during this stage, here are some examples: + + * Why do we want to introduce brand new metrics APIs versus taking a well + established API (e.g. Promethues and Micrometer), what makes OpenTelemetry + metrics API different (e.g. Baggage)? + * Do we need to consider OpenCensus Stats API shim, or this is out of scope? + +2. Focus on a core subset of API, cover the end-to-end library instrumentation + scenario. At this stage we don't expect to cover all the APIs as some of them + might be very similar (e.g. if we know how to record an integer, we don't + have to work on float/double as we can add them later by replicating what + we've done for integer). + +3. Focus on a core subset of SDK. This would help us to get the end-to-end + application. + +4. Replicate stage 2 to cover the complete set of APIs. + +5. Replicate stage 4 to cover the complete set of SDKs. + +## Scenario 1: Grocery + +The **Grocery** scenario covers how a developer could use metrics API and SDK in +a final application. It is a self-contained application which covers: + +* How to instrument the code in a vendor agnostic way +* How to configure the SDK and exporter + +Considering there might be multiple grocery stores, the metrics we collect will +have the store name as a dimension - which is fairly static (not changing while +the store is running). + +The store has plenty supply of potato and tomato, with the following price: + +* Potato: $1.0 / ea +* Tomato: $3.0 / ea + +Each customer has a unique name (e.g. customerA, customerB), a customer could +come to the store multiple times. Here goes the Python snippet: + +```python +store = GroceryStore("Portland") +store.process_order("customerA", {"potato": 2, "tomato": 3}) +store.process_order("customerB", {"tomato": 10}) +store.process_order("customerC", {"potato": 2}) +store.process_order("customerA", {"tomato": 1}) +``` + +When the store is closed, we will report the following metrics: + +### Order info + +| Store | Customer | Number of Orders | Amount (USD) | +| -------- | --------- | ---------------- | ------------ | +| Portland | customerA | 2 | 14.0 | +| Portland | customerB | 1 | 30.0 | +| Portland | customerC | 1 | 2.0 | + +### Items sold + +| Store | Customer | Item | Count | +| -------- | --------- | ------ | ----- | +| Portland | customerA | potato | 2 | +| Portland | customerA | tomato | 4 | +| Portland | customerB | tomato | 10 | +| Portland | customerC | potato | 2 | + +## Scenario 2: HTTP Server + +The _HTTP Server_ scenario covers how a library developer X could use metrics +API to instrument a library, and how the application developer Y can configure +the library to use OpenTelemetry SDK in a final application. X and Y are working +for different companies and they don't communicate. The demo has two parts - the +library (HTTP lib owned by X) and the server app (owned by Y): + +* How developer X could instrument the library code in a vendor agnostic way + * Performance is critical for X + * X doesn't know which metrics and which dimension will Y pick + * X doesn't know the aggregation time window, nor the final destination of the + metrics +* How developer Y could configure the SDK and exporter + * How should Y hook up the metrics SDK with the library + * How should Y configure the time window(s) and destination(s) + * How should Y pick the metrics and the dimensions + +### Library Requirements + +The library developer (developer X) will expose the following metrics out of +box: + +### Pull Metrics + +These are pull metrics - the value is always available, and is only reported and +collected based on the ask from consumer(s). If there is no ask from the +consumer, the value will not be reported at all (e.g. there is no API call to +fetch the room temperature unless someone is asking for the room temperature). + +#### Process CPU Usage + +Note: the **Host Name** should leverage [`OpenTelemetry +Resource`](https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/resource/sdk.md), +so it should be covered by the metrics SDK rather than API, and strictly +speaking it is not considered as a "dimension" from the SDK perspective. + +| Host Name | Process ID | CPU% [0.0, 100.0] | +| --------- | ---------- | ----------------- | +| MachineA | 1234 | 15.3 | + +#### System CPU Usage + +| Host Name | CPU% [0, 100] | +| --------- | ------------- | +| MachineA | 30 | + +#### Server Room Temperature + +| Host Name | Temperature (F) | +| --------- | --------------- | +| MachineA | 65.3 | + +### Push Metrics + +These are the push metrics - the value is reported (via the API) when it is +available, and collected (via the SDK) based on the ask from consumer(s). If +there is no ask from the consumer, the API will be no-op and the data will be +dropped on the floor. + +#### Received HTTP Requests + +Note: the **Client Type** is passed in via the [`OpenTelemetry +Baggage`](https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/baggage/api.md), +strictly speaking it is not part of the metrics API, but it is considered as a +"dimension" from the metrics SDK perspective. + +| Host Name | Process ID | Client Type | HTTP Method | HTTP Host | HTTP Flavor | Peer IP | Peer Port | Host IP | Host Port | +| --------- | ---------- | ----------- | ----------- | --------- | ----------- | --------- | --------- | --------- | --------- | +| MachineA | 1234 | Android | GET | otel.org | 1.1 | 127.0.0.1 | 51327 | 127.0.0.1 | 80 | +| MachineA | 1234 | Android | POST | otel.org | 1.1 | 127.0.0.1 | 51328 | 127.0.0.1 | 80 | +| MachineA | 1234 | iOS | PUT | otel.org | 1.1 | 127.0.0.1 | 51329 | 127.0.0.1 | 80 | + +#### HTTP Server Duration + +Note: the server duration is only available for **finished HTTP requests**. + +| Host Name | Process ID | Client Type | HTTP Method | HTTP Host | HTTP Status Code | HTTP Flavor | Peer IP | Peer Port | Host IP | Host Port | Duration (ms) | +| --------- | ---------- | ----------- | ----------- | --------- | ---------------- | ----------- | --------- | --------- | --------- | --------- | ------------- | +| MachineA | 1234 | Android | GET | otel.org | 200 | 1.1 | 127.0.0.1 | 51327 | 127.0.0.1 | 80 | 8.5 | +| MachineA | 1234 | Android | POST | otel.org | 304 | 1.1 | 127.0.0.1 | 51328 | 127.0.0.1 | 80 | 100.0 | + +### Application Requirements + +The application owner (developer Y) would only want the following metrics: + +* [System CPU Usage](#system-cpu-usage) reported every 5 seconds +* [Server Room Temperature](#server-room-temperature) reported every minute +* [HTTP Server Duration](#http-server-duration), reported every 5 seconds, with + a subset of the dimensions: + * Host Name + * HTTP Method + * HTTP Host + * HTTP Status Code + * Client Type + * 90%, 95%, 99% and 99.9% latency +* HTTP request counters, reported every 5 seconds: + * Total number of received HTTP requests + * Total number of finished HTTP requests + * Number of currently-in-flight HTTP requests (concurrent HTTP requests) + + | Host Name | Process ID | HTTP Host | Received Requests | Finished Requests | Concurrent Requests | + | --------- | ---------- | --------- | ----------------- | ----------------- | ------------------- | + | MachineA | 1234 | otel.org | 630 | 601 | 29 | + | MachineA | 5678 | otel.org | 1005 | 1001 | 4 | +* Exception samples (examplar) - in case HTTP 5xx happened, developer Y would + want to see a sample request with all the dimensions (IP, Port, etc.) From 0660044d8c7caad52822145dbc96ef2de6738823 Mon Sep 17 00:00:00 2001 From: Reiley Yang Date: Wed, 17 Feb 2021 20:48:47 -0800 Subject: [PATCH 02/20] rename --- ...prototype-scenarios.md => 0146-metrics-prototype-scenarios.md} | 0 1 file changed, 0 insertions(+), 0 deletions(-) rename text/metrics/{0145-metrics-prototype-scenarios.md => 0146-metrics-prototype-scenarios.md} (100%) diff --git a/text/metrics/0145-metrics-prototype-scenarios.md b/text/metrics/0146-metrics-prototype-scenarios.md similarity index 100% rename from text/metrics/0145-metrics-prototype-scenarios.md rename to text/metrics/0146-metrics-prototype-scenarios.md From 1e9eaf16950caef0b045d60d18f21afd944098bc Mon Sep 17 00:00:00 2001 From: Reiley Yang Date: Wed, 17 Feb 2021 20:55:06 -0800 Subject: [PATCH 03/20] fix typo --- text/metrics/0146-metrics-prototype-scenarios.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/metrics/0146-metrics-prototype-scenarios.md b/text/metrics/0146-metrics-prototype-scenarios.md index b41f6fe59..94e57a004 100644 --- a/text/metrics/0146-metrics-prototype-scenarios.md +++ b/text/metrics/0146-metrics-prototype-scenarios.md @@ -41,7 +41,7 @@ stages: precisely during this stage, here are some examples: * Why do we want to introduce brand new metrics APIs versus taking a well - established API (e.g. Promethues and Micrometer), what makes OpenTelemetry + established API (e.g. Prometheus and Micrometer), what makes OpenTelemetry metrics API different (e.g. Baggage)? * Do we need to consider OpenCensus Stats API shim, or this is out of scope? From c04673d78d8e737182b6545d5cd78856f7d2a95e Mon Sep 17 00:00:00 2001 From: Reiley Yang Date: Wed, 17 Feb 2021 21:59:31 -0800 Subject: [PATCH 04/20] Update text/metrics/0146-metrics-prototype-scenarios.md Co-authored-by: Cijo Thomas --- text/metrics/0146-metrics-prototype-scenarios.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/metrics/0146-metrics-prototype-scenarios.md b/text/metrics/0146-metrics-prototype-scenarios.md index 94e57a004..3034e9a50 100644 --- a/text/metrics/0146-metrics-prototype-scenarios.md +++ b/text/metrics/0146-metrics-prototype-scenarios.md @@ -210,5 +210,5 @@ The application owner (developer Y) would only want the following metrics: | --------- | ---------- | --------- | ----------------- | ----------------- | ------------------- | | MachineA | 1234 | otel.org | 630 | 601 | 29 | | MachineA | 5678 | otel.org | 1005 | 1001 | 4 | -* Exception samples (examplar) - in case HTTP 5xx happened, developer Y would +* Exception samples (exemplar) - in case HTTP 5xx happened, developer Y would want to see a sample request with all the dimensions (IP, Port, etc.) From 11c07e2c6098411532560d81931f8dec9d0822a0 Mon Sep 17 00:00:00 2001 From: Reiley Yang Date: Wed, 17 Feb 2021 21:59:47 -0800 Subject: [PATCH 05/20] Update text/metrics/0146-metrics-prototype-scenarios.md Co-authored-by: Cijo Thomas --- text/metrics/0146-metrics-prototype-scenarios.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/metrics/0146-metrics-prototype-scenarios.md b/text/metrics/0146-metrics-prototype-scenarios.md index 3034e9a50..2da55f9fe 100644 --- a/text/metrics/0146-metrics-prototype-scenarios.md +++ b/text/metrics/0146-metrics-prototype-scenarios.md @@ -200,7 +200,7 @@ The application owner (developer Y) would only want the following metrics: * HTTP Host * HTTP Status Code * Client Type - * 90%, 95%, 99% and 99.9% latency + * 90%, 95%, 99% and 99.9% server duration * HTTP request counters, reported every 5 seconds: * Total number of received HTTP requests * Total number of finished HTTP requests From 39dee6da8d0939a55df63ff08bb03b26306e735a Mon Sep 17 00:00:00 2001 From: Reiley Yang Date: Wed, 17 Feb 2021 22:00:00 -0800 Subject: [PATCH 06/20] Update text/metrics/0146-metrics-prototype-scenarios.md Co-authored-by: Cijo Thomas --- text/metrics/0146-metrics-prototype-scenarios.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/metrics/0146-metrics-prototype-scenarios.md b/text/metrics/0146-metrics-prototype-scenarios.md index 2da55f9fe..bb8ab118a 100644 --- a/text/metrics/0146-metrics-prototype-scenarios.md +++ b/text/metrics/0146-metrics-prototype-scenarios.md @@ -21,7 +21,7 @@ before end of 2021: specification, with multiple language SIGs providing RC (release candidate) or GA (general available) clients. -In this document, we will focus on two scenarios that we use for prototypeing +In this document, we will focus on two scenarios that we use for prototyping metrics API/SDK. The goal is to have two scenarios which clearly capture the major requirements, so we can work with language client SIGs to prototype, gather the learnings, determine the scopes and stages. Later the scenarios can From caa8bde134c69b4156477637c7411851807c007a Mon Sep 17 00:00:00 2001 From: Reiley Yang Date: Wed, 17 Feb 2021 22:00:10 -0800 Subject: [PATCH 07/20] Update text/metrics/0146-metrics-prototype-scenarios.md Co-authored-by: Cijo Thomas --- text/metrics/0146-metrics-prototype-scenarios.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/metrics/0146-metrics-prototype-scenarios.md b/text/metrics/0146-metrics-prototype-scenarios.md index bb8ab118a..e8b5c86c5 100644 --- a/text/metrics/0146-metrics-prototype-scenarios.md +++ b/text/metrics/0146-metrics-prototype-scenarios.md @@ -115,7 +115,7 @@ library (HTTP lib owned by X) and the server app (owned by Y): * How developer X could instrument the library code in a vendor agnostic way * Performance is critical for X - * X doesn't know which metrics and which dimension will Y pick + * X doesn't know which metrics and which dimensions will Y pick * X doesn't know the aggregation time window, nor the final destination of the metrics * How developer Y could configure the SDK and exporter From 5dfb0e98fd677dbf0db70f102d312d317d1c8213 Mon Sep 17 00:00:00 2001 From: Reiley Yang Date: Wed, 17 Feb 2021 22:06:53 -0800 Subject: [PATCH 08/20] clarify GA/stable --- text/metrics/0146-metrics-prototype-scenarios.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/text/metrics/0146-metrics-prototype-scenarios.md b/text/metrics/0146-metrics-prototype-scenarios.md index e8b5c86c5..9c74e0ad3 100644 --- a/text/metrics/0146-metrics-prototype-scenarios.md +++ b/text/metrics/0146-metrics-prototype-scenarios.md @@ -19,7 +19,8 @@ before end of 2021: * By end of 2021, we want to have a stable release of metrics API/SDK specification, with multiple language SIGs providing RC (release candidate) or - GA (general available) clients. + [stable](https://github.com/open-telemetry/opentelemetry-specification/blob/9047c91412d3d4b7f28b0f7346d8c5034b509849/specification/versioning-and-stability.md#stable) + clients. In this document, we will focus on two scenarios that we use for prototyping metrics API/SDK. The goal is to have two scenarios which clearly capture the From 9346bae3e74d1175c1f2676a0af4e91f3a5a78a5 Mon Sep 17 00:00:00 2001 From: Reiley Yang Date: Wed, 17 Feb 2021 22:18:02 -0800 Subject: [PATCH 09/20] add example to exemplar --- text/metrics/0146-metrics-prototype-scenarios.md | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/text/metrics/0146-metrics-prototype-scenarios.md b/text/metrics/0146-metrics-prototype-scenarios.md index 9c74e0ad3..543859e41 100644 --- a/text/metrics/0146-metrics-prototype-scenarios.md +++ b/text/metrics/0146-metrics-prototype-scenarios.md @@ -212,4 +212,9 @@ The application owner (developer Y) would only want the following metrics: | MachineA | 1234 | otel.org | 630 | 601 | 29 | | MachineA | 5678 | otel.org | 1005 | 1001 | 4 | * Exception samples (exemplar) - in case HTTP 5xx happened, developer Y would - want to see a sample request with all the dimensions (IP, Port, etc.) + want to see a sample request with trace id, span id and all the dimensions + (IP, Port, etc.) + + | Trace ID | Span ID | Host Name | Process ID | Client Type | HTTP Method | HTTP Host | HTTP Status Code | HTTP Flavor | Peer IP | Peer Port | Host IP | Host Port | Exception | + | -------------------------------- | ---------------- | --------- | ---------- | ----------- | ----------- | --------- | ---------------- | ----------- | --------- | --------- | --------- | --------- | -------------------- | + | 8389584945550f40820b96ce1ceb9299 | 745239d26e408342 | MachineA | 1234 | iOS | PUT | otel.org | 500 | 1.1 | 127.0.0.1 | 51329 | 127.0.0.1 | 80 | SocketException(...) | From 93290f70bc3e74c7bed778436a6bf28e5b93d07c Mon Sep 17 00:00:00 2001 From: Reiley Yang Date: Thu, 18 Feb 2021 09:39:21 -0800 Subject: [PATCH 10/20] adjust the proposed timeline considering most folks will be on vacation in Dec. --- text/metrics/0146-metrics-prototype-scenarios.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/text/metrics/0146-metrics-prototype-scenarios.md b/text/metrics/0146-metrics-prototype-scenarios.md index 543859e41..5452df538 100644 --- a/text/metrics/0146-metrics-prototype-scenarios.md +++ b/text/metrics/0146-metrics-prototype-scenarios.md @@ -4,16 +4,16 @@ With the stable release of the tracing specification, the OpenTelemetry community is willing to spend more energy on metrics API/SDK. The goal is to get the metrics API/SDK specification to [`Experimental`](https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/versioning-and-stability.md#experimental) -state by end of 6/2021, and make it +state by end of 5/2021, and make it [`Stable`](https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/versioning-and-stability.md#stable) before end of 2021: -* By end of 6/2021, we should have a good confidence that we can recommend +* By end of 5/2021, we should have a good confidence that we can recommend language client owners to work on metrics preview release. This means starting from 7/1/2021 the specification should not have major surprise or big changes which would thrust the language client maintainers. -* By end of 10/2021, we should mark the metrics API/SDK specification as +* By end of 9/2021, we should mark the metrics API/SDK specification as [`Feature-freeze`](https://github.com/open-telemetry/opentelemetry-specification/blob/1afab39e5658f807315abf2f3256809293bfd421/specification/document-status.md#feature-freeze), and focusing on bug fixing or editorial changes. From 51bf6f5b1d5966f0a6cd4d02c467361fc168c056 Mon Sep 17 00:00:00 2001 From: Reiley Yang Date: Thu, 18 Feb 2021 09:44:46 -0800 Subject: [PATCH 11/20] fix nits --- text/metrics/0146-metrics-prototype-scenarios.md | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/text/metrics/0146-metrics-prototype-scenarios.md b/text/metrics/0146-metrics-prototype-scenarios.md index 5452df538..c21428407 100644 --- a/text/metrics/0146-metrics-prototype-scenarios.md +++ b/text/metrics/0146-metrics-prototype-scenarios.md @@ -73,8 +73,8 @@ the store is running). The store has plenty supply of potato and tomato, with the following price: -* Potato: $1.0 / ea -* Tomato: $3.0 / ea +* Potato: $1.00 / ea +* Tomato: $3.00 / ea Each customer has a unique name (e.g. customerA, customerB), a customer could come to the store multiple times. Here goes the Python snippet: @@ -93,9 +93,9 @@ When the store is closed, we will report the following metrics: | Store | Customer | Number of Orders | Amount (USD) | | -------- | --------- | ---------------- | ------------ | -| Portland | customerA | 2 | 14.0 | -| Portland | customerB | 1 | 30.0 | -| Portland | customerC | 1 | 2.0 | +| Portland | customerA | 2 | 14.00 | +| Portland | customerB | 1 | 30.00 | +| Portland | customerC | 1 | 2.00 | ### Items sold From 9ba2d1e6cffabfaea797c5a34ff7fa7de5532f45 Mon Sep 17 00:00:00 2001 From: Reiley Yang Date: Thu, 18 Feb 2021 17:03:51 -0800 Subject: [PATCH 12/20] adjust the ToC --- .../0146-metrics-prototype-scenarios.md | 70 +++++++++---------- 1 file changed, 35 insertions(+), 35 deletions(-) diff --git a/text/metrics/0146-metrics-prototype-scenarios.md b/text/metrics/0146-metrics-prototype-scenarios.md index c21428407..928caeded 100644 --- a/text/metrics/0146-metrics-prototype-scenarios.md +++ b/text/metrics/0146-metrics-prototype-scenarios.md @@ -8,16 +8,18 @@ state by end of 5/2021, and make it [`Stable`](https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/versioning-and-stability.md#stable) before end of 2021: -* By end of 5/2021, we should have a good confidence that we can recommend +* By end of 5/31/2021, we should have a good confidence that we can recommend language client owners to work on metrics preview release. This means starting - from 7/1/2021 the specification should not have major surprise or big changes - which would thrust the language client maintainers. + from 6/1/2021 the specification should not have major surprises or big changes + which would thrust the language client maintainers, and we start to recommend + client maintainers to implement it. We might introduce additional features but + there should be a high bar. -* By end of 9/2021, we should mark the metrics API/SDK specification as +* By end of 9/30/2021, we should mark the metrics API/SDK specification as [`Feature-freeze`](https://github.com/open-telemetry/opentelemetry-specification/blob/1afab39e5658f807315abf2f3256809293bfd421/specification/document-status.md#feature-freeze), and focusing on bug fixing or editorial changes. -* By end of 2021, we want to have a stable release of metrics API/SDK +* By end of 11/30/2021, we want to have a stable release of metrics API/SDK specification, with multiple language SIGs providing RC (release candidate) or [stable](https://github.com/open-telemetry/opentelemetry-specification/blob/9047c91412d3d4b7f28b0f7346d8c5034b509849/specification/versioning-and-stability.md#stable) clients. @@ -89,7 +91,7 @@ store.process_order("customerA", {"tomato": 1}) When the store is closed, we will report the following metrics: -### Order info +**Order info:** | Store | Customer | Number of Orders | Amount (USD) | | -------- | --------- | ---------------- | ------------ | @@ -97,7 +99,7 @@ When the store is closed, we will report the following metrics: | Portland | customerB | 1 | 30.00 | | Portland | customerC | 1 | 2.00 | -### Items sold +**Items sold:** | Store | Customer | Item | Count | | -------- | --------- | ------ | ----- | @@ -126,47 +128,45 @@ library (HTTP lib owned by X) and the server app (owned by Y): ### Library Requirements -The library developer (developer X) will expose the following metrics out of -box: +The library developer (developer X) will provide two libraries: -### Pull Metrics +* Server climate control library - a library which monitors and controls the + temperature and humidity of the server. +* HTTP server library - a library which provides HTTP service. -These are pull metrics - the value is always available, and is only reported and -collected based on the ask from consumer(s). If there is no ask from the -consumer, the value will not be reported at all (e.g. there is no API call to -fetch the room temperature unless someone is asking for the room temperature). +Both libraries will provide out-of-box metrics, the metrics have two categories: -#### Process CPU Usage +* Push metrics - the value is reported (via the API) when it is available, and + collected (via the SDK) based on the ask from consumer(s). If there is no ask + from the consumer, the API will be no-op and the data will be dropped on the + floor. +* Pull metrics - the value is always available, and is only reported and + collected based on the ask from consumer(s). If there is no ask from the + consumer, the value will not be reported at all (e.g. there is no API call to + fetch the temperature unless someone is asking for the temperature). + +#### Server Climate Control Library Note: the **Host Name** should leverage [`OpenTelemetry Resource`](https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/resource/sdk.md), so it should be covered by the metrics SDK rather than API, and strictly speaking it is not considered as a "dimension" from the SDK perspective. -| Host Name | Process ID | CPU% [0.0, 100.0] | -| --------- | ---------- | ----------------- | -| MachineA | 1234 | 15.3 | - -#### System CPU Usage - -| Host Name | CPU% [0, 100] | -| --------- | ------------- | -| MachineA | 30 | - -#### Server Room Temperature +**Server temperature:** | Host Name | Temperature (F) | | --------- | --------------- | | MachineA | 65.3 | -### Push Metrics +**Server humidity:** + +| Host Name | Humidity (%) | +| --------- | ------------ | +| MachineA | 21 | -These are the push metrics - the value is reported (via the API) when it is -available, and collected (via the SDK) based on the ask from consumer(s). If -there is no ask from the consumer, the API will be no-op and the data will be -dropped on the floor. +#### HTTP Server Library -#### Received HTTP Requests +**Received HTTP requests:** Note: the **Client Type** is passed in via the [`OpenTelemetry Baggage`](https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/baggage/api.md), @@ -179,7 +179,7 @@ strictly speaking it is not part of the metrics API, but it is considered as a | MachineA | 1234 | Android | POST | otel.org | 1.1 | 127.0.0.1 | 51328 | 127.0.0.1 | 80 | | MachineA | 1234 | iOS | PUT | otel.org | 1.1 | 127.0.0.1 | 51329 | 127.0.0.1 | 80 | -#### HTTP Server Duration +**HTTP server request duration:** Note: the server duration is only available for **finished HTTP requests**. @@ -192,9 +192,9 @@ Note: the server duration is only available for **finished HTTP requests**. The application owner (developer Y) would only want the following metrics: -* [System CPU Usage](#system-cpu-usage) reported every 5 seconds +* Server temperature - reported every 5 seconds * [Server Room Temperature](#server-room-temperature) reported every minute -* [HTTP Server Duration](#http-server-duration), reported every 5 seconds, with +* [HTTP Server Request Duration](#http-server-duration), reported every 5 seconds, with a subset of the dimensions: * Host Name * HTTP Method From e69b88a3aaa9ff1ae9bfaa7f6eedf57eb3cd9ddd Mon Sep 17 00:00:00 2001 From: Reiley Yang Date: Thu, 18 Feb 2021 21:15:34 -0800 Subject: [PATCH 13/20] fix nits --- text/metrics/0146-metrics-prototype-scenarios.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/text/metrics/0146-metrics-prototype-scenarios.md b/text/metrics/0146-metrics-prototype-scenarios.md index 928caeded..0394428f6 100644 --- a/text/metrics/0146-metrics-prototype-scenarios.md +++ b/text/metrics/0146-metrics-prototype-scenarios.md @@ -193,9 +193,9 @@ Note: the server duration is only available for **finished HTTP requests**. The application owner (developer Y) would only want the following metrics: * Server temperature - reported every 5 seconds -* [Server Room Temperature](#server-room-temperature) reported every minute -* [HTTP Server Request Duration](#http-server-duration), reported every 5 seconds, with - a subset of the dimensions: +* Server humidity - reported every minute +* HTTP server request duration, reported every 5 seconds, with a subset of the + dimensions: * Host Name * HTTP Method * HTTP Host From dd40330fe07fbd032d83217166cc9d129251524d Mon Sep 17 00:00:00 2001 From: Reiley Yang Date: Fri, 19 Feb 2021 11:17:23 -0800 Subject: [PATCH 14/20] Update text/metrics/0146-metrics-prototype-scenarios.md Co-authored-by: Leighton Chen --- text/metrics/0146-metrics-prototype-scenarios.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/metrics/0146-metrics-prototype-scenarios.md b/text/metrics/0146-metrics-prototype-scenarios.md index 0394428f6..1de26f288 100644 --- a/text/metrics/0146-metrics-prototype-scenarios.md +++ b/text/metrics/0146-metrics-prototype-scenarios.md @@ -36,7 +36,7 @@ Here are the languages we've agreed to use during the prototyping: * Java * Python -Instead of boiling the ocean, we will need to divide the work into multiple +In order to not undertake such an enormous task at once, we will need to have an incremental approach and divide the work into multiple stages: 1. Do the end-to-end prototype to get the overall understanding of the problem From 9e1666cccc10e9423d54594386c57542a546be1e Mon Sep 17 00:00:00 2001 From: Reiley Yang Date: Fri, 19 Feb 2021 11:17:40 -0800 Subject: [PATCH 15/20] Update text/metrics/0146-metrics-prototype-scenarios.md Co-authored-by: Leighton Chen --- text/metrics/0146-metrics-prototype-scenarios.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/metrics/0146-metrics-prototype-scenarios.md b/text/metrics/0146-metrics-prototype-scenarios.md index 1de26f288..af8f0c871 100644 --- a/text/metrics/0146-metrics-prototype-scenarios.md +++ b/text/metrics/0146-metrics-prototype-scenarios.md @@ -59,7 +59,7 @@ stages: 4. Replicate stage 2 to cover the complete set of APIs. -5. Replicate stage 4 to cover the complete set of SDKs. +5. Replicate stage 3 to cover the complete set of SDKs. ## Scenario 1: Grocery From 415c794cf03ae59bf21a9814ab5fb9a60973da5b Mon Sep 17 00:00:00 2001 From: Reiley Yang Date: Fri, 19 Feb 2021 11:21:13 -0800 Subject: [PATCH 16/20] adjust wording --- text/metrics/0146-metrics-prototype-scenarios.md | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/text/metrics/0146-metrics-prototype-scenarios.md b/text/metrics/0146-metrics-prototype-scenarios.md index af8f0c871..103306a31 100644 --- a/text/metrics/0146-metrics-prototype-scenarios.md +++ b/text/metrics/0146-metrics-prototype-scenarios.md @@ -10,10 +10,9 @@ before end of 2021: * By end of 5/31/2021, we should have a good confidence that we can recommend language client owners to work on metrics preview release. This means starting - from 6/1/2021 the specification should not have major surprises or big changes - which would thrust the language client maintainers, and we start to recommend - client maintainers to implement it. We might introduce additional features but - there should be a high bar. + from 6/1/2021 the specification should not have major surprises or big + changes. We will then start recommending client maintainers to implement it. + We might introduce additional features but there should be a high bar. * By end of 9/30/2021, we should mark the metrics API/SDK specification as [`Feature-freeze`](https://github.com/open-telemetry/opentelemetry-specification/blob/1afab39e5658f807315abf2f3256809293bfd421/specification/document-status.md#feature-freeze), From f4a6f5d600ac2fe81b1454d236dd8842072f5638 Mon Sep 17 00:00:00 2001 From: Reiley Yang Date: Fri, 19 Feb 2021 11:22:35 -0800 Subject: [PATCH 17/20] fix typo --- text/metrics/0146-metrics-prototype-scenarios.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/metrics/0146-metrics-prototype-scenarios.md b/text/metrics/0146-metrics-prototype-scenarios.md index 103306a31..9dce9d892 100644 --- a/text/metrics/0146-metrics-prototype-scenarios.md +++ b/text/metrics/0146-metrics-prototype-scenarios.md @@ -16,7 +16,7 @@ before end of 2021: * By end of 9/30/2021, we should mark the metrics API/SDK specification as [`Feature-freeze`](https://github.com/open-telemetry/opentelemetry-specification/blob/1afab39e5658f807315abf2f3256809293bfd421/specification/document-status.md#feature-freeze), - and focusing on bug fixing or editorial changes. + and focus on bug fixing or editorial changes. * By end of 11/30/2021, we want to have a stable release of metrics API/SDK specification, with multiple language SIGs providing RC (release candidate) or From 71fdfdd7710bdf9fa3e8a34883c78cdac38089d5 Mon Sep 17 00:00:00 2001 From: Reiley Yang Date: Wed, 24 Feb 2021 16:54:51 -0800 Subject: [PATCH 18/20] address review comment --- text/metrics/0146-metrics-prototype-scenarios.md | 3 +++ 1 file changed, 3 insertions(+) diff --git a/text/metrics/0146-metrics-prototype-scenarios.md b/text/metrics/0146-metrics-prototype-scenarios.md index 9dce9d892..2437a8009 100644 --- a/text/metrics/0146-metrics-prototype-scenarios.md +++ b/text/metrics/0146-metrics-prototype-scenarios.md @@ -120,6 +120,9 @@ library (HTTP lib owned by X) and the server app (owned by Y): * X doesn't know which metrics and which dimensions will Y pick * X doesn't know the aggregation time window, nor the final destination of the metrics + * X would like to provide some default recommendation (e.g. default + dimensions, aggregation time window, histogram buckets) so consumers of his + library can have a better onboarding experience. * How developer Y could configure the SDK and exporter * How should Y hook up the metrics SDK with the library * How should Y configure the time window(s) and destination(s) From 992e0ab891d0144fae2298a7f82a23f1f87cc135 Mon Sep 17 00:00:00 2001 From: Reiley Yang Date: Wed, 24 Feb 2021 17:04:19 -0800 Subject: [PATCH 19/20] address comments --- text/metrics/0146-metrics-prototype-scenarios.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/metrics/0146-metrics-prototype-scenarios.md b/text/metrics/0146-metrics-prototype-scenarios.md index 2437a8009..f7188e7a4 100644 --- a/text/metrics/0146-metrics-prototype-scenarios.md +++ b/text/metrics/0146-metrics-prototype-scenarios.md @@ -88,7 +88,7 @@ store.process_order("customerC", {"potato": 2}) store.process_order("customerA", {"tomato": 1}) ``` -When the store is closed, we will report the following metrics: +We will need the following metrics every minute: **Order info:** From eaa6a10c7456c196955193663eb1a76ab934316e Mon Sep 17 00:00:00 2001 From: Reiley Yang Date: Wed, 24 Feb 2021 18:52:04 -0800 Subject: [PATCH 20/20] Update text/metrics/0146-metrics-prototype-scenarios.md Co-authored-by: Cijo Thomas --- text/metrics/0146-metrics-prototype-scenarios.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/metrics/0146-metrics-prototype-scenarios.md b/text/metrics/0146-metrics-prototype-scenarios.md index f7188e7a4..fe370bde3 100644 --- a/text/metrics/0146-metrics-prototype-scenarios.md +++ b/text/metrics/0146-metrics-prototype-scenarios.md @@ -113,7 +113,7 @@ The _HTTP Server_ scenario covers how a library developer X could use metrics API to instrument a library, and how the application developer Y can configure the library to use OpenTelemetry SDK in a final application. X and Y are working for different companies and they don't communicate. The demo has two parts - the -library (HTTP lib owned by X) and the server app (owned by Y): +library (HTTP lib and ClimateControl lib owned by X) and the server app (owned by Y): * How developer X could instrument the library code in a vendor agnostic way * Performance is critical for X