From 0024c8f131bc566228fee06aa25ad2e83dc709ab Mon Sep 17 00:00:00 2001 From: Philipp Hofmann Date: Wed, 9 Apr 2025 10:20:38 +0200 Subject: [PATCH 1/5] docs(sdks): Batch Processor This PR migrates the SpansAggregator RFC to the develop docs. As logs also have to use some aggregation, we renamed the SpansAggregator to BatchProcessor so it can be used with any type of telemetry data. --- .../sdk/telemetry/spans/batch-processor.mdx | 146 ++++++++++++++++++ 1 file changed, 146 insertions(+) create mode 100644 develop-docs/sdk/telemetry/spans/batch-processor.mdx diff --git a/develop-docs/sdk/telemetry/spans/batch-processor.mdx b/develop-docs/sdk/telemetry/spans/batch-processor.mdx new file mode 100644 index 00000000000000..f39f7eb6c5d3b3 --- /dev/null +++ b/develop-docs/sdk/telemetry/spans/batch-processor.mdx @@ -0,0 +1,146 @@ +--- +title: Batch Processor +--- + + + 🚧 This document is work in progress. + + + + This document uses key words such as "MUST", "SHOULD", and "MAY" as defined in [RFC 2119](https://www.ietf.org/rfc/rfc2119.txt) to indicate requirement levels. + + +When an SDK implements span streaming or logs, it MUST batch multiple spans and logs into envelopes to reduce the number of HTTP. SDKs MUST implement a BatchProcessor to achieve this. The BatchProcessor keeps finished spans and logs in memory and batches them together in envelopes. It uses a combination of timeout and [weight](#weight) to decide when to batch its spans and logs into an envelope and send it to Sentry. +The SDK SHOULD use the BatchProcessor in the client because the transport SHOULD NOT be aware of spans or logs. The SDK MAY deviate from this approach. The SDK MUST call filtering and sampling before adding spans or logs to the BatchProcessor. This concept is similar to [OpenTelemetry's Batch Processors](https://github.com/open-telemetry/opentelemetry-collector/blob/main/processor/batchprocessor/README.md). + +The BatchProcessor starts a timeout of `x` seconds when the SDK adds the first span or log. When the timeout exceeds, the BatchProcessor sends all spans or logs no matter how many items it contains. The BatchProcessor also sends all items after the SDK captures spans or logs with weight more than `y`. When the BatchProcessor sends all spans or logs, it resets its timeout and removes all spans and logs in the BatchProcessor. When a span and its children have more weight than the max BatchProcessor weight `y`, the BatchProcessor MUST send the spans or logs together in one envelope directly to Sentry. + +The specification is written in the [Gherkin syntax](https://cucumber.io/docs/gherkin/reference/) and uses `x = 10` seconds for the timeout and `y = 1024 * 1024` for the maximum batch byte size in the BatchProcessor. SDKs MAY use different values for `x` and `y` depending on their needs. If the timeout is set to `0`, then the SDK sends every span and log immediately. Initially, we don't plan adding options for these variables, but we can make them configurable if required in the future, similar to the option `maxCacheItems`. The specification uses spans as an example, but the same applies to logs or any other future telemetry data. + + +```Gherkin +Scenario: No spans in BatchProcessor 1 span added + Given no spans in the BatchProcessor + When the SDK adds 1 span + Then the SDK adds this span to the BatchProcessor + And starts a timeout of 10 seconds + And doesn't send the span to Sentry + +Scenario: Span added before timeout exceeds + Given 1 span in the BatchProcessor + Given 9.9 seconds pass + When the SDK adds 1 span + Then the SDK adds this span to the BatchProcessor + And doesn't reset the timeout + And doesn't send the spans in the BatchProcessor to Sentry + +Scenario: Spans with size of y - 1 added, timeout exceeds + Given spans with size of y - 1 in the BatchProcessor + When the timeout exceeds + Then the SDK adds all the spans to one envelope + And sends them to Sentry + And resets the timeout + And clears the BatchProcessor + +Scenario: Spans with size of y added within 9.9 seconds + Given no spans in the BatchProcessor + When the SDK adds spans with a weight of y within 9.9 seconds + Then the SDK puts all spans into one envelope + And sends the envelope to Sentry + And resets the timeout + And clears the BatchProcessor + +Scenario: 1 span added app crashes + Given 1 span in the SpansAggregator + When the SDK detects a crash + Then the SDK does nothing with the BatchProcessor + And loses the spans in the BatchProcessor + +Scenario: Unfinished spans + Given no span is in the SpansAggregator + When the SDK starts a span but doesn't finish it + Then the SpansAggregator is empty + +Scenario: Spans in SpansAggregator, span with children + Given spans with a size of y - 1 in the BatchProcessor + When the SDK finishes a span with one child + Then the SDK puts the spans with a size of y - 1 already in the BatchProcessor into an envelope + And sends the envelope to Sentry. + And stores the span with its child into the BatchProcessor + And resets the timeout + +Scenario: Span with more children than max BatchProcessor weight + Given one span A is in the BatchProcessor + When the SDK starts a span B + And starts child spans with a size of y for span B + When the SDK finishes the span B and all it's children + Then the SDK directly puts all spans of span B into one envelope + And sends the envelope to Sentry. + And doesn't store the spans of span B in the BatchProcessor + And keeps the existing span A in the BatchProcessor + And doesn't reset the timeout + +Scenario: Timeout set to 0 span without children + Given the timeout is set to 0 + When the SDK finishes one span without any children + Then the SDK puts the span into one one envelope + And sends the envelope to Sentry. + +Scenario: Timeout set to 0 span with children + Given the timeout is set to 0 + When the SDK finishes one span with children of a weight of 100 + Then the SDK puts the span with the children into one envelope + And sends the envelope to Sentry. + +Scenario: Timeout set to 0 spans without children + Given the timeout is set to 0 + When the SDK finishes two spans without any children + Then the SDK puts every span into one envelope + And sends both envelopes to Sentry. + +``` + +## Weight + +The SDK MUST implement a way to calculate the weight of a span or a log to manage the BatchProcessor's memory footprint. Depending on the serialization strategy, the SDK MAY either serialize the span or log into bytes and count these or serialize the span or log and recursively count the number of elements in the dictionary. Every key in a dictionary and every element in an array add a weight of one. For a detailed explanation of how to count the weight, see the example below. As serialization is expensive, the BatchProcessor SHOULD keep track of the serialized spans and logs and directly pass them to the envelope item to avoid serializing multiple times. + +```JSON +{ + // All simple properties count as 1 so in total 12 + "timestamp": 1705031078.623853, + "start_timestamp": 1705031078.337715, + "description": "ExtraViewController full display", + "op": "ui.load.full_display", + "span_id": "794d0cba0ac64235", + "parent_span_id": "45054abc6ded413a", + "trace_id": "65880cfc084f4bd5ab3abc7d598b3c14", + "status": "ok", + "origin": "manual.ui.time_to_display", + "hash": "a925395473cfe97d", + "sampled": true, + "type": "trace", + + // The data object has 5 simple properties, which count as 5 + // and one list with 3 elements counting as 3 + "data": { + "frames.frozen": 0, + "frames.slow": 1, + "frames.total": 1, + "thread.id": 259, + "thread.name": "main", + "list" : [1, 2, 3] + }, + + // Tags count as 2 + "sentry_tags": { + "environment": "ui-tests", + "main_thread": "true", + }, + + // The weight is + // 12 (simple properties) + // 8 (data) + // 2 (tags) + // = 22 +} +``` From 6c5edb940fd293fccf2e44764aa95c5ecb277907 Mon Sep 17 00:00:00 2001 From: Philipp Hofmann Date: Thu, 10 Apr 2025 09:40:19 +0200 Subject: [PATCH 2/5] many updates --- .../sdk/telemetry/spans/batch-processor.mdx | 139 +++++------------- 1 file changed, 36 insertions(+), 103 deletions(-) diff --git a/develop-docs/sdk/telemetry/spans/batch-processor.mdx b/develop-docs/sdk/telemetry/spans/batch-processor.mdx index f39f7eb6c5d3b3..abc57b8a9407c3 100644 --- a/develop-docs/sdk/telemetry/spans/batch-processor.mdx +++ b/develop-docs/sdk/telemetry/spans/batch-processor.mdx @@ -10,29 +10,32 @@ title: Batch Processor This document uses key words such as "MUST", "SHOULD", and "MAY" as defined in [RFC 2119](https://www.ietf.org/rfc/rfc2119.txt) to indicate requirement levels. -When an SDK implements span streaming or logs, it MUST batch multiple spans and logs into envelopes to reduce the number of HTTP. SDKs MUST implement a BatchProcessor to achieve this. The BatchProcessor keeps finished spans and logs in memory and batches them together in envelopes. It uses a combination of timeout and [weight](#weight) to decide when to batch its spans and logs into an envelope and send it to Sentry. -The SDK SHOULD use the BatchProcessor in the client because the transport SHOULD NOT be aware of spans or logs. The SDK MAY deviate from this approach. The SDK MUST call filtering and sampling before adding spans or logs to the BatchProcessor. This concept is similar to [OpenTelemetry's Batch Processors](https://github.com/open-telemetry/opentelemetry-collector/blob/main/processor/batchprocessor/README.md). +The BatchProcessor batches spans and logs into one envelope to reduce the number of HTTP requests. When an SDK implements span streaming or logs, it MUST use a BatchProcessor, which is similar to [OpenTelemetry's Batch Processors](https://github.com/open-telemetry/opentelemetry-collector/blob/main/processor/batchprocessor/README.md). The BatchProcessor holds finished spans and logs in memory and batches them together in envelopes. It uses a combination of time and size-based batching. When writing this, the BatchProcessor only handles spans and logs, but the SDK MAY use it for other telemetry data in the future. -The BatchProcessor starts a timeout of `x` seconds when the SDK adds the first span or log. When the timeout exceeds, the BatchProcessor sends all spans or logs no matter how many items it contains. The BatchProcessor also sends all items after the SDK captures spans or logs with weight more than `y`. When the BatchProcessor sends all spans or logs, it resets its timeout and removes all spans and logs in the BatchProcessor. When a span and its children have more weight than the max BatchProcessor weight `y`, the BatchProcessor MUST send the spans or logs together in one envelope directly to Sentry. +## Specification -The specification is written in the [Gherkin syntax](https://cucumber.io/docs/gherkin/reference/) and uses `x = 10` seconds for the timeout and `y = 1024 * 1024` for the maximum batch byte size in the BatchProcessor. SDKs MAY use different values for `x` and `y` depending on their needs. If the timeout is set to `0`, then the SDK sends every span and log immediately. Initially, we don't plan adding options for these variables, but we can make them configurable if required in the future, similar to the option `maxCacheItems`. The specification uses spans as an example, but the same applies to logs or any other future telemetry data. +Whenever the SDK finishes a span or captures a log, it MUST put these into the BatchProcessor. The SDK MUST NOT put unfinished spans into the BatchProcessor. The BatchProcessor MUST start a timeout of `x` seconds when the SDK adds the first span or log. When the timeout exceeds, the BatchProcessor MUST send all spans or logs, no matter how many items it contains. The BatchProcessor MUST send all items after the SDK captures spans or logs with a size more than `y`. When the BatchProcessor sends all spans or logs, it MUST reset its timeout and remove all spans and logs. The SDK MUST apply filtering and sampling before adding spans or logs to the BatchProcessor. The SDK SHOULD drop rate limited spans and logs before putting them into the BatchProcessor to reduce memory usage. + +The SDK MUST calculate the size of a span or a log to manage the BatchProcessor's memory footprint. The SDK MUST serialize the span or log and calculate the size based on the serialized JSON bytes. As serialization is expensive, the BatchProcessor SHOULD keep track of the serialized spans and logs and pass these to the envelope to avoid serializing multiple times. + +The detailed specification is written in the [Gherkin syntax](https://cucumber.io/docs/gherkin/reference/) and uses `x = 10` seconds for the timeout and `y = 1024 * 1024` for the maximum batch byte size in the BatchProcessor. SDKs MAY use different values for `x` and `y` depending on their needs. The SDK SHOULD NOT expose x and y via the options. The specification uses spans as an example, but the same applies to logs or any other future telemetry data. ```Gherkin Scenario: No spans in BatchProcessor 1 span added Given no spans in the BatchProcessor - When the SDK adds 1 span - Then the SDK adds this span to the BatchProcessor + When the SDK finishes 1 span + Then the SDK puts this span to the BatchProcessor And starts a timeout of 10 seconds And doesn't send the span to Sentry Scenario: Span added before timeout exceeds - Given 1 span in the BatchProcessor + Given span A in the BatchProcessor Given 9.9 seconds pass - When the SDK adds 1 span - Then the SDK adds this span to the BatchProcessor + When the SDK finishes span B + Then the SDK adds span B to the BatchProcessor And doesn't reset the timeout - And doesn't send the spans in the BatchProcessor to Sentry + And doesn't send the spans A and B in the BatchProcessor to Sentry Scenario: Spans with size of y - 1 added, timeout exceeds Given spans with size of y - 1 in the BatchProcessor @@ -43,104 +46,34 @@ Scenario: Spans with size of y - 1 added, timeout exceeds And clears the BatchProcessor Scenario: Spans with size of y added within 9.9 seconds - Given no spans in the BatchProcessor - When the SDK adds spans with a weight of y within 9.9 seconds - Then the SDK puts all spans into one envelope + Given spans with size of y - 1 in the BatchProcessor + When the SDK finishes another span and puts it into the BatchProcessor + Then the BatchProcessor puts all spans into one envelope And sends the envelope to Sentry And resets the timeout And clears the BatchProcessor -Scenario: 1 span added app crashes - Given 1 span in the SpansAggregator - When the SDK detects a crash - Then the SDK does nothing with the BatchProcessor - And loses the spans in the BatchProcessor - Scenario: Unfinished spans - Given no span is in the SpansAggregator + Given no span is in the BatchProcessor When the SDK starts a span but doesn't finish it - Then the SpansAggregator is empty - -Scenario: Spans in SpansAggregator, span with children - Given spans with a size of y - 1 in the BatchProcessor - When the SDK finishes a span with one child - Then the SDK puts the spans with a size of y - 1 already in the BatchProcessor into an envelope - And sends the envelope to Sentry. - And stores the span with its child into the BatchProcessor - And resets the timeout - -Scenario: Span with more children than max BatchProcessor weight - Given one span A is in the BatchProcessor - When the SDK starts a span B - And starts child spans with a size of y for span B - When the SDK finishes the span B and all it's children - Then the SDK directly puts all spans of span B into one envelope - And sends the envelope to Sentry. - And doesn't store the spans of span B in the BatchProcessor - And keeps the existing span A in the BatchProcessor - And doesn't reset the timeout - -Scenario: Timeout set to 0 span without children - Given the timeout is set to 0 - When the SDK finishes one span without any children - Then the SDK puts the span into one one envelope - And sends the envelope to Sentry. - -Scenario: Timeout set to 0 span with children - Given the timeout is set to 0 - When the SDK finishes one span with children of a weight of 100 - Then the SDK puts the span with the children into one envelope - And sends the envelope to Sentry. - -Scenario: Timeout set to 0 spans without children - Given the timeout is set to 0 - When the SDK finishes two spans without any children - Then the SDK puts every span into one envelope - And sends both envelopes to Sentry. - -``` + Then the BatchProcessor is empty + +Scenario: Span filtered out + Given no span is in the BatchProcessor + When the finishes a span + And the span is filtered out + Then the BatchProcessor is empty + +Scenario: Span not sampled + Given no span is in the BatchProcessor + When the finishes a span + And the span is not sampled + Then the BatchProcessor is empty + +Scenario: 1 span added application crashes + Given 1 span in the SpansAggregator + When the SDK detects a crash + Then the SDK does nothing with the items in the BatchProcessor + And loses the spans in the BatchProcessor -## Weight - -The SDK MUST implement a way to calculate the weight of a span or a log to manage the BatchProcessor's memory footprint. Depending on the serialization strategy, the SDK MAY either serialize the span or log into bytes and count these or serialize the span or log and recursively count the number of elements in the dictionary. Every key in a dictionary and every element in an array add a weight of one. For a detailed explanation of how to count the weight, see the example below. As serialization is expensive, the BatchProcessor SHOULD keep track of the serialized spans and logs and directly pass them to the envelope item to avoid serializing multiple times. - -```JSON -{ - // All simple properties count as 1 so in total 12 - "timestamp": 1705031078.623853, - "start_timestamp": 1705031078.337715, - "description": "ExtraViewController full display", - "op": "ui.load.full_display", - "span_id": "794d0cba0ac64235", - "parent_span_id": "45054abc6ded413a", - "trace_id": "65880cfc084f4bd5ab3abc7d598b3c14", - "status": "ok", - "origin": "manual.ui.time_to_display", - "hash": "a925395473cfe97d", - "sampled": true, - "type": "trace", - - // The data object has 5 simple properties, which count as 5 - // and one list with 3 elements counting as 3 - "data": { - "frames.frozen": 0, - "frames.slow": 1, - "frames.total": 1, - "thread.id": 259, - "thread.name": "main", - "list" : [1, 2, 3] - }, - - // Tags count as 2 - "sentry_tags": { - "environment": "ui-tests", - "main_thread": "true", - }, - - // The weight is - // 12 (simple properties) - // 8 (data) - // 2 (tags) - // = 22 -} ``` From b04356e7b163f270327b120c863c1806134c44a9 Mon Sep 17 00:00:00 2001 From: Philipp Hofmann Date: Thu, 10 Apr 2025 11:28:29 +0200 Subject: [PATCH 3/5] timeout and max size --- .../sdk/telemetry/spans/batch-processor.mdx | 22 +++++++++++-------- 1 file changed, 13 insertions(+), 9 deletions(-) diff --git a/develop-docs/sdk/telemetry/spans/batch-processor.mdx b/develop-docs/sdk/telemetry/spans/batch-processor.mdx index abc57b8a9407c3..fc8b3e1c110a6c 100644 --- a/develop-docs/sdk/telemetry/spans/batch-processor.mdx +++ b/develop-docs/sdk/telemetry/spans/batch-processor.mdx @@ -14,11 +14,15 @@ The BatchProcessor batches spans and logs into one envelope to reduce the number ## Specification -Whenever the SDK finishes a span or captures a log, it MUST put these into the BatchProcessor. The SDK MUST NOT put unfinished spans into the BatchProcessor. The BatchProcessor MUST start a timeout of `x` seconds when the SDK adds the first span or log. When the timeout exceeds, the BatchProcessor MUST send all spans or logs, no matter how many items it contains. The BatchProcessor MUST send all items after the SDK captures spans or logs with a size more than `y`. When the BatchProcessor sends all spans or logs, it MUST reset its timeout and remove all spans and logs. The SDK MUST apply filtering and sampling before adding spans or logs to the BatchProcessor. The SDK SHOULD drop rate limited spans and logs before putting them into the BatchProcessor to reduce memory usage. +Whenever the SDK finishes a span or captures a log, it MUST put these into the BatchProcessor. The SDK MUST NOT put unfinished spans into the BatchProcessor. -The SDK MUST calculate the size of a span or a log to manage the BatchProcessor's memory footprint. The SDK MUST serialize the span or log and calculate the size based on the serialized JSON bytes. As serialization is expensive, the BatchProcessor SHOULD keep track of the serialized spans and logs and pass these to the envelope to avoid serializing multiple times. +The BatchProcessor MUST start a timeout of 5 seconds when the SDK adds the first span or log. When the timeout exceeds, the BatchProcessor MUST send all spans or logs, no matter how many items it contains. The SDK MAY choose a different value for the timeout, but it MUST NOT exceed 30 seconds, as this can lead to problems with the span buffer on the backend, which uses a time interval of 60 seconds for determining segments for spans. -The detailed specification is written in the [Gherkin syntax](https://cucumber.io/docs/gherkin/reference/) and uses `x = 10` seconds for the timeout and `y = 1024 * 1024` for the maximum batch byte size in the BatchProcessor. SDKs MAY use different values for `x` and `y` depending on their needs. The SDK SHOULD NOT expose x and y via the options. The specification uses spans as an example, but the same applies to logs or any other future telemetry data. +The BatchProcessor MUST send all items after the SDK when containing spans or logs exceeding 1MiB in size. The SDK MAY choose a different value for the max batch size keeping the [envelope max sizes](/sdk/data-model/envelopes/#size-limits) in mind. The SDK MUST calculate the size of a span or a log to manage the BatchProcessor's memory footprint. The SDK MUST serialize the span or log and calculate the size based on the serialized JSON bytes. As serialization is expensive, the BatchProcessor SHOULD keep track of the serialized spans and logs and pass these to the envelope to avoid serializing multiple times. + +When the BatchProcessor sends all spans or logs, it MUST reset its timeout and remove all spans and logs. The SDK MUST apply filtering and sampling before adding spans or logs to the BatchProcessor. The SDK SHOULD drop rate limited spans and logs before putting them into the BatchProcessor to reduce memory usage. + +The detailed specification is written in the [Gherkin syntax](https://cucumber.io/docs/gherkin/reference/). The specification uses spans as an example, but the same applies to logs or any other future telemetry data. ```Gherkin @@ -26,27 +30,27 @@ Scenario: No spans in BatchProcessor 1 span added Given no spans in the BatchProcessor When the SDK finishes 1 span Then the SDK puts this span to the BatchProcessor - And starts a timeout of 10 seconds + And starts a timeout of 5 seconds And doesn't send the span to Sentry Scenario: Span added before timeout exceeds Given span A in the BatchProcessor - Given 9.9 seconds pass + Given 4.9 seconds pass When the SDK finishes span B Then the SDK adds span B to the BatchProcessor And doesn't reset the timeout And doesn't send the spans A and B in the BatchProcessor to Sentry -Scenario: Spans with size of y - 1 added, timeout exceeds - Given spans with size of y - 1 in the BatchProcessor +Scenario: Spans with size of MiB - 1 byte added, timeout exceeds + Given spans with size of MiB - 1 byte in the BatchProcessor When the timeout exceeds Then the SDK adds all the spans to one envelope And sends them to Sentry And resets the timeout And clears the BatchProcessor -Scenario: Spans with size of y added within 9.9 seconds - Given spans with size of y - 1 in the BatchProcessor +Scenario: Spans with size of MiB - 1 byte added within 4.9 seconds + Given spans with size of MiB - 1 byte in the BatchProcessor When the SDK finishes another span and puts it into the BatchProcessor Then the BatchProcessor puts all spans into one envelope And sends the envelope to Sentry From bb430469b2d6e15fb7a6ff7b2112d406ec73280a Mon Sep 17 00:00:00 2001 From: Philipp Hofmann Date: Thu, 10 Apr 2025 11:45:04 +0200 Subject: [PATCH 4/5] change rate limiting --- develop-docs/sdk/telemetry/spans/batch-processor.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/develop-docs/sdk/telemetry/spans/batch-processor.mdx b/develop-docs/sdk/telemetry/spans/batch-processor.mdx index fc8b3e1c110a6c..48678dd21caa8a 100644 --- a/develop-docs/sdk/telemetry/spans/batch-processor.mdx +++ b/develop-docs/sdk/telemetry/spans/batch-processor.mdx @@ -20,7 +20,7 @@ The BatchProcessor MUST start a timeout of 5 seconds when the SDK adds the first The BatchProcessor MUST send all items after the SDK when containing spans or logs exceeding 1MiB in size. The SDK MAY choose a different value for the max batch size keeping the [envelope max sizes](/sdk/data-model/envelopes/#size-limits) in mind. The SDK MUST calculate the size of a span or a log to manage the BatchProcessor's memory footprint. The SDK MUST serialize the span or log and calculate the size based on the serialized JSON bytes. As serialization is expensive, the BatchProcessor SHOULD keep track of the serialized spans and logs and pass these to the envelope to avoid serializing multiple times. -When the BatchProcessor sends all spans or logs, it MUST reset its timeout and remove all spans and logs. The SDK MUST apply filtering and sampling before adding spans or logs to the BatchProcessor. The SDK SHOULD drop rate limited spans and logs before putting them into the BatchProcessor to reduce memory usage. +When the BatchProcessor sends all spans or logs, it MUST reset its timeout and remove all spans and logs. The SDK MUST apply filtering and sampling before adding spans or logs to the BatchProcessor. The SDK MUST apply rate limits to spans and logs after they leave the BatchProcessor to send as much data as possible by dropping data as late as possible. The detailed specification is written in the [Gherkin syntax](https://cucumber.io/docs/gherkin/reference/). The specification uses spans as an example, but the same applies to logs or any other future telemetry data. From e64244c15e0e52e0920ed4e92841647233765f1f Mon Sep 17 00:00:00 2001 From: Philipp Hofmann Date: Thu, 10 Apr 2025 14:26:02 +0200 Subject: [PATCH 5/5] final touches --- develop-docs/sdk/telemetry/spans/batch-processor.mdx | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/develop-docs/sdk/telemetry/spans/batch-processor.mdx b/develop-docs/sdk/telemetry/spans/batch-processor.mdx index 48678dd21caa8a..1b054cc78fc22b 100644 --- a/develop-docs/sdk/telemetry/spans/batch-processor.mdx +++ b/develop-docs/sdk/telemetry/spans/batch-processor.mdx @@ -10,11 +10,11 @@ title: Batch Processor This document uses key words such as "MUST", "SHOULD", and "MAY" as defined in [RFC 2119](https://www.ietf.org/rfc/rfc2119.txt) to indicate requirement levels. -The BatchProcessor batches spans and logs into one envelope to reduce the number of HTTP requests. When an SDK implements span streaming or logs, it MUST use a BatchProcessor, which is similar to [OpenTelemetry's Batch Processors](https://github.com/open-telemetry/opentelemetry-collector/blob/main/processor/batchprocessor/README.md). The BatchProcessor holds finished spans and logs in memory and batches them together in envelopes. It uses a combination of time and size-based batching. When writing this, the BatchProcessor only handles spans and logs, but the SDK MAY use it for other telemetry data in the future. +The BatchProcessor batches spans and logs into one envelope to reduce the number of HTTP requests. When an SDK implements span streaming or logs, it MUST use a BatchProcessor, which is similar to [OpenTelemetry's Batch Processor](https://github.com/open-telemetry/opentelemetry-collector/blob/main/processor/batchprocessor/README.md). The BatchProcessor holds logs and finished spans in memory and batches them together into envelopes. It uses a combination of time and size-based batching. When writing this, the BatchProcessor only handles spans and logs, but an SDK MAY use it for other telemetry data in the future. ## Specification -Whenever the SDK finishes a span or captures a log, it MUST put these into the BatchProcessor. The SDK MUST NOT put unfinished spans into the BatchProcessor. +Whenever the SDK finishes a span or captures a log, it MUST put it into the BatchProcessor. The SDK MUST NOT put unfinished spans into the BatchProcessor. The BatchProcessor MUST start a timeout of 5 seconds when the SDK adds the first span or log. When the timeout exceeds, the BatchProcessor MUST send all spans or logs, no matter how many items it contains. The SDK MAY choose a different value for the timeout, but it MUST NOT exceed 30 seconds, as this can lead to problems with the span buffer on the backend, which uses a time interval of 60 seconds for determining segments for spans. @@ -41,16 +41,16 @@ Scenario: Span added before timeout exceeds And doesn't reset the timeout And doesn't send the spans A and B in the BatchProcessor to Sentry -Scenario: Spans with size of MiB - 1 byte added, timeout exceeds - Given spans with size of MiB - 1 byte in the BatchProcessor +Scenario: Spans with size of 1 MiB - 1 byte added, timeout exceeds + Given spans with size of 1 MiB - 1 byte in the BatchProcessor When the timeout exceeds Then the SDK adds all the spans to one envelope And sends them to Sentry And resets the timeout And clears the BatchProcessor -Scenario: Spans with size of MiB - 1 byte added within 4.9 seconds - Given spans with size of MiB - 1 byte in the BatchProcessor +Scenario: Spans with size of 1 MiB - 1 byte added within 4.9 seconds + Given spans with size of 1 MiB - 1 byte in the BatchProcessor When the SDK finishes another span and puts it into the BatchProcessor Then the BatchProcessor puts all spans into one envelope And sends the envelope to Sentry