From fc3c7ebf8890e5ab21abc543ae16fd44f4084d2d Mon Sep 17 00:00:00 2001
From: David Herberth <david.herberth@sentry.io>
Date: Mon, 14 Oct 2024 14:25:29 +0200
Subject: [PATCH 1/2] feat(relay): Add document describing span and transaction
 quotas

---
 .../relay/transaction-span-ratelimits.mdx     | 230 ++++++++++++++++++
 1 file changed, 230 insertions(+)
 create mode 100644 develop-docs/relay/transaction-span-ratelimits.mdx
diff --git a/develop-docs/relay/transaction-span-ratelimits.mdx b/develop-docs/relay/transaction-span-ratelimits.mdx
new file mode 100644
index 0000000000000..14c3013ae8866
--- /dev/null
+++ b/develop-docs/relay/transaction-span-ratelimits.mdx
@@ -0,0 +1,230 @@
+---
+title: Transaction and Span Rate Limiting
+---
+
+<Alert level="warning" title="Important">
+This document describes a desired state, it is not yet fully implemented in Relay.
+</Alert>
+
+
+Relay enforces quotas defined in Sentry and propagates them as rate limits. In most cases,
+this is a simple one-to-one mapping of data category to envelope item.
+
+The transaction and span rate limits are more complicated. Spans and Transactions both
+have a total and an additional indexed category. They are also closely related, a transaction is a container for spans.
+A dropped transaction results in dropped spans.
+
+The following document describes how transaction and span quotas interact with each other within Relay.
+
+<Alert level="info" title="Important">
+This section describes how Relay has to interpret quotas, SDKs and other consumers
+should read [SDK Development / Rate Limiting](/sdk/rate-limiting/).
+</Alert>
+
+
+Related Documentation:
+- [The Indexed Outcome Category](/application/dynamic-sampling/outcomes/)
+
+
+
+## Enforcement
+
+The following rules hold true:
+- A quota for the transaction category is equivalent to a quota for the spans category.
+- A quota for the spans category is equivalent to a quota for the transaction category.
+- When a transaction is dropped, all the contained spans should be reported in outcomes.
+- If either transactions or spans are rate limited, clients should receive a limit for both categories.
+- Indexed quotas only affect the payload for their respective category.
+
+These rules can be visualized using this table:
+
+|                         | Transaction Payload | Transaction Metric | Span Payload | Span Metric |
+| ----------------------- | ------------------- | ------------------ | ------------ | ----------- |
+| **Transaction**         | ❌                  | ❌                 | ❌           | ❌          |
+| **Transaction Indexed** | ❌                  | ✅                 | ✅           | ✅          |
+| **Span**                | ❌                  | ❌                 | ❌           | ❌          |
+| **Span Indexed**        | ✅                  | ✅                 | ❌           | ✅          |
+
+
+## Outcomes
+
+Outcomes must be generated in Relay for every span and transaction which is dropped. Usually a dropped
+span/transaction results in outcomes for their respective total and indexed category, refer to
+[The Indexed Outcome Category](/application/dynamic-sampling/outcomes/) for details.
+
+This is straight forward for all indexed rate limits, they only drop the payload which results in a single
+negative outcome in the indexed category of the item. A `transaction_indexed` rate limit does not
+cause any spans to be dropped and vice versa.
+
+A span quota and the resulting span rate limit is also trivial for standalone spans received by Relay,
+the standalone span is dropped, and a single outcome is generated.
+
+### Transaction Outcomes
+
+Transactions are containers for spans until they are extracted in Relay. This span extraction can happen at any
+Relay stage: customer managed, PoP-Relay or Processing-Relay. Until spans are extracted from Relay, a dropped transaction
+should count the contained spans and generate an outcome with the contained span quantity + 1, for the segment span
+which would be generated from the transaction itself.
+
+<Note>
+While it is desirable to have span counts correctly extracted from dropped transactions, it may not be feasible
+to do so at any stage of the processing pipeline. For example, it may not be possible to do so (malformed transactions)
+or simply too expensive to compute.
+</Note>
+
+After spans have been extracted, the transaction is no longer a container of span items and just represents itself,
+thus, a dropped transaction with spans already extracted only generates outcomes for the total transactions and
+indexed transaction categories.
+
+<Note>
+Span quotas are equivalent to transactions quotas, so all the above also applies for a span quota.
+</Note>
+
+
+## Examples
+
+### Example: Span Indexed Quota
+
+**Quota**:
+
+```json
+{
+  "categories": ["span_indexed"],
+  "limit": 0
+  // ...
+}
+```
+
+**Transaction**:
+
+```json
+{
+  "type": "transaction",
+  "spans": [
+      { .. },
+      { .. },
+      { .. }
+  ],
+  // ...
+}
+```
+
+An envelope containing a transaction with 3 child spans generates 4 outcomes for rate limited spans in the
+`spans_indexed` category. 1 count for the generated segment span from the transaction and 3 counts for the
+contained spans. The transaction itself will still be ingested.
+
+**Ingestion**:
+
+| Transaction Payload | Transaction Metrics | Span Payload | Span Metrics |
+| ------------------- | ------------------- | ------------ | ------------ |
+| ✅                  | ✅                  | ❌           | ✅           |
+
+**Outcomes**:
+
+| `transaction` | `transaction_indexed` | `span` | `span_indexed` |
+| ------------- | --------------------- | ------ | -------------- |
+| 0             | 0                     | 0      | 4              |
+
+**Rate Limits propagated to SDKs:** None.
+
+
+### Example: Transaction Quota
+
+**Quota**:
+
+```json
+{
+  "categories": ["transaction"],
+  "limit": 0
+  // ...
+}
+```
+
+**Transaction**:
+
+```json
+{
+  "type": "transaction",
+  "spans": [
+      { .. },
+      { .. },
+      { .. }
+  ],
+  // ...
+}
+```
+
+**Ingestion**:
+
+| Transaction Payload | Transaction Metrics | Span Payload | Span Metrics |
+| ------------------- | ------------------- | ------------ | ------------ |
+| ❌                  | ❌                  | ❌           | ❌           |
+
+**Outcomes**:
+
+| `transaction` | `transaction_indexed` | `span` | `span_indexed` |
+| ------------- | --------------------- | ------ | -------------- |
+| 1             | 1                     | 4      | 4              |
+
+**Rate Limits propagated to SDKs:** Transaction, Span.
+
+
+## FAQ / Reasoning
+
+### Why can transaction limits be propagated as span limits and vice versa?
+
+At first glance this is non-obvious, spans can exist without transactions
+and also transactions can exist without (standalone) spans, so why can the quotas
+be used interchangeably?
+
+The reasoning lies in the Sentry product and how the information is used within Sentry.
+Because of Relay, Sentry can safely assume spans exist if the user/SDK sends transactions
+and more and more of the product is built on the basis of spans. From this point of view
+transactions and spans convey the same information, it is just represented differently.
+This paradigm shift from transaction to spans is also visible in the billing. AM2 and before
+only used transactions for billing, whereas AM3 uses spans now. But AM3 still accepts transactions
+it just bills on spans. Disabling ingestion for spans on AM3 therefore also would require the same
+quota to be set for transactions, but since quotas are tracked separately a transaction quota
+may never trigger the span quota or vice versa.
+
+The logical conclusion and simplification is, to just treat span and transaction rate limits
+equally (important: treat the limits the same, the quotas are still tracked separately).
+Another upside is, enforcing these limits on SDKs does not require extra logic, it is all contained
+within Relay.
+
+### Why do span outcomes and transaction outcomes differ?
+
+There can be multiple reasons for this in the entire pipeline, these are just some reasons why there
+can be differences caused by Relay:
+
+- Parsing the span count from a transaction is too expensive and may be omitted. This is an extremely
+important property of Relay, abuse cases must be handled as fast as possible with as little cost ($ and resources)
+as possible. In some cases, it may not be feasible to JSON parse the transaction to extract a span count.
+- A envelope item may be malformed, there will be outcomes generated for the inferred data category (span or transaction),
+but the span count cannot be recovered from an invalid transaction.
+
+
+### I want Relay to enforce a quota, which category should I use?
+
+From top to bottom:
+
+- Do you completely disable ingestion? Configure the limit for the `span` and `transaction` data categories.
+- Is it billing related? For example, spike protection operates on the category of the billing unit
+(AM2: transactions, AM3: spans), use the respective data category for the limit.
+- Use the total category (not indexed) which makes most sense to you. When in doubt, ask the Relay team.
+
+### Should I ever use an indexed category in a Quota?
+
+No.
+
+Indexed quotas are only useful to protect downstream infrastructure through abuse quotas.
+They are inherently more expensive to enforce, cannot be propagated to clients and are generally a sign
+of misconfiguration.
+
+Dynamic-, smart- or client side-sampling should prevent any indexed quota from being enforced.
+
+### What does this mean for standalone spans?
+
+They are not treated differently from extracted spans. After metrics extraction, which may happen in customer
+Relays, there is no more distinction. Having no special treatment for standalone spans also means we do not
+need any special logic in the SDKs.

From 777f9e024ce14dbae43583911bc5fa7580de8db8 Mon Sep 17 00:00:00 2001
From: David Herberth <david.herberth@sentry.io>
Date: Tue, 29 Oct 2024 15:42:43 +0100
Subject: [PATCH 2/2] review changes

---
 .../relay/transaction-span-ratelimits.mdx     | 37 +++++++++----------
 1 file changed, 17 insertions(+), 20 deletions(-)

diff --git a/develop-docs/relay/transaction-span-ratelimits.mdx b/develop-docs/relay/transaction-span-ratelimits.mdx
index 14c3013ae8866..e13c3e239fd55 100644
--- a/develop-docs/relay/transaction-span-ratelimits.mdx
+++ b/develop-docs/relay/transaction-span-ratelimits.mdx
@@ -7,11 +7,11 @@ This document describes a desired state, it is not yet fully implemented in Rela
 </Alert>
 
 
-Relay enforces quotas defined in Sentry and propagates them as rate limits. In most cases,
+Relay enforces quotas defined in Sentry and propagates them as rate limits to clients. In most cases,
 this is a simple one-to-one mapping of data category to envelope item.
 
 The transaction and span rate limits are more complicated. Spans and Transactions both
-have a total and an additional indexed category. They are also closely related, a transaction is a container for spans.
+have a "total" and an additional "indexed" category. They are also closely related, a transaction is a container for spans.
 A dropped transaction results in dropped spans.
 
 The following document describes how transaction and span quotas interact with each other within Relay.
@@ -30,21 +30,23 @@ Related Documentation:
 ## Enforcement
 
 The following rules hold true:
-- A quota for the transaction category is equivalent to a quota for the spans category.
-- A quota for the spans category is equivalent to a quota for the transaction category.
+- A quota for the transaction category is also enforced for the span category.
+- A quota for the span category is also enforced for the transaction category.
 - When a transaction is dropped, all the contained spans should be reported in outcomes.
 - If either transactions or spans are rate limited, clients should receive a limit for both categories.
 - Indexed quotas only affect the payload for their respective category.
 
 These rules can be visualized using this table:
 
-|                         | Transaction Payload | Transaction Metric | Span Payload | Span Metric |
+| Rate Limit / Item       | Transaction Payload | Transaction Metric | Span Payload | Span Metric |
 | ----------------------- | ------------------- | ------------------ | ------------ | ----------- |
 | **Transaction**         | ❌                  | ❌                 | ❌           | ❌          |
 | **Transaction Indexed** | ❌                  | ✅                 | ✅           | ✅          |
 | **Span**                | ❌                  | ❌                 | ❌           | ❌          |
 | **Span Indexed**        | ✅                  | ✅                 | ❌           | ✅          |
 
+- ❌: Item rejected.
+- ✅: Item accepted.
 
 ## Outcomes
 
@@ -119,7 +121,7 @@ contained spans. The transaction itself will still be ingested.
 | ------------------- | ------------------- | ------------ | ------------ |
 | ✅                  | ✅                  | ❌           | ✅           |
 
-**Outcomes**:
+**Negative Outcomes**:
 
 | `transaction` | `transaction_indexed` | `span` | `span_indexed` |
 | ------------- | --------------------- | ------ | -------------- |
@@ -160,7 +162,7 @@ contained spans. The transaction itself will still be ingested.
 | ------------------- | ------------------- | ------------ | ------------ |
 | ❌                  | ❌                  | ❌           | ❌           |
 
-**Outcomes**:
+**Negative Outcomes**:
 
 | `transaction` | `transaction_indexed` | `span` | `span_indexed` |
 | ------------- | --------------------- | ------ | -------------- |
@@ -171,7 +173,7 @@ contained spans. The transaction itself will still be ingested.
 
 ## FAQ / Reasoning
 
-### Why can transaction limits be propagated as span limits and vice versa?
+### Why do transaction limits need to be propagated as span limits and vice versa?
 
 At first glance this is non-obvious, spans can exist without transactions
 and also transactions can exist without (standalone) spans, so why can the quotas
@@ -181,18 +183,13 @@ The reasoning lies in the Sentry product and how the information is used within
 Because of Relay, Sentry can safely assume spans exist if the user/SDK sends transactions
 and more and more of the product is built on the basis of spans. From this point of view
 transactions and spans convey the same information, it is just represented differently.
-This paradigm shift from transaction to spans is also visible in the billing. AM2 and before
-only used transactions for billing, whereas AM3 uses spans now. But AM3 still accepts transactions
-it just bills on spans. Disabling ingestion for spans on AM3 therefore also would require the same
-quota to be set for transactions, but since quotas are tracked separately a transaction quota
-may never trigger the span quota or vice versa.
 
-The logical conclusion and simplification is, to just treat span and transaction rate limits
+The logical conclusion and simplification is, to treat span and transaction rate limits
 equally (important: treat the limits the same, the quotas are still tracked separately).
-Another upside is, enforcing these limits on SDKs does not require extra logic, it is all contained
-within Relay.
+Enforcing these limits on SDKs then does not require extra logic, it is all contained
+within Relay and works for any SDK version, future and present.
 
-### Why do span outcomes and transaction outcomes differ?
+### Why can span outcomes and transaction outcomes become inconsistent?
 
 There can be multiple reasons for this in the entire pipeline, these are just some reasons why there
 can be differences caused by Relay:
@@ -209,13 +206,13 @@ but the span count cannot be recovered from an invalid transaction.
 From top to bottom:
 
 - Do you completely disable ingestion? Configure the limit for the `span` and `transaction` data categories.
-- Is it billing related? For example, spike protection operates on the category of the billing unit
-(AM2: transactions, AM3: spans), use the respective data category for the limit.
+- Is it billing related? For example, spike protection operates on the category of the billing unit,
+use the respective data category for the limit.
 - Use the total category (not indexed) which makes most sense to you. When in doubt, ask the Relay team.
 
 ### Should I ever use an indexed category in a Quota?
 
-No.
+No, unless the intention is to protect infrastructure (abuse limits).
 
 Indexed quotas are only useful to protect downstream infrastructure through abuse quotas.
 They are inherently more expensive to enforce, cannot be propagated to clients and are generally a sign