Skip to content

Commit

Permalink
Improve Cardinality description (#9999)
Browse files Browse the repository at this point in the history
* Improve Cardinality description

- added more details
- added image

* Undo metric type changes

* Added image for cardinality

* Update docs/product/metrics/index.mdx

Co-authored-by: vivianyentran <20403606+vivianyentran@users.noreply.github.com>

* Update docs/product/metrics/index.mdx

Co-authored-by: vivianyentran <20403606+vivianyentran@users.noreply.github.com>

* Update docs/product/metrics/index.mdx

Co-authored-by: vivianyentran <20403606+vivianyentran@users.noreply.github.com>

* Update docs/product/metrics/index.mdx

Co-authored-by: vivianyentran <20403606+vivianyentran@users.noreply.github.com>

* Update docs/product/metrics/index.mdx

Co-authored-by: vivianyentran <20403606+vivianyentran@users.noreply.github.com>

* Update docs/product/metrics/index.mdx

Co-authored-by: vivianyentran <20403606+vivianyentran@users.noreply.github.com>

---------

Co-authored-by: vivianyentran <20403606+vivianyentran@users.noreply.github.com>
  • Loading branch information
2 people authored and matejminar committed Jun 6, 2024
1 parent d122437 commit 262062c
Showing 1 changed file with 26 additions and 5 deletions.
31 changes: 26 additions & 5 deletions docs/product/metrics/index.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -23,19 +23,40 @@ Each metric also needs to have a unit associated with it, so that you know what

## Augmenting Metrics with Tags

Metrics are powerful on their own, but you can enrich them further by adding dimensions in the form of Tags. These are key/value string pairs (for example, `platform:ios`) that are associated with metrics to provide contextual information, and are often used to filter and group them during analysis.
Metrics are powerful on their own, but you can enrich them further by adding dimensions in the form of Tags. Metrics can be categorized, organized, and filtered based on these different dimensions, providing more granularity and flexibility in analyzing and querying your data.

Sentry adds certain common tags by default such as, `transaction`, `environment`, and `release`, but you can also create your own custom tags to track attributes that help organize your data for your specific use case. Some common examples of useful tags are browser name, region, language, or customer.
Tags consist of key-value pairs, where the key represents the tag name and the value represents the tag value. For example, you might have a platform tag, like `platform:android` or `platform:ios`. You can create your own custom tags to track attributes that help organize your data for your specific use case. Other useful tags are browser name, region, language, and customer.

To improve your product experience, Sentry adds certain common tags by default: `transaction``environment`, and `release`. This allows you to immediately analyze and segment your metrics data based on these useful dimensions. These tags can also be found on events such as errors and transactions.

## Limits and Restrictions

### Cardinality

In metrics, "cardinality" refers to the number of unique time series generated by tags associated with a metric. The more tags combinations you create, the higher the cardinality.
In metrics, "cardinality" refers to the number of unique time-series generated by tags associated with a metric. The more unique tag combinations you create, the higher the cardinality.

For instance, imagine you are tracking daily logins. You create a metric named `login` and add tags like `platform` (with values `ios`, `android`, `web`) to indicate where logins occur. This will result in 3 distinct time-series:
- metric `login`, tag `platform:android`
- metric `login`, tag `platform:ios`
- metric `login`, tag `platform:web`

Then you decide to add a new tag called `region` which can be either be `US` or `EU`. If all combinations of tag values actually occurs, this will result in up to **6 time-series:**
- metric `login`, tag `platform:android`, tag `region:US`
- metric `login`, tag `platform:android`, tag `region:EU`
- metric `login`, tag `platform:ios`, tag `region:US`
- metric `login`, tag `platform:ios`, tag `region:EU`
- metric `login`, tag `platform:web`, tag `region:US`
- metric `login`, tag `platform:web`, tag `region:EU`

Generally, the maximum cardinality for a metric is the product of tag cardinalities. In this example, the cardinality is 3(`platform`) x 2(`region`) = 6.

![Cardinality](./img/cardinality.png)

To maintain reasonable cardinality, use tags with a fixed or predictable set of values. Adding a tag such as `user_id`, which will have a different value for each user and can also grow significantly over time, will result in a substantial increase in overall cardinality. Considering that each user might log in from various platforms, this multiplies the number of unique time series needed to analyze the data. For example, with 100,000 user IDs and 3 platform values, you would potentially require 300,000 time series.

For instance, imagine you are tracking daily logins. You create a metric named `login` and add tags like `platform` (with values `ios`, `android`, `web`) to indicate where logins occur. Then you decide to add a new tag called `user_id`, which will have a different value for each user. In metrics, adding a `user_id` tag, which identifies the user logging in, can greatly increase cardinality as user IDs accumulate. Considering that each user might log in from various platforms, this multiplies the number of unique time series needed to analyze the data. For example, with 100,000 user IDs and 3 platform values, you would potentially require 300,000 time series.
Metrics are useful for analyzing your data in aggregate. If you have more than 10 users, is it really useful to track the number of logins for each user from each platform? Probably not. Instead, what might be more valuable is to look at a dimension like a `plan_tier` (e.g. `free`, `team`, `business`,`internal users`) to segment your login data based on user categories that are meaningful for your analysis. This approach helps limit cardinality and optimizes metric analysis.

To manage cardinality, use tags with fixed sets of values. Instead of `user_id`, consider using tags like `country` (with specific country names) to categorize login locations or `user_segment` (e.g. `business`, `internal users`) to segment your login data based on user categories that are meaningful for your analysis. This approach helps limit cardinality and optimizes metric analysis.
If you really are interested in individual user actions, consider using [transactions](/product/performance/transaction-summary/#what-is-a-transaction) instead of metrics.

<Note>

Expand Down

0 comments on commit 262062c

Please sign in to comment.