Skip to content

Commit

Permalink
feat(generic-metrics): Add support for gauges in dataset and processo…
Browse files Browse the repository at this point in the history
…rs (#4912)

* add migrations for tables

* update to use anyLast (this doesn't work)

* try with argMax aggregate function

* rename migrations

* renumber migrations

* renumber migrations again

* add raw_timestamp column

* fix test distributed storage set key

* add granularity-specific retention day columns to raw table

* Sum aggregate function instead of count.

* remove storage set key creations

* remove extra column and add max timestamp aggregation

* add gauges storage set key to migration group

* feat(rust): Update all dependencies in lockfile (#4892)

* feat: Write received_p99 to commit log (#4872)

This supports the subscriptions to opt into using received_p99 for scheduling instead of the current orig_message_ts

Needs getsentry/arroyo#295

* Revert "feat: Write received_p99 to commit log (#4872)"

This reverts commit c7db591.

Co-authored-by: lynnagara <1779792+lynnagara@users.noreply.github.com>

* initial pass, writable storage

* fix output type for gauges messages

* note on dlq

* write path working

* add entity, readable storage for querying

* switch to mat view version2

* add tests for gauges processor

* add gauges entity key

* fix readable gauge storage schema

* remove avg from gauges migration

* Revert "feat: Write received_p99 to commit log (#4872)"

This reverts commit c7db591.

Co-authored-by: lynnagara <1779792+lynnagara@users.noreply.github.com>

* ref(subscriptions): Move --delay-seconds from CLI arg to yaml definition (#4915)

The main motivations for this are:
1. The amount of delay depends on the synchronization timestamp used, and this is defined at the storage level in code. For example if "orig_message_ts" is used, a longer delay will be applied than if "received_p99" is used, since received will be set earlier in the pipeline.
2. The same CLI args get applied in all Sentry deployments, and this makes it easier to keep them in sync
3. Rolling out different values per storage via CLI will probably break some of our templates and require too much rework.

There are no functional changes here since we have 60 configured everywhere right now.

* feat(rust): Add strategy that does json schema validation (#4901)

* spans: add profile_id to tests  (#4827)

* add test for spans profile_id and fix bug where test was dependent on local timezone

* test: Refactor API tests to not reference sessions so it can be removed (#4920)

* fix merge conflict

* remove avgs support in dataset (storage, entity) and processor

* fix aggregate function in entity

* add some comments

---------

Co-authored-by: Lyn Nagara <lyn.nagara@gmail.com>
Co-authored-by: getsentry-bot <bot@sentry.io>
Co-authored-by: lynnagara <1779792+lynnagara@users.noreply.github.com>
Co-authored-by: Dalitso Banda <dalitso.banda@sentry.io>
  • Loading branch information
5 people committed Nov 1, 2023
1 parent ae43829 commit 62f2192
Show file tree
Hide file tree
Showing 9 changed files with 651 additions and 4 deletions.
10 changes: 10 additions & 0 deletions snuba/cli/devserver.py
Original file line number Diff line number Diff line change
Expand Up @@ -228,6 +228,16 @@ def devserver(*, bootstrap: bool, workers: bool) -> None:
*COMMON_CONSUMER_DEV_OPTIONS,
],
),
(
"generic-metrics-gauges-consumer",
[
"snuba",
"consumer",
"--storage=generic_metrics_gauges_raw",
"--consumer-group=snuba-gen-metrics-gauges-consumers",
*COMMON_CONSUMER_DEV_OPTIONS,
],
),
]
if settings.ENABLE_METRICS_SUBSCRIPTIONS:
if settings.SEPARATE_SCHEDULER_EXECUTOR_SUBSCRIPTIONS_DEV:
Expand Down
1 change: 1 addition & 0 deletions snuba/datasets/configuration/generic_metrics/dataset.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -7,3 +7,4 @@ entities:
- generic_metrics_distributions
- generic_metrics_counters
- generic_org_metrics_counters
- generic_metrics_gauges
255 changes: 255 additions & 0 deletions snuba/datasets/configuration/generic_metrics/entities/gauges.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,255 @@
version: v1
kind: entity
name: generic_metrics_gauges

schema:
[
{ name: org_id, type: UInt, args: { size: 64 } },
{ name: project_id, type: UInt, args: { size: 64 } },
{ name: metric_id, type: UInt, args: { size: 64 } },
{ name: rounded_timestamp, type: DateTime },
{ name: bucketed_time, type: DateTime },
{
name: tags,
type: Nested,
args:
{
subcolumns:
[
{ name: key, type: UInt, args: { size: 64 } },
{ name: value, type: UInt, args: { size: 64 } },
],
},
},
{
name: min,
type: AggregateFunction,
args: { func: min, arg_types: [{ type: Float, args: { size: 64 } }] },
},
{
name: max,
type: AggregateFunction,
args: { func: max, arg_types: [{ type: Float, args: { size: 64 } }] },
},
{
name: sum,
type: AggregateFunction,
args: { func: sum, arg_types: [{ type: Float, args: { size: 64 } }] },
},
{
name: count,
type: AggregateFunction,
args: { func: count, arg_types: [{ type: UInt, args: { size: 64 } }] },
},
{
name: last,
type: AggregateFunction,
args: { func: count, arg_types: [{ type: Float, args: { size: 64 } }] },
},
]

storages:
- storage: generic_metrics_gauges
translation_mappers:
functions:
- mapper: AggregateFunctionMapper
args:
column_to_map: value
from_name: min
to_name: minMerge
aggr_col_name: min
- mapper: AggregateFunctionMapper
args:
column_to_map: value
from_name: minIf
to_name: minMergeIf
aggr_col_name: min
- mapper: AggregateFunctionMapper
args:
column_to_map: value
from_name: max
to_name: maxMerge
aggr_col_name: max
- mapper: AggregateFunctionMapper
args:
column_to_map: value
from_name: maxIf
to_name: maxMergeIf
aggr_col_name: max
- mapper: AggregateFunctionMapper
args:
column_to_map: value
from_name: sum
to_name: sumMerge
aggr_col_name: sum
- mapper: AggregateFunctionMapper
args:
column_to_map: value
from_name: sumIf
to_name: sumMergeIf
aggr_col_name: sum
- mapper: AggregateFunctionMapper
args:
column_to_map: value
from_name: count
to_name: sumMerge
aggr_col_name: count
- mapper: AggregateFunctionMapper
args:
column_to_map: value
from_name: countIf
to_name: sumMergeIf
aggr_col_name: count
- mapper: AggregateFunctionMapper
args:
column_to_map: value
from_name: last
to_name: argMaxMerge
aggr_col_name: last
- mapper: AggregateFunctionMapper
args:
column_to_map: value
from_name: lastIf
to_name: argMaxMergeIf
aggr_col_name: last
subscriptables:
- mapper: SubscriptableMapper
args:
from_column_table:
from_column_name: tags_raw
to_nested_col_table:
to_nested_col_name: tags
value_subcolumn_name: raw_value
- mapper: SubscriptableMapper
args:
from_column_table:
from_column_name: tags
to_nested_col_table:
to_nested_col_name: tags
value_subcolumn_name: indexed_value
- storage: generic_metrics_gauges_raw
is_writable: true
translation_mappers:
functions:
- mapper: AggregateFunctionMapper
args:
column_to_map: value
from_name: min
to_name: minMerge
aggr_col_name: min
- mapper: AggregateFunctionMapper
args:
column_to_map: value
from_name: minIf
to_name: minMergeIf
aggr_col_name: min
- mapper: AggregateFunctionMapper
args:
column_to_map: value
from_name: max
to_name: maxMerge
aggr_col_name: max
- mapper: AggregateFunctionMapper
args:
column_to_map: value
from_name: maxIf
to_name: maxMergeIf
aggr_col_name: max
- mapper: AggregateFunctionMapper
args:
column_to_map: value
from_name: sum
to_name: sumMerge
aggr_col_name: sum
- mapper: AggregateFunctionMapper
args:
column_to_map: value
from_name: sumIf
to_name: sumMergeIf
aggr_col_name: sum
- mapper: AggregateFunctionMapper
args:
column_to_map: value
from_name: count
to_name: sumMerge
aggr_col_name: count
- mapper: AggregateFunctionMapper
args:
column_to_map: value
from_name: countIf
to_name: sumMergeIf
aggr_col_name: count
- mapper: AggregateFunctionMapper
args:
column_to_map: value
from_name: last
to_name: argMaxMerge
aggr_col_name: last
- mapper: AggregateFunctionMapper
args:
column_to_map: value
from_name: last
to_name: argMaxMerge
aggr_col_name: last
- mapper: AggregateFunctionMapper
args:
column_to_map: value
from_name: lastIf
to_name: argMaxMergeIf
aggr_col_name: last
subscriptables:
- mapper: SubscriptableMapper
args:
from_column_table:
from_column_name: tags_raw
to_nested_col_table:
to_nested_col_name: tags
value_subcolumn_name: raw_value
- mapper: SubscriptableMapper
args:
from_column_table:
from_column_name: tags
to_nested_col_table:
to_nested_col_name: tags
value_subcolumn_name: indexed_value

storage_selector:
selector: SimpleQueryStorageSelector
args:
storage: generic_metrics_gauges

query_processors:
- processor: TagsTypeTransformer
- processor: MappedGranularityProcessor
args:
accepted_granularities:
10: 0
60: 1
3600: 2
86400: 3
default_granularity: 1
- processor: TimeSeriesProcessor
args:
time_group_columns:
bucketed_time: rounded_timestamp
time_parse_columns: [rounded_timestamp]
- processor: ReferrerRateLimiterProcessor
- processor: OrganizationRateLimiterProcessor
args:
org_column: org_id
- processor: ProjectReferrerRateLimiter
args:
project_column: project_id
- processor: ProjectRateLimiterProcessor
args:
project_column: project_id
- processor: ResourceQuotaProcessor
args:
project_field: project_id

validators:
- validator: EntityRequiredColumnValidator
args:
required_filter_columns: ["org_id", "project_id"]
required_time_column: rounded_timestamp
partition_key_column_name: org_id
Loading

0 comments on commit 62f2192

Please sign in to comment.