Skip to content

Conversation

@xurui-c
Copy link
Member

@xurui-c xurui-c commented Sep 8, 2025

Redo of #7387

@github-actions
Copy link

github-actions bot commented Sep 8, 2025

This PR has a migration; here is the generated SQL for ./snuba/migrations/groups.py ()

-- start migrations

-- forward migration outcomes : 0010_outcomes_daily_fixed_partitioning
Local op: CREATE TABLE IF NOT EXISTS outcomes_daily_local_v2 (org_id UInt64, project_id UInt64, key_id UInt64, timestamp DateTime, outcome UInt8, reason LowCardinality(String), category UInt8, quantity UInt64, times_seen UInt64) ENGINE ReplicatedSummingMergeTree('/clickhouse/tables/outcomes/{shard}/default/outcomes_daily_local_v2', '{replica}') ORDER BY (org_id, project_id, key_id, outcome, reason, timestamp, category) PARTITION BY (toStartOfMonth(timestamp)) TTL timestamp + toIntervalMonth(13);
Local op: CREATE MATERIALIZED VIEW IF NOT EXISTS outcomes_mv_daily_local_v2 TO outcomes_daily_local_v2 (org_id UInt64, project_id UInt64, key_id UInt64, timestamp DateTime, outcome UInt8, reason String, category UInt8, quantity UInt64, times_seen UInt64) AS 
                    SELECT
                        org_id,
                        project_id,
                        ifNull(key_id, 0) AS key_id,
                        toStartOfDay(timestamp) AS timestamp,
                        outcome,
                        ifNull(reason, 'none') AS reason,
                        category,
                        count() AS times_seen,
                        sum(quantity) AS quantity
                    FROM outcomes_raw_local
                    GROUP BY org_id, project_id, key_id, timestamp, outcome, reason, category
                ;
Distributed op: CREATE TABLE IF NOT EXISTS outcomes_daily_dist_v2 (org_id UInt64, project_id UInt64, key_id UInt64, timestamp DateTime, outcome UInt8, reason LowCardinality(String), category UInt8, quantity UInt64, times_seen UInt64) ENGINE Distributed(`cluster_one_sh`, default, outcomes_daily_local_v2, org_id);
Distributed op: DROP TABLE IF EXISTS outcomes_daily_dist SYNC;
Local op: DROP TABLE IF EXISTS outcomes_mv_daily_local SYNC;
Local op: DROP TABLE IF EXISTS outcomes_daily_local SYNC;
-- end forward migration outcomes : 0010_outcomes_daily_fixed_partitioning




-- backward migration outcomes : 0010_outcomes_daily_fixed_partitioning
Distributed op: DROP TABLE IF EXISTS outcomes_daily_dist_v2 SYNC;
Local op: DROP TABLE IF EXISTS outcomes_mv_daily_local_v2 SYNC;
Local op: DROP TABLE IF EXISTS outcomes_daily_local_v2 SYNC;
-- end backward migration outcomes : 0010_outcomes_daily_fixed_partitioning

@xurui-c xurui-c marked this pull request as ready for review September 8, 2025 17:52
@xurui-c xurui-c requested a review from a team as a code owner September 8, 2025 17:52
Comment on lines 115 to 116
#[serde(default)]
group_first_seen: StringToIntDatetime64,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potential bug: When the group_first_seen field is missing from an error event, the Rust processor incorrectly defaults it to the epoch timestamp (0) instead of a null value.
  • Description: The ErrorMessage struct in the Rust errors processor uses #[serde(default)] for the group_first_seen field. Because the underlying StringToIntDatetime64 type implements Default by returning 0, any incoming error message from Kafka without this field will have group_first_seen set to the Unix epoch timestamp. This is inconsistent with the Python processor, which correctly uses None for missing values, and the database schema, which allows NULL. This incorrect default causes affected issue groups to appear as the oldest possible in any sorting functionality, impacting issue prioritization and display order.

  • Suggested fix: Modify the deserialization logic for group_first_seen in the Rust processor. Instead of #[serde(default)] on StringToIntDatetime64, consider wrapping the type in an Option, like Option<StringToIntDatetime64>. This will deserialize a missing field to None, which can then be correctly handled and inserted as NULL into the database, aligning its behavior with the Python processor.
    severity: 0.65, confidence: 0.95

Did we get this right? 👍 / 👎 to inform future reviews.

@xurui-c xurui-c closed this Sep 8, 2025
@xurui-c xurui-c reopened this Sep 8, 2025
@xurui-c xurui-c force-pushed the meredith/fix-daily-outcomes-redo branch from 53f7fcd to 792ec9d Compare September 8, 2025 18:48
@codecov
Copy link

codecov bot commented Sep 8, 2025

⚠️ File not in storage

No result to display due to the CLI not being able to find the file.
Please ensure the file contains junit in the name and automated file search is enabled,
or the desired file specified by the file and search_dir arguments of the CLI.

@xurui-c xurui-c merged commit 5b246aa into master Sep 8, 2025
34 checks passed
@xurui-c xurui-c deleted the meredith/fix-daily-outcomes-redo branch September 8, 2025 19:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants