From 0e4b03cccd33cc9df499a893dce2610fb87061f4 Mon Sep 17 00:00:00 2001 From: Alexa Batino <37125652+alexabatino@users.noreply.github.com> Date: Fri, 26 Apr 2024 10:09:22 -0400 Subject: [PATCH] [Insights] Add WHERE clause so that only metrics from last 6 year periods are returned (Recidiviz/recidiviz-data#29378) ## Description of the change Importing data to the `metric_benchmarks` table in CA failed because we require `threshold` column to be non-nullable, but some entries in the view were NULL. These rows were for periods prior to the earliest period we care about (the period ending in the first of the month six months ago). So this PR updates the metric-related view helper to include a filter to only include the latest six year periods. Loaded to `alexa_29366` and you can confirm there are no unexpected NULLS with ``` SELECT * FROM `recidiviz-staging.alexa_29366_outliers_views.metric_benchmarks_materialized` WHERE threshold IS NULL ``` ## Type of change > All pull requests must have at least one of the following labels applied (otherwise the PR will fail): | Label | Description | |----------------------------- |----------------------------------------------------------------------------------------------------------- | | Type: Bug | non-breaking change that fixes an issue | | Type: Feature | non-breaking change that adds functionality | | Type: Breaking Change | fix or feature that would cause existing functionality to not work as expected | | Type: Non-breaking refactor | change addresses some tech debt item or prepares for a later change, but does not change functionality | | Type: Configuration Change | adjusts configuration to achieve some end related to functionality, development, performance, or security | | Type: Dependency Upgrade | upgrades a project dependency - these changes are not included in release notes | ## Related issues Closes Recidiviz/recidiviz-data#29366 ## Checklists ### Development **This box MUST be checked by the submitter prior to merging**: - [ ] **Double- and triple-checked that there is no Personally Identifiable Information (PII) being mistakenly added in this pull request** These boxes should be checked by the submitter prior to merging: - [ ] Tests have been written to cover the code changed/added as part of this pull request ### Code review These boxes should be checked by reviewers prior to merging: - [ ] This pull request has a descriptive title and information useful to a reviewer - [ ] Potential security implications or infrastructural changes have been considered, if relevant GitOrigin-RevId: f943546e98aeb32034bf2c4fef862d45249996f6 --- .../state/views/outliers/supervision_metrics_helpers.py | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/recidiviz/calculator/query/state/views/outliers/supervision_metrics_helpers.py b/recidiviz/calculator/query/state/views/outliers/supervision_metrics_helpers.py index 629f68e376..89e1df9343 100644 --- a/recidiviz/calculator/query/state/views/outliers/supervision_metrics_helpers.py +++ b/recidiviz/calculator/query/state/views/outliers/supervision_metrics_helpers.py @@ -57,6 +57,8 @@ def supervision_metric_query_template( FROM {source_table} WHERE state_code = '{state_code}' AND period = "YEAR" +-- Limit the events lookback to only the necessary periods to minimize the size of the subqueries +AND end_date >= DATE_SUB(CURRENT_DATE('US/Eastern'), INTERVAL 6 MONTH) """ ) @@ -72,6 +74,8 @@ def supervision_metric_query_template( FROM {source_table} WHERE state_code = '{state_code}' AND period = "YEAR" +-- Limit the events lookback to only the necessary periods to minimize the size of the subqueries +AND end_date >= DATE_SUB(CURRENT_DATE('US/Eastern'), INTERVAL 6 MONTH) """ rate_subquery = f""" @@ -85,6 +89,8 @@ def supervision_metric_query_template( FROM {source_table} WHERE state_code = '{state_code}' AND period = "YEAR" +-- Limit the events lookback to only the necessary periods to minimize the size of the subqueries +AND end_date >= DATE_SUB(CURRENT_DATE('US/Eastern'), INTERVAL 6 MONTH) """ subqueries.extend([rate_subquery, count_subquery])