Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(glam) add fully qualified table names in legacy telemetry queries #5559

Merged
merged 5 commits into from May 13, 2024

Conversation

edugfilho
Copy link
Contributor

@edugfilho edugfilho commented May 10, 2024

Most of the changes are formatting.
This PR adds fully qualified table names to GLAM ETL Legacy queries.
... and there's one @submission_date parameter I smuggled, to be extra safe.

Checklist for reviewer:

  • Commits should reference a bug or github issue, if relevant (if a bug is referenced, the pull request should include the bug number in the title).
  • If the PR comes from a fork, trigger integration CI tests by running the Push to upstream workflow and provide the <username>:<branch> of the fork as parameter. The parameter will also show up
    in the logs of the manual-trigger-required-for-fork CI task together with more detailed instructions.
  • If adding a new field to a query, ensure that the schema and dependent downstream schemas have been updated.
  • When adding a new derived dataset, ensure that data is not available already (fully or partially) and recommend extending an existing dataset in favor of creating new ones. Data can be available in the bigquery-etl repository, looker-hub or in looker-spoke-default.

For modifications to schemas in restricted namespaces (see CODEOWNERS):

┆Issue is synchronized with this Jira Task

telemetry_derived.clients_histogram_aggregates_v2
`moz-fx-data-shared-prod.telemetry_derived.clients_histogram_aggregates_v2`
WHERE
submission_date = DATE_SUB(DATE(@submission_date), INTERVAL 1 DAY)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added this parameter to be extra safe. The table is overwritten at every execution so there should only be one submission date, but just in case there are more...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only formatting in this file

@edugfilho edugfilho changed the title fix(glam) fix table names to fully qualified fix(glam) add fully qualified table names in legacy telemetry queries May 10, 2024
@dataops-ci-bot

This comment has been minimized.

@dataops-ci-bot

This comment has been minimized.

@dataops-ci-bot

This comment has been minimized.

@dataops-ci-bot

This comment has been minimized.

@dataops-ci-bot

This comment has been minimized.

@edugfilho edugfilho enabled auto-merge (squash) May 13, 2024 18:51
@dataops-ci-bot
Copy link

Integration report for "Merge branch 'main' into glam-fully-qual-tbls"

sql.diff

Click to expand!
diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/telemetry_derived/clients_histogram_aggregates_new_v1/query.sql /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/telemetry_derived/clients_histogram_aggregates_new_v1/query.sql
--- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/telemetry_derived/clients_histogram_aggregates_new_v1/query.sql	2024-05-13 18:46:21.000000000 +0000
+++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/telemetry_derived/clients_histogram_aggregates_new_v1/query.sql	2024-05-13 18:46:20.000000000 +0000
@@ -1,10 +1,12 @@
 WITH preconditions AS (
   SELECT
     IF(
-      (SELECT MAX(submission_date) FROM clients_histogram_aggregates_v2) = DATE_SUB(
-        DATE(@submission_date),
-        INTERVAL 1 DAY
-      ),
+      (
+        SELECT
+          MAX(submission_date)
+        FROM
+          `moz-fx-data-shared-prod.telemetry_derived.clients_histogram_aggregates_v2`
+      ) = DATE_SUB(DATE(@submission_date), INTERVAL 1 DAY),
       TRUE,
       ERROR('Pre-condition failed: table clients_histogram_aggregates_v2 must be up to date')
     ) histogram_aggregates_up_to_date
@@ -13,7 +15,7 @@
   SELECT
     * EXCEPT (histogram_aggregates_up_to_date)
   FROM
-    clients_daily_histogram_aggregates_v1,
+    `moz-fx-data-shared-prod.telemetry_derived.clients_daily_histogram_aggregates_v1`,
     preconditions
   WHERE
     preconditions.histogram_aggregates_up_to_date
@@ -64,7 +66,7 @@
   FROM
     filtered_aggregates AS hist_aggs
   LEFT JOIN
-    latest_versions
+    `moz-fx-data-shared-prod.telemetry_derived.latest_versions` AS latest_versions
     ON latest_versions.channel = hist_aggs.channel
   WHERE
     CAST(app_version AS INT64) >= (latest_version - 2)
@@ -85,7 +87,7 @@
     key,
     process,
     agg_type,
-    udf.map_sum(ARRAY_CONCAT_AGG(value)) AS aggregates
+    `moz-fx-data-shared-prod`.udf.map_sum(ARRAY_CONCAT_AGG(value)) AS aggregates
   FROM
     version_filtered_new
   GROUP BY
@@ -105,7 +107,7 @@
     latest_version
 )
 SELECT
-  udf_js.sample_id(client_id) AS sample_id,
+  `moz-fx-data-shared-prod`.udf_js.sample_id(client_id) AS sample_id,
   client_id,
   os,
   app_version,
diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/telemetry_derived/clients_histogram_aggregates_v1/metadata.yaml /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/telemetry_derived/clients_histogram_aggregates_v1/metadata.yaml
--- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/telemetry_derived/clients_histogram_aggregates_v1/metadata.yaml	2024-05-13 18:47:58.000000000 +0000
+++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/telemetry_derived/clients_histogram_aggregates_v1/metadata.yaml	2024-05-13 18:48:02.000000000 +0000
@@ -23,6 +23,6 @@
   - workgroup:mozilla-confidential
 references:
   query.sql:
-  - clients_histogram_aggregates_new_v1
-  - clients_histogram_aggregates_v1
-  - latest_versions
+  - moz-fx-data-shared-prod.telemetry_derived.clients_histogram_aggregates_new_v1
+  - moz-fx-data-shared-prod.telemetry_derived.clients_histogram_aggregates_v1
+  - moz-fx-data-shared-prod.telemetry_derived.latest_versions
diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/telemetry_derived/clients_histogram_aggregates_v1/query.sql /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/telemetry_derived/clients_histogram_aggregates_v1/query.sql
--- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/telemetry_derived/clients_histogram_aggregates_v1/query.sql	2024-05-13 18:46:21.000000000 +0000
+++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/telemetry_derived/clients_histogram_aggregates_v1/query.sql	2024-05-13 18:46:20.000000000 +0000
@@ -1,5 +1,6 @@
 CREATE TEMP FUNCTION udf_merged_user_data(old_aggs ANY TYPE, new_aggs ANY TYPE)
-  RETURNS ARRAY<STRUCT<
+RETURNS ARRAY<
+  STRUCT<
     first_bucket INT64,
     last_bucket INT64,
     num_buckets INT64,
@@ -8,19 +9,23 @@
     key STRING,
     process STRING,
     agg_type STRING,
-    aggregates ARRAY<STRUCT<key STRING, value INT64>>>> AS (
+    aggregates ARRAY<STRUCT<key STRING, value INT64>>
+  >
+> AS (
   (
-    WITH unnested AS
-      (SELECT *
-      FROM UNNEST(old_aggs)
-
+    WITH unnested AS (
+      SELECT
+        *
+      FROM
+        UNNEST(old_aggs)
       UNION ALL
-
-      SELECT *
-      FROM UNNEST(new_aggs)),
-
-    aggregated_data AS
-      (SELECT AS STRUCT
+      SELECT
+        *
+      FROM
+        UNNEST(new_aggs)
+    ),
+    aggregated_data AS (
+      SELECT AS STRUCT
         first_bucket,
         last_bucket,
         num_buckets,
@@ -30,7 +35,8 @@
         process,
         agg_type,
         mozfun.map.sum(ARRAY_CONCAT_AGG(aggregates)) AS histogram_aggregates
-      FROM unnested
+      FROM
+        unnested
       GROUP BY
         first_bucket,
         last_bucket,
@@ -39,9 +45,11 @@
         metric_type,
         key,
         process,
-        agg_type)
-
-      SELECT ARRAY_AGG((
+        agg_type
+    )
+    SELECT
+      ARRAY_AGG(
+        (
         first_bucket,
         last_bucket,
         num_buckets,
@@ -50,26 +58,35 @@
         key,
         process,
         agg_type,
-        histogram_aggregates))
-      FROM aggregated_data
+          histogram_aggregates
+        )
+      )
+    FROM
+      aggregated_data
   )
 );
 
-WITH clients_histogram_aggregates_new AS
-  (SELECT *
-  FROM clients_histogram_aggregates_new_v1
-  WHERE sample_id >= @min_sample_id
-    AND sample_id <= @max_sample_id),
-
-clients_histogram_aggregates_partition AS
-  (SELECT *
-  FROM clients_histogram_aggregates_v1
-  WHERE submission_date = DATE_SUB(@submission_date, INTERVAL 1 DAY)
+WITH clients_histogram_aggregates_new AS (
+  SELECT
+    *
+  FROM
+    `moz-fx-data-shared-prod.telemetry_derived.clients_histogram_aggregates_new_v1`
+  WHERE
+    sample_id >= @min_sample_id
+    AND sample_id <= @max_sample_id
+),
+clients_histogram_aggregates_partition AS (
+  SELECT
+    *
+  FROM
+    `moz-fx-data-shared-prod.telemetry_derived.clients_histogram_aggregates_v1`
+  WHERE
+    submission_date = DATE_SUB(@submission_date, INTERVAL 1 DAY)
     AND sample_id >= @min_sample_id
-    AND sample_id <= @max_sample_id),
-
-clients_histogram_aggregates_old AS
-  (SELECT
+    AND sample_id <= @max_sample_id
+),
+clients_histogram_aggregates_old AS (
+  SELECT
     sample_id,
     client_id,
     os,
@@ -78,13 +95,16 @@
     hist_aggs.channel AS channel,
     CONCAT(client_id, os, app_version, app_build_id, hist_aggs.channel) AS join_key,
     histogram_aggregates
-  FROM clients_histogram_aggregates_partition AS hist_aggs
-  LEFT JOIN latest_versions
+  FROM
+    clients_histogram_aggregates_partition AS hist_aggs
+  LEFT JOIN
+    `moz-fx-data-shared-prod.telemetry_derived.latest_versions` AS latest_versions
   ON latest_versions.channel = hist_aggs.channel
-  WHERE app_version >= (latest_version - 2)),
-
-merged AS
-  (SELECT
+  WHERE
+    app_version >= (latest_version - 2)
+),
+merged AS (
+  SELECT
     COALESCE(old_data.sample_id, new_data.sample_id) AS sample_id,
     COALESCE(old_data.client_id, new_data.client_id) AS client_id,
     COALESCE(old_data.os, new_data.os) AS os,
@@ -103,12 +123,15 @@
       process,
       agg_type,
       aggregates
-    FROM UNNEST(new_data.histogram_aggregates)
+      FROM
+        UNNEST(new_data.histogram_aggregates)
     ) AS new_aggs
-  FROM clients_histogram_aggregates_old AS old_data
-  FULL OUTER JOIN clients_histogram_aggregates_new AS new_data
-  ON new_data.join_key = old_data.join_key)
-
+  FROM
+    clients_histogram_aggregates_old AS old_data
+  FULL OUTER JOIN
+    clients_histogram_aggregates_new AS new_data
+    ON new_data.join_key = old_data.join_key
+)
 SELECT
   @submission_date AS submission_date,
   sample_id,
@@ -118,4 +141,5 @@
   app_build_id,
   channel,
   udf_merged_user_data(old_aggs, new_aggs) AS histogram_aggregates
-FROM merged
+FROM
+  merged
diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/telemetry_derived/clients_histogram_aggregates_v2/metadata.yaml /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/telemetry_derived/clients_histogram_aggregates_v2/metadata.yaml
--- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/telemetry_derived/clients_histogram_aggregates_v2/metadata.yaml	2024-05-13 18:47:58.000000000 +0000
+++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/telemetry_derived/clients_histogram_aggregates_v2/metadata.yaml	2024-05-13 18:48:02.000000000 +0000
@@ -12,7 +12,6 @@
   - workgroup:mozilla-confidential
 references:
   query.sql:
-  - clients_histogram_aggregates_v2
-  - latest_versions
-  - telemetry_derived.clients_histogram_aggregates_new_v1
-  - telemetry_derived.clients_histogram_aggregates_v2
+  - moz-fx-data-shared-prod.telemetry_derived.clients_histogram_aggregates_new_v1
+  - moz-fx-data-shared-prod.telemetry_derived.clients_histogram_aggregates_v2
+  - moz-fx-data-shared-prod.telemetry_derived.latest_versions
diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/telemetry_derived/clients_histogram_aggregates_v2/query.sql /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/telemetry_derived/clients_histogram_aggregates_v2/query.sql
--- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/telemetry_derived/clients_histogram_aggregates_v2/query.sql	2024-05-13 18:46:21.000000000 +0000
+++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/telemetry_derived/clients_histogram_aggregates_v2/query.sql	2024-05-13 18:46:20.000000000 +0000
@@ -86,10 +86,12 @@
 WITH preconditions AS (
   SELECT
     IF(
-      (SELECT MAX(submission_date) FROM clients_histogram_aggregates_v2) = DATE_SUB(
-        DATE(@submission_date),
-        INTERVAL 1 DAY
-      ),
+      (
+        SELECT
+          MAX(submission_date)
+        FROM
+          `moz-fx-data-shared-prod.telemetry_derived.clients_histogram_aggregates_v2`
+      ) = DATE_SUB(DATE(@submission_date), INTERVAL 1 DAY),
       TRUE,
       ERROR('Pre-condition failed: Current submission_date parameter skips a day or more of data.')
     ) histogram_aggregates_up_to_date
@@ -98,7 +100,7 @@
   SELECT
     * EXCEPT (histogram_aggregates_up_to_date)
   FROM
-    telemetry_derived.clients_histogram_aggregates_new_v1,
+    `moz-fx-data-shared-prod.telemetry_derived.clients_histogram_aggregates_new_v1`,
     preconditions
   WHERE
     preconditions.histogram_aggregates_up_to_date
@@ -109,7 +111,9 @@
   SELECT
     *
   FROM
-    telemetry_derived.clients_histogram_aggregates_v2
+    `moz-fx-data-shared-prod.telemetry_derived.clients_histogram_aggregates_v2`
+  WHERE
+    submission_date = DATE_SUB(DATE(@submission_date), INTERVAL 1 DAY)
 ),
 clients_histogram_aggregates_old AS (
   SELECT
@@ -125,7 +129,7 @@
   FROM
     clients_histogram_aggregates_partition AS hist_aggs
   LEFT JOIN
-    latest_versions
+    `moz-fx-data-shared-prod.telemetry_derived.latest_versions` AS latest_versions
     ON latest_versions.channel = hist_aggs.channel
   WHERE
     app_version >= (latest_version - 2)
diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/telemetry_derived/clients_histogram_bucket_counts_v1/metadata.yaml /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/telemetry_derived/clients_histogram_bucket_counts_v1/metadata.yaml
--- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/telemetry_derived/clients_histogram_bucket_counts_v1/metadata.yaml	2024-05-13 18:47:58.000000000 +0000
+++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/telemetry_derived/clients_histogram_bucket_counts_v1/metadata.yaml	2024-05-13 18:48:03.000000000 +0000
@@ -12,4 +12,4 @@
   - workgroup:mozilla-confidential
 references:
   query.sql:
-  - clients_histogram_aggregates_v2
+  - moz-fx-data-shared-prod.telemetry_derived.clients_histogram_aggregates_v2
diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/telemetry_derived/clients_histogram_bucket_counts_v1/query.sql /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/telemetry_derived/clients_histogram_bucket_counts_v1/query.sql
--- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/telemetry_derived/clients_histogram_bucket_counts_v1/query.sql	2024-05-13 18:46:21.000000000 +0000
+++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/telemetry_derived/clients_histogram_bucket_counts_v1/query.sql	2024-05-13 18:46:20.000000000 +0000
@@ -18,7 +18,7 @@
     os = 'Windows'
     AND channel = 'release' AS sampled
   FROM
-    clients_histogram_aggregates_v2
+    `moz-fx-data-shared-prod.telemetry_derived.clients_histogram_aggregates_v2`
   CROSS JOIN
     UNNEST(histogram_aggregates)
   WHERE
diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/telemetry_derived/clients_histogram_probe_counts_v1/metadata.yaml /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/telemetry_derived/clients_histogram_probe_counts_v1/metadata.yaml
--- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/telemetry_derived/clients_histogram_probe_counts_v1/metadata.yaml	2024-05-13 18:47:58.000000000 +0000
+++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/telemetry_derived/clients_histogram_probe_counts_v1/metadata.yaml	2024-05-13 18:48:03.000000000 +0000
@@ -12,5 +12,5 @@
   - workgroup:mozilla-confidential
 references:
   query.sql:
-  - clients_histogram_bucket_counts_v1
-  - clients_non_norm_histogram_bucket_counts_v1
+  - moz-fx-data-shared-prod.telemetry_derived.clients_histogram_bucket_counts_v1
+  - moz-fx-data-shared-prod.telemetry_derived.clients_non_norm_histogram_bucket_counts_v1
diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/telemetry_derived/clients_histogram_probe_counts_v1/query.sql /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/telemetry_derived/clients_histogram_probe_counts_v1/query.sql
--- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/telemetry_derived/clients_histogram_probe_counts_v1/query.sql	2024-05-13 18:46:21.000000000 +0000
+++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/telemetry_derived/clients_histogram_probe_counts_v1/query.sql	2024-05-13 18:46:20.000000000 +0000
@@ -1,26 +1,23 @@
-CREATE TEMP FUNCTION
-  udf_get_buckets(min INT64,
-    max INT64,
-    num INT64,
-    metric_type STRING)
-  RETURNS ARRAY<INT64> AS ( (
-    WITH
-      buckets AS (
+CREATE TEMP FUNCTION udf_get_buckets(min INT64, max INT64, num INT64, metric_type STRING)
+RETURNS ARRAY<INT64> AS (
+  (
+    WITH buckets AS (
       SELECT
         CASE
-          WHEN metric_type = 'histogram-exponential' THEN mozfun.glam.histogram_generate_exponential_buckets(min, max, num)
-        ELSE
-        mozfun.glam.histogram_generate_linear_buckets(min,
-          max,
-          num)
-      END
-        AS arr )
+          WHEN metric_type = 'histogram-exponential'
+            THEN mozfun.glam.histogram_generate_exponential_buckets(min, max, num)
+          ELSE mozfun.glam.histogram_generate_linear_buckets(min, max, num)
+        END AS arr
+    )
     SELECT
       ARRAY_AGG(CAST(item AS INT64))
     FROM
       buckets
     CROSS JOIN
-      UNNEST(arr) AS item ) );
+      UNNEST(arr) AS item
+  )
+);
+
 WITH aggregates AS (
   SELECT
     os,
@@ -37,14 +34,15 @@
     agg_type AS client_agg_type,
     'histogram' AS agg_type,
     CAST(ROUND(SUM(record.value)) AS INT64) AS total_users,
-    mozfun.glam.histogram_fill_buckets_dirichlet( mozfun.map.sum(ARRAY_AGG(record)),
-      mozfun.glam.histogram_buckets_cast_string_array(udf_get_buckets(first_bucket,
-          MAX(last_bucket),
-          MAX(num_buckets),
-          metric_type)),
-      CAST(ROUND(SUM(record.value)) AS INT64) ) AS aggregates
+    mozfun.glam.histogram_fill_buckets_dirichlet(
+      mozfun.map.sum(ARRAY_AGG(record)),
+      mozfun.glam.histogram_buckets_cast_string_array(
+        udf_get_buckets(first_bucket, MAX(last_bucket), MAX(num_buckets), metric_type)
+      ),
+      CAST(ROUND(SUM(record.value)) AS INT64)
+    ) AS aggregates
   FROM
-      clients_histogram_bucket_counts_v1 AS bucket_counts
+    `moz-fx-data-shared-prod.telemetry_derived.clients_histogram_bucket_counts_v1` AS bucket_counts
   GROUP BY
     os,
     app_version,
@@ -55,8 +53,9 @@
     KEY,
     process,
     client_agg_type,
-    first_bucket ),
-  non_norm_aggregates AS (
+    first_bucket
+),
+non_norm_aggregates AS (
   SELECT
     os,
     app_version,
@@ -71,13 +70,14 @@
     MAX(num_buckets) AS num_buckets,
     agg_type AS client_agg_type,
     'histogram' AS agg_type,
-    mozfun.glam.histogram_fill_buckets( mozfun.map.sum(ARRAY_AGG(record)),
-      mozfun.glam.histogram_buckets_cast_string_array(udf_get_buckets(first_bucket,
-          MAX(last_bucket),
-          MAX(num_buckets),
-          metric_type))) AS non_norm_aggregates,
+    mozfun.glam.histogram_fill_buckets(
+      mozfun.map.sum(ARRAY_AGG(record)),
+      mozfun.glam.histogram_buckets_cast_string_array(
+        udf_get_buckets(first_bucket, MAX(last_bucket), MAX(num_buckets), metric_type)
+      )
+    ) AS non_norm_aggregates,
   FROM
-    clients_non_norm_histogram_bucket_counts_v1 AS non_norm_bucket_counts
+    `moz-fx-data-shared-prod.telemetry_derived.clients_non_norm_histogram_bucket_counts_v1` AS non_norm_bucket_counts
   GROUP BY
     os,
     app_version,
@@ -88,14 +88,12 @@
     KEY,
     process,
     client_agg_type,
-    first_bucket)
-
-  SELECT
-    IF
-      (os = '*', NULL, os) AS os,
+    first_bucket
+)
+SELECT
+  IF(os = '*', NULL, os) AS os,
     app_version,
-    IF
-      (app_build_id = '*', NULL, app_build_id) AS app_build_id,
+  IF(app_build_id = '*', NULL, app_build_id) AS app_build_id,
     channel,
     metric,
     metric_type,
@@ -109,7 +107,10 @@
     aggregates.total_users,
     aggregates.aggregates,
     non_norm_aggregates.non_norm_aggregates
-  FROM aggregates INNER JOIN non_norm_aggregates
+FROM
+  aggregates
+INNER JOIN
+  non_norm_aggregates
   USING (
     os,
     app_version,
diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/telemetry_derived/clients_non_norm_histogram_bucket_counts_v1/query.sql /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/telemetry_derived/clients_non_norm_histogram_bucket_counts_v1/query.sql
--- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/telemetry_derived/clients_non_norm_histogram_bucket_counts_v1/query.sql	2024-05-13 18:46:21.000000000 +0000
+++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/telemetry_derived/clients_non_norm_histogram_bucket_counts_v1/query.sql	2024-05-13 18:46:20.000000000 +0000
@@ -57,15 +57,15 @@
     -- for context see https://github.com/mozilla/glam/issues/1575#issuecomment-946880387
     CASE
       WHEN channel = 'release'
-        THEN COUNT(DISTINCT client_id) > 625000/(@max_sample_id - @min_sample_id + 1)
+        THEN COUNT(DISTINCT client_id) > 625000 / (@max_sample_id - @min_sample_id + 1)
       WHEN channel = 'beta'
-        THEN COUNT(DISTINCT client_id) > 9000/(@max_sample_id - @min_sample_id + 1)
+        THEN COUNT(DISTINCT client_id) > 9000 / (@max_sample_id - @min_sample_id + 1)
       WHEN channel = 'nightly'
-        THEN COUNT(DISTINCT client_id) > 375/(@max_sample_id - @min_sample_id + 1)
-      ELSE COUNT(DISTINCT client_id) > 100/(@max_sample_id - @min_sample_id + 1)
+        THEN COUNT(DISTINCT client_id) > 375 / (@max_sample_id - @min_sample_id + 1)
+      ELSE COUNT(DISTINCT client_id) > 100 / (@max_sample_id - @min_sample_id + 1)
     END
 ),
-all_combos as (
+all_combos AS (
   SELECT
     * EXCEPT (os, app_build_id),
     COALESCE(combo.os, table.os) AS os,
@@ -80,12 +80,11 @@
 ),
 non_normalized_histograms AS (
   SELECT
-    * EXCEPT (sampled) REPLACE(
-        mozfun.map.sum(ARRAY_CONCAT_AGG(aggregates)) AS aggregates
-      )
+    * EXCEPT (sampled) REPLACE(mozfun.map.sum(ARRAY_CONCAT_AGG(aggregates)) AS aggregates)
   FROM
     all_combos
-  WHERE sample_id >= @min_sample_id
+  WHERE
+    sample_id >= @min_sample_id
     AND sample_id <= @max_sample_id
   GROUP BY
     sample_id,
diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/telemetry_derived/clients_scalar_aggregates_v1/query.sql /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/telemetry_derived/clients_scalar_aggregates_v1/query.sql
--- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/telemetry_derived/clients_scalar_aggregates_v1/query.sql	2024-05-13 18:46:21.000000000 +0000
+++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/telemetry_derived/clients_scalar_aggregates_v1/query.sql	2024-05-13 18:46:20.000000000 +0000
@@ -3,7 +3,7 @@
     * EXCEPT (app_version),
     CAST(app_version AS INT64) AS app_version
   FROM
-    telemetry_derived.clients_daily_scalar_aggregates_v1
+    `moz-fx-data-shared-prod.telemetry_derived.clients_daily_scalar_aggregates_v1`
   WHERE
     submission_date = @submission_date
 ),
@@ -45,7 +45,7 @@
   FROM
     filtered_aggregates AS scalar_aggs
   LEFT JOIN
-    latest_versions
+    `moz-fx-data-shared-prod.telemetry_derived.latest_versions` AS latest_versions
     USING (channel)
   WHERE
     app_version >= (latest_version - 2)
@@ -112,9 +112,9 @@
     scalar_aggs.channel,
     scalar_aggregates
   FROM
-    telemetry_derived.clients_scalar_aggregates_v1 AS scalar_aggs
+    `moz-fx-data-shared-prod.telemetry_derived.clients_scalar_aggregates_v1` AS scalar_aggs
   LEFT JOIN
-    latest_versions
+    `moz-fx-data-shared-prod.telemetry_derived.latest_versions` AS latest_versions
     USING (channel)
   WHERE
     app_version >= (latest_version - 2)
@@ -142,6 +142,8 @@
   app_version,
   app_build_id,
   channel,
-  udf.merge_scalar_user_data(ARRAY_CONCAT(old_aggs, new_aggs)) AS scalar_aggregates
+  `moz-fx-data-shared-prod`.udf.merge_scalar_user_data(
+    ARRAY_CONCAT(old_aggs, new_aggs)
+  ) AS scalar_aggregates
 FROM
   joined_new_old
diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/telemetry_derived/clients_scalar_probe_counts_v1/metadata.yaml /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/telemetry_derived/clients_scalar_probe_counts_v1/metadata.yaml
--- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/telemetry_derived/clients_scalar_probe_counts_v1/metadata.yaml	2024-05-13 18:47:58.000000000 +0000
+++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/telemetry_derived/clients_scalar_probe_counts_v1/metadata.yaml	2024-05-13 18:48:02.000000000 +0000
@@ -12,4 +12,4 @@
   - workgroup:mozilla-confidential
 references:
   query.sql:
-  - clients_scalar_aggregates_v1
+  - moz-fx-data-shared-prod.telemetry_derived.clients_scalar_aggregates_v1
diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/telemetry_derived/clients_scalar_probe_counts_v1/query.sql /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/telemetry_derived/clients_scalar_probe_counts_v1/query.sql
--- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/telemetry_derived/clients_scalar_probe_counts_v1/query.sql	2024-05-13 18:46:21.000000000 +0000
+++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/telemetry_derived/clients_scalar_probe_counts_v1/query.sql	2024-05-13 18:46:20.000000000 +0000
@@ -95,7 +95,7 @@
     os = 'Windows'
     AND channel = 'release' AS sampled,
   FROM
-    clients_scalar_aggregates_v1
+    `moz-fx-data-shared-prod.telemetry_derived.clients_scalar_aggregates_v1`
   WHERE
     submission_date = @submission_date
     AND (@app_version IS NULL OR app_version = @app_version)
@@ -167,7 +167,9 @@
     IF(app_build_id = '*', NULL, app_build_id) AS app_build_id,
     channel,
     IF(MAX(sampled), 10, 1) AS user_count,
-    udf.merge_scalar_user_data(ARRAY_CONCAT_AGG(scalar_aggregates)) AS scalar_aggregates
+    `moz-fx-data-shared-prod`.udf.merge_scalar_user_data(
+      ARRAY_CONCAT_AGG(scalar_aggregates)
+    ) AS scalar_aggregates
   FROM
     all_combos
   GROUP BY
@@ -261,8 +263,10 @@
     bucketed_scalars
 ),
 valid_booleans_scalars AS (
-  SELECT *
-  FROM booleans_and_scalars
+  SELECT
+    *
+  FROM
+    booleans_and_scalars
   INNER JOIN
     build_ids
     USING (app_build_id, channel)
diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/telemetry_derived/glam_sample_counts_v1/metadata.yaml /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/telemetry_derived/glam_sample_counts_v1/metadata.yaml
--- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/telemetry_derived/glam_sample_counts_v1/metadata.yaml	2024-05-13 18:47:58.000000000 +0000
+++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/telemetry_derived/glam_sample_counts_v1/metadata.yaml	2024-05-13 18:48:03.000000000 +0000
@@ -12,5 +12,5 @@
   - workgroup:mozilla-confidential
 references:
   query.sql:
-  - clients_histogram_aggregates_v2
-  - clients_scalar_aggregates_v1
+  - moz-fx-data-shared-prod.telemetry_derived.clients_histogram_aggregates_v2
+  - moz-fx-data-shared-prod.telemetry_derived.clients_scalar_aggregates_v1
diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/telemetry_derived/glam_sample_counts_v1/query.sql /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/telemetry_derived/glam_sample_counts_v1/query.sql
--- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/telemetry_derived/glam_sample_counts_v1/query.sql	2024-05-13 18:46:21.000000000 +0000
+++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/telemetry_derived/glam_sample_counts_v1/query.sql	2024-05-13 18:46:20.000000000 +0000
@@ -11,7 +11,7 @@
     h1.aggregates,
     IF(os = 'Windows' AND channel = 'release', 10, 1) AS sample_mult
   FROM
-    clients_histogram_aggregates_v2,
+    `moz-fx-data-shared-prod.telemetry_derived.clients_histogram_aggregates_v2`,
     UNNEST(histogram_aggregates) h1
   WHERE
     submission_date = @submission_date
@@ -25,7 +25,7 @@
     scalar_aggregates,
     IF(os = 'Windows' AND channel = 'release', 10, 1) AS sample_mult
   FROM
-    clients_scalar_aggregates_v1
+    `moz-fx-data-shared-prod.telemetry_derived.clients_scalar_aggregates_v1`
   WHERE
     submission_date = @submission_date
 )
diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/telemetry_derived/glam_user_counts_v1/metadata.yaml /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/telemetry_derived/glam_user_counts_v1/metadata.yaml
--- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/telemetry_derived/glam_user_counts_v1/metadata.yaml	2024-05-13 18:47:58.000000000 +0000
+++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/telemetry_derived/glam_user_counts_v1/metadata.yaml	2024-05-13 18:48:02.000000000 +0000
@@ -12,5 +12,5 @@
   - workgroup:mozilla-confidential
 references:
   query.sql:
-  - clients_histogram_aggregates_v2
-  - clients_scalar_aggregates_v1
+  - moz-fx-data-shared-prod.telemetry_derived.clients_histogram_aggregates_v2
+  - moz-fx-data-shared-prod.telemetry_derived.clients_scalar_aggregates_v1
diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/telemetry_derived/glam_user_counts_v1/query.sql /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/telemetry_derived/glam_user_counts_v1/query.sql
--- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/telemetry_derived/glam_user_counts_v1/query.sql	2024-05-13 18:46:21.000000000 +0000
+++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/telemetry_derived/glam_user_counts_v1/query.sql	2024-05-13 18:46:20.000000000 +0000
@@ -5,21 +5,22 @@
     app_version,
     app_build_id,
     channel
-  FROM clients_scalar_aggregates_v1
-  WHERE submission_date = @submission_date
-
+  FROM
+    `moz-fx-data-shared-prod.telemetry_derived.clients_scalar_aggregates_v1`
+  WHERE
+    submission_date = @submission_date
   UNION ALL
-
   SELECT
     client_id,
     os,
     app_version,
     app_build_id,
     channel
-  FROM clients_histogram_aggregates_v2
-  WHERE submission_date = @submission_date
+  FROM
+    `moz-fx-data-shared-prod.telemetry_derived.clients_histogram_aggregates_v2`
+  WHERE
+    submission_date = @submission_date
 )
-
 SELECT
   os,
   app_version,
@@ -33,11 +34,9 @@
   app_version,
   app_build_id,
   channel
-
 UNION ALL
-
 SELECT
-  CAST(NULL AS STRING) as os,
+  CAST(NULL AS STRING) AS os,
   app_version,
   app_build_id,
   channel,
@@ -48,9 +47,7 @@
   app_version,
   app_build_id,
   channel
-
 UNION ALL
-
 SELECT
   os,
   CAST(NULL AS INT64) AS app_version,
@@ -63,9 +60,7 @@
   os,
   app_build_id,
   channel
-
 UNION ALL
-
 SELECT
   os,
   app_version,
@@ -78,9 +73,7 @@
   os,
   app_version,
   channel
-
 UNION ALL
-
 SELECT
   os,
   CAST(NULL AS INT64) AS app_version,
@@ -92,9 +85,7 @@
 GROUP BY
   os,
   channel
-
 UNION ALL
-
 SELECT
   CAST(NULL AS STRING) AS os,
   app_version,
@@ -106,9 +97,7 @@
 GROUP BY
   app_version,
   channel
-
 UNION ALL
-
 SELECT
   CAST(NULL AS STRING) AS os,
   app_version,
@@ -119,9 +108,7 @@
   all_clients
 GROUP BY
   app_version
-
 UNION ALL
-
 SELECT
   os,
   CAST(NULL AS INT64) AS app_version,
@@ -132,9 +119,7 @@
   all_clients
 GROUP BY
   os
-
 UNION ALL
-
 SELECT
   CAST(NULL AS STRING) AS os,
   CAST(NULL AS INT64) AS app_version,
@@ -145,9 +130,7 @@
   all_clients
 GROUP BY
   channel
-
 UNION ALL
-
 SELECT
   CAST(NULL AS STRING) AS os,
   CAST(NULL AS INT64) AS app_version,
diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/telemetry_derived/histogram_percentiles_v1/metadata.yaml /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/telemetry_derived/histogram_percentiles_v1/metadata.yaml
--- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/telemetry_derived/histogram_percentiles_v1/metadata.yaml	2024-05-13 18:47:58.000000000 +0000
+++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/telemetry_derived/histogram_percentiles_v1/metadata.yaml	2024-05-13 18:48:03.000000000 +0000
@@ -12,4 +12,4 @@
   - workgroup:mozilla-confidential
 references:
   query.sql:
-  - clients_histogram_probe_counts_v1
+  - moz-fx-data-shared-prod.telemetry_derived.clients_histogram_probe_counts_v1
diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/telemetry_derived/histogram_percentiles_v1/query.sql /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/telemetry_derived/histogram_percentiles_v1/query.sql
--- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/telemetry_derived/histogram_percentiles_v1/query.sql	2024-05-13 18:46:21.000000000 +0000
+++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/telemetry_derived/histogram_percentiles_v1/query.sql	2024-05-13 18:46:20.000000000 +0000
@@ -23,4 +23,4 @@
     ('99.9', mozfun.glam.percentile(99.9, non_norm_aggregates, metric_type))
   ] AS non_norm_aggregates
 FROM
-  clients_histogram_probe_counts_v1
+  `moz-fx-data-shared-prod.telemetry_derived.clients_histogram_probe_counts_v1`
diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/telemetry_derived/scalar_percentiles_v1/metadata.yaml /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/telemetry_derived/scalar_percentiles_v1/metadata.yaml
--- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/telemetry_derived/scalar_percentiles_v1/metadata.yaml	2024-05-13 18:47:58.000000000 +0000
+++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/telemetry_derived/scalar_percentiles_v1/metadata.yaml	2024-05-13 18:48:03.000000000 +0000
@@ -12,4 +12,4 @@
   - workgroup:mozilla-confidential
 references:
   query.sql:
-  - clients_scalar_aggregates_v1
+  - moz-fx-data-shared-prod.telemetry_derived.clients_scalar_aggregates_v1
diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/telemetry_derived/scalar_percentiles_v1/query.sql /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/telemetry_derived/scalar_percentiles_v1/query.sql
--- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/telemetry_derived/scalar_percentiles_v1/query.sql	2024-05-13 18:46:21.000000000 +0000
+++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/telemetry_derived/scalar_percentiles_v1/query.sql	2024-05-13 18:46:20.000000000 +0000
@@ -1,36 +1,41 @@
 WITH flat_clients_scalar_aggregates AS (
-  SELECT *,
-    os = 'Windows' and channel = 'release' AS sampled,
+  SELECT
+    *,
+    os = 'Windows'
+    AND channel = 'release' AS sampled,
   FROM
-    clients_scalar_aggregates_v1
+    `moz-fx-data-shared-prod.telemetry_derived.clients_scalar_aggregates_v1`
   WHERE
     submission_date = @submission_date
-    AND (
-      @app_version IS NULL
-      OR app_version = @app_version
-    )
+    AND (@app_version IS NULL OR app_version = @app_version)
 ),
-
-static_combos as (
-  SELECT null as os, null as app_build_id
+static_combos AS (
+  SELECT
+    NULL AS os,
+    NULL AS app_build_id
   UNION ALL
-  SELECT null as os, '*' as app_build_id
+  SELECT
+    NULL AS os,
+    '*' AS app_build_id
   UNION ALL
-  SELECT '*' as os, null as app_build_id
+  SELECT
+    '*' AS os,
+    NULL AS app_build_id
   UNION ALL
-  SELECT '*' as os, '*' as app_build_id
+  SELECT
+    '*' AS os,
+    '*' AS app_build_id
 ),
-
 all_combos AS (
   SELECT
-    * EXCEPT(os, app_build_id),
-    COALESCE(combos.os, flat_table.os) as os,
-    COALESCE(combos.app_build_id, flat_table.app_build_id) as app_build_id
+    * EXCEPT (os, app_build_id),
+    COALESCE(combos.os, flat_table.os) AS os,
+    COALESCE(combos.app_build_id, flat_table.app_build_id) AS app_build_id
   FROM
      flat_clients_scalar_aggregates flat_table
   CROSS JOIN
-     static_combos combos),
-
+    static_combos combos
+),
 user_aggregates AS (
   SELECT
     client_id,
@@ -39,7 +44,9 @@
     IF(app_build_id = '*', NULL, app_build_id) AS app_build_id,
     channel,
     IF(MAX(sampled), 10, 1) AS user_count,
-    udf.merge_scalar_user_data(ARRAY_CONCAT_AGG(scalar_aggregates)) AS scalar_aggregates
+    `moz-fx-data-shared-prod`.udf.merge_scalar_user_data(
+      ARRAY_CONCAT_AGG(scalar_aggregates)
+    ) AS scalar_aggregates
   FROM
     all_combos
   GROUP BY
@@ -47,8 +54,8 @@
     os,
     app_version,
     app_build_id,
-    channel),
-
+    channel
+),
 percentiles AS (
   SELECT
     os,
@@ -69,7 +76,8 @@
     APPROX_QUANTILES(value, 1000)  AS aggregates
   FROM
     user_aggregates
-  CROSS JOIN UNNEST(scalar_aggregates)
+  CROSS JOIN
+    UNNEST(scalar_aggregates)
   GROUP BY
     os,
     app_version,
@@ -82,12 +90,15 @@
     client_agg_type
 ),
 aggregated AS (
-  SELECT *
-  REPLACE(mozfun.glam.map_from_array_offsets_precise(
+  SELECT
+    * REPLACE (
+      mozfun.glam.map_from_array_offsets_precise(
     [0.1, 1.0, 5.0, 25.0, 50.0, 75.0, 95.0, 99.0, 99.9],
     aggregates
-  ) AS aggregates)
-  FROM percentiles
+      ) AS aggregates
+    )
+  FROM
+    percentiles
 )
 SELECT
   *,

Link to full diff

@edugfilho edugfilho merged commit 8bd936e into main May 13, 2024
20 of 21 checks passed
@edugfilho edugfilho deleted the glam-fully-qual-tbls branch May 13, 2024 18:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants