Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DENG-3889] Rewrite events_stream_v1 for better performance #5659

Merged
merged 2 commits into from
May 27, 2024

Conversation

BenWu
Copy link
Contributor

@BenWu BenWu commented May 24, 2024

DENG-3889

Changes:

  • replace two (out of five) of the javascript udfs usages with a sql udf
  • rearrange ctes so most of the udfs are called per event ping rather than on the unnested events

performance

1% sample (sample_id = 1) on firefox desktop for 2024-05-22:

changes slot hours job id (backfill-2)
base 24.5 bquxjob_7a85e33a_18fab4919e3
rearranged 5.7 bquxjob_4ababcd5_18fab3b62ea
sql udfs 4.3 bquxjob_129253ac_18fab52d5c9
rearranged + sql udfs 3.4 bquxjob_4145760d_18fab765f03

100% of firefox desktop for 2024-05-22 (both overwriting a clustered partition):

changes slot hours job id (backfill-2)
base (airflow run) 3910 😵‍💫 bqjob_r6deaccf950e8d7f5_0000018fa35c4fce_1
rearranged + sql udfs 699 bquxjob_5293faeb_18fab5ab707

I'm mostly confident the output is equivalent because this doesn't throw any errors:

SELECT
  mozfun.assert.json_equals(from_map_event_extra(event.extra), mozfun.json.from_map(event.extra)),
  mozfun.assert.json_equals(from_map_experiment(ping_info.experiments), mozfun.json.from_map(ping_info.experiments)),
FROM
  `moz-fx-data-shared-prod.firefox_desktop_stable.events_v1`
CROSS JOIN 
  UNNEST(events) AS event
WHERE 
  DATE(submission_timestamp) IN ('2024-05-22')
  AND sample_id = 1

Also ran this for fenix and ios

Checklist for reviewer:

  • Commits should reference a bug or github issue, if relevant (if a bug is referenced, the pull request should include the bug number in the title).
  • If the PR comes from a fork, trigger integration CI tests by running the Push to upstream workflow and provide the <username>:<branch> of the fork as parameter. The parameter will also show up
    in the logs of the manual-trigger-required-for-fork CI task together with more detailed instructions.
  • If adding a new field to a query, ensure that the schema and dependent downstream schemas have been updated.
  • When adding a new derived dataset, ensure that data is not available already (fully or partially) and recommend extending an existing dataset in favor of creating new ones. Data can be available in the bigquery-etl repository, looker-hub or in looker-spoke-default.

For modifications to schemas in restricted namespaces (see CODEOWNERS):

┆Issue is synchronized with this Jira Task

@BenWu
Copy link
Contributor Author

BenWu commented May 24, 2024

cc @badboy as a query owner

Comment on lines +3 to +24
CREATE TEMP FUNCTION from_map_event_extra(input ARRAY<STRUCT<key STRING, value STRING>>)
RETURNS json AS (
IF(
ARRAY_LENGTH(input) = 0,
NULL,
JSON_OBJECT(
ARRAY(SELECT key FROM UNNEST(input)),
ARRAY(
SELECT
CASE
WHEN SAFE_CAST(value AS NUMERIC) IS NOT NULL
THEN TO_JSON(SAFE_CAST(value AS NUMERIC))
WHEN SAFE_CAST(value AS BOOL) IS NOT NULL
THEN TO_JSON(SAFE_CAST(value AS BOOL))
ELSE TO_JSON(value)
END
FROM
UNNEST(input)
)
)
)
);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

THIS is the bigquery SQL I was looking for. I couldn't get it done, which is the sole reason I wrote that that JS UDF. Glad someone else is better at it then I am.

@badboy
Copy link
Member

badboy commented May 27, 2024

Fantastic!

@dataops-ci-bot
Copy link

Integration report for "Merge branch 'main' into benwu/events-stream-udf"

sql.diff

Click to expand!
Only in /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/telemetry/desktop_retention_clients: schema.yaml
diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/accounts_backend_derived/event_monitoring_live_v1/materialized_view.sql /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/accounts_backend_derived/event_monitoring_live_v1/materialized_view.sql
--- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/accounts_backend_derived/event_monitoring_live_v1/materialized_view.sql	2024-05-27 14:55:32.000000000 +0000
+++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/accounts_backend_derived/event_monitoring_live_v1/materialized_view.sql	2024-05-27 14:57:24.000000000 +0000
@@ -60,7 +60,7 @@
   LEFT JOIN
     UNNEST(event.extra) AS event_extra
   WHERE
-    DATE(submission_timestamp) >= "2024-05-24"
+    DATE(submission_timestamp) >= "2024-05-27"
   GROUP BY
     submission_date,
     window_start,
diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/accounts_backend_derived/events_stream_v1/query.sql /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/accounts_backend_derived/events_stream_v1/query.sql
--- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/accounts_backend_derived/events_stream_v1/query.sql	2024-05-27 14:55:32.000000000 +0000
+++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/accounts_backend_derived/events_stream_v1/query.sql	2024-05-27 14:57:25.000000000 +0000
@@ -1,7 +1,73 @@
 -- Generated via bigquery_etl.glean_usage
+-- convert array of key value pairs to a json object, cast numbers and booleans if possible
+CREATE TEMP FUNCTION from_map_event_extra(input ARRAY<STRUCT<key STRING, value STRING>>)
+RETURNS json AS (
+  IF(
+    ARRAY_LENGTH(input) = 0,
+    NULL,
+    JSON_OBJECT(
+      ARRAY(SELECT key FROM UNNEST(input)),
+      ARRAY(
+        SELECT
+          CASE
+            WHEN SAFE_CAST(value AS NUMERIC) IS NOT NULL
+              THEN TO_JSON(SAFE_CAST(value AS NUMERIC))
+            WHEN SAFE_CAST(value AS BOOL) IS NOT NULL
+              THEN TO_JSON(SAFE_CAST(value AS BOOL))
+            ELSE TO_JSON(value)
+          END
+        FROM
+          UNNEST(input)
+      )
+    )
+  )
+);
+
+-- convert array of key value pairs to a json object
+-- values are nested structs and will be converted to json objects
+CREATE TEMP FUNCTION from_map_experiment(
+  input ARRAY<
+    STRUCT<key STRING, value STRUCT<branch STRING, extra STRUCT<type STRING, enrollment_id STRING>>>
+  >
+)
+RETURNS json AS (
+  IF(
+    ARRAY_LENGTH(input) = 0,
+    NULL,
+    JSON_OBJECT(ARRAY(SELECT key FROM UNNEST(input)), ARRAY(SELECT value FROM UNNEST(input)))
+  )
+);
+
+CREATE TEMP FUNCTION metrics_to_json(metrics JSON)
+RETURNS JSON AS (
+  JSON_STRIP_NULLS(
+    JSON_REMOVE(
+      -- labeled_* are the only ones that SHOULD show up as context for events pings,
+      -- thus we special-case them
+      --
+      -- The JSON_SET/JSON_EXTRACT shenanigans are needed
+      -- because those subfields might not exist, so accessing the columns would fail.
+      -- but accessing non-existent fields in a JSON object simply gives us NULL.
+      JSON_SET(
+        metrics,
+        '$.labeled_counter',
+        mozfun.json.from_nested_map(metrics.labeled_counter),
+        '$.labeled_string',
+        mozfun.json.from_nested_map(metrics.labeled_string),
+        '$.labeled_boolean',
+        mozfun.json.from_nested_map(metrics.labeled_boolean),
+        '$.url',
+        metrics.url2
+      ),
+      '$.url2'
+    ),
+    remove_empty => TRUE
+  )
+);
+
 WITH base AS (
   SELECT
-    * EXCEPT (metrics, events, name, category, extra, timestamp) REPLACE(
+    * REPLACE (
       STRUCT(
         client_info.app_build AS app_build,
         client_info.app_channel AS app_channel,
@@ -25,24 +91,14 @@
         ping_info.end_time,
         ping_info.parsed_end_time,
         ping_info.ping_type
-      ) AS ping_info
+      ) AS ping_info,
+      metrics_to_json(TO_JSON(metrics)) AS metrics
     ),
     client_info.client_id AS client_id,
     ping_info.reason AS reason,
-    `mozfun.json.from_map`(ping_info.experiments) AS experiments,
-    COALESCE(
-      SAFE.TIMESTAMP_MILLIS(SAFE_CAST(mozfun.map.get_key(event.extra, 'glean_timestamp') AS INT64)),
-      SAFE.TIMESTAMP_ADD(ping_info.parsed_start_time, INTERVAL event.timestamp MILLISECOND)
-    ) AS event_timestamp,
-    event.category AS event_category,
-    event.name AS event_name,
-    ARRAY_TO_STRING([event.category, event.name], '.') AS event, -- handles NULL values better
-    `mozfun.json.from_map`(event.extra) AS event_extra,
-    TO_JSON(metrics) AS metrics
+    from_map_experiment(ping_info.experiments) AS experiments,
   FROM
-    `accounts_backend.events` AS e
-  CROSS JOIN
-    UNNEST(events) AS event
+    `accounts_backend.events`
   WHERE
     {% if is_init() %}
       DATE(submission_timestamp) >= '2023-11-01'
@@ -52,34 +108,16 @@
 )
 --
 SELECT
-  * REPLACE (
-    -- expose as easy to access JSON column,
-    -- strip nulls,
-    -- translate nested array records into a JSON object,
-    -- rename url2 -> url
-    JSON_STRIP_NULLS(
-      JSON_REMOVE(
-        -- labeled_* are the only ones that SHOULD show up as context for events pings,
-        -- thus we special-case them
-        --
-        -- The JSON_SET/JSON_EXTRACT shenanigans are needed
-        -- because those subfields might not exist, so accessing the columns would fail.
-        -- but accessing non-existent fields in a JSON object simply gives us NULL.
-        JSON_SET(
-          metrics,
-          '$.labeled_counter',
-          mozfun.json.from_nested_map(metrics.labeled_counter),
-          '$.labeled_string',
-          mozfun.json.from_nested_map(metrics.labeled_string),
-          '$.labeled_boolean',
-          mozfun.json.from_nested_map(metrics.labeled_boolean),
-          '$.url',
-          metrics.url2
-        ),
-        '$.url2'
-      ),
-      remove_empty => TRUE
-    ) AS metrics
-  )
+  base.* EXCEPT (events),
+  COALESCE(
+    SAFE.TIMESTAMP_MILLIS(SAFE_CAST(mozfun.map.get_key(event.extra, 'glean_timestamp') AS INT64)),
+    SAFE.TIMESTAMP_ADD(ping_info.parsed_start_time, INTERVAL event.timestamp MILLISECOND)
+  ) AS event_timestamp,
+  event.category AS event_category,
+  event.name AS event_name,
+  ARRAY_TO_STRING([event.category, event.name], '.') AS event, -- handles NULL values better
+  from_map_event_extra(event.extra) AS event_extra,
 FROM
   base
+CROSS JOIN
+  UNNEST(events) AS event
diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/accounts_cirrus_derived/events_stream_v1/query.sql /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/accounts_cirrus_derived/events_stream_v1/query.sql
--- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/accounts_cirrus_derived/events_stream_v1/query.sql	2024-05-27 14:55:32.000000000 +0000
+++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/accounts_cirrus_derived/events_stream_v1/query.sql	2024-05-27 14:57:26.000000000 +0000
@@ -1,7 +1,73 @@
 -- Generated via bigquery_etl.glean_usage
+-- convert array of key value pairs to a json object, cast numbers and booleans if possible
+CREATE TEMP FUNCTION from_map_event_extra(input ARRAY<STRUCT<key STRING, value STRING>>)
+RETURNS json AS (
+  IF(
+    ARRAY_LENGTH(input) = 0,
+    NULL,
+    JSON_OBJECT(
+      ARRAY(SELECT key FROM UNNEST(input)),
+      ARRAY(
+        SELECT
+          CASE
+            WHEN SAFE_CAST(value AS NUMERIC) IS NOT NULL
+              THEN TO_JSON(SAFE_CAST(value AS NUMERIC))
+            WHEN SAFE_CAST(value AS BOOL) IS NOT NULL
+              THEN TO_JSON(SAFE_CAST(value AS BOOL))
+            ELSE TO_JSON(value)
+          END
+        FROM
+          UNNEST(input)
+      )
+    )
+  )
+);
+
+-- convert array of key value pairs to a json object
+-- values are nested structs and will be converted to json objects
+CREATE TEMP FUNCTION from_map_experiment(
+  input ARRAY<
+    STRUCT<key STRING, value STRUCT<branch STRING, extra STRUCT<type STRING, enrollment_id STRING>>>
+  >
+)
+RETURNS json AS (
+  IF(
+    ARRAY_LENGTH(input) = 0,
+    NULL,
+    JSON_OBJECT(ARRAY(SELECT key FROM UNNEST(input)), ARRAY(SELECT value FROM UNNEST(input)))
+  )
+);
+
+CREATE TEMP FUNCTION metrics_to_json(metrics JSON)
+RETURNS JSON AS (
+  JSON_STRIP_NULLS(
+    JSON_REMOVE(
+      -- labeled_* are the only ones that SHOULD show up as context for events pings,
+      -- thus we special-case them
+      --
+      -- The JSON_SET/JSON_EXTRACT shenanigans are needed
+      -- because those subfields might not exist, so accessing the columns would fail.
+      -- but accessing non-existent fields in a JSON object simply gives us NULL.
+      JSON_SET(
+        metrics,
+        '$.labeled_counter',
+        mozfun.json.from_nested_map(metrics.labeled_counter),
+        '$.labeled_string',
+        mozfun.json.from_nested_map(metrics.labeled_string),
+        '$.labeled_boolean',
+        mozfun.json.from_nested_map(metrics.labeled_boolean),
+        '$.url',
+        metrics.url2
+      ),
+      '$.url2'
+    ),
+    remove_empty => TRUE
+  )
+);
+
 WITH base AS (
   SELECT
-    * EXCEPT (metrics, events, name, category, extra, timestamp) REPLACE(
+    * REPLACE (
       STRUCT(
         client_info.app_build AS app_build,
         client_info.app_channel AS app_channel,
@@ -25,24 +91,14 @@
         ping_info.end_time,
         ping_info.parsed_end_time,
         ping_info.ping_type
-      ) AS ping_info
+      ) AS ping_info,
+      metrics_to_json(TO_JSON(metrics)) AS metrics
     ),
     client_info.client_id AS client_id,
     ping_info.reason AS reason,
-    `mozfun.json.from_map`(ping_info.experiments) AS experiments,
-    COALESCE(
-      SAFE.TIMESTAMP_MILLIS(SAFE_CAST(mozfun.map.get_key(event.extra, 'glean_timestamp') AS INT64)),
-      SAFE.TIMESTAMP_ADD(ping_info.parsed_start_time, INTERVAL event.timestamp MILLISECOND)
-    ) AS event_timestamp,
-    event.category AS event_category,
-    event.name AS event_name,
-    ARRAY_TO_STRING([event.category, event.name], '.') AS event, -- handles NULL values better
-    `mozfun.json.from_map`(event.extra) AS event_extra,
-    TO_JSON(metrics) AS metrics
+    from_map_experiment(ping_info.experiments) AS experiments,
   FROM
-    `accounts_cirrus.events` AS e
-  CROSS JOIN
-    UNNEST(events) AS event
+    `accounts_cirrus.events`
   WHERE
     {% if is_init() %}
       DATE(submission_timestamp) >= '2023-11-01'
@@ -52,34 +108,16 @@
 )
 --
 SELECT
-  * REPLACE (
-    -- expose as easy to access JSON column,
-    -- strip nulls,
-    -- translate nested array records into a JSON object,
-    -- rename url2 -> url
-    JSON_STRIP_NULLS(
-      JSON_REMOVE(
-        -- labeled_* are the only ones that SHOULD show up as context for events pings,
-        -- thus we special-case them
-        --
-        -- The JSON_SET/JSON_EXTRACT shenanigans are needed
-        -- because those subfields might not exist, so accessing the columns would fail.
-        -- but accessing non-existent fields in a JSON object simply gives us NULL.
-        JSON_SET(
-          metrics,
-          '$.labeled_counter',
-          mozfun.json.from_nested_map(metrics.labeled_counter),
-          '$.labeled_string',
-          mozfun.json.from_nested_map(metrics.labeled_string),
-          '$.labeled_boolean',
-          mozfun.json.from_nested_map(metrics.labeled_boolean),
-          '$.url',
-          metrics.url2
-        ),
-        '$.url2'
-      ),
-      remove_empty => TRUE
-    ) AS metrics
-  )
+  base.* EXCEPT (events),
+  COALESCE(
+    SAFE.TIMESTAMP_MILLIS(SAFE_CAST(mozfun.map.get_key(event.extra, 'glean_timestamp') AS INT64)),
+    SAFE.TIMESTAMP_ADD(ping_info.parsed_start_time, INTERVAL event.timestamp MILLISECOND)
+  ) AS event_timestamp,
+  event.category AS event_category,
+  event.name AS event_name,
+  ARRAY_TO_STRING([event.category, event.name], '.') AS event, -- handles NULL values better
+  from_map_event_extra(event.extra) AS event_extra,
 FROM
   base
+CROSS JOIN
+  UNNEST(events) AS event
diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/accounts_frontend_derived/event_monitoring_live_v1/materialized_view.sql /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/accounts_frontend_derived/event_monitoring_live_v1/materialized_view.sql
--- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/accounts_frontend_derived/event_monitoring_live_v1/materialized_view.sql	2024-05-27 14:55:32.000000000 +0000
+++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/accounts_frontend_derived/event_monitoring_live_v1/materialized_view.sql	2024-05-27 14:57:24.000000000 +0000
@@ -60,7 +60,7 @@
   LEFT JOIN
     UNNEST(event.extra) AS event_extra
   WHERE
-    DATE(submission_timestamp) >= "2024-05-24"
+    DATE(submission_timestamp) >= "2024-05-27"
   GROUP BY
     submission_date,
     window_start,
diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/accounts_frontend_derived/events_stream_v1/query.sql /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/accounts_frontend_derived/events_stream_v1/query.sql
--- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/accounts_frontend_derived/events_stream_v1/query.sql	2024-05-27 14:55:32.000000000 +0000
+++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/accounts_frontend_derived/events_stream_v1/query.sql	2024-05-27 14:57:27.000000000 +0000
@@ -1,7 +1,73 @@
 -- Generated via bigquery_etl.glean_usage
+-- convert array of key value pairs to a json object, cast numbers and booleans if possible
+CREATE TEMP FUNCTION from_map_event_extra(input ARRAY<STRUCT<key STRING, value STRING>>)
+RETURNS json AS (
+  IF(
+    ARRAY_LENGTH(input) = 0,
+    NULL,
+    JSON_OBJECT(
+      ARRAY(SELECT key FROM UNNEST(input)),
+      ARRAY(
+        SELECT
+          CASE
+            WHEN SAFE_CAST(value AS NUMERIC) IS NOT NULL
+              THEN TO_JSON(SAFE_CAST(value AS NUMERIC))
+            WHEN SAFE_CAST(value AS BOOL) IS NOT NULL
+              THEN TO_JSON(SAFE_CAST(value AS BOOL))
+            ELSE TO_JSON(value)
+          END
+        FROM
+          UNNEST(input)
+      )
+    )
+  )
+);
+
+-- convert array of key value pairs to a json object
+-- values are nested structs and will be converted to json objects
+CREATE TEMP FUNCTION from_map_experiment(
+  input ARRAY<
+    STRUCT<key STRING, value STRUCT<branch STRING, extra STRUCT<type STRING, enrollment_id STRING>>>
+  >
+)
+RETURNS json AS (
+  IF(
+    ARRAY_LENGTH(input) = 0,
+    NULL,
+    JSON_OBJECT(ARRAY(SELECT key FROM UNNEST(input)), ARRAY(SELECT value FROM UNNEST(input)))
+  )
+);
+
+CREATE TEMP FUNCTION metrics_to_json(metrics JSON)
+RETURNS JSON AS (
+  JSON_STRIP_NULLS(
+    JSON_REMOVE(
+      -- labeled_* are the only ones that SHOULD show up as context for events pings,
+      -- thus we special-case them
+      --
+      -- The JSON_SET/JSON_EXTRACT shenanigans are needed
+      -- because those subfields might not exist, so accessing the columns would fail.
+      -- but accessing non-existent fields in a JSON object simply gives us NULL.
+      JSON_SET(
+        metrics,
+        '$.labeled_counter',
+        mozfun.json.from_nested_map(metrics.labeled_counter),
+        '$.labeled_string',
+        mozfun.json.from_nested_map(metrics.labeled_string),
+        '$.labeled_boolean',
+        mozfun.json.from_nested_map(metrics.labeled_boolean),
+        '$.url',
+        metrics.url2
+      ),
+      '$.url2'
+    ),
+    remove_empty => TRUE
+  )
+);
+
 WITH base AS (
   SELECT
-    * EXCEPT (metrics, events, name, category, extra, timestamp) REPLACE(
+    * REPLACE (
       STRUCT(
         client_info.app_build AS app_build,
         client_info.app_channel AS app_channel,
@@ -25,24 +91,14 @@
         ping_info.end_time,
         ping_info.parsed_end_time,
         ping_info.ping_type
-      ) AS ping_info
+      ) AS ping_info,
+      metrics_to_json(TO_JSON(metrics)) AS metrics
     ),
     client_info.client_id AS client_id,
     ping_info.reason AS reason,
-    `mozfun.json.from_map`(ping_info.experiments) AS experiments,
-    COALESCE(
-      SAFE.TIMESTAMP_MILLIS(SAFE_CAST(mozfun.map.get_key(event.extra, 'glean_timestamp') AS INT64)),
-      SAFE.TIMESTAMP_ADD(ping_info.parsed_start_time, INTERVAL event.timestamp MILLISECOND)
-    ) AS event_timestamp,
-    event.category AS event_category,
-    event.name AS event_name,
-    ARRAY_TO_STRING([event.category, event.name], '.') AS event, -- handles NULL values better
-    `mozfun.json.from_map`(event.extra) AS event_extra,
-    TO_JSON(metrics) AS metrics
+    from_map_experiment(ping_info.experiments) AS experiments,
   FROM
-    `accounts_frontend.events` AS e
-  CROSS JOIN
-    UNNEST(events) AS event
+    `accounts_frontend.events`
   WHERE
     {% if is_init() %}
       DATE(submission_timestamp) >= '2023-11-01'
@@ -52,34 +108,16 @@
 )
 --
 SELECT
-  * REPLACE (
-    -- expose as easy to access JSON column,
-    -- strip nulls,
-    -- translate nested array records into a JSON object,
-    -- rename url2 -> url
-    JSON_STRIP_NULLS(
-      JSON_REMOVE(
-        -- labeled_* are the only ones that SHOULD show up as context for events pings,
-        -- thus we special-case them
-        --
-        -- The JSON_SET/JSON_EXTRACT shenanigans are needed
-        -- because those subfields might not exist, so accessing the columns would fail.
-        -- but accessing non-existent fields in a JSON object simply gives us NULL.
-        JSON_SET(
-          metrics,
-          '$.labeled_counter',
-          mozfun.json.from_nested_map(metrics.labeled_counter),
-          '$.labeled_string',
-          mozfun.json.from_nested_map(metrics.labeled_string),
-          '$.labeled_boolean',
-          mozfun.json.from_nested_map(metrics.labeled_boolean),
-          '$.url',
-          metrics.url2
-        ),
-        '$.url2'
-      ),
-      remove_empty => TRUE
-    ) AS metrics
-  )
+  base.* EXCEPT (events),
+  COALESCE(
+    SAFE.TIMESTAMP_MILLIS(SAFE_CAST(mozfun.map.get_key(event.extra, 'glean_timestamp') AS INT64)),
+    SAFE.TIMESTAMP_ADD(ping_info.parsed_start_time, INTERVAL event.timestamp MILLISECOND)
+  ) AS event_timestamp,
+  event.category AS event_category,
+  event.name AS event_name,
+  ARRAY_TO_STRING([event.category, event.name], '.') AS event, -- handles NULL values better
+  from_map_event_extra(event.extra) AS event_extra,
 FROM
   base
+CROSS JOIN
+  UNNEST(events) AS event
diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/bedrock_derived/event_monitoring_live_v1/materialized_view.sql /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/bedrock_derived/event_monitoring_live_v1/materialized_view.sql
--- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/bedrock_derived/event_monitoring_live_v1/materialized_view.sql	2024-05-27 14:55:32.000000000 +0000
+++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/bedrock_derived/event_monitoring_live_v1/materialized_view.sql	2024-05-27 14:57:24.000000000 +0000
@@ -50,7 +50,7 @@
         client_info.app_display_version AS version,
         ping_info
       FROM
-        `moz-fx-data-shared-prod.bedrock_live.events_v1`
+        `moz-fx-data-shared-prod.bedrock_live.non_interaction_v1`
       UNION ALL
       SELECT
         submission_timestamp,
@@ -70,7 +70,7 @@
         client_info.app_display_version AS version,
         ping_info
       FROM
-        `moz-fx-data-shared-prod.bedrock_live.non_interaction_v1`
+        `moz-fx-data-shared-prod.bedrock_live.events_v1`
     )
   CROSS JOIN
     UNNEST(events) AS event,
@@ -80,7 +80,7 @@
   LEFT JOIN
     UNNEST(event.extra) AS event_extra
   WHERE
-    DATE(submission_timestamp) >= "2024-05-24"
+    DATE(submission_timestamp) >= "2024-05-27"
   GROUP BY
     submission_date,
     window_start,
diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/bedrock_derived/events_stream_v1/query.sql /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/bedrock_derived/events_stream_v1/query.sql
--- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/bedrock_derived/events_stream_v1/query.sql	2024-05-27 14:55:32.000000000 +0000
+++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/bedrock_derived/events_stream_v1/query.sql	2024-05-27 14:57:27.000000000 +0000
@@ -1,7 +1,73 @@
 -- Generated via bigquery_etl.glean_usage
+-- convert array of key value pairs to a json object, cast numbers and booleans if possible
+CREATE TEMP FUNCTION from_map_event_extra(input ARRAY<STRUCT<key STRING, value STRING>>)
+RETURNS json AS (
+  IF(
+    ARRAY_LENGTH(input) = 0,
+    NULL,
+    JSON_OBJECT(
+      ARRAY(SELECT key FROM UNNEST(input)),
+      ARRAY(
+        SELECT
+          CASE
+            WHEN SAFE_CAST(value AS NUMERIC) IS NOT NULL
+              THEN TO_JSON(SAFE_CAST(value AS NUMERIC))
+            WHEN SAFE_CAST(value AS BOOL) IS NOT NULL
+              THEN TO_JSON(SAFE_CAST(value AS BOOL))
+            ELSE TO_JSON(value)
+          END
+        FROM
+          UNNEST(input)
+      )
+    )
+  )
+);
+
+-- convert array of key value pairs to a json object
+-- values are nested structs and will be converted to json objects
+CREATE TEMP FUNCTION from_map_experiment(
+  input ARRAY<
+    STRUCT<key STRING, value STRUCT<branch STRING, extra STRUCT<type STRING, enrollment_id STRING>>>
+  >
+)
+RETURNS json AS (
+  IF(
+    ARRAY_LENGTH(input) = 0,
+    NULL,
+    JSON_OBJECT(ARRAY(SELECT key FROM UNNEST(input)), ARRAY(SELECT value FROM UNNEST(input)))
+  )
+);
+
+CREATE TEMP FUNCTION metrics_to_json(metrics JSON)
+RETURNS JSON AS (
+  JSON_STRIP_NULLS(
+    JSON_REMOVE(
+      -- labeled_* are the only ones that SHOULD show up as context for events pings,
+      -- thus we special-case them
+      --
+      -- The JSON_SET/JSON_EXTRACT shenanigans are needed
+      -- because those subfields might not exist, so accessing the columns would fail.
+      -- but accessing non-existent fields in a JSON object simply gives us NULL.
+      JSON_SET(
+        metrics,
+        '$.labeled_counter',
+        mozfun.json.from_nested_map(metrics.labeled_counter),
+        '$.labeled_string',
+        mozfun.json.from_nested_map(metrics.labeled_string),
+        '$.labeled_boolean',
+        mozfun.json.from_nested_map(metrics.labeled_boolean),
+        '$.url',
+        metrics.url2
+      ),
+      '$.url2'
+    ),
+    remove_empty => TRUE
+  )
+);
+
 WITH base AS (
   SELECT
-    * EXCEPT (metrics, events, name, category, extra, timestamp) REPLACE(
+    * REPLACE (
       STRUCT(
         client_info.app_build AS app_build,
         client_info.app_channel AS app_channel,
@@ -25,24 +91,14 @@
         ping_info.end_time,
         ping_info.parsed_end_time,
         ping_info.ping_type
-      ) AS ping_info
+      ) AS ping_info,
+      metrics_to_json(TO_JSON(metrics)) AS metrics
     ),
     client_info.client_id AS client_id,
     ping_info.reason AS reason,
-    `mozfun.json.from_map`(ping_info.experiments) AS experiments,
-    COALESCE(
-      SAFE.TIMESTAMP_MILLIS(SAFE_CAST(mozfun.map.get_key(event.extra, 'glean_timestamp') AS INT64)),
-      SAFE.TIMESTAMP_ADD(ping_info.parsed_start_time, INTERVAL event.timestamp MILLISECOND)
-    ) AS event_timestamp,
-    event.category AS event_category,
-    event.name AS event_name,
-    ARRAY_TO_STRING([event.category, event.name], '.') AS event, -- handles NULL values better
-    `mozfun.json.from_map`(event.extra) AS event_extra,
-    TO_JSON(metrics) AS metrics
+    from_map_experiment(ping_info.experiments) AS experiments,
   FROM
-    `bedrock.events` AS e
-  CROSS JOIN
-    UNNEST(events) AS event
+    `bedrock.events`
   WHERE
     {% if is_init() %}
       DATE(submission_timestamp) >= '2023-11-01'
@@ -52,34 +108,16 @@
 )
 --
 SELECT
-  * REPLACE (
-    -- expose as easy to access JSON column,
-    -- strip nulls,
-    -- translate nested array records into a JSON object,
-    -- rename url2 -> url
-    JSON_STRIP_NULLS(
-      JSON_REMOVE(
-        -- labeled_* are the only ones that SHOULD show up as context for events pings,
-        -- thus we special-case them
-        --
-        -- The JSON_SET/JSON_EXTRACT shenanigans are needed
-        -- because those subfields might not exist, so accessing the columns would fail.
-        -- but accessing non-existent fields in a JSON object simply gives us NULL.
-        JSON_SET(
-          metrics,
-          '$.labeled_counter',
-          mozfun.json.from_nested_map(metrics.labeled_counter),
-          '$.labeled_string',
-          mozfun.json.from_nested_map(metrics.labeled_string),
-          '$.labeled_boolean',
-          mozfun.json.from_nested_map(metrics.labeled_boolean),
-          '$.url',
-          metrics.url2
-        ),
-        '$.url2'
-      ),
-      remove_empty => TRUE
-    ) AS metrics
-  )
+  base.* EXCEPT (events),
+  COALESCE(
+    SAFE.TIMESTAMP_MILLIS(SAFE_CAST(mozfun.map.get_key(event.extra, 'glean_timestamp') AS INT64)),
+    SAFE.TIMESTAMP_ADD(ping_info.parsed_start_time, INTERVAL event.timestamp MILLISECOND)
+  ) AS event_timestamp,
+  event.category AS event_category,
+  event.name AS event_name,
+  ARRAY_TO_STRING([event.category, event.name], '.') AS event, -- handles NULL values better
+  from_map_event_extra(event.extra) AS event_extra,
 FROM
   base
+CROSS JOIN
+  UNNEST(events) AS event
diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/burnham_derived/events_stream_v1/query.sql /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/burnham_derived/events_stream_v1/query.sql
--- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/burnham_derived/events_stream_v1/query.sql	2024-05-27 14:55:32.000000000 +0000
+++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/burnham_derived/events_stream_v1/query.sql	2024-05-27 14:57:27.000000000 +0000
@@ -1,7 +1,73 @@
 -- Generated via bigquery_etl.glean_usage
+-- convert array of key value pairs to a json object, cast numbers and booleans if possible
+CREATE TEMP FUNCTION from_map_event_extra(input ARRAY<STRUCT<key STRING, value STRING>>)
+RETURNS json AS (
+  IF(
+    ARRAY_LENGTH(input) = 0,
+    NULL,
+    JSON_OBJECT(
+      ARRAY(SELECT key FROM UNNEST(input)),
+      ARRAY(
+        SELECT
+          CASE
+            WHEN SAFE_CAST(value AS NUMERIC) IS NOT NULL
+              THEN TO_JSON(SAFE_CAST(value AS NUMERIC))
+            WHEN SAFE_CAST(value AS BOOL) IS NOT NULL
+              THEN TO_JSON(SAFE_CAST(value AS BOOL))
+            ELSE TO_JSON(value)
+          END
+        FROM
+          UNNEST(input)
+      )
+    )
+  )
+);
+
+-- convert array of key value pairs to a json object
+-- values are nested structs and will be converted to json objects
+CREATE TEMP FUNCTION from_map_experiment(
+  input ARRAY<
+    STRUCT<key STRING, value STRUCT<branch STRING, extra STRUCT<type STRING, enrollment_id STRING>>>
+  >
+)
+RETURNS json AS (
+  IF(
+    ARRAY_LENGTH(input) = 0,
+    NULL,
+    JSON_OBJECT(ARRAY(SELECT key FROM UNNEST(input)), ARRAY(SELECT value FROM UNNEST(input)))
+  )
+);
+
+CREATE TEMP FUNCTION metrics_to_json(metrics JSON)
+RETURNS JSON AS (
+  JSON_STRIP_NULLS(
+    JSON_REMOVE(
+      -- labeled_* are the only ones that SHOULD show up as context for events pings,
+      -- thus we special-case them
+      --
+      -- The JSON_SET/JSON_EXTRACT shenanigans are needed
+      -- because those subfields might not exist, so accessing the columns would fail.
+      -- but accessing non-existent fields in a JSON object simply gives us NULL.
+      JSON_SET(
+        metrics,
+        '$.labeled_counter',
+        mozfun.json.from_nested_map(metrics.labeled_counter),
+        '$.labeled_string',
+        mozfun.json.from_nested_map(metrics.labeled_string),
+        '$.labeled_boolean',
+        mozfun.json.from_nested_map(metrics.labeled_boolean),
+        '$.url',
+        metrics.url2
+      ),
+      '$.url2'
+    ),
+    remove_empty => TRUE
+  )
+);
+
 WITH base AS (
   SELECT
-    * EXCEPT (metrics, events, name, category, extra, timestamp) REPLACE(
+    * REPLACE (
       STRUCT(
         client_info.app_build AS app_build,
         client_info.app_channel AS app_channel,
@@ -25,24 +91,14 @@
         ping_info.end_time,
         ping_info.parsed_end_time,
         ping_info.ping_type
-      ) AS ping_info
+      ) AS ping_info,
+      metrics_to_json(TO_JSON(metrics)) AS metrics
     ),
     client_info.client_id AS client_id,
     ping_info.reason AS reason,
-    `mozfun.json.from_map`(ping_info.experiments) AS experiments,
-    COALESCE(
-      SAFE.TIMESTAMP_MILLIS(SAFE_CAST(mozfun.map.get_key(event.extra, 'glean_timestamp') AS INT64)),
-      SAFE.TIMESTAMP_ADD(ping_info.parsed_start_time, INTERVAL event.timestamp MILLISECOND)
-    ) AS event_timestamp,
-    event.category AS event_category,
-    event.name AS event_name,
-    ARRAY_TO_STRING([event.category, event.name], '.') AS event, -- handles NULL values better
-    `mozfun.json.from_map`(event.extra) AS event_extra,
-    TO_JSON(metrics) AS metrics
+    from_map_experiment(ping_info.experiments) AS experiments,
   FROM
-    `burnham.events` AS e
-  CROSS JOIN
-    UNNEST(events) AS event
+    `burnham.events`
   WHERE
     {% if is_init() %}
       DATE(submission_timestamp) >= '2023-11-01'
@@ -52,34 +108,16 @@
 )
 --
 SELECT
-  * REPLACE (
-    -- expose as easy to access JSON column,
-    -- strip nulls,
-    -- translate nested array records into a JSON object,
-    -- rename url2 -> url
-    JSON_STRIP_NULLS(
-      JSON_REMOVE(
-        -- labeled_* are the only ones that SHOULD show up as context for events pings,
-        -- thus we special-case them
-        --
-        -- The JSON_SET/JSON_EXTRACT shenanigans are needed
-        -- because those subfields might not exist, so accessing the columns would fail.
-        -- but accessing non-existent fields in a JSON object simply gives us NULL.
-        JSON_SET(
-          metrics,
-          '$.labeled_counter',
-          mozfun.json.from_nested_map(metrics.labeled_counter),
-          '$.labeled_string',
-          mozfun.json.from_nested_map(metrics.labeled_string),
-          '$.labeled_boolean',
-          mozfun.json.from_nested_map(metrics.labeled_boolean),
-          '$.url',
-          metrics.url2
-        ),
-        '$.url2'
-      ),
-      remove_empty => TRUE
-    ) AS metrics
-  )
+  base.* EXCEPT (events),
+  COALESCE(
+    SAFE.TIMESTAMP_MILLIS(SAFE_CAST(mozfun.map.get_key(event.extra, 'glean_timestamp') AS INT64)),
+    SAFE.TIMESTAMP_ADD(ping_info.parsed_start_time, INTERVAL event.timestamp MILLISECOND)
+  ) AS event_timestamp,
+  event.category AS event_category,
+  event.name AS event_name,
+  ARRAY_TO_STRING([event.category, event.name], '.') AS event, -- handles NULL values better
+  from_map_event_extra(event.extra) AS event_extra,
 FROM
   base
+CROSS JOIN
+  UNNEST(events) AS event
diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/contextual_services/event_aggregates/schema.yaml /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/contextual_services/event_aggregates/schema.yaml
--- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/contextual_services/event_aggregates/schema.yaml	2024-05-27 14:54:51.000000000 +0000
+++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/contextual_services/event_aggregates/schema.yaml	2024-05-27 15:03:47.000000000 +0000
@@ -1,49 +1,49 @@
 fields:
-- mode: NULLABLE
-  name: submission_date
+- name: submission_date
   type: DATE
-- mode: NULLABLE
-  name: source
+  mode: NULLABLE
+- name: source
   type: STRING
-- mode: NULLABLE
-  name: event_type
+  mode: NULLABLE
+- name: event_type
   type: STRING
-- mode: NULLABLE
-  name: form_factor
+  mode: NULLABLE
+- name: form_factor
   type: STRING
-- mode: NULLABLE
-  name: country
+  mode: NULLABLE
+- name: country
   type: STRING
-- mode: NULLABLE
-  name: subdivision1
+  mode: NULLABLE
+- name: subdivision1
   type: STRING
-- mode: NULLABLE
-  name: advertiser
+  mode: NULLABLE
+- name: advertiser
   type: STRING
-- mode: NULLABLE
-  name: release_channel
+  mode: NULLABLE
+- name: release_channel
   type: STRING
-- mode: NULLABLE
-  name: position
+  mode: NULLABLE
+- name: position
   type: INTEGER
-- mode: NULLABLE
-  name: provider
+  mode: NULLABLE
+- name: provider
   type: STRING
-- mode: NULLABLE
-  name: match_type
+  mode: NULLABLE
+- name: match_type
   type: STRING
-- mode: NULLABLE
-  name: normalized_os
+  mode: NULLABLE
+- name: normalized_os
   type: STRING
-- mode: NULLABLE
-  name: suggest_data_sharing_enabled
+  mode: NULLABLE
+- name: suggest_data_sharing_enabled
   type: BOOLEAN
-- mode: NULLABLE
-  name: event_count
+  mode: NULLABLE
+- name: event_count
   type: INTEGER
-- mode: NULLABLE
-  name: user_count
+  mode: NULLABLE
+- name: user_count
   type: INTEGER
-- mode: NULLABLE
-  name: query_type
+  mode: NULLABLE
+- name: query_type
   type: STRING
+  mode: NULLABLE
diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/contextual_services/event_aggregates_suggest/schema.yaml /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/contextual_services/event_aggregates_suggest/schema.yaml
--- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/contextual_services/event_aggregates_suggest/schema.yaml	2024-05-27 14:54:51.000000000 +0000
+++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/contextual_services/event_aggregates_suggest/schema.yaml	2024-05-27 15:03:47.000000000 +0000
@@ -1,40 +1,40 @@
 fields:
-- mode: NULLABLE
-  name: submission_date
+- name: submission_date
   type: DATE
-- mode: NULLABLE
-  name: form_factor
+  mode: NULLABLE
+- name: form_factor
   type: STRING
-- mode: NULLABLE
-  name: country
+  mode: NULLABLE
+- name: country
   type: STRING
-- mode: NULLABLE
-  name: advertiser
+  mode: NULLABLE
+- name: advertiser
   type: STRING
-- mode: NULLABLE
-  name: normalized_os
+  mode: NULLABLE
+- name: normalized_os
   type: STRING
-- mode: NULLABLE
-  name: release_channel
+  mode: NULLABLE
+- name: release_channel
   type: STRING
-- mode: NULLABLE
-  name: position
+  mode: NULLABLE
+- name: position
   type: INTEGER
-- mode: NULLABLE
-  name: provider
+  mode: NULLABLE
+- name: provider
   type: STRING
-- mode: NULLABLE
-  name: match_type
+  mode: NULLABLE
+- name: match_type
   type: STRING
-- mode: NULLABLE
-  name: suggest_data_sharing_enabled
+  mode: NULLABLE
+- name: suggest_data_sharing_enabled
   type: BOOLEAN
-- mode: NULLABLE
-  name: impression_count
+  mode: NULLABLE
+- name: impression_count
   type: INTEGER
-- mode: NULLABLE
-  name: click_count
+  mode: NULLABLE
+- name: click_count
   type: INTEGER
-- mode: NULLABLE
-  name: query_type
+  mode: NULLABLE
+- name: query_type
   type: STRING
+  mode: NULLABLE
diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/debug_ping_view_derived/event_monitoring_live_v1/materialized_view.sql /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/debug_ping_view_derived/event_monitoring_live_v1/materialized_view.sql
--- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/debug_ping_view_derived/event_monitoring_live_v1/materialized_view.sql	2024-05-27 14:55:26.000000000 +0000
+++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/debug_ping_view_derived/event_monitoring_live_v1/materialized_view.sql	2024-05-27 14:57:25.000000000 +0000
@@ -60,7 +60,7 @@
   LEFT JOIN
     UNNEST(event.extra) AS event_extra
   WHERE
-    DATE(submission_timestamp) >= "2024-05-24"
+    DATE(submission_timestamp) >= "2024-05-27"
   GROUP BY
     submission_date,
     window_start,
diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/debug_ping_view_derived/events_stream_v1/query.sql /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/debug_ping_view_derived/events_stream_v1/query.sql
--- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/debug_ping_view_derived/events_stream_v1/query.sql	2024-05-27 14:55:26.000000000 +0000
+++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/debug_ping_view_derived/events_stream_v1/query.sql	2024-05-27 14:57:28.000000000 +0000
@@ -1,7 +1,73 @@
 -- Generated via bigquery_etl.glean_usage
+-- convert array of key value pairs to a json object, cast numbers and booleans if possible
+CREATE TEMP FUNCTION from_map_event_extra(input ARRAY<STRUCT<key STRING, value STRING>>)
+RETURNS json AS (
+  IF(
+    ARRAY_LENGTH(input) = 0,
+    NULL,
+    JSON_OBJECT(
+      ARRAY(SELECT key FROM UNNEST(input)),
+      ARRAY(
+        SELECT
+          CASE
+            WHEN SAFE_CAST(value AS NUMERIC) IS NOT NULL
+              THEN TO_JSON(SAFE_CAST(value AS NUMERIC))
+            WHEN SAFE_CAST(value AS BOOL) IS NOT NULL
+              THEN TO_JSON(SAFE_CAST(value AS BOOL))
+            ELSE TO_JSON(value)
+          END
+        FROM
+          UNNEST(input)
+      )
+    )
+  )
+);
+
+-- convert array of key value pairs to a json object
+-- values are nested structs and will be converted to json objects
+CREATE TEMP FUNCTION from_map_experiment(
+  input ARRAY<
+    STRUCT<key STRING, value STRUCT<branch STRING, extra STRUCT<type STRING, enrollment_id STRING>>>
+  >
+)
+RETURNS json AS (
+  IF(
+    ARRAY_LENGTH(input) = 0,
+    NULL,
+    JSON_OBJECT(ARRAY(SELECT key FROM UNNEST(input)), ARRAY(SELECT value FROM UNNEST(input)))
+  )
+);
+
+CREATE TEMP FUNCTION metrics_to_json(metrics JSON)
+RETURNS JSON AS (
+  JSON_STRIP_NULLS(
+    JSON_REMOVE(
+      -- labeled_* are the only ones that SHOULD show up as context for events pings,
+      -- thus we special-case them
+      --
+      -- The JSON_SET/JSON_EXTRACT shenanigans are needed
+      -- because those subfields might not exist, so accessing the columns would fail.
+      -- but accessing non-existent fields in a JSON object simply gives us NULL.
+      JSON_SET(
+        metrics,
+        '$.labeled_counter',
+        mozfun.json.from_nested_map(metrics.labeled_counter),
+        '$.labeled_string',
+        mozfun.json.from_nested_map(metrics.labeled_string),
+        '$.labeled_boolean',
+        mozfun.json.from_nested_map(metrics.labeled_boolean),
+        '$.url',
+        metrics.url2
+      ),
+      '$.url2'
+    ),
+    remove_empty => TRUE
+  )
+);
+
 WITH base AS (
   SELECT
-    * EXCEPT (metrics, events, name, category, extra, timestamp) REPLACE(
+    * REPLACE (
       STRUCT(
         client_info.app_build AS app_build,
         client_info.app_channel AS app_channel,
@@ -25,24 +91,14 @@
         ping_info.end_time,
         ping_info.parsed_end_time,
         ping_info.ping_type
-      ) AS ping_info
+      ) AS ping_info,
+      metrics_to_json(TO_JSON(metrics)) AS metrics
     ),
     client_info.client_id AS client_id,
     ping_info.reason AS reason,
-    `mozfun.json.from_map`(ping_info.experiments) AS experiments,
-    COALESCE(
-      SAFE.TIMESTAMP_MILLIS(SAFE_CAST(mozfun.map.get_key(event.extra, 'glean_timestamp') AS INT64)),
-      SAFE.TIMESTAMP_ADD(ping_info.parsed_start_time, INTERVAL event.timestamp MILLISECOND)
-    ) AS event_timestamp,
-    event.category AS event_category,
-    event.name AS event_name,
-    ARRAY_TO_STRING([event.category, event.name], '.') AS event, -- handles NULL values better
-    `mozfun.json.from_map`(event.extra) AS event_extra,
-    TO_JSON(metrics) AS metrics
+    from_map_experiment(ping_info.experiments) AS experiments,
   FROM
-    `debug_ping_view.events` AS e
-  CROSS JOIN
-    UNNEST(events) AS event
+    `debug_ping_view.events`
   WHERE
     {% if is_init() %}
       DATE(submission_timestamp) >= '2023-11-01'
@@ -52,34 +108,16 @@
 )
 --
 SELECT
-  * REPLACE (
-    -- expose as easy to access JSON column,
-    -- strip nulls,
-    -- translate nested array records into a JSON object,
-    -- rename url2 -> url
-    JSON_STRIP_NULLS(
-      JSON_REMOVE(
-        -- labeled_* are the only ones that SHOULD show up as context for events pings,
-        -- thus we special-case them
-        --
-        -- The JSON_SET/JSON_EXTRACT shenanigans are needed
-        -- because those subfields might not exist, so accessing the columns would fail.
-        -- but accessing non-existent fields in a JSON object simply gives us NULL.
-        JSON_SET(
-          metrics,
-          '$.labeled_counter',
-          mozfun.json.from_nested_map(metrics.labeled_counter),
-          '$.labeled_string',
-          mozfun.json.from_nested_map(metrics.labeled_string),
-          '$.labeled_boolean',
-          mozfun.json.from_nested_map(metrics.labeled_boolean),
-          '$.url',
-          metrics.url2
-        ),
-        '$.url2'
-      ),
-      remove_empty => TRUE
-    ) AS metrics
-  )
+  base.* EXCEPT (events),
+  COALESCE(
+    SAFE.TIMESTAMP_MILLIS(SAFE_CAST(mozfun.map.get_key(event.extra, 'glean_timestamp') AS INT64)),
+    SAFE.TIMESTAMP_ADD(ping_info.parsed_start_time, INTERVAL event.timestamp MILLISECOND)
+  ) AS event_timestamp,
+  event.category AS event_category,
+  event.name AS event_name,
+  ARRAY_TO_STRING([event.category, event.name], '.') AS event, -- handles NULL values better
+  from_map_event_extra(event.extra) AS event_extra,
 FROM
   base
+CROSS JOIN
+  UNNEST(events) AS event
diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/fenix/funnel_retention_clients/schema.yaml /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/fenix/funnel_retention_clients/schema.yaml
--- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/fenix/funnel_retention_clients/schema.yaml	2024-05-27 14:54:51.000000000 +0000
+++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/fenix/funnel_retention_clients/schema.yaml	2024-05-27 15:02:38.000000000 +0000
@@ -26,6 +26,9 @@
 - name: adjust_network
   type: STRING
   mode: NULLABLE
+- name: install_source
+  type: STRING
+  mode: NULLABLE
 - name: retained_week_2
   type: BOOLEAN
   mode: NULLABLE
diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/fenix/funnel_retention_week_4/schema.yaml /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/fenix/funnel_retention_week_4/schema.yaml
--- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/fenix/funnel_retention_week_4/schema.yaml	2024-05-27 14:54:51.000000000 +0000
+++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/fenix/funnel_retention_week_4/schema.yaml	2024-05-27 15:02:38.000000000 +0000
@@ -48,6 +48,10 @@
   description: 'The type of source of a client installation.
 
     '
+- name: install_source
+  type: STRING
+  mode: NULLABLE
+  description: null
 - name: new_profiles
   type: INTEGER
   mode: NULLABLE
diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/firefox_desktop_background_defaultagent_derived/events_stream_v1/query.sql /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/firefox_desktop_background_defaultagent_derived/events_stream_v1/query.sql
--- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/firefox_desktop_background_defaultagent_derived/events_stream_v1/query.sql	2024-05-27 14:55:32.000000000 +0000
+++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/firefox_desktop_background_defaultagent_derived/events_stream_v1/query.sql	2024-05-27 14:57:29.000000000 +0000
@@ -1,7 +1,73 @@
 -- Generated via bigquery_etl.glean_usage
+-- convert array of key value pairs to a json object, cast numbers and booleans if possible
+CREATE TEMP FUNCTION from_map_event_extra(input ARRAY<STRUCT<key STRING, value STRING>>)
+RETURNS json AS (
+  IF(
+    ARRAY_LENGTH(input) = 0,
+    NULL,
+    JSON_OBJECT(
+      ARRAY(SELECT key FROM UNNEST(input)),
+      ARRAY(
+        SELECT
+          CASE
+            WHEN SAFE_CAST(value AS NUMERIC) IS NOT NULL
+              THEN TO_JSON(SAFE_CAST(value AS NUMERIC))
+            WHEN SAFE_CAST(value AS BOOL) IS NOT NULL
+              THEN TO_JSON(SAFE_CAST(value AS BOOL))
+            ELSE TO_JSON(value)
+          END
+        FROM
+          UNNEST(input)
+      )
+    )
+  )
+);
+
+-- convert array of key value pairs to a json object
+-- values are nested structs and will be converted to json objects
+CREATE TEMP FUNCTION from_map_experiment(
+  input ARRAY<
+    STRUCT<key STRING, value STRUCT<branch STRING, extra STRUCT<type STRING, enrollment_id STRING>>>
+  >
+)
+RETURNS json AS (
+  IF(
+    ARRAY_LENGTH(input) = 0,
+    NULL,
+    JSON_OBJECT(ARRAY(SELECT key FROM UNNEST(input)), ARRAY(SELECT value FROM UNNEST(input)))
+  )
+);
+
+CREATE TEMP FUNCTION metrics_to_json(metrics JSON)
+RETURNS JSON AS (
+  JSON_STRIP_NULLS(
+    JSON_REMOVE(
+      -- labeled_* are the only ones that SHOULD show up as context for events pings,
+      -- thus we special-case them
+      --
+      -- The JSON_SET/JSON_EXTRACT shenanigans are needed
+      -- because those subfields might not exist, so accessing the columns would fail.
+      -- but accessing non-existent fields in a JSON object simply gives us NULL.
+      JSON_SET(
+        metrics,
+        '$.labeled_counter',
+        mozfun.json.from_nested_map(metrics.labeled_counter),
+        '$.labeled_string',
+        mozfun.json.from_nested_map(metrics.labeled_string),
+        '$.labeled_boolean',
+        mozfun.json.from_nested_map(metrics.labeled_boolean),
+        '$.url',
+        metrics.url2
+      ),
+      '$.url2'
+    ),
+    remove_empty => TRUE
+  )
+);
+
 WITH base AS (
   SELECT
-    * EXCEPT (metrics, events, name, category, extra, timestamp) REPLACE(
+    * REPLACE (
       STRUCT(
         client_info.app_build AS app_build,
         client_info.app_channel AS app_channel,
@@ -25,24 +91,14 @@
         ping_info.end_time,
         ping_info.parsed_end_time,
         ping_info.ping_type
-      ) AS ping_info
+      ) AS ping_info,
+      metrics_to_json(TO_JSON(metrics)) AS metrics
     ),
     client_info.client_id AS client_id,
     ping_info.reason AS reason,
-    `mozfun.json.from_map`(ping_info.experiments) AS experiments,
-    COALESCE(
-      SAFE.TIMESTAMP_MILLIS(SAFE_CAST(mozfun.map.get_key(event.extra, 'glean_timestamp') AS INT64)),
-      SAFE.TIMESTAMP_ADD(ping_info.parsed_start_time, INTERVAL event.timestamp MILLISECOND)
-    ) AS event_timestamp,
-    event.category AS event_category,
-    event.name AS event_name,
-    ARRAY_TO_STRING([event.category, event.name], '.') AS event, -- handles NULL values better
-    `mozfun.json.from_map`(event.extra) AS event_extra,
-    TO_JSON(metrics) AS metrics
+    from_map_experiment(ping_info.experiments) AS experiments,
   FROM
-    `firefox_desktop_background_defaultagent.events` AS e
-  CROSS JOIN
-    UNNEST(events) AS event
+    `firefox_desktop_background_defaultagent.events`
   WHERE
     {% if is_init() %}
       DATE(submission_timestamp) >= '2023-11-01'
@@ -52,34 +108,16 @@
 )
 --
 SELECT
-  * REPLACE (
-    -- expose as easy to access JSON column,
-    -- strip nulls,
-    -- translate nested array records into a JSON object,
-    -- rename url2 -> url
-    JSON_STRIP_NULLS(
-      JSON_REMOVE(
-        -- labeled_* are the only ones that SHOULD show up as context for events pings,
-        -- thus we special-case them
-        --
-        -- The JSON_SET/JSON_EXTRACT shenanigans are needed
-        -- because those subfields might not exist, so accessing the columns would fail.
-        -- but accessing non-existent fields in a JSON object simply gives us NULL.
-        JSON_SET(
-          metrics,
-          '$.labeled_counter',
-          mozfun.json.from_nested_map(metrics.labeled_counter),
-          '$.labeled_string',
-          mozfun.json.from_nested_map(metrics.labeled_string),
-          '$.labeled_boolean',
-          mozfun.json.from_nested_map(metrics.labeled_boolean),
-          '$.url',
-          metrics.url2
-        ),
-        '$.url2'
-      ),
-      remove_empty => TRUE
-    ) AS metrics
-  )
+  base.* EXCEPT (events),
+  COALESCE(
+    SAFE.TIMESTAMP_MILLIS(SAFE_CAST(mozfun.map.get_key(event.extra, 'glean_timestamp') AS INT64)),
+    SAFE.TIMESTAMP_ADD(ping_info.parsed_start_time, INTERVAL event.timestamp MILLISECOND)
+  ) AS event_timestamp,
+  event.category AS event_category,
+  event.name AS event_name,
+  ARRAY_TO_STRING([event.category, event.name], '.') AS event, -- handles NULL values better
+  from_map_event_extra(event.extra) AS event_extra,
 FROM
   base
+CROSS JOIN
+  UNNEST(events) AS event
diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/firefox_desktop_background_tasks_derived/event_monitoring_live_v1/materialized_view.sql /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/firefox_desktop_background_tasks_derived/event_monitoring_live_v1/materialized_view.sql
--- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/firefox_desktop_background_tasks_derived/event_monitoring_live_v1/materialized_view.sql	2024-05-27 14:55:32.000000000 +0000
+++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/firefox_desktop_background_tasks_derived/event_monitoring_live_v1/materialized_view.sql	2024-05-27 14:57:25.000000000 +0000
@@ -50,7 +50,7 @@
         client_info.app_display_version AS version,
         ping_info
       FROM
-        `moz-fx-data-shared-prod.firefox_desktop_background_tasks_live.events_v1`
+        `moz-fx-data-shared-prod.firefox_desktop_background_tasks_live.background_tasks_v1`
       UNION ALL
       SELECT
         submission_timestamp,
@@ -60,7 +60,7 @@
         client_info.app_display_version AS version,
         ping_info
       FROM
-        `moz-fx-data-shared-prod.firefox_desktop_background_tasks_live.background_tasks_v1`
+        `moz-fx-data-shared-prod.firefox_desktop_background_tasks_live.events_v1`
     )
   CROSS JOIN
     UNNEST(events) AS event,
@@ -70,7 +70,7 @@
   LEFT JOIN
     UNNEST(event.extra) AS event_extra
   WHERE
-    DATE(submission_timestamp) >= "2024-05-24"
+    DATE(submission_timestamp) >= "2024-05-27"
   GROUP BY
     submission_date,
     window_start,
diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/firefox_desktop_background_tasks_derived/events_stream_v1/query.sql /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/firefox_desktop_background_tasks_derived/events_stream_v1/query.sql
--- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/firefox_desktop_background_tasks_derived/events_stream_v1/query.sql	2024-05-27 14:55:32.000000000 +0000
+++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/firefox_desktop_background_tasks_derived/events_stream_v1/query.sql	2024-05-27 14:57:30.000000000 +0000
@@ -1,7 +1,73 @@
 -- Generated via bigquery_etl.glean_usage
+-- convert array of key value pairs to a json object, cast numbers and booleans if possible
+CREATE TEMP FUNCTION from_map_event_extra(input ARRAY<STRUCT<key STRING, value STRING>>)
+RETURNS json AS (
+  IF(
+    ARRAY_LENGTH(input) = 0,
+    NULL,
+    JSON_OBJECT(
+      ARRAY(SELECT key FROM UNNEST(input)),
+      ARRAY(
+        SELECT
+          CASE
+            WHEN SAFE_CAST(value AS NUMERIC) IS NOT NULL
+              THEN TO_JSON(SAFE_CAST(value AS NUMERIC))
+            WHEN SAFE_CAST(value AS BOOL) IS NOT NULL
+              THEN TO_JSON(SAFE_CAST(value AS BOOL))
+            ELSE TO_JSON(value)
+          END
+        FROM
+          UNNEST(input)
+      )
+    )
+  )
+);
+
+-- convert array of key value pairs to a json object
+-- values are nested structs and will be converted to json objects
+CREATE TEMP FUNCTION from_map_experiment(
+  input ARRAY<
+    STRUCT<key STRING, value STRUCT<branch STRING, extra STRUCT<type STRING, enrollment_id STRING>>>
+  >
+)
+RETURNS json AS (
+  IF(
+    ARRAY_LENGTH(input) = 0,
+    NULL,
+    JSON_OBJECT(ARRAY(SELECT key FROM UNNEST(input)), ARRAY(SELECT value FROM UNNEST(input)))
+  )
+);
+
+CREATE TEMP FUNCTION metrics_to_json(metrics JSON)
+RETURNS JSON AS (
+  JSON_STRIP_NULLS(
+    JSON_REMOVE(
+      -- labeled_* are the only ones that SHOULD show up as context for events pings,
+      -- thus we special-case them
+      --
+      -- The JSON_SET/JSON_EXTRACT shenanigans are needed
+      -- because those subfields might not exist, so accessing the columns would fail.
+      -- but accessing non-existent fields in a JSON object simply gives us NULL.
+      JSON_SET(
+        metrics,
+        '$.labeled_counter',
+        mozfun.json.from_nested_map(metrics.labeled_counter),
+        '$.labeled_string',
+        mozfun.json.from_nested_map(metrics.labeled_string),
+        '$.labeled_boolean',
+        mozfun.json.from_nested_map(metrics.labeled_boolean),
+        '$.url',
+        metrics.url2
+      ),
+      '$.url2'
+    ),
+    remove_empty => TRUE
+  )
+);
+
 WITH base AS (
   SELECT
-    * EXCEPT (metrics, events, name, category, extra, timestamp) REPLACE(
+    * REPLACE (
       STRUCT(
         client_info.app_build AS app_build,
         client_info.app_channel AS app_channel,
@@ -25,24 +91,14 @@
         ping_info.end_time,
         ping_info.parsed_end_time,
         ping_info.ping_type
-      ) AS ping_info
+      ) AS ping_info,
+      metrics_to_json(TO_JSON(metrics)) AS metrics
     ),
     client_info.client_id AS client_id,
     ping_info.reason AS reason,
-    `mozfun.json.from_map`(ping_info.experiments) AS experiments,
-    COALESCE(
-      SAFE.TIMESTAMP_MILLIS(SAFE_CAST(mozfun.map.get_key(event.extra, 'glean_timestamp') AS INT64)),
-      SAFE.TIMESTAMP_ADD(ping_info.parsed_start_time, INTERVAL event.timestamp MILLISECOND)
-    ) AS event_timestamp,
-    event.category AS event_category,
-    event.name AS event_name,
-    ARRAY_TO_STRING([event.category, event.name], '.') AS event, -- handles NULL values better
-    `mozfun.json.from_map`(event.extra) AS event_extra,
-    TO_JSON(metrics) AS metrics
+    from_map_experiment(ping_info.experiments) AS experiments,
   FROM
-    `firefox_desktop_background_tasks.events` AS e
-  CROSS JOIN
-    UNNEST(events) AS event
+    `firefox_desktop_background_tasks.events`
   WHERE
     {% if is_init() %}
       DATE(submission_timestamp) >= '2023-11-01'
@@ -52,34 +108,16 @@
 )
 --
 SELECT
-  * REPLACE (
-    -- expose as easy to access JSON column,
-    -- strip nulls,
-    -- translate nested array records into a JSON object,
-    -- rename url2 -> url
-    JSON_STRIP_NULLS(
-      JSON_REMOVE(
-        -- labeled_* are the only ones that SHOULD show up as context for events pings,
-        -- thus we special-case them
-        --
-        -- The JSON_SET/JSON_EXTRACT shenanigans are needed
-        -- because those subfields might not exist, so accessing the columns would fail.
-        -- but accessing non-existent fields in a JSON object simply gives us NULL.
-        JSON_SET(
-          metrics,
-          '$.labeled_counter',
-          mozfun.json.from_nested_map(metrics.labeled_counter),
-          '$.labeled_string',
-          mozfun.json.from_nested_map(metrics.labeled_string),
-          '$.labeled_boolean',
-          mozfun.json.from_nested_map(metrics.labeled_boolean),
-          '$.url',
-          metrics.url2
-        ),
-        '$.url2'
-      ),
-      remove_empty => TRUE
-    ) AS metrics
-  )
+  base.* EXCEPT (events),
+  COALESCE(
+    SAFE.TIMESTAMP_MILLIS(SAFE_CAST(mozfun.map.get_key(event.extra, 'glean_timestamp') AS INT64)),
+    SAFE.TIMESTAMP_ADD(ping_info.parsed_start_time, INTERVAL event.timestamp MILLISECOND)
+  ) AS event_timestamp,
+  event.category AS event_category,
+  event.name AS event_name,
+  ARRAY_TO_STRING([event.category, event.name], '.') AS event, -- handles NULL values better
+  from_map_event_extra(event.extra) AS event_extra,
 FROM
   base
+CROSS JOIN
+  UNNEST(events) AS event
diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/firefox_desktop_background_update_derived/events_stream_v1/query.sql /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/firefox_desktop_background_update_derived/events_stream_v1/query.sql
--- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/firefox_desktop_background_update_derived/events_stream_v1/query.sql	2024-05-27 14:55:26.000000000 +0000
+++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/firefox_desktop_background_update_derived/events_stream_v1/query.sql	2024-05-27 14:57:30.000000000 +0000
@@ -1,7 +1,73 @@
 -- Generated via bigquery_etl.glean_usage
+-- convert array of key value pairs to a json object, cast numbers and booleans if possible
+CREATE TEMP FUNCTION from_map_event_extra(input ARRAY<STRUCT<key STRING, value STRING>>)
+RETURNS json AS (
+  IF(
+    ARRAY_LENGTH(input) = 0,
+    NULL,
+    JSON_OBJECT(
+      ARRAY(SELECT key FROM UNNEST(input)),
+      ARRAY(
+        SELECT
+          CASE
+            WHEN SAFE_CAST(value AS NUMERIC) IS NOT NULL
+              THEN TO_JSON(SAFE_CAST(value AS NUMERIC))
+            WHEN SAFE_CAST(value AS BOOL) IS NOT NULL
+              THEN TO_JSON(SAFE_CAST(value AS BOOL))
+            ELSE TO_JSON(value)
+          END
+        FROM
+          UNNEST(input)
+      )
+    )
+  )
+);
+
+-- convert array of key value pairs to a json object
+-- values are nested structs and will be converted to json objects
+CREATE TEMP FUNCTION from_map_experiment(
+  input ARRAY<
+    STRUCT<key STRING, value STRUCT<branch STRING, extra STRUCT<type STRING, enrollment_id STRING>>>
+  >
+)
+RETURNS json AS (
+  IF(
+    ARRAY_LENGTH(input) = 0,
+    NULL,
+    JSON_OBJECT(ARRAY(SELECT key FROM UNNEST(input)), ARRAY(SELECT value FROM UNNEST(input)))
+  )
+);
+
+CREATE TEMP FUNCTION metrics_to_json(metrics JSON)
+RETURNS JSON AS (
+  JSON_STRIP_NULLS(
+    JSON_REMOVE(
+      -- labeled_* are the only ones that SHOULD show up as context for events pings,
+      -- thus we special-case them
+      --
+      -- The JSON_SET/JSON_EXTRACT shenanigans are needed
+      -- because those subfields might not exist, so accessing the columns would fail.
+      -- but accessing non-existent fields in a JSON object simply gives us NULL.
+      JSON_SET(
+        metrics,
+        '$.labeled_counter',
+        mozfun.json.from_nested_map(metrics.labeled_counter),
+        '$.labeled_string',
+        mozfun.json.from_nested_map(metrics.labeled_string),
+        '$.labeled_boolean',
+        mozfun.json.from_nested_map(metrics.labeled_boolean),
+        '$.url',
+        metrics.url2
+      ),
+      '$.url2'
+    ),
+    remove_empty => TRUE
+  )
+);
+
 WITH base AS (
   SELECT
-    * EXCEPT (metrics, events, name, category, extra, timestamp) REPLACE(
+    * REPLACE (
       STRUCT(
         client_info.app_build AS app_build,
         client_info.app_channel AS app_channel,
@@ -25,24 +91,14 @@
         ping_info.end_time,
         ping_info.parsed_end_time,
         ping_info.ping_type
-      ) AS ping_info
+      ) AS ping_info,
+      metrics_to_json(TO_JSON(metrics)) AS metrics
     ),
     client_info.client_id AS client_id,
     ping_info.reason AS reason,
-    `mozfun.json.from_map`(ping_info.experiments) AS experiments,
-    COALESCE(
-      SAFE.TIMESTAMP_MILLIS(SAFE_CAST(mozfun.map.get_key(event.extra, 'glean_timestamp') AS INT64)),
-      SAFE.TIMESTAMP_ADD(ping_info.parsed_start_time, INTERVAL event.timestamp MILLISECOND)
-    ) AS event_timestamp,
-    event.category AS event_category,
-    event.name AS event_name,
-    ARRAY_TO_STRING([event.category, event.name], '.') AS event, -- handles NULL values better
-    `mozfun.json.from_map`(event.extra) AS event_extra,
-    TO_JSON(metrics) AS metrics
+    from_map_experiment(ping_info.experiments) AS experiments,
   FROM
-    `firefox_desktop_background_update.events` AS e
-  CROSS JOIN
-    UNNEST(events) AS event
+    `firefox_desktop_background_update.events`
   WHERE
     {% if is_init() %}
       DATE(submission_timestamp) >= '2023-11-01'
@@ -52,34 +108,16 @@
 )
 --
 SELECT
-  * REPLACE (
-    -- expose as easy to access JSON column,
-    -- strip nulls,
-    -- translate nested array records into a JSON object,
-    -- rename url2 -> url
-    JSON_STRIP_NULLS(
-      JSON_REMOVE(
-        -- labeled_* are the only ones that SHOULD show up as context for events pings,
-        -- thus we special-case them
-        --
-        -- The JSON_SET/JSON_EXTRACT shenanigans are needed
-        -- because those subfields might not exist, so accessing the columns would fail.
-        -- but accessing non-existent fields in a JSON object simply gives us NULL.
-        JSON_SET(
-          metrics,
-          '$.labeled_counter',
-          mozfun.json.from_nested_map(metrics.labeled_counter),
-          '$.labeled_string',
-          mozfun.json.from_nested_map(metrics.labeled_string),
-          '$.labeled_boolean',
-          mozfun.json.from_nested_map(metrics.labeled_boolean),
-          '$.url',
-          metrics.url2
-        ),
-        '$.url2'
-      ),
-      remove_empty => TRUE
-    ) AS metrics
-  )
+  base.* EXCEPT (events),
+  COALESCE(
+    SAFE.TIMESTAMP_MILLIS(SAFE_CAST(mozfun.map.get_key(event.extra, 'glean_timestamp') AS INT64)),
+    SAFE.TIMESTAMP_ADD(ping_info.parsed_start_time, INTERVAL event.timestamp MILLISECOND)
+  ) AS event_timestamp,
+  event.category AS event_category,
+  event.name AS event_name,
+  ARRAY_TO_STRING([event.category, event.name], '.') AS event, -- handles NULL values better
+  from_map_event_extra(event.extra) AS event_extra,
 FROM
   base
+CROSS JOIN
+  UNNEST(events) AS event
diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/firefox_desktop_derived/event_monitoring_live_v1/materialized_view.sql /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/firefox_desktop_derived/event_monitoring_live_v1/materialized_view.sql
--- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/firefox_desktop_derived/event_monitoring_live_v1/materialized_view.sql	2024-05-27 14:55:26.000000000 +0000
+++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/firefox_desktop_derived/event_monitoring_live_v1/materialized_view.sql	2024-05-27 14:57:26.000000000 +0000
@@ -50,7 +50,7 @@
         client_info.app_display_version AS version,
         ping_info
       FROM
-        `moz-fx-data-shared-prod.firefox_desktop_live.urlbar_potential_exposure_v1`
+        `moz-fx-data-shared-prod.firefox_desktop_live.newtab_v1`
       UNION ALL
       SELECT
         submission_timestamp,
@@ -80,7 +80,7 @@
         client_info.app_display_version AS version,
         ping_info
       FROM
-        `moz-fx-data-shared-prod.firefox_desktop_live.newtab_v1`
+        `moz-fx-data-shared-prod.firefox_desktop_live.urlbar_potential_exposure_v1`
     )
   CROSS JOIN
     UNNEST(events) AS event,
@@ -90,7 +90,7 @@
   LEFT JOIN
     UNNEST(event.extra) AS event_extra
   WHERE
-    DATE(submission_timestamp) >= "2024-05-24"
+    DATE(submission_timestamp) >= "2024-05-27"
   GROUP BY
     submission_date,
     window_start,
diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/firefox_desktop_derived/events_stream_v1/query.sql /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/firefox_desktop_derived/events_stream_v1/query.sql
--- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/firefox_desktop_derived/events_stream_v1/query.sql	2024-05-27 14:55:26.000000000 +0000
+++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/firefox_desktop_derived/events_stream_v1/query.sql	2024-05-27 14:57:31.000000000 +0000
@@ -1,7 +1,73 @@
 -- Generated via bigquery_etl.glean_usage
+-- convert array of key value pairs to a json object, cast numbers and booleans if possible
+CREATE TEMP FUNCTION from_map_event_extra(input ARRAY<STRUCT<key STRING, value STRING>>)
+RETURNS json AS (
+  IF(
+    ARRAY_LENGTH(input) = 0,
+    NULL,
+    JSON_OBJECT(
+      ARRAY(SELECT key FROM UNNEST(input)),
+      ARRAY(
+        SELECT
+          CASE
+            WHEN SAFE_CAST(value AS NUMERIC) IS NOT NULL
+              THEN TO_JSON(SAFE_CAST(value AS NUMERIC))
+            WHEN SAFE_CAST(value AS BOOL) IS NOT NULL
+              THEN TO_JSON(SAFE_CAST(value AS BOOL))
+            ELSE TO_JSON(value)
+          END
+        FROM
+          UNNEST(input)
+      )
+    )
+  )
+);
+
+-- convert array of key value pairs to a json object
+-- values are nested structs and

⚠️ Only part of the diff is displayed.

Link to full diff

@BenWu BenWu merged commit 54eb241 into main May 27, 2024
21 checks passed
@BenWu BenWu deleted the benwu/events-stream-udf branch May 27, 2024 15:27
lucia-vargas-a added a commit that referenced this pull request Jun 6, 2024
* Changes to active_users_aggregates.

* Remove days_since_seen and calculate activity segment.

* Remove is_core_active not required.

* Remove days_seen_bits from mobile queries not required.

* Adjust mobile query to use the baseline_clients_last_seen views.

* Update Focus Android query to use the baseline_clients_last_seen view.

* Get activity segments from the views instead of hard-coding it. Query attribution only when neccesary to improve performance. Delay update by 1 day to get metrics ping's data.

* Get app_name from clients_last_seen_v2 view.

* Set activity_segment as NULL for legacy Focus Android.

* Formatting

* Remove first_seen_date and correct column name in desktop_query.sql

* Update active_users_aggregates query for Desktop to get app_name and growth metrics from clients_last_seen_v2, replace language_name with locale and remove search metrics based on sprint decision (see DENG-1989).

* Update active_users_aggregates query for Mobile to remove search metrics based on sprint decision (see DENG-1989) and calculate the min metrics ping received between the current and next date, given that these pings can arrive in the same or next date as the baseline ping.

* Revert changes to Desktop queries in this PR, as this implementation is separated to [PR-5607](#5607).

* Space

* Formatting.

* Change name of activity_segment to segment.

* Add . Klar Android query and join in the Mobile view.

* Fix reference to app name.

* Update app_anme in checks for Focus

* Add klar android to the checks and update app_name in checks for Focus.

* Formatting.

* Use active_users instead of baseline_clients_last_seen and improve retrieving metrics ping's data.

* Formatting.

* --- (#5623)

updated-dependencies:
- dependency-name: requests
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Anna Scholtz <anna@scholtzan.net>

* Update backfill.yaml (#5629)

* DENG-3719: Allow setting billing project for managed backfills (#5605)

* Added default billing project and param

* DENG-3851 Updated managed backfill docs with slack channel for notifcations (#5630)

* Updated docs

* Update docs/cookbooks/creating_a_derived_dataset.md

Co-authored-by: Alexander <anicholson@mozilla.com>

---------

Co-authored-by: Alexander <anicholson@mozilla.com>

* Inject some errors to fenix metrics_clients_last_seen_v1 to test observability tooling (#5631)

* feat: add kpi_support_metrics SQL generator (#5594)

* feat: add kpi_support_metrics SQL generator

* feat: replace baseline_clients_last_seen_activity_extended with active_users

* feat: add template to union all views inside telemetry namespace

* feat: remove retention queries for firefox_ios and fenix as now they are generated via kpi_support_metrics generator

* bug: correct app_name based logic used by templating

* bug: fix formatting inside kpi_support_metrics templates

* feat: implement feedback from PR by removing mozilla online logic as no longer needed and correcting app_name setting

* feat: fix active_users union mismatch

* feat: remove scheduling for now, will be added in the next PR.

* feat: make sure we user product_specific_attribution_fields additional field list instead of app_name logic in templates (#5633)

* delete new pageload experiments table (#5634)

* Use information_schema to find experiment tables for shredder (#5635)

* Update retention of search terms sanitized datasets (#5577)

* Be explicit about columns in clients_yearly (#5636)

* Add distribution_id and geo_subdivision to clients_yearly schema (#5637)

* Added two new fields to col list (#5639)

* feat: add backfill.yaml to org_mozilla_fennec_aurora_derived.baseline_clients_daily_v1 (#5632)

* Add flipkart and home_depot views (#5638)

* Add flipkart and home_depot views

* remove manually created views. use the user_facing view syndication instead

* fix yaml lint

* DENG-3288 add Focus iOS and Focus Android and switch to use new views instead of extended views (#5641)

* DENG-3288 Add "is_mobile" to mobile_engagement_clients_v1 and mobile_engagement_v1 (#5648)

* Update `bqetl_subplat` DAG description with new expected Stripe report error message. (#5642)

* feat: add schema files to baseline_clients_daily_v1 and baseline_clients_last_seen_v1 (#5646)

* fix: add fields: map to baseline_daily and baseline_last_seen schemas (#5651)

* feat: update kpi_support_metrics generation to append platform to the unioned view name (#5644)

Co-authored-by: Katie Windau <153020235+kwindau@users.noreply.github.com>

* fix(backfills): switch to concurrent.futures to improve debuggability (#5653)

* fix(backfills): followup to concurrent.futures, raise Error if any failed dates (#5654)

* Reprocess past two days in jobs_by_organization_v1 (#5652)

* feat: rename baseline_last_seen references to active_users and add "platform" to only include platform specific fields when it matches (#5645)

* fix(backfills): switch from process to thread to avoid pickling bigquery object (#5656)

* feat: rename kpi_support_metrics to mobile_kpi_support_metrics (#5657)

* [RS-1246] Fix DDG DAU in search_revenue_levers_daily (#5655)

* Fix search_revenue_levers_daily

* Query formatting

---------

Co-authored-by: m-d-bowerman <mbowerman@mozilla.com>

* DENG-2975 active_users (#5478)

* Add is_desktop to engagement and retention tables (#5658)

* add is_desktop to engagement and retention tables

* replace clients_last_seen by active_users

* format

* remove unused normalized_channel

* [DENG-3889] Rewrite events_stream_v1 for better performance (#5659)

* Add more null channels for metrics_clients_last_seen test (#5662)

* chore(glam): accommodate for dag reschedule (#5664)

* [DENG-3889] Replace events_stream temp udfs with inline usages (#5666)

* Add telemetry_derived.latest_version to data-observability-dev (#5663)

* [RS-834] Add country to search_revenue_levers_daily (#4739)

* Add country

* change channel spellings so match revenue table

* update country list

* Reformat

* Update desktop DAU definition

Can't update mobile DAU by engine breakouts w/o changes to mobile_search_clients_daily. But we also don't need that option since default engine is currently buggy and searchers presumably have > 0 active seconds

* Fix sql format

---------

Co-authored-by: Alekhya <88394696+alekhyamoz@users.noreply.github.com>
Co-authored-by: Alekhya Kommasani <akommasani@mozilla.com>

* Fix PyPI package publishing in CI (#5669)

* Fix PyPI package publishing in CI

* Update .circleci/workflows.yml

Co-authored-by: Mike Williams <102263964+mikewilli@users.noreply.github.com>

---------

Co-authored-by: Mike Williams <102263964+mikewilli@users.noreply.github.com>

* Use ad group ID to get campaign info (#5670)

* feat: move mobile engagement logic into mobile kpi support metrics generator (#5661)

* feat: move mobile_engagement SQL logic into mobile_kpi_support_metrics generator

* feat: remove mobile_engagement queries from telemetry and telemetry_erived

* feat: apply formatting to engagement_clients view and add checks template

* feat: add engagement view template and add engagement templates to the generation logic

* bug: fix engagement query errors

* feat: sneak in removing of retention views inside telemetry (mobile kpi generator will overwrite these anyways)

* feat: remove bqetl_mobile_engagement_model DAG from DAG list as no longer needed.

* feat: fix an engagement_v1 query template and removed unecessary comment from the metadata template

* fix: update engagement templates to ensure valid query is rendered

* [DENG-3905] Support queries with temp udfs when billing project is set (#5668)

* DENG-3237: Added deletion dates to deprecated tables (#5660)

* Added deletion dates

* fix: metadata.yaml generation change to allow backfill command execution (#5671)

* RS-788 Add support for organic searches with ads to the mobile search counts tables (#5598)

* adding organic searches with ads to this table

* updating mobile_search_aggregates table with search_with_ads_organic column

* updating the search revenue lever table

- include search_with_ads_organic columns for Bing, Google and DDG

* Fix CI issues

* Fix tests CI failure

* fix tests

* Fix test sql failure

* Update query.sql

reverting back to original code for search_revenue_levers table

---------

Co-authored-by: Alekhya Kommasani <akommasani@mozilla.com>
Co-authored-by: Alekhya <88394696+alekhyamoz@users.noreply.github.com>

* Update desktop active_users view to make dimensions available for KPIs. (#5675)

* Use desktop active_users view to make dimensions available for KPIs.

* Naming adjustment

* Fix reference to app name.

* Use active_users instead of baseline_clients_last_seen and improve retrieving metrics ping's data.

* Formatting.

* Get distribution_id for Fenix.

* Get distribution_id for Fenix.

* Format correction

* Skip unavailable data, becuase it's not KPI relevant.

* Formatting

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Anna Scholtz <anna@scholtzan.net>
Co-authored-by: Teon L Brooks <teon.brooks@gmail.com>
Co-authored-by: Winnie Chan <10429026+wwyc@users.noreply.github.com>
Co-authored-by: Alexander <anicholson@mozilla.com>
Co-authored-by: kik-kik <42538694+kik-kik@users.noreply.github.com>
Co-authored-by: Ben Wu <12437227+BenWu@users.noreply.github.com>
Co-authored-by: whd <whd@users.noreply.github.com>
Co-authored-by: Frank Bertsch <frank.bertsch@gmail.com>
Co-authored-by: wil stuckey <wstuckey@mozilla.com>
Co-authored-by: Katie Windau <153020235+kwindau@users.noreply.github.com>
Co-authored-by: Sean Rose <1994030+sean-rose@users.noreply.github.com>
Co-authored-by: skahmann3 <16420065+skahmann3@users.noreply.github.com>
Co-authored-by: m-d-bowerman <mbowerman@mozilla.com>
Co-authored-by: Vanessa Sabino <vsabino@mozilla.com>
Co-authored-by: Eduardo Filho <edugomfilho@gmail.com>
Co-authored-by: Alekhya <88394696+alekhyamoz@users.noreply.github.com>
Co-authored-by: Alekhya Kommasani <akommasani@mozilla.com>
Co-authored-by: Mike Williams <102263964+mikewilli@users.noreply.github.com>
Co-authored-by: Preethi Issac <102384487+pissac17@users.noreply.github.com>
lucia-vargas-a added a commit that referenced this pull request Jun 6, 2024
* Changes to active_users_aggregates.

* Remove days_since_seen and calculate activity segment.

* Remove is_core_active not required.

* Remove days_seen_bits from mobile queries not required.

* Adjust mobile query to use the baseline_clients_last_seen views.

* Update Focus Android query to use the baseline_clients_last_seen view.

* Get activity segments from the views instead of hard-coding it. Query attribution only when neccesary to improve performance. Delay update by 1 day to get metrics ping's data.

* Get app_name from clients_last_seen_v2 view.

* Set activity_segment as NULL for legacy Focus Android.

* Formatting

* Remove first_seen_date and correct column name in desktop_query.sql

* Update active_users_aggregates query for Desktop to get app_name and growth metrics from clients_last_seen_v2, replace language_name with locale and remove search metrics based on sprint decision (see DENG-1989).

* Update active_users_aggregates query for Mobile to remove search metrics based on sprint decision (see DENG-1989) and calculate the min metrics ping received between the current and next date, given that these pings can arrive in the same or next date as the baseline ping.

* Revert changes to Desktop queries in this PR, as this implementation is separated to [PR-5607](#5607).

* Space

* Formatting.

* Change name of activity_segment to segment.

* Add . Klar Android query and join in the Mobile view.

* Fix reference to app name.

* Update app_anme in checks for Focus

* Add klar android to the checks and update app_name in checks for Focus.

* Formatting.

* Use active_users instead of baseline_clients_last_seen and improve retrieving metrics ping's data.

* Formatting.

* --- (#5623)

updated-dependencies:
- dependency-name: requests
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Anna Scholtz <anna@scholtzan.net>

* Update backfill.yaml (#5629)

* DENG-3719: Allow setting billing project for managed backfills (#5605)

* Added default billing project and param

* DENG-3851 Updated managed backfill docs with slack channel for notifcations (#5630)

* Updated docs

* Update docs/cookbooks/creating_a_derived_dataset.md

Co-authored-by: Alexander <anicholson@mozilla.com>

---------

Co-authored-by: Alexander <anicholson@mozilla.com>

* Inject some errors to fenix metrics_clients_last_seen_v1 to test observability tooling (#5631)

* feat: add kpi_support_metrics SQL generator (#5594)

* feat: add kpi_support_metrics SQL generator

* feat: replace baseline_clients_last_seen_activity_extended with active_users

* feat: add template to union all views inside telemetry namespace

* feat: remove retention queries for firefox_ios and fenix as now they are generated via kpi_support_metrics generator

* bug: correct app_name based logic used by templating

* bug: fix formatting inside kpi_support_metrics templates

* feat: implement feedback from PR by removing mozilla online logic as no longer needed and correcting app_name setting

* feat: fix active_users union mismatch

* feat: remove scheduling for now, will be added in the next PR.

* feat: make sure we user product_specific_attribution_fields additional field list instead of app_name logic in templates (#5633)

* delete new pageload experiments table (#5634)

* Use information_schema to find experiment tables for shredder (#5635)

* Update retention of search terms sanitized datasets (#5577)

* Be explicit about columns in clients_yearly (#5636)

* Add distribution_id and geo_subdivision to clients_yearly schema (#5637)

* Added two new fields to col list (#5639)

* feat: add backfill.yaml to org_mozilla_fennec_aurora_derived.baseline_clients_daily_v1 (#5632)

* Add flipkart and home_depot views (#5638)

* Add flipkart and home_depot views

* remove manually created views. use the user_facing view syndication instead

* fix yaml lint

* DENG-3288 add Focus iOS and Focus Android and switch to use new views instead of extended views (#5641)

* DENG-3288 Add "is_mobile" to mobile_engagement_clients_v1 and mobile_engagement_v1 (#5648)

* Update `bqetl_subplat` DAG description with new expected Stripe report error message. (#5642)

* feat: add schema files to baseline_clients_daily_v1 and baseline_clients_last_seen_v1 (#5646)

* fix: add fields: map to baseline_daily and baseline_last_seen schemas (#5651)

* feat: update kpi_support_metrics generation to append platform to the unioned view name (#5644)

Co-authored-by: Katie Windau <153020235+kwindau@users.noreply.github.com>

* fix(backfills): switch to concurrent.futures to improve debuggability (#5653)

* fix(backfills): followup to concurrent.futures, raise Error if any failed dates (#5654)

* Reprocess past two days in jobs_by_organization_v1 (#5652)

* feat: rename baseline_last_seen references to active_users and add "platform" to only include platform specific fields when it matches (#5645)

* fix(backfills): switch from process to thread to avoid pickling bigquery object (#5656)

* feat: rename kpi_support_metrics to mobile_kpi_support_metrics (#5657)

* [RS-1246] Fix DDG DAU in search_revenue_levers_daily (#5655)

* Fix search_revenue_levers_daily

* Query formatting

---------

Co-authored-by: m-d-bowerman <mbowerman@mozilla.com>

* DENG-2975 active_users (#5478)

* Add is_desktop to engagement and retention tables (#5658)

* add is_desktop to engagement and retention tables

* replace clients_last_seen by active_users

* format

* remove unused normalized_channel

* [DENG-3889] Rewrite events_stream_v1 for better performance (#5659)

* Add more null channels for metrics_clients_last_seen test (#5662)

* chore(glam): accommodate for dag reschedule (#5664)

* [DENG-3889] Replace events_stream temp udfs with inline usages (#5666)

* Add telemetry_derived.latest_version to data-observability-dev (#5663)

* [RS-834] Add country to search_revenue_levers_daily (#4739)

* Add country

* change channel spellings so match revenue table

* update country list

* Reformat

* Update desktop DAU definition

Can't update mobile DAU by engine breakouts w/o changes to mobile_search_clients_daily. But we also don't need that option since default engine is currently buggy and searchers presumably have > 0 active seconds

* Fix sql format

---------

Co-authored-by: Alekhya <88394696+alekhyamoz@users.noreply.github.com>
Co-authored-by: Alekhya Kommasani <akommasani@mozilla.com>

* Fix PyPI package publishing in CI (#5669)

* Fix PyPI package publishing in CI

* Update .circleci/workflows.yml

Co-authored-by: Mike Williams <102263964+mikewilli@users.noreply.github.com>

---------

Co-authored-by: Mike Williams <102263964+mikewilli@users.noreply.github.com>

* Use ad group ID to get campaign info (#5670)

* feat: move mobile engagement logic into mobile kpi support metrics generator (#5661)

* feat: move mobile_engagement SQL logic into mobile_kpi_support_metrics generator

* feat: remove mobile_engagement queries from telemetry and telemetry_erived

* feat: apply formatting to engagement_clients view and add checks template

* feat: add engagement view template and add engagement templates to the generation logic

* bug: fix engagement query errors

* feat: sneak in removing of retention views inside telemetry (mobile kpi generator will overwrite these anyways)

* feat: remove bqetl_mobile_engagement_model DAG from DAG list as no longer needed.

* feat: fix an engagement_v1 query template and removed unecessary comment from the metadata template

* fix: update engagement templates to ensure valid query is rendered

* [DENG-3905] Support queries with temp udfs when billing project is set (#5668)

* DENG-3237: Added deletion dates to deprecated tables (#5660)

* Added deletion dates

* fix: metadata.yaml generation change to allow backfill command execution (#5671)

* RS-788 Add support for organic searches with ads to the mobile search counts tables (#5598)

* adding organic searches with ads to this table

* updating mobile_search_aggregates table with search_with_ads_organic column

* updating the search revenue lever table

- include search_with_ads_organic columns for Bing, Google and DDG

* Fix CI issues

* Fix tests CI failure

* fix tests

* Fix test sql failure

* Update query.sql

reverting back to original code for search_revenue_levers table

---------

Co-authored-by: Alekhya Kommasani <akommasani@mozilla.com>
Co-authored-by: Alekhya <88394696+alekhyamoz@users.noreply.github.com>

* Update desktop active_users view to make dimensions available for KPIs. (#5675)

* Use desktop active_users view to make dimensions available for KPIs.

* Naming adjustment

* Fix reference to app name.

* Use active_users instead of baseline_clients_last_seen and improve retrieving metrics ping's data.

* Formatting.

* Get distribution_id for Fenix.

* Get distribution_id for Fenix.

* Format correction

* Skip unavailable data, becuase it's not KPI relevant.

* Formatting

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Anna Scholtz <anna@scholtzan.net>
Co-authored-by: Teon L Brooks <teon.brooks@gmail.com>
Co-authored-by: Winnie Chan <10429026+wwyc@users.noreply.github.com>
Co-authored-by: Alexander <anicholson@mozilla.com>
Co-authored-by: kik-kik <42538694+kik-kik@users.noreply.github.com>
Co-authored-by: Ben Wu <12437227+BenWu@users.noreply.github.com>
Co-authored-by: whd <whd@users.noreply.github.com>
Co-authored-by: Frank Bertsch <frank.bertsch@gmail.com>
Co-authored-by: wil stuckey <wstuckey@mozilla.com>
Co-authored-by: Katie Windau <153020235+kwindau@users.noreply.github.com>
Co-authored-by: Sean Rose <1994030+sean-rose@users.noreply.github.com>
Co-authored-by: skahmann3 <16420065+skahmann3@users.noreply.github.com>
Co-authored-by: m-d-bowerman <mbowerman@mozilla.com>
Co-authored-by: Vanessa Sabino <vsabino@mozilla.com>
Co-authored-by: Eduardo Filho <edugomfilho@gmail.com>
Co-authored-by: Alekhya <88394696+alekhyamoz@users.noreply.github.com>
Co-authored-by: Alekhya Kommasani <akommasani@mozilla.com>
Co-authored-by: Mike Williams <102263964+mikewilli@users.noreply.github.com>
Co-authored-by: Preethi Issac <102384487+pissac17@users.noreply.github.com>
lucia-vargas-a added a commit that referenced this pull request Jun 6, 2024
* Changes to active_users_aggregates.

* GROWTH-147 iOS LTV (#5400)

* Adding a new table (moz-fx-data-shared-prod.firefox_ios_derived.ltv_states_v1) & corresponding view (moz-fx-data-shared-prod.firefox_ios.ltv_states)

* GROWTH-147 backfill (#5407)

* DENG-3462 create view on top of clients_last_seen_v2 with new boolean… (#5406)

* DENG-3462 create view on top of clients_last_seen_v2 with new boolean columns is_dau, is_wau, etc

* DENG-3462 remove is_mobile column

* Backfill new `newtab_clients_daily_v1` table (#5408)

* Backfill new table

* update end_date to incorporate new fields

* GROWTH-147 complete backfill of firefox_ios_derived.ltv_states_v1 (#5409)

* Google ads etl (#5393)

* Add UDFs for extracting campaign info

* Add campaigns table

* Add ad_groups table

* Add schemas

* Uncomment UDFs

* Format sql files

* Reformat UDFs

* Use common most_recent_rec terminology

* Replace 3 UDFs with one

* reformat & fix tests

* Reformat again

* App campaign stats (#5413)

* Extract campaign_id and ad_group_id in fac

* Add Android App Campaign Stats table

* Dont rely on fac view

* Reformat

* DENG-2621: Add pageload derived tables (#5359)

* Added nightly, experiments and 1pct tables

* DENG-3235: Added deletion date field to bigquery_tables_inventory_v1 (#5386)

* Added deletion date field

---------

Co-authored-by: Chelsey Beck <64881557+chelseybeck@users.noreply.github.com>

* DENG-3471 suppression list from campaign monitor (#5412)

* DENG-3471 suppression list from campaign monitor

* Update sql/moz-fx-data-shared-prod/marketing_suppression_list_external/campaign_monitor_suppression_list_v1/metadata.yaml

Co-authored-by: Ben Wu <12437227+BenWu@users.noreply.github.com>

---------

Co-authored-by: Ben Wu <12437227+BenWu@users.noreply.github.com>

* Complete backfill (#5414)

* DENG-3008 Updating Braze Models (#5405)

* updating create statement

* joining on users table to filter for active and ensuring there is at least one subscription

* joining on users to filter for active

* adding dev subscription group

* removing fxa_id in favor of has_fxa

* bringing in update timestamp for downstream use

* updating formatting and adding filter for active users

* adding filter for one active newsletter

* updating tests

* adding fxa id back to users table to join to products

* updating query

* updating values

* updating tests

* fix test for subscriptions

* changing schema to array

* updating format

* updating to pull in all subscriptions with statuses

* removing create statement

* updating subscriptions query to make it an array and updating associated tests

* updating formatting and comment

---------

Co-authored-by: Leli Schiestl <lschiestl@mozilla.com>

* Deng 3471 suppression list out of campaign monitor api (#5416)

* DENG-3471 suppression list from campaign monitor

* Update sql/moz-fx-data-shared-prod/marketing_suppression_list_external/campaign_monitor_suppression_list_v1/metadata.yaml

Co-authored-by: Ben Wu <12437227+BenWu@users.noreply.github.com>

* after seans coder review

---------

Co-authored-by: Ben Wu <12437227+BenWu@users.noreply.github.com>

* fixing the readme (#5418)

* DENG-3511 update funnel_derived_clients to remove channel to match updated private bqetl (#5419)

* DENG-3511 update views to use updated udfs (#5420)

* DENG-3511 remove channel from input since no longer required for updated UDF

* DENG-3511 remove channel from input since no longer required for updated UDF

* DENG-3186 new desktop engagement model (#5415)

* DENG-3186 new desktop engagement model

* DENG-3186 add distribution ID to eng client V1 query

* DENG-3186 - add distribution ID to the schema.yaml for eng client

* DENG-3186 add distribution ID to the agg table

* DENG-3186 adding locale to both client & agg tables

* DENG-3186 add app_version to client & agg tables

* DENG-3168 move lifecycle stage from engagement agg table to engagement agg view

* DENG-3186 - add new desktop_engagement_client_v1 table to shredder

* DENG-3186 update formatting of config.py

* DENG-3186 switch to get country, os, os version, and build # from clients last seen and remove join to clients daily

* DENG-3168 backfill new desktop_engagement_client_v1 table (#5422)

* Update Glean generators checks. (#5323)

* Update Glean generators checks.

* Remove new_profiles check, not required.

* warn or fail depending on range

* Set check to fail if clients are missing, which is an ETL issue.

* Checks fail when difference is higher than 1%.

* Checks fail when difference is higher than 1%.

* Make checks fail on a smaller difference.

* Deng 3410 main suppression list (#5423)

* DENG-3471 suppression list from campaign monitor

* status quo

* DENG-3410 Main Suppression List

* adding metadata

* DENG-3125 Update clients last seen v2 (#5617)

Explicitly adding column order to schema.yaml and query, and will also swap out with newly backfilled table

* Deng 3763 increase schedule frequency (#5590)

* changing to 8 hours to align with schedules

* updating schedule

* updating schedule and aligning checks

* updating error text

* DENG-3462 - update clients last seen to use clients_last_seen_v2 instead of v1 (#5447)

Co-authored-by: Brad Ochocki Szasz <brad.ochocki@gmail.com>

* Revert "RS-788 Add support for organic searches with ads to the mobile search…" (#5676)

This reverts commit 9814059.

* Include VPN iOS network extension app in VPN events ETLs. (#5677)

* feat: remove backfill.yaml for org_mozilla_fennec_aurora_derived/baseline_clients_daily_v1 as it was not used (#5673)

* Normalizing campaign names (#5681)

* Normalizing campaign names

Converted all the campaign names to upper case to prevent us from seeing situations like eu and EU in our segments

* Fixed the placement of the UPPER function

* removing acoustic last engaged timestamp from models and tests (#5678)

* removing acoustic last engaged timestamp

* removing suppression list model and tests

* removing dependency until frequency is increased

* formatting

* formatting

* Add isp name to desktop_active_users (#5686)

* feat: update firefox_android_clients_v1 to pull distribution_id only from the baseline ping (#5685)

* feat: update firefox_android_clients_v1 to pull distribution_id only from the baseline ping

* feat: update firefox_android_clients_v1 baseline test schema to include distribution_id

* fix: resolve distribution_id not in baseline error

---------

Co-authored-by: Katie Windau <153020235+kwindau@users.noreply.github.com>

* adding search_with_ads_organic to mobile_clients_daily table (#5683)

* adding search_with_ads_organic to mobile_clients_daily table

* Fix the CI tests

---------

Co-authored-by: Alekhya Kommasani <akommasani@mozilla.com>
Co-authored-by: Alekhya <88394696+alekhyamoz@users.noreply.github.com>

* Fix setting partitioning metadata (#5690)

* DENG-3186 start managed backfill of desktop_engagement_clients_v1 (#5691)

* chore: refactor schema deploys, add and use deploy utils (#5674)

* chore: refactor schema deploys, add and use utils

* Update tests

* Add deploy tests

* Use string representation of table object in log statements

* removing upstream dependencies for now (#5688)

* removing dependencies

* adding comment

* feat: add backfill.yaml to fenix and firefox_ios retention_v1 query (#5679)

* DENG-3186 complete managed backfill of destkop_engagement_clients_v1 (#5695)

* Reverted EXPANSION to Expansion (#5698)

* fix: update entry date to be the same as the end_date (#5699)

* fix(deploy): skip (instead of fail) deploys with explicitly null destination_table (#5700)

* Revert "adding search_with_ads_organic to mobile_clients_daily table (#5683)" (#5701)

This reverts commit 71bbaa2.

* DENG-3186 initiate managed backfill for desktop_engagement_v1 (#5694)

* feat: NULL empty strings in attribution fields when generating retention_clients and engagement_clients results (#5680)

* feat: NULL empty strings in attribution fields when generating retention_clients and engagement_clients results

* fix: fix engagement and retention clients views inside kpi_mobile generator

* Add registration funnels config using legacy events (#5696)

* Add registration funnels using legacy events

* Update registration_funnels_legacy_events.toml

---------

Co-authored-by: Kimberly Siegler <kimberlysiegler@Kimberlys-MacBook-Pro-2.local>
Co-authored-by: Anna Scholtz <anna@scholtzan.net>

* Deng 3763 increment changed tables (#5704)

* changing sync tables to incremental

* removing checks no longer needed

* adding timestamp to payload and using it in changed query

* formatting sql

* deleting changed subscriptions

* updating firefox subscriptions query

* updating comments

* DENG-3186 complete managed backfill of desktop engagement v1 (#5705)

* updating labels (#5706)

* removed backfill yaml (#5707)

* Fix mobile_search_clients_daily (#5702)

* Update mobile_search_clients_daily tempate

* Update mobile_search_clients_daily tests

* adding builders application newsletter to map (#5708)

* feat: moving 2024-05-31 fenix_derived.retention_v1 backfill entry to Complete (#5709)

* Clean up views that are no longer needed (#5713)

* feat: make changes to generate retention and engagement for focus and klar products also (#5710)

* add scheduling definition into parameter to fix error where data not … (#5716)

* add scheduling definition into parameter to fix error where data not going into the proper partition

* remove {raw} around date as not using sql generator here

* add backfill.yaml, initiate backfill for desktop_retention_clients_v1 (#5718)

* updating metadata for incremental runs (#5717)

* updating metadata for incremental runs

* changing back to day from hour

* missed one

* updating to remove partition parameter and add tags (#5719)

* The app_name is Firefox Desktop in all cases. (#5720)

* complete managed backfill for desktop-client-retetion-v1 (#5724)

* add backfill.yaml for managed backfill of desktop_retention_v1 aggregation table (#5727)

* feat: add bqetl_mobile_kpi_metrics DAG and schedule mobile_kpi_metrics using it (#5715)

* feat: add backfill.yaml for firefox_ios_derived.retention_v1 (#5728)

* adding search with ads organic to mobile_search_aggregates and search_revenue_levers_daily table (#5682)

* updates to mobile_search_aggregates and search_revenue_levers_table

adding search_with_ads_organic to mobile_search_aggregates and search_revenue_levers_table

* Update query.sql

* Fix schema.yaml and test files

* Fix CI issue

Fix CI issue

---------

Co-authored-by: Alekhya Kommasani <akommasani@mozilla.com>
Co-authored-by: Alekhya <88394696+alekhyamoz@users.noreply.github.com>

* Exclude `accounts_cirrus` Glean app from baseline ping checks as it has no baseline pings (bug 1888718). (#5726)

Cirrus apps generally don't have baseline pings.

This should resolve bug 1888718 "Airflow task bqetl_glean_usage.checks__warn_accounts_cirrus_derived__baseline_clients_last_seen__v1 failing since exec_date 2024-03-23".

* feat: update bqetl_mobile_kpi_metrics DAG start date (#5731)

* feat: update status of firefox_ios_derived.retention_v1 to Complete (#5730)

* change status from Intitiate to Complete (#5732)

* Remove `external_downstream_tasks` metadata from `unified_metrics_v1`. (#5734)

The `kpi_forecasting` DAG doesn't currently have a `wait_for_unified_metrics` task.

* Add additional fields to data observability table to test schema changes (#5735)

* Added dataset id (#5721)

* GROWTH-143 Create new desktop conversion event table (#5733)

* fix(geckoview_version): Replace geckoview.version field with valid gecko.version (#5736)

* removing uniqueness check from sync models (#5739)

* Backfill geckoview_version_v1 (#5738)

* Backfill geckoview_version_v1

* Add schema.yaml to geckoview_version_v1

* RS_1233 add is_enterprise_policies to search_clients_daily_v8 and search_aggregates (#5714)

* RS_1233_Add payload.processes.parent.scalars.policies_is_enterprise to search aggregates and search clients daily table

Add ingpayload.processes.parent.scalars.policies_is_enterprise to
- search_derived/search_aggregates
- search_derived/searc_ clients_daily_v8

* Update query.sql

* update to schema.yaml

* Fix CI issues

* Fix the tests

Fix tests issue

---------

Co-authored-by: Alekhya Kommasani <akommasani@mozilla.com>
Co-authored-by: Alekhya <88394696+alekhyamoz@users.noreply.github.com>

* Hardcode normalized_os as iOS in event_aggregates (#5725)

* GROWTH-143 fix check to use required partition by column and check al… (#5743)

* Add channel to schema_error_counts_v2 (#5744)

* Complete backfill of geckoview_versions (#5742)

* Fix `format_timedelta` function's parsing of negative timedeltas (#5740)

* Fix `format_timedelta` function's parsing of negative timedeltas.

The entire timedelta can be negative.

* Refactor to use a single timedelta regular expression.

* Fix typo in `format_timedelta` function argument.

* Add deprecation label to sql generators. (#5746)

* Deprecate active_users_aggregates_attribution (#5745)

* The app_name is Firefox Desktop in all cases.

* Formatting

* Formatting

* Example config files for BigEye and Monte Carlo (#5737)

* Add BigConfig example

* Add monte carlo config

* Allow multiple monte carlo configs

* Update schema for schema_error_counts_v2 (#5748)

* Deng 3998 deprecate active users aggregates deletion requests (#5747)

* Add deprecation label to sql generators.

* Clarification to not delete the table.

* Update active_users_aggregates query for Desktop (#5607)

* Update active_users_aggregates query for Desktop to get app_name and growth metrics from clients_last_seen_v2, replace language_name with locale and remove search metrics based on sprint decision (see DENG-1989).

* Use active_users view.

* Naming adjustment

---------

Co-authored-by: Brad Ochocki Szasz <brad.ochocki@gmail.com>

* Changes to active_users_aggregates query for Mobile (#5396)

* Changes to active_users_aggregates.

* Remove days_since_seen and calculate activity segment.

* Remove is_core_active not required.

* Remove days_seen_bits from mobile queries not required.

* Adjust mobile query to use the baseline_clients_last_seen views.

* Update Focus Android query to use the baseline_clients_last_seen view.

* Get activity segments from the views instead of hard-coding it. Query attribution only when neccesary to improve performance. Delay update by 1 day to get metrics ping's data.

* Get app_name from clients_last_seen_v2 view.

* Set activity_segment as NULL for legacy Focus Android.

* Formatting

* Remove first_seen_date and correct column name in desktop_query.sql

* Update active_users_aggregates query for Desktop to get app_name and growth metrics from clients_last_seen_v2, replace language_name with locale and remove search metrics based on sprint decision (see DENG-1989).

* Update active_users_aggregates query for Mobile to remove search metrics based on sprint decision (see DENG-1989) and calculate the min metrics ping received between the current and next date, given that these pings can arrive in the same or next date as the baseline ping.

* Revert changes to Desktop queries in this PR, as this implementation is separated to [PR-5607](#5607).

* Space

* Formatting.

* Change name of activity_segment to segment.

* Fix reference to app name.

* Update app_anme in checks for Focus

* Use active_users instead of baseline_clients_last_seen and improve retrieving metrics ping's data.

* Formatting.

* Get distribution_id for Fenix.

* Get distribution_id for Fenix.

---------

Co-authored-by: Brad Ochocki Szasz <brad.ochocki@gmail.com>

* Create klar_android.active users aggregates (#5624)

* Changes to active_users_aggregates.

* Remove days_since_seen and calculate activity segment.

* Remove is_core_active not required.

* Remove days_seen_bits from mobile queries not required.

* Adjust mobile query to use the baseline_clients_last_seen views.

* Update Focus Android query to use the baseline_clients_last_seen view.

* Get activity segments from the views instead of hard-coding it. Query attribution only when neccesary to improve performance. Delay update by 1 day to get metrics ping's data.

* Get app_name from clients_last_seen_v2 view.

* Set activity_segment as NULL for legacy Focus Android.

* Formatting

* Remove first_seen_date and correct column name in desktop_query.sql

* Update active_users_aggregates query for Desktop to get app_name and growth metrics from clients_last_seen_v2, replace language_name with locale and remove search metrics based on sprint decision (see DENG-1989).

* Update active_users_aggregates query for Mobile to remove search metrics based on sprint decision (see DENG-1989) and calculate the min metrics ping received between the current and next date, given that these pings can arrive in the same or next date as the baseline ping.

* Revert changes to Desktop queries in this PR, as this implementation is separated to [PR-5607](#5607).

* Space

* Formatting.

* Change name of activity_segment to segment.

* Add . Klar Android query and join in the Mobile view.

* Fix reference to app name.

* Update app_anme in checks for Focus

* Add klar android to the checks and update app_name in checks for Focus.

* Formatting.

* Use active_users instead of baseline_clients_last_seen and improve retrieving metrics ping's data.

* Formatting.

* --- (#5623)

updated-dependencies:
- dependency-name: requests
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Anna Scholtz <anna@scholtzan.net>

* Update backfill.yaml (#5629)

* DENG-3719: Allow setting billing project for managed backfills (#5605)

* Added default billing project and param

* DENG-3851 Updated managed backfill docs with slack channel for notifcations (#5630)

* Updated docs

* Update docs/cookbooks/creating_a_derived_dataset.md

Co-authored-by: Alexander <anicholson@mozilla.com>

---------

Co-authored-by: Alexander <anicholson@mozilla.com>

* Inject some errors to fenix metrics_clients_last_seen_v1 to test observability tooling (#5631)

* feat: add kpi_support_metrics SQL generator (#5594)

* feat: add kpi_support_metrics SQL generator

* feat: replace baseline_clients_last_seen_activity_extended with active_users

* feat: add template to union all views inside telemetry namespace

* feat: remove retention queries for firefox_ios and fenix as now they are generated via kpi_support_metrics generator

* bug: correct app_name based logic used by templating

* bug: fix formatting inside kpi_support_metrics templates

* feat: implement feedback from PR by removing mozilla online logic as no longer needed and correcting app_name setting

* feat: fix active_users union mismatch

* feat: remove scheduling for now, will be added in the next PR.

* feat: make sure we user product_specific_attribution_fields additional field list instead of app_name logic in templates (#5633)

* delete new pageload experiments table (#5634)

* Use information_schema to find experiment tables for shredder (#5635)

* Update retention of search terms sanitized datasets (#5577)

* Be explicit about columns in clients_yearly (#5636)

* Add distribution_id and geo_subdivision to clients_yearly schema (#5637)

* Added two new fields to col list (#5639)

* feat: add backfill.yaml to org_mozilla_fennec_aurora_derived.baseline_clients_daily_v1 (#5632)

* Add flipkart and home_depot views (#5638)

* Add flipkart and home_depot views

* remove manually created views. use the user_facing view syndication instead

* fix yaml lint

* DENG-3288 add Focus iOS and Focus Android and switch to use new views instead of extended views (#5641)

* DENG-3288 Add "is_mobile" to mobile_engagement_clients_v1 and mobile_engagement_v1 (#5648)

* Update `bqetl_subplat` DAG description with new expected Stripe report error message. (#5642)

* feat: add schema files to baseline_clients_daily_v1 and baseline_clients_last_seen_v1 (#5646)

* fix: add fields: map to baseline_daily and baseline_last_seen schemas (#5651)

* feat: update kpi_support_metrics generation to append platform to the unioned view name (#5644)

Co-authored-by: Katie Windau <153020235+kwindau@users.noreply.github.com>

* fix(backfills): switch to concurrent.futures to improve debuggability (#5653)

* fix(backfills): followup to concurrent.futures, raise Error if any failed dates (#5654)

* Reprocess past two days in jobs_by_organization_v1 (#5652)

* feat: rename baseline_last_seen references to active_users and add "platform" to only include platform specific fields when it matches (#5645)

* fix(backfills): switch from process to thread to avoid pickling bigquery object (#5656)

* feat: rename kpi_support_metrics to mobile_kpi_support_metrics (#5657)

* [RS-1246] Fix DDG DAU in search_revenue_levers_daily (#5655)

* Fix search_revenue_levers_daily

* Query formatting

---------

Co-authored-by: m-d-bowerman <mbowerman@mozilla.com>

* DENG-2975 active_users (#5478)

* Add is_desktop to engagement and retention tables (#5658)

* add is_desktop to engagement and retention tables

* replace clients_last_seen by active_users

* format

* remove unused normalized_channel

* [DENG-3889] Rewrite events_stream_v1 for better performance (#5659)

* Add more null channels for metrics_clients_last_seen test (#5662)

* chore(glam): accommodate for dag reschedule (#5664)

* [DENG-3889] Replace events_stream temp udfs with inline usages (#5666)

* Add telemetry_derived.latest_version to data-observability-dev (#5663)

* [RS-834] Add country to search_revenue_levers_daily (#4739)

* Add country

* change channel spellings so match revenue table

* update country list

* Reformat

* Update desktop DAU definition

Can't update mobile DAU by engine breakouts w/o changes to mobile_search_clients_daily. But we also don't need that option since default engine is currently buggy and searchers presumably have > 0 active seconds

* Fix sql format

---------

Co-authored-by: Alekhya <88394696+alekhyamoz@users.noreply.github.com>
Co-authored-by: Alekhya Kommasani <akommasani@mozilla.com>

* Fix PyPI package publishing in CI (#5669)

* Fix PyPI package publishing in CI

* Update .circleci/workflows.yml

Co-authored-by: Mike Williams <102263964+mikewilli@users.noreply.github.com>

---------

Co-authored-by: Mike Williams <102263964+mikewilli@users.noreply.github.com>

* Use ad group ID to get campaign info (#5670)

* feat: move mobile engagement logic into mobile kpi support metrics generator (#5661)

* feat: move mobile_engagement SQL logic into mobile_kpi_support_metrics generator

* feat: remove mobile_engagement queries from telemetry and telemetry_erived

* feat: apply formatting to engagement_clients view and add checks template

* feat: add engagement view template and add engagement templates to the generation logic

* bug: fix engagement query errors

* feat: sneak in removing of retention views inside telemetry (mobile kpi generator will overwrite these anyways)

* feat: remove bqetl_mobile_engagement_model DAG from DAG list as no longer needed.

* feat: fix an engagement_v1 query template and removed unecessary comment from the metadata template

* fix: update engagement templates to ensure valid query is rendered

* [DENG-3905] Support queries with temp udfs when billing project is set (#5668)

* DENG-3237: Added deletion dates to deprecated tables (#5660)

* Added deletion dates

* fix: metadata.yaml generation change to allow backfill command execution (#5671)

* RS-788 Add support for organic searches with ads to the mobile search counts tables (#5598)

* adding organic searches with ads to this table

* updating mobile_search_aggregates table with search_with_ads_organic column

* updating the search revenue lever table

- include search_with_ads_organic columns for Bing, Google and DDG

* Fix CI issues

* Fix tests CI failure

* fix tests

* Fix test sql failure

* Update query.sql

reverting back to original code for search_revenue_levers table

---------

Co-authored-by: Alekhya Kommasani <akommasani@mozilla.com>
Co-authored-by: Alekhya <88394696+alekhyamoz@users.noreply.github.com>

* Update desktop active_users view to make dimensions available for KPIs. (#5675)

* Use desktop active_users view to make dimensions available for KPIs.

* Naming adjustment

* Fix reference to app name.

* Use active_users instead of baseline_clients_last_seen and improve retrieving metrics ping's data.

* Formatting.

* Get distribution_id for Fenix.

* Get distribution_id for Fenix.

* Format correction

* Skip unavailable data, becuase it's not KPI relevant.

* Formatting

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Anna Scholtz <anna@scholtzan.net>
Co-authored-by: Teon L Brooks <teon.brooks@gmail.com>
Co-authored-by: Winnie Chan <10429026+wwyc@users.noreply.github.com>
Co-authored-by: Alexander <anicholson@mozilla.com>
Co-authored-by: kik-kik <42538694+kik-kik@users.noreply.github.com>
Co-authored-by: Ben Wu <12437227+BenWu@users.noreply.github.com>
Co-authored-by: whd <whd@users.noreply.github.com>
Co-authored-by: Frank Bertsch <frank.bertsch@gmail.com>
Co-authored-by: wil stuckey <wstuckey@mozilla.com>
Co-authored-by: Katie Windau <153020235+kwindau@users.noreply.github.com>
Co-authored-by: Sean Rose <1994030+sean-rose@users.noreply.github.com>
Co-authored-by: skahmann3 <16420065+skahmann3@users.noreply.github.com>
Co-authored-by: m-d-bowerman <mbowerman@mozilla.com>
Co-authored-by: Vanessa Sabino <vsabino@mozilla.com>
Co-authored-by: Eduardo Filho <edugomfilho@gmail.com>
Co-authored-by: Alekhya <88394696+alekhyamoz@users.noreply.github.com>
Co-authored-by: Alekhya Kommasani <akommasani@mozilla.com>
Co-authored-by: Mike Williams <102263964+mikewilli@users.noreply.github.com>
Co-authored-by: Preethi Issac <102384487+pissac17@users.noreply.github.com>

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: Katie Windau <153020235+kwindau@users.noreply.github.com>
Co-authored-by: m-d-bowerman <107562575+m-d-bowerman@users.noreply.github.com>
Co-authored-by: Frank Bertsch <frank.bertsch@gmail.com>
Co-authored-by: Winnie Chan <10429026+wwyc@users.noreply.github.com>
Co-authored-by: Chelsey Beck <64881557+chelseybeck@users.noreply.github.com>
Co-authored-by: Leli <33942105+lelilia@users.noreply.github.com>
Co-authored-by: Ben Wu <12437227+BenWu@users.noreply.github.com>
Co-authored-by: Leli Schiestl <lschiestl@mozilla.com>
Co-authored-by: Brad Ochocki Szasz <brad.ochocki@gmail.com>
Co-authored-by: Alekhya <88394696+alekhyamoz@users.noreply.github.com>
Co-authored-by: Sean Rose <1994030+sean-rose@users.noreply.github.com>
Co-authored-by: kik-kik <42538694+kik-kik@users.noreply.github.com>
Co-authored-by: richard baffour <baffour345@gmail.com>
Co-authored-by: Preethi Issac <102384487+pissac17@users.noreply.github.com>
Co-authored-by: Alekhya Kommasani <akommasani@mozilla.com>
Co-authored-by: Anna Scholtz <anna@scholtzan.net>
Co-authored-by: Alexander <anicholson@mozilla.com>
Co-authored-by: ksiegler1 <ksiegler@mozilla.com>
Co-authored-by: Kimberly Siegler <kimberlysiegler@Kimberlys-MacBook-Pro-2.local>
Co-authored-by: Marlene Hirose <92952117+Marlene-M-Hirose@users.noreply.github.com>
Co-authored-by: Eduardo Filho <edugomfilho@gmail.com>
Co-authored-by: Curtis Morales <cmorales@mozilla.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Teon L Brooks <teon.brooks@gmail.com>
Co-authored-by: whd <whd@users.noreply.github.com>
Co-authored-by: wil stuckey <wstuckey@mozilla.com>
Co-authored-by: skahmann3 <16420065+skahmann3@users.noreply.github.com>
Co-authored-by: m-d-bowerman <mbowerman@mozilla.com>
Co-authored-by: Vanessa Sabino <vsabino@mozilla.com>
Co-authored-by: Mike Williams <102263964+mikewilli@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants