Add support for auto generating metrics #1283

mknowlton89 · 2023-05-12T20:11:02Z

Features and Changes

This PR introduces logic to automatically generate metrics for our various event trackers. In it's current state, this only supports Segment & Rudderstack. Once I get some initial feedback on the overall structure of the PR, I'll expand to include additional event trackers, specifically GA4, Firebase, and Amplitude, with more to follow in separate PRs.

To do this, when an organization adds a datasource via an event tracker supported above, they'll see an option on the NewDataSourceForm to look up what metrics we can generate for them automatically. I built a method within SqlIntegration that adds a supportsAutoGeneratedMetrics property to the integration's getDataSourceProperties so we can easily determine on the front and back end if the particular data source supports auto generating metrics.

isAutoGeneratingMetricsSupported(): boolean {
    const supportedEventTrackers: SchemaFormat[] = ["segment", "rudderstack"];

    if (
      this.settings.schemaFormat &&
      supportedEventTrackers.includes(this.settings.schemaFormat)
    ) {
      return true;
    }
    return false;
  }

This hits a new endpoint /datasource/:datasourceId/auto-metrics - This endpoint looks up the unique events that are tracked by the event tracker. This is possible since the event trackers all have a table that lists all of the events that they track. In the case of Segment & Rudderstack this lives on the tracks table.

The endpoint returns a list of events that we think we can generate, along with sql queries for count and binomial metrics derived from the tracked events.

Here, the user can preview the underlying SQL that would eventually power these metrics, and opt to create binomial and count metrics for each tracked event. I've added some tooltips throughout the table to provide some insight.

Then, when the user clicks "Save" if there are any metrics they've indicated they want us to create for them, (via the toggle state) we'll kick off an async job (createAutoGeneratedMetrics) that creates those metrics.

There has also been some logic defined to pluralize the event names (E.G. for an Order Placed event via Segment, the count metric associated with that would be Orders Placed.)

This is currently handled via a pretty simple map. Eventually, we could leverage ChatGPT for this. If the map is unable to map an event to a pluralized version, it will simply return a displayName of Count of {event}.

The seed for the pluralization map came from Segment's documentation here - https://segment.com/docs/connections/spec/ecommerce/v2/. This map will likely be a bit of a living document as we expand support.

Testing

Set up a Segment destination with BigQuery and add that data source to GrowthBook and confirm all tracked events are displayed and created correctly, with the correct underlying sql query.
Ensure the above works for revenue, count. and binomial metrics.

Screenshots

github-actions · 2023-05-12T20:18:48Z

Your preview environment pr-1283-bttf has been deployed.

Preview environment endpoints are available at:

…he user has indicated they want us to create for them.

…ery messy logic to customize metric type.

… how to go about duration types, atleast with Segment.

…mes for the tracked events, along with different event column names based on the schemaFormat.

… integration property.

… not fully tested yet.

…nput as that is required for us to know which schema to query when looking for the tracked events coming from Rudderstack. Also updating the Snowflake from clause for the query to use the schema.

…e for both sql queries relating to the auto metrics - the query to get a list of tracked events and the query for the actual metric.

…but committing in, instead of stashing it while I go bash a bug.

…iving the list of tracked events.

… also adding hooks to generate additional metrics from single tracked event. Very narrow in scope for now, but getting the building blocks in place.

…ing all tracked events, and giving the user the option to create a binomial and/or count metric for each event.

…and adds polish to the front end. Still need to wire a few things on the front end up, mainly the SQL Preview.

…uilt out the pluralization map further, inspired from Segment's article on e-comm tracking suggestions.

…the previous implementation in, but commented out, and I will make a follow-up commit to remove it, but I just want a snapshot in case we want to revert back to it.

…ent & Rudderstack and improves error handling.

…not commited in last push

packages/back-end/src/models/MetricModel.ts

packages/back-end/src/integrations/SqlIntegration.ts

packages/back-end/types/datasource.d.ts

jdorn · 2023-06-05T19:46:07Z

packages/back-end/src/types/Integration.ts

+  binomialSqlQuery: string;
+  countSqlQuery: string;
+  countDisplayName: string;
+};


This data structure is a little weird. It's mixing form state (e.g. createBinomialFromEvent) with data from the query (e.g. lastTrackedAt). Seems like that should be 2 separate data structures.

Updated this to be a bit more future proof and to separate the concerns.

Currently, this data takes the following shape:

export type TrackedEventData = { event: string; displayName: string; count: number; hasUserId: boolean; lastTrackedAt: Date; metricsToCreate: { name: string; sql: string; type: MetricType; shouldCreate?: boolean; }[]; };

jdorn · 2023-06-05T19:48:50Z

packages/back-end/src/integrations/Athena.ts

-  getInformationSchemaFromClause(): string {
+  getInformationSchemaTableFromClause(databaseName: string): string {
+    return `${databaseName}.information_schema.columns`;
+  }


This method also feels duplicative with the new generateTableName method.

@jdorn Just pushed up a big refactor of generateTableName that handles the logic here as well, so we've now gotten rid of the getInformationSchemaTableFromClause method.

All of the changes were done in this commit.

packages/back-end/src/util/autoGeneratedMetrics.ts

…e we were using, removes the pluralization logic, and added a few todos for future iterations.

…xpanded rows elsewhere in the app.

packages/back-end/src/integrations/Redshift.ts

…refactors AutoMetricCard to not use hardcoded values.

…ure the column order remains correct.

POC for auto-generating metrics from Segment/BigQuery

8b2b643

mknowlton89 added 14 commits May 16, 2023 09:07

Merge branch 'main' into mk/auto-metrics

e24e9f0

Adds some logic to build the metric sql query and only builds those t…

49734e1

…he user has indicated they want us to create for them.

Adds logic for schemaFormat to customize timestamp column, and adds v…

81cab6f

…ery messy logic to customize metric type.

Adds basic support for count and revenue metric types. Still not sure…

7538361

… how to go about duration types, atleast with Segment.

Merge branch 'main' into mk/auto-metrics

71356da

Adds some flexibility to the SQl queries to handle different table na…

6d4be65

…mes for the tracked events, along with different event column names based on the schemaFormat.

Overall cleanup and renaming of methods.

75aba06

Removing unused import.

3503ef4

Updates the front end to consume the new supportsAutoGeneratedMetrics…

4935fa3

… integration property.

Expanding support for all data warehouses for Segment & Rudderstack -…

eff2b94

… not fully tested yet.

Updates the mssql integration to include an optional default schema i…

7753358

…nput as that is required for us to know which schema to query when looking for the tracked events coming from Rudderstack. Also updating the Snowflake from clause for the query to use the schema.

Refactors the code a bit to use the same method to get the from claus…

5eeed8f

…e for both sql queries relating to the auto metrics - the query to get a list of tracked events and the query for the actual metric.

Improves error handling

6ccfdfe

Resets controller logic back after testing, and renames helper method.

d10ef04

mknowlton89 requested a review from jdorn May 22, 2023 10:40

mknowlton89 self-assigned this May 22, 2023

mknowlton89 requested a review from a team May 22, 2023 10:40

mknowlton89 added 11 commits May 24, 2023 06:49

Refactored a bit based on Jeremy's feedback, needs a lot of cleanup, …

145b9cd

…but committing in, instead of stashing it while I go bash a bug.

Merge branch 'main' into mk/auto-metrics

e2dd627

Starts scaffolding out how we can build additional metrics when retre…

08c37e1

…iving the list of tracked events.

Adds better typing and refactors logic to simplify some things, while…

63d9788

… also adding hooks to generate additional metrics from single tracked event. Very narrow in scope for now, but getting the building blocks in place.

Includes a big refactor that simplifies things a lot - we're now gett…

5a161ec

…ing all tracked events, and giving the user the option to create a binomial and/or count metric for each event.

Improves the front end experience.

c99eccc

Introduces a method to build a count metric name that is pluralized, …

33aa2a5

…and adds polish to the front end. Still need to wire a few things on the front end up, mainly the SQL Preview.

Improved the UI a bit to include the SQL Preview functionality, and b…

7d275d5

…uilt out the pluralization map further, inspired from Segment's article on e-comm tracking suggestions.

Switched the front end experience over to what Luke suggested - left …

652eb75

…the previous implementation in, but commented out, and I will make a follow-up commit to remove it, but I just want a snapshot in case we want to revert back to it.

Removed commented out code.

10367b3

Fixed a type issue

9785c2d

mknowlton89 added 4 commits June 1, 2023 15:32

Removes console log

b734281

Merge branch 'main' into mk/auto-metrics

05be520

Adds support for screens tables for Segment & Rudderstack

351cb7d

Cleans up the logic for the various data warehouses supported by Segm…

1e565e9

…ent & Rudderstack and improves error handling.

mknowlton89 changed the title ~~POC for auto-generating metrics from Segment/BigQuery~~ Add support for auto generating metrics Jun 2, 2023

mknowlton89 added 2 commits June 2, 2023 08:49

Merge branch 'main' into mk/auto-metrics

a1393a8

Adds types for includesScreensTable and includesPagesTable that were …

aea2c57

…not commited in last push

mknowlton89 marked this pull request as ready for review June 5, 2023 10:39

Resolving conflicts

9c72050

jdorn reviewed Jun 5, 2023

View reviewed changes

mknowlton89 added 6 commits June 6, 2023 11:31

Addresses almost all of Jeremy's feedback - updates the data structur…

09d4262

…e we were using, removes the pluralization logic, and added a few todos for future iterations.

Removes console log

faf99cb

Fixed a few failing types

081d2ce

Refactored how we were building the FROM clause within SqlIntegration.

dfca94c

Added some styling to the Sql preview row so it matches how we show e…

700f541

…xpanded rows elsewhere in the app.

Merge branch 'main' into mk/auto-metrics

ab0bd25

mknowlton89 requested a review from jdorn June 6, 2023 20:28

mknowlton89 commented Jun 7, 2023

View reviewed changes

packages/back-end/src/integrations/Redshift.ts Outdated Show resolved Hide resolved

mknowlton89 and others added 7 commits June 7, 2023 07:17

General PR cleanup, improved mobile styling of table.

cc1a23a

Refactors how we build the tablePath based on Jeremy's feedback

4b0aca2

Merge branch 'main' into mk/auto-metrics

55a88c9

Fixes from code review

e37b249

Merge branch 'main' into mk/auto-metrics

3eb2a8f

Removed an incorrect property on the Postgres integration class, and …

ff9f3ea

…refactors AutoMetricCard to not use hardcoded values.

refactors the AutoMetricCard to look for specific metric types to ens…

a22af49

…ure the column order remains correct.

jdorn approved these changes Jun 8, 2023

View reviewed changes

mknowlton89 added 2 commits June 9, 2023 09:40

Merge branch 'main' into mk/auto-metrics

0973a4f

Merge branch 'main' into mk/auto-metrics

dfbe91c

mknowlton89 merged commit 9a07423 into main Jun 9, 2023
3 checks passed

mknowlton89 deleted the mk/auto-metrics branch June 9, 2023 15:09

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for auto generating metrics #1283

Add support for auto generating metrics #1283

mknowlton89 commented May 12, 2023 •

edited

github-actions bot commented May 12, 2023 •

edited

jdorn Jun 5, 2023

mknowlton89 Jun 6, 2023

jdorn Jun 5, 2023

mknowlton89 Jun 6, 2023

Add support for auto generating metrics #1283

Add support for auto generating metrics #1283

Conversation

mknowlton89 commented May 12, 2023 • edited

Features and Changes

Testing

Screenshots

github-actions bot commented May 12, 2023 • edited

jdorn Jun 5, 2023

Choose a reason for hiding this comment

mknowlton89 Jun 6, 2023

Choose a reason for hiding this comment

jdorn Jun 5, 2023

Choose a reason for hiding this comment

mknowlton89 Jun 6, 2023

Choose a reason for hiding this comment

mknowlton89 commented May 12, 2023 •

edited

github-actions bot commented May 12, 2023 •

edited