New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for auto generating metrics #1283
Conversation
Your preview environment pr-1283-bttf has been deployed. Preview environment endpoints are available at: |
…he user has indicated they want us to create for them.
…ery messy logic to customize metric type.
… how to go about duration types, atleast with Segment.
…mes for the tracked events, along with different event column names based on the schemaFormat.
… integration property.
… not fully tested yet.
…nput as that is required for us to know which schema to query when looking for the tracked events coming from Rudderstack. Also updating the Snowflake from clause for the query to use the schema.
…e for both sql queries relating to the auto metrics - the query to get a list of tracked events and the query for the actual metric.
…but committing in, instead of stashing it while I go bash a bug.
…iving the list of tracked events.
… also adding hooks to generate additional metrics from single tracked event. Very narrow in scope for now, but getting the building blocks in place.
…ing all tracked events, and giving the user the option to create a binomial and/or count metric for each event.
…and adds polish to the front end. Still need to wire a few things on the front end up, mainly the SQL Preview.
…uilt out the pluralization map further, inspired from Segment's article on e-comm tracking suggestions.
…the previous implementation in, but commented out, and I will make a follow-up commit to remove it, but I just want a snapshot in case we want to revert back to it.
…ent & Rudderstack and improves error handling.
binomialSqlQuery: string; | ||
countSqlQuery: string; | ||
countDisplayName: string; | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This data structure is a little weird. It's mixing form state (e.g. createBinomialFromEvent
) with data from the query (e.g. lastTrackedAt
). Seems like that should be 2 separate data structures.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated this to be a bit more future proof and to separate the concerns.
Currently, this data takes the following shape:
export type TrackedEventData = {
event: string;
displayName: string;
count: number;
hasUserId: boolean;
lastTrackedAt: Date;
metricsToCreate: {
name: string;
sql: string;
type: MetricType;
shouldCreate?: boolean;
}[];
};
getInformationSchemaFromClause(): string { | ||
getInformationSchemaTableFromClause(databaseName: string): string { | ||
return `${databaseName}.information_schema.columns`; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method also feels duplicative with the new generateTableName
method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jdorn Just pushed up a big refactor of generateTableName
that handles the logic here as well, so we've now gotten rid of the getInformationSchemaTableFromClause
method.
All of the changes were done in this commit.
…e we were using, removes the pluralization logic, and added a few todos for future iterations.
…xpanded rows elsewhere in the app.
…refactors AutoMetricCard to not use hardcoded values.
…ure the column order remains correct.
Features and Changes
This PR introduces logic to automatically generate metrics for our various event trackers. In it's current state, this only supports
Segment
&Rudderstack
. Once I get some initial feedback on the overall structure of the PR, I'll expand to include additional event trackers, specifically GA4, Firebase, and Amplitude, with more to follow in separate PRs.To do this, when an organization adds a datasource via an event tracker supported above, they'll see an option on the
NewDataSourceForm
to look up what metrics we can generate for them automatically. I built a method withinSqlIntegration
that adds asupportsAutoGeneratedMetrics
property to the integration'sgetDataSourceProperties
so we can easily determine on the front and back end if the particular data source supports auto generating metrics.This hits a new endpoint
/datasource/:datasourceId/auto-metrics
- This endpoint looks up the unique events that are tracked by the event tracker. This is possible since the event trackers all have a table that lists all of the events that they track. In the case ofSegment
&Rudderstack
this lives on thetracks
table.The endpoint returns a list of events that we think we can generate, along with sql queries for
count
andbinomial
metrics derived from the tracked events.Here, the user can preview the underlying SQL that would eventually power these metrics, and opt to create
binomial
andcount
metrics for each tracked event. I've added some tooltips throughout the table to provide some insight.Then, when the user clicks "Save" if there are any metrics they've indicated they want us to create for them, (via the toggle state) we'll kick off an async job (
createAutoGeneratedMetrics
) that creates those metrics.There has also been some logic defined to pluralize the event names (E.G. for an
Order Placed
event via Segment, thecount
metric associated with that would beOrders Placed
.)This is currently handled via a pretty simple map. Eventually, we could leverage ChatGPT for this. If the map is unable to map an event to a pluralized version, it will simply return a displayName of
Count of {event}
.The seed for the pluralization map came from Segment's documentation here - https://segment.com/docs/connections/spec/ecommerce/v2/. This map will likely be a bit of a living document as we expand support.
Testing
Screenshots