Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add & validate cumulative type params & period agg #288

Merged
merged 9 commits into from
Jun 18, 2024

Conversation

courtneyholcomb
Copy link
Contributor

@courtneyholcomb courtneyholcomb commented Jun 10, 2024

Resolves #289
Completes SL-2363
Completes SL-2364

Description

Move cumulative metric type params to a nested key within type_params called cumulative_type_params.
Add new period_agg field with options FIRST, LAST, and AVERAGE. (Note we chose FIRST and LAST because that language aligns with the window function syntax we'll use for those - FIRST_VALUE and LAST_VALUE.) This will be used when aggregating cumulative metrics to non-default time granularities.

Checklist

@cla-bot cla-bot bot added the cla:yes label Jun 10, 2024
@courtneyholcomb courtneyholcomb force-pushed the court/cumulative-params branch 2 times, most recently from d0f9cf0 to d9fb402 Compare June 10, 2024 20:00
@courtneyholcomb courtneyholcomb changed the title Add cumulative type params & period agg Add & validate cumulative type params & period agg Jun 10, 2024
@dbt-labs dbt-labs deleted a comment from github-actions bot Jun 10, 2024
@courtneyholcomb courtneyholcomb marked this pull request as ready for review June 10, 2024 22:48
@dbt-labs dbt-labs deleted a comment from github-actions bot Jun 11, 2024
Copy link
Collaborator

@QMalcolm QMalcolm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great! However, I'm trying to think through both the system as a whole (core, DSI, MetricFlow), DSI by itself, the deprecation/migration path in both of those, and the end user experience. Taking that all into account, I think we should make some changes

  1. MetricFlow should only have to pull from one place
    a. Ideally it should be able to pull directly from metric.type_params.cumulative_type_params.<field> and not think about metric_type_params.<deprecated_field>
    b. DSI can ensure this by either
    1. Including a new transformation (I believe MetricFlow still runs transformations) which sets the new paths if they aren't already set when the deprecated paths are.
    2. providing a metric level accessor which does the munging
      • this is problematic in that it creates more "future" work in that we'll want to deprecate the accessor at some point and end up migrating MetricFlow not once but twice.
    3. at post instantition of a metric, populating the cumulative_type_params from the old deprecated fields (only for cumulative_type_param fields which are None)
  2. dbt-core should always populate the cumulative_type_param fields if they are available
    a. if the user specifies them directly, perfect
    b. if the user specifies them via the deprecated fields, but not directly, then we use the deprecated fields

Points (1.b) and (2) mean that our desired state for the time being is that there is duplication. Thus both being defined shouldn't be an error, it should be expected. What should be an error is if they are different.

Additionally, people who use dbt-core have the ability to silence warnings. Which might happen for these warnings as they will likely be noisy. It is basically guaranteed to happen for anybody who currently has a cumulative metric defined. Unfortunately silencing happens via log event type and all DSI validation logs get logged out via the same event type. That means silencing this DSI validation issue will silence all DSI validation issues (not great). This isn't something we should fix in this PR, but it's probably worth SL and Core working together on how to make this better.

Comment on lines 55 to 68
if (
getattr(metric.type_params, field)
and metric.type_params.cumulative_type_params
and getattr(metric.type_params.cumulative_type_params, field)
):
issues.append(
ValidationError(
context=metric_context,
message=(
f"`{field}` set twice in cumulative metric '{metric.name}'. "
"Please remove it from `type_params`."
),
)
)
Copy link
Collaborator

@QMalcolm QMalcolm Jun 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We actually want them to be duplicated I think. In which case we should change this error to be if the fields are different.

@courtneyholcomb
Copy link
Contributor Author

@QMalcolm thank you for the thorough feedback! Just added a couple commits to add the transformation and update the validation if you want to take another look.

Copy link
Collaborator

@QMalcolm QMalcolm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for working through the change requests ❤️ This looks great!

Also, thank you for being smarter than me. In my change requests, I said that we should error if the deprecated path values and the new path values differed. The edge case in what I said is that it should be allowable for either to be None. How you set up the validation rule handles that 🚀

@courtneyholcomb
Copy link
Contributor Author

Note: We have a release for 0.6.0 coming soon. This should not merge until after that release - otherwise users will get validation warnings for cumulative_type_params before those fields are actually supported in MF.

@courtneyholcomb courtneyholcomb merged commit 1b9f2df into main Jun 18, 2024
22 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Feature] Add period_agg field for cumulative metrics
2 participants