Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Relax double underscore requirement on metrics #283

Open
3 tasks done
adamcunnington-mlg opened this issue Jun 4, 2024 · 4 comments
Open
3 tasks done
Labels
enhancement New feature or request

Comments

@adamcunnington-mlg
Copy link

Is this your first time submitting a feature request?

  • I have read the expectations for open source contributors
  • I have searched the existing issues, and I could not find an existing issue for this feature
  • I am requesting a straightforward change to existing dbt-semantic-interfaces functionality, rather than a Big Idea better suited to a discussion

Describe the feature

Both semantic model measure names AND metrics are not allowed double underscores or uppercase characters. Typical error message one runs into:

Invalid name `metric_mrt_CLIENT_AGNOSTIC__3rd_party__consumer_confidence_index__index_modelled_daily` - names may only contain lower case letters, numbers, and underscores. Additionally, names must start with a lower case letter, cannot end with an underscore, cannot contain dunders (double underscores, or __), and must be at least 2 characters long.

I understand that behind the scenes, metricflow does some concatenation using double underscores for the dynamic join traversal - but perhaps this is not necessary for metrics and this can be relaxed for metrics?

This is a HUGE breaking change vs the old metrics specification and has caused an enormous blocker for us migrating thousands of metrics lest we completely break our BI tool (Preset/Apache Superset). Our metrics sync to downstream tools (indeed, the whole point of a centralised semantic layer) and whilst changing the whole spec for metrics was a controversial breaking change in itself, not allowing old metric names (which downstream tools typically use as keys) is a breaking change further. The implication of this is we would have to completely rebuild our downstream dependencies - every single metric would be broken.

P.s. it's not clear whether the current error message relates to semantic model measures OR metrics - but through trial and error, it turns out it's both. The error message should be clearer.

Describe alternatives you've considered

Not upgrading beyond DBT 1.5 but this has scary consequences for stability and lack of features.

Who will this benefit?

Anyone who is beyond DBT 1.5 and has double underscores in their old metric names

Are you interested in contributing this feature?

Don't know the codebase

Anything else?

No response

@adamcunnington-mlg adamcunnington-mlg added the enhancement New feature or request label Jun 4, 2024
@adamcunnington-mlg
Copy link
Author

It would be really helpful to know whether this is doable and likely to be accepted or not. It has huge bearing on our migration path. @plypaul grateful for your input, thank you.

@Jstein77
Copy link

Hey @adamcunnington-mlg , dunders currently do a lot of heavy lifting in Metricflow name spacing. For example, we will append the requested granularity to time dimensions columns in the result table with dunders (ordered_at__week, etc.), and we use them to denote join paths to dimensions, i.e., order__customer__name. This is fundamentally how we name columns in the result set.

We're planning features that would allow query time transformation of metrics like period over period and dynamic windows for cumulative metrics. We'd use the dunder syntax here as well to call out the metric transformation in the column name, i.e., revenue__wow, cumulative_revenue__7d_window. I do think we're overloading the dunder as it serves a bunch of different purposes, and we've chatted internally about different ways to name columns, but there is no active work going on right now to update this.

All that to say is I don't think we'll be relaxing the dunder restriction on metric names in at least the next 6 months. One option we discussed would be to allow users to specify a replacement for dunders in their project configs.

I get that this is not an ideal timeline given you're trying to migrate today. I'm looking to understand the migration effort for you more clearly to see if there is a path to unblocking you. It seems like there are a few pieces of work:

  1. Create semantic models and migrate metrics to use the MetricFlow spec - there is some automated tooling [here](https://github.com/dbt-labs/dbt-converter), and happy to collaborate on this if you're getting stuck.
  2. Update metrics names to remove dunders
  3. Update references to metric names in downstream tools

It sounds like #3 is the most challenging because metric names are imported directly to Superset then referenced in a ton of downstream dashboards/reports. Is there any way to update these metric definitions via an API? We can help write migration tooling if this is an option.

@adamcunnington-mlg
Copy link
Author

adamcunnington-mlg commented Jun 21, 2024

@Jstein77 thank you for the response. The project config option seems like a really trivial change that would unblock me here. I empathise with the challenge but am glad the overloading of a name is recognised here! Feels far too magical!

The migration effort for us is enormous. We're very aware of dbt-converter and we used that as part of an intensive week where 4 developers sunk a combined 120 hours manually migrating ~1600 metrics to MetricFlow. We finished points 1 and 2 and 3 remains. Preset/Superset does have an API but the metric key is the fundamental ID and can't be changed - we'd have to work with Preset to do a crazy export/import operation to fundamentally change values in their DB - it's super high risk and significant effort.

Could the config be considered as a short term workaround? I could simply use a character other than dunder to drive MetricFlow and all of the work could be avoided :(.

@adamcunnington-mlg
Copy link
Author

adamcunnington-mlg commented Jun 27, 2024

@Jstein77 I'd really appreciate your consideration of this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants