You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have searched the existing issues, and I could not find an existing issue for this feature
I am requesting a straightforward extension of existing metricflow functionality, rather than a Big Idea better suited to a discussion
Describe the feature
As of this moment, it is possible to configure a time dimension with a granularity that does not match the granularity stored in the warehouse. You could have a column with, e.g., DAILY granularity while labeling the dimension as YEARLY.
Eventually we'd like to allow users to conform their time dimension inputs from, say, DAILY to YEARLY. Ideally we'd have some validation to catch when this is necessary. Note - this might be impossible to do well, so we might need to think hard about whether or not to build this at all.
Some notes from Slack discussion:
Sampling is a challenge. If we scan everything, or do a row-based sample, we might over-scan a ton of data for this validator. If, on the other hand, we do simple LIMIT X or block-based sampling (where supported) we run the risk of picking a row block that matches the target granularity (in our example, all rows happen to be from January 1st).
Truncation point cannot be assumed - if a user has a fiscal year that starts on January 30th, and they pre-normalize their YEARLY data to January 30th, their data might be valid but if we check against January 1st we'll say it is not. Allowing for this divergence in the validator by default means a more complicated query - we need to find two different types of granularity miss in our result set, which isn't trivial to generalize.
There's also the opposite scenario - YEARLY data stored as DAILY - which seems like kind of a nightmare to deal with.
In any case, this validator will likely have to be implemented with a tolerance for false negatives, and maybe should be a separate option with parameters so users can override the truncation point and sampling configuration. By default we do a complete scan and assert that every row's date_trunc value matches the input value.
Describe alternatives you've considered
No response
Who will this benefit?
No response
Are you interested in contributing this feature?
No response
Anything else?
No response
The text was updated successfully, but these errors were encountered:
Is this your first time submitting a feature request?
Describe the feature
As of this moment, it is possible to configure a time dimension with a granularity that does not match the granularity stored in the warehouse. You could have a column with, e.g., DAILY granularity while labeling the dimension as YEARLY.
Eventually we'd like to allow users to conform their time dimension inputs from, say, DAILY to YEARLY. Ideally we'd have some validation to catch when this is necessary. Note - this might be impossible to do well, so we might need to think hard about whether or not to build this at all.
Some notes from Slack discussion:
There's also the opposite scenario - YEARLY data stored as DAILY - which seems like kind of a nightmare to deal with.
In any case, this validator will likely have to be implemented with a tolerance for false negatives, and maybe should be a separate option with parameters so users can override the truncation point and sampling configuration. By default we do a complete scan and assert that every row's date_trunc value matches the input value.
Describe alternatives you've considered
No response
Who will this benefit?
No response
Are you interested in contributing this feature?
No response
Anything else?
No response
The text was updated successfully, but these errors were encountered: