-
Notifications
You must be signed in to change notification settings - Fork 88
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SL-1704] [SL-1703] [Feature] Enable fill_nulls_with
and join_to_timespine
on metric spec
#1031
Comments
fill_nulls_with
and join_to_timespine
on metric specfill_nulls_with
and join_to_timespine
on metric spec
fill_nulls_with
and join_to_timespine
on metric specfill_nulls_with
and join_to_timespine
on metric spec
fill_nulls_with
and join_to_timespine
on metric specfill_nulls_with
and join_to_timespine
on metric spec
Also reported #1098, which is basically a duplicate for this. Adding some business context, to push the priority up. Imagine this very common scenario: Revenue and costs (especially for different business lines) come from different tables. Not an uncommon scenario. Now, we want time based reporting (daily, weekly, monthly metric values). As soon as ANY of the input metrics have a NULL in them for this particular period, the whole thing will be NULL. Total revenue is NULL because we didn't have any revenue to show from business line 3 !?! No workaround for this feature missing, as far as I know. This feature should have the highest priority! It's a significant blocker for real world use. |
@siljamardla I think there is an easy workaround for this. You can add coalesce statements to the
Metric10 is able to safely sum metric7 and metric8 even when there are nulls present.
Definitely agree that it would be better to support plumbing the null values all the way down to the input metric, but with this approach we are still able to accurately calculate the metric value for metric 10 for user id = 1 on 2024-01-01. This makes this feature feel less urgent to me since there is a workaround. I believe adding a coalesce to the |
Thanks @Jstein77, this looks promising. I'm also extending this to cover the ratio metric case:
I did try adding the coalesce into the
That's why I've used |
After some additional testing I have to come back here. I have made updates to my # Derived metrics
- name: metric7
label: metric7
type: derived
type_params:
expr: COALESCE(metric1_for_metric7,0) + COALESCE(metric2_for_metric7,0) + COALESCE(metric3_for_metric7,0)
metrics:
- name: metric1
alias: metric1_for_metric7
- name: metric2
alias: metric2_for_metric7
- name: metric3
alias: metric3_for_metric7
- name: metric8
label: metric8
type: derived
type_params:
expr: COALESCE(metric4_for_metric8,0) + COALESCE(metric5_for_metric8,0) + COALESCE(metric6_for_metric8,0)
metrics:
- name: metric4
alias: metric4_for_metric8
- name: metric5
alias: metric5_for_metric8
- name: metric6
alias: metric6_for_metric8
- name: metric9
label: metric9
type: derived
type_params:
expr: COALESCE(metric7_for_metric9,0) / NULLIF(COALESCE(metric8_for_metric9,0),0)
metrics:
- name: metric7
alias: metric7_for_metric9
- name: metric8
alias: metric8_for_metric9
- name: metric10
label: metric10
type: derived
type_params:
expr: COALESCE(metric7_for_metric9,0) + COALESCE(metric8_for_metric9,0)
metrics:
- name: metric7
alias: metric7_for_metric9
- name: metric8
alias: metric8_for_metric9 Querying all of them together:
Will return:
Initially I was going to suggest there's a difference between derived metrics defined on top of simple vs other derived metrics.
However, when I review the compiled code, I think the metric type used in the derived metric is irrelevant. The beginning of the compiled query reads: SELECT
COALESCE(subq_4.calendar_date_local, subq_9.calendar_date_local, subq_15.calendar_date_local, subq_21.calendar_date_local, subq_35.calendar_date_local, subq_49.calendar_date_local) AS calendar_date_local
, COALESCE(subq_4.user_id, subq_9.user_id, subq_15.user_id, subq_21.user_id, subq_35.user_id, subq_49.user_id) AS user_id
, COALESCE(MAX(subq_4.metric1), 0) AS metric1
, COALESCE(MAX(subq_4.metric2), 0) AS metric2
, COALESCE(MAX(subq_4.metric3), 0) AS metric3
, COALESCE(MAX(subq_9.metric4), 0) AS metric4
, COALESCE(MAX(subq_9.metric5), 0) AS metric5
, COALESCE(MAX(subq_9.metric6), 0) AS metric6
, MAX(subq_15.metric7) AS metric7
, MAX(subq_21.metric8) AS metric8
, MAX(subq_35.metric9) AS metric9
, MAX(subq_49.metric10) AS metric10
FROM ... All the null handling happens in subq_15, subq_21, subq_35 and subq_49. However, we add more |
I have added a third table (in this table we have data for user_id 4 and 5, nothing for 1, 2 and 3):
I have defined a metric on that third table:
I am now querying all of the above plus the new metric:
and voila, we see the
However, it looks like there are no errors in the values any more. Now it's just about rendering the output. How to fix it?It should be relatively simple to add a For addition and subtraction metrics (metric7, metric8 and metric10) I would set the fill_nulls_with to 0, while for division I would keep the nulls and for multiplication it might depend on business logic. What do you think? I'm wondering if there are cases not covered by this logic... Current workaroundsWhenever a user uses a MetricFlow query similar to this, they have to process the output by manually wrapping or not wrapping the output in COALESCE depending on the metric type (that they have to know). |
@siljamardla thanks for providing such detailed test cases! Metric10 will still have null values because of the full outer join, but i believe the actual metric value calculation is correct if you use the workaround. I agree with your proposed fix. Adding |
Is this your first time submitting a feature request?
Describe the feature
Currently,
fill_nulls_with
andjoin_to_timespine
are only available on metric input measures. This can result in unwanted nulls if the measure is nested in a ratio or derived metric and then joined to other metrics. If we add those options to the metric config, users will be able to resolve that issue and also have more granular control over when nulls are filled.This means that after the metric is calculated, the metric subquery can be joined to a time spine subquery to fill missing dates, and then coalesce nulls to the integer value specified in the
fill_nulls_with
property. We also may consider allowing variables and other expressions to be used to fill nulls instead of only integers.Describe alternatives you've considered
No response
Who will this benefit?
No response
Are you interested in contributing this feature?
No response
Anything else?
No response
SL-1703
The text was updated successfully, but these errors were encountered: