Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CT-2938] [spike] Automate creation of metricflow_time_spine if the project defines semantic objects #8319

Closed
Tracked by #8125
jtcohen6 opened this issue Aug 4, 2023 · 8 comments
Assignees
Labels
enhancement New feature or request semantic Issues related to the semantic layer

Comments

@jtcohen6
Copy link
Contributor

jtcohen6 commented Aug 4, 2023

follow-up to #7938
related to #8172

Big idea

  • Currently, if the user defines semantic objects in their project, but not a model named metricflow_time_spine, we raise an error.
  • We should simply create the model automatically, if it is not found in the project, using the recommended definition.
  • Users should still have the ability to create themselves with a custom implementation if they so choose.

Acceptance criteria (spike)

@jtcohen6 jtcohen6 added the semantic Issues related to the semantic layer label Aug 4, 2023
@github-actions github-actions bot changed the title Automate creation of metricflow_time_spine if the project defines semantic objects [CT-2938] Automate creation of metricflow_time_spine if the project defines semantic objects Aug 4, 2023
@jtcohen6 jtcohen6 added the enhancement New feature or request label Aug 4, 2023
@graciegoheen
Copy link
Contributor

Discussed during estimation meeting: Complications with partial parsing - we've never internally created a model without a matching file to it. If you add a semantic model, you have to add this model. If you remove your last semantic model, you have to remove this model.

@martynydbt martynydbt changed the title [CT-2938] Automate creation of metricflow_time_spine if the project defines semantic objects [CT-2938] [spike] Automate creation of metricflow_time_spine if the project defines semantic objects Aug 7, 2023
@QMalcolm
Copy link
Contributor

I might be crazy but I think this is actually pretty straight forward 🤔

@graciegoheen
Copy link
Contributor

If we create this model automatically, I think we will need to have a cross-database macro for date type - to feed the proper inputs (to_date(...) vs date(...)) into date_spine.

@QMalcolm
Copy link
Contributor

So there are two main concerns I believe:

  1. Squaring partial parsing with generating a metricflow_time_spine model when one isn't specified
  2. Auto-generating the metricflow_time_spine correctly given the any adapter

For Issue (1) there are four possible states

a. metricflow_time_spine was specified by the user and that's still the case
b. metricflow_time_spine was specified by the user and now isn't (and thus should be generated)
c. metricflow_time_spine wasn't specified by the user (and thus generated) and that's still the case
d. metricflow_time_spine wasn't specified by the user (and thus generated) but now it is specified by the user

I think the solution is at the end of parsing if there are semantic layer nodes and no metricflow_time_spine model we add one and mark it as auto generated. At the start of parsing if there is a saved manifest, we drop the metricflow_time_spine node if is marked as having been auto generated. This workflow makes the following happen in the corresponding cases.

a. the metricflow_time_spine is handled by the user specification
b. the user specifed metricflow_time_spine gets dropped during partial parsing, and then re-added via the auto-generation
c. the auto generated metricflow_time_spine gets dropped, and then re-added at the end
d. the auto generated metricflow_time_spine gets dropped, and then the user specified metricflow_time_spine gets added


For issue (2) I don't think we need a cross-database macro for date types, though it would be nice. Instead we could just use the same jinja template we use for the date_spine macro tests, were we do different calls to the macro based on the target data warehouse.

@graciegoheen
Copy link
Contributor

Instead we could just use the same jinja template we use for the date_spine macro tests

What jinja template is this? Curious what it looks like

@QMalcolm
Copy link
Contributor

QMalcolm commented Oct 9, 2023

Instead we could just use the same jinja template we use for the date_spine macro tests

What jinja template is this? Curious what it looks like

@graciegoheen it looks like this currently

    {% if target.type == 'postgres' %}
        {{ date_spine("day", "'2023-09-01'::date", "'2023-09-10'::date") }}

    {% elif target.type == 'bigquery' or target.type == 'redshift' %}
        select cast(date_day as date) as date_day
        from ({{ date_spine("day", "'2023-09-01'", "'2023-09-10'") }})

    {% else %}
        {{ date_spine("day", "'2023-09-01'", "'2023-09-10'") }}
    {% endif %}

@graciegoheen
Copy link
Contributor

Got it - that makes sense!

@dbeatty10 curious your thoughts here - would a cross-database macro for date type be:

  • similar to just cast ... as {{ dbt.type_date() }}
  • or similar to cast_bool_to_text, cast_text_to_date
  • something else?

@dbeatty10
Copy link
Contributor

@graciegoheen this idea makes sense to me to support a built-in metricflow_time_spine 🚀

What do you think about calling it to_date()?

Instead of naming it cast_text_to_date, I'd suggest we call it to_date instead. Even though to_date isn't within the SQL standard, databricks, postgres, redshift, and snowflake all have a to_date function that does what we want. Although bigquery is an outlier, nothing we can't solve with a little dispatch magic ✨

Prototype of to_date()

Assuming to_date() is a cross-database macro that takes an ISO 8601 (YYYY-MM-DD) date string as input, here's a completely untested prototype for dbt-postgres:

{% macro to_date(date_str) %}
  {{ return(adapter.dispatch('to_date', 'dbt') (date_str)) }}
{% endmacro %}

{% macro default__to_date(date_str) -%}
    to_date({{ dbt.string_literal(date_str) }})
{%- endmacro %}

Pulling it all together for metricflow_time_spine

The cross-database Jinja template might look like this:

select cast(date_day as date) as date_day
from ({{ dbt.date_spine("day", dbt.to_date("2023-09-01"), dbt.to_date("2023-09-10")) }})

Appendix

Validation and error checking

If we want, we could always add some format validation to the default implementation of to_date() by using the datetime module:

    {%- set dt = modules.datetime.datetime.strptime(date_str, "%Y-%m-%d") -%}
    ...

type_date macro

We may (or may not) want to also create a cross-database type_date macro (which doesn't exist today). I haven't seen any database that doesn't call this data type DATE, so that makes it either easy-peasy or extraneous depending how you look at it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request semantic Issues related to the semantic layer
Projects
None yet
Development

No branches or pull requests

4 participants