Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Snowflake External Tables - Ability to add in custom col_expression #140

Closed
dalvarez06 opened this issue May 5, 2022 · 3 comments · Fixed by #275
Closed

Snowflake External Tables - Ability to add in custom col_expression #140

dalvarez06 opened this issue May 5, 2022 · 3 comments · Fixed by #275
Labels
Milestone

Comments

@dalvarez06
Copy link

dalvarez06 commented May 5, 2022

Describe the feature

dbt-external-tables does a great job of creating external tables into snowflake. However there are times where one might want to make there own expression to create a column.

Describe alternatives you've considered

If we look at the below code snippet we see that the only variable that has access to the expression is col_expression. The issue is that we have no control over what gets passed to the col_expression variable other than the name. The below code is from https://github.com/dbt-labs/dbt-external-tables/blob/main/macros/plugins/snowflake/create_external_table.sql

{%- for column in columns %}
    {%- set column_quoted = adapter.quote(column.name) if column.quote else column.name %}
    {%- set col_expression -%}
        {%- set col_id = 'value:c' ~ loop.index if is_csv else 'value:' ~ column_quoted -%}
        (case when is_null_value({{col_id}}) or lower({{col_id}}) = 'null' then null else {{col_id}} end)
    {%- endset %}
    {{column_quoted}} {{column.data_type}} as ({{col_expression}}::{{column.data_type}})
    {{- ',' if not loop.last -}}
{% endfor %}

Additional context

This feature would just be for Snowflake.

Who will this benefit?

Anyone who wants to do more complex expressions in their external tables.

Proposed fix

Here is a quick fix that I wrote. I'm sure there are plenty of ways of doing this. I tested this out and it worked.

columns:
    - name: user_id
      data_type: varchar
    - name: complex_column
      data_type: varchar
      col_expression: custom expression
{%- for column in columns %}
    {%- set column_quoted = adapter.quote(column.name) if column.quote else column.name %}
    {%- set col_expression -%}
        {%- if column.col_expression -%}
            {{column.col_expression}}
        {%- else -%}
            {%- set col_id = 'value:c' ~ loop.index if is_csv else 'value:' ~ column_quoted -%}
            (case when is_null_value({{col_id}}) or lower({{col_id}}) = 'null' then null else {{col_id}} end)
        {%- endif -%}
    {%- endset %}
    {{column_quoted}} {{column.data_type}} as ({{col_expression}}::{{column.data_type}})
    {{- ',' if not loop.last -}}
{% endfor %}
@dalvarez06 dalvarez06 added enhancement New feature or request triage labels May 5, 2022
@nic-sharesies
Copy link

nic-sharesies commented May 6, 2022

+1 on this! I'm about to fork dbt-external-tables because I want a custom column expression to pull the replicated_date out of our parquet files using the Snowpipe option.

@nochimo
Copy link

nochimo commented Dec 12, 2022

+1
This would help a lot some use cases we have. Any news on that?

@adam-campbell-mfe
Copy link

I would greatly support this feature, there has been some discussion on other threads that the external table importing should just do the ET and not T (which should be performed by models). I mostly agree with that idea, however the only way to get snowflake external table metadata into your dbt defined external table i.e. METADATA$FILENAME is via an expression in a partition. Which might not be the way I want to setup my partition for that table.

Another frustration is if want to store dates in a column, but the AUTO detect feature fails due to some non-standard date type I am forced to import the column as string and convert to date down the line. This could potentially lead to a range of issues.

At present the ability to create snowflake external tables using dbt is too limiting for my use.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
5 participants