Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow query_tag configuration in snapshot model for Snowflake #20

Closed
ErikDataDesigner opened this issue Aug 11, 2021 · 5 comments · Fixed by #48
Closed

Allow query_tag configuration in snapshot model for Snowflake #20

ErikDataDesigner opened this issue Aug 11, 2021 · 5 comments · Fixed by #48
Labels
enhancement New feature or request good_first_issue Good for newcomers

Comments

@ErikDataDesigner
Copy link

Describe the feature

When using Snowflake allow the snapshot config to accept and honor the query_tag parameter.

Additional context

This enhancement is specific for snowflake implementations

Who will this benefit?

Snapshots can be end up being compute intensive, this enhancement would provide a more simplified way for db administrators to collect the data required to monitor and manage the cost.

Example

{% snapshot snapshot_name %}

{{
config(
target_database='db',
target_schema='snapshots',
unique_key='id',
strategy='check',
check_cols=['col_a'],
query_tag='SNAPSHOT'
)
}}

select * from {{ source('db', 'table') }}

{% endsnapshot %}

@jtcohen6
Copy link
Contributor

@ErikDataDesigner Thanks for opening! It looks like query tag configs aren't used on snapshots in Snowflake today.

The query tag config is turned on/off in Snowflake materializations via two macros: set_query_tag() (start) and unset_query_tag() (end). E.g. in the Snowflake table materialization:
https://github.com/dbt-labs/dbt/blob/454168204c90464de1753ced824f5e3417e6203a/plugins/snowflake/dbt/include/snowflake/macros/materializations/table.sql#L3
https://github.com/dbt-labs/dbt/blob/454168204c90464de1753ced824f5e3417e6203a/plugins/snowflake/dbt/include/snowflake/macros/materializations/table.sql#L30

It looks like those macros aren't included in the seed materialization or the snapshot materializations, because in both cases Snowflake uses the default ("global") version.

The short-term answer here would be to copy-paste the materialization code from the global project into the Snowflake plugin, and add those macro calls at the top and bottom. I... don't really like that! The alternative would be to find a hook into the python code, either within the run/seed/snapshot task or the connection logic, where we can string these up by default. I'll do a little more digging, but I'm not so hopeful of my chances.

In the meantime, it is possible to set a default query_tag for all dbt-related queries in the connection profile, which will include queries related to snapshots.

@jtcohen6 jtcohen6 transferred this issue from dbt-labs/dbt-core Oct 12, 2021
@jtcohen6 jtcohen6 added enhancement New feature or request good_first_issue Good for newcomers labels Oct 12, 2021
@anthu
Copy link
Contributor

anthu commented Nov 14, 2021

How about wrapping the standard behaviour of the materialisation? Like:

{% materialization snapshot, adapter='snowflake' %}
    {% set original_query_tag = set_query_tag() %}
        {% set relations = materialization_snapshot_default() %}
    {% do unset_query_tag(original_query_tag) %}

    {{ return(relations) }}
{% endmaterialization %}

And

{% materialization seed, adapter='snowflake' %}
    {% set original_query_tag = set_query_tag() %}
        {% set relations = materialization_seed_default() %}
    {% do unset_query_tag(original_query_tag) %}
    
    {{ return(relations) }}
{% endmaterialization %}

This do work just fine as a project-scoped materialisation.

I would gladly open a PR for this. But I'm struggling to get my head around the Contributions Guide for dbt adapters. The one linked it this repo seems to be out of date (but that might be worth discussed in a separate issue ;) )

@jtcohen6
Copy link
Contributor

@anthu Glad to hear that works locally. It's a pretty elegant way to support query tags for seeds + snapshots on Snowflake, without requiring a ton of duplicated code. I'd be open to a PR for it.

We do need new contributing guides for the adapter repos. For the time being, you can get started by:

  • forking/cloning this repo
  • in a virtualenv: pip install -e . -r dev_requirements.txt (will install latest changes from dbt-core as well)
  • making changes to the code inside dbt/
  • running dbt from that virtualenv

@anthu
Copy link
Contributor

anthu commented Nov 15, 2021

Thanks for the prompt reply.

I got it to run in the meanwhile - the tricky part is definitely setting up the Snowflake connection for the test to run - it simply refused to pick them up:

this worked python3 -m pytest -m profile_snowflake instead simply running pytest.

I will make it ready later - what are your thought on this being more or less an "internal" method?

@jtcohen6
Copy link
Contributor

You're right, this is an "internal" method, and there's no guarantee that materializations will always work this way (creating a macro named materialization_<name> behind the scenes).

Given that dbt Labs maintains this adapter, I feel more comfortable with it. In fact, I think it will be a good check if we ever change this behavior. It isn't exactly documented, so it isn't exactly contracted in v1—but we can fairly assume that there are users who have observed this capability, and now projects which rely on it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request good_first_issue Good for newcomers
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants