-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug] clone
creating "view pointers" instead of "cloned tables" on 1.8 / "Keep on latest version"
#10296
Comments
This has also been observed in 1.7 |
I propose overriding the
In order to do that - add the following macro to your dbt project: -- macros/clone_override.sql
{%- materialization clone, default -%}
{%- set relations = {'relations': []} -%}
{%- if not defer_relation -%}
-- nothing to do
{{ log("No relation found in state manifest for " ~ model.unique_id, info=True) }}
{{ return(relations) }}
{%- endif -%}
{%- set existing_relation = load_cached_relation(this) -%}
{%- if existing_relation and not flags.FULL_REFRESH -%}
-- noop!
{{ log("Relation " ~ existing_relation ~ " already exists", info=True) }}
{{ return(relations) }}
{%- endif -%}
{%- set other_existing_relation = load_cached_relation(defer_relation) -%}
{#/* See https://github.com/dbt-labs/dbt-core/issues/10296 */-#}
{{ log('CLONE DEBUG: ' ~ other_existing_relation ~ ' type is ' ~ other_existing_relation.type) }}
{% set check_type_query -%}
select case when table_type = 'BASE TABLE' then 'table' else table_type end as table_type
from {{ defer_relation.database }}.information_schema.tables
where lower(table_schema) = lower('{{ defer_relation.schema }}')
and lower(table_name) = lower('{{ defer_relation.identifier}}')
{%- endset %}
{% set recheck = run_query(check_type_query) %}
{% set defer_rel_type_recheck = recheck.columns['TABLE_TYPE'].values()[0] %}
{{ log('CLONE DEBUG (recheck type): defer relation type is ' ~ defer_rel_type_recheck) }}
-- If this is a database that can do zero-copy cloning of tables, and the other relation is a table, then this will be a table
-- Otherwise, this will be a view
{% set can_clone_table = can_clone_table() %}
{%- if other_existing_relation and (other_existing_relation.type == 'table' or defer_rel_type_recheck == 'table') and can_clone_table -%}
{%- set target_relation = this.incorporate(type='table') -%}
{% if existing_relation is not none and not existing_relation.is_table %}
{{ log("Dropping relation " ~ existing_relation ~ " because it is of type " ~ existing_relation.type) }}
{{ drop_relation_if_exists(existing_relation) }}
{% endif %}
-- as a general rule, data platforms that can clone tables can also do atomic 'create or replace'
{% call statement('main') %}
{% if target_relation and defer_relation and target_relation == defer_relation %}
{{ log("Target relation and defer relation are the same, skipping clone for relation: " ~ target_relation) }}
{% else %}
{{ create_or_replace_clone(target_relation, defer_relation) }}
{% endif %}
{% endcall %}
{% set should_revoke = should_revoke(existing_relation, full_refresh_mode=True) %}
{% do apply_grants(target_relation, grant_config, should_revoke=should_revoke) %}
{% do persist_docs(target_relation, model) %}
{{ return({'relations': [target_relation]}) }}
{%- else -%}
{%- set target_relation = this.incorporate(type='view') -%}
-- reuse the view materialization
-- TODO: support actual dispatch for materialization macros
-- Tracking ticket: https://github.com/dbt-labs/dbt-core/issues/7799
{% set search_name = "materialization_view_" ~ adapter.type() %}
{% if not search_name in context %}
{% set search_name = "materialization_view_default" %}
{% endif %}
{% set materialization_macro = context[search_name] %}
{% set relations = materialization_macro() %}
{{ return(relations) }}
{%- endif -%}
{%- endmaterialization -%}
With a -- models/foo.sql
{{ config(materialized='table') }}
select 2 id
-- models/bar.sql
{{ config(materialized='view') }}
select * from {{ ref('foo') }} We get additional logging information in the debug logs: $ dbt --debug clone -s state:modified+ --defer --state target_old --target dev
10:33:16 Began running node model.my_dbt_project.foo
10:33:16 Re-using an available connection from the pool (formerly list_development_jyeo_dbt_jyeo, now model.my_dbt_project.foo)
10:33:16 Began compiling node model.my_dbt_project.foo
10:33:16 Began executing node model.my_dbt_project.foo
10:33:16 On "model.my_dbt_project.foo": cache miss for schema "prod_jyeo.dbt_jyeo", this is inefficient
10:33:17 Using snowflake connection "model.my_dbt_project.foo"
10:33:17 On model.my_dbt_project.foo: /* {"app": "dbt", "dbt_version": "1.8.2", "profile_name": "sf", "target_name": "dev", "node_id": "model.my_dbt_project.foo"} */
show objects in prod_jyeo.dbt_jyeo limit 10000
10:33:17 Opening a new connection, currently in state closed
10:33:18 SQL status: SUCCESS 4 in 2.0 seconds
10:33:18 While listing relations in database=prod_jyeo, schema=dbt_jyeo, found: ABC, BAR, BAZ, FOO
10:33:18 CLONE DEBUG: "PROD_JYEO"."DBT_JYEO"."FOO" type is table
10:33:18 Using snowflake connection "model.my_dbt_project.foo"
10:33:18 On model.my_dbt_project.foo: /* {"app": "dbt", "dbt_version": "1.8.2", "profile_name": "sf", "target_name": "dev", "node_id": "model.my_dbt_project.foo"} */
select case when table_type = 'BASE TABLE' then 'table' else table_type end as table_type
from prod_jyeo.information_schema.tables
where lower(table_schema) = lower('dbt_jyeo')
and lower(table_name) = lower('foo')
10:33:20 SQL status: SUCCESS 1 in 1.0 seconds
10:33:20 CLONE DEBUG (recheck type): defer relation type is table
10:33:20 Writing runtime sql for node "model.my_dbt_project.foo"
10:33:20 Using snowflake connection "model.my_dbt_project.foo"
10:33:20 On model.my_dbt_project.foo: /* {"app": "dbt", "dbt_version": "1.8.2", "profile_name": "sf", "target_name": "dev", "node_id": "model.my_dbt_project.foo"} */
create or replace
transient
table development_jyeo.dbt_jyeo.foo
clone prod_jyeo.dbt_jyeo.foo
10:33:21 SQL status: SUCCESS 1 in 1.0 seconds
10:33:21 On model.my_dbt_project.foo: Close
10:33:22 Finished running node model.my_dbt_project.foo
10:33:22 Began running node model.my_dbt_project.bar
10:33:22 Re-using an available connection from the pool (formerly model.my_dbt_project.foo, now model.my_dbt_project.bar)
10:33:22 Began compiling node model.my_dbt_project.bar
10:33:22 Began executing node model.my_dbt_project.bar
10:33:22 CLONE DEBUG: "PROD_JYEO"."DBT_JYEO"."BAR" type is view
10:33:22 Using snowflake connection "model.my_dbt_project.bar"
10:33:22 On model.my_dbt_project.bar: /* {"app": "dbt", "dbt_version": "1.8.2", "profile_name": "sf", "target_name": "dev", "node_id": "model.my_dbt_project.bar"} */
select case when table_type = 'BASE TABLE' then 'table' else table_type end as table_type
from prod_jyeo.information_schema.tables
where lower(table_schema) = lower('dbt_jyeo')
and lower(table_name) = lower('bar')
10:33:22 Opening a new connection, currently in state closed
10:33:24 SQL status: SUCCESS 1 in 2.0 seconds
10:33:24 CLONE DEBUG (recheck type): defer relation type is VIEW
10:33:24 Writing runtime sql for node "model.my_dbt_project.bar"
10:33:24 Using snowflake connection "model.my_dbt_project.bar"
10:33:24 On model.my_dbt_project.bar: /* {"app": "dbt", "dbt_version": "1.8.2", "profile_name": "sf", "target_name": "dev", "node_id": "model.my_dbt_project.bar"} */
create or replace view development_jyeo.dbt_jyeo.bar
as (
select * from prod_jyeo.dbt_jyeo.bar
);
10:33:25 SQL status: SUCCESS 1 in 1.0 seconds
10:33:25 On model.my_dbt_project.bar: Close
10:33:25 Finished running node model.my_dbt_project.bar Prior to cloning the prod object - the debug logs will tell us:
It would be useful to know if (1) and (2) both line up - this means 100% certainty that Snowflake is returning the fact that the prod object is a confirmed If (1) and (2) don't line up - it is also useful I think? Additionally, by logging out |
Is this a new bug in dbt-core?
Current Behavior
On the latest dbt version ("Keep on latest version") in dbt Cloud -
dbt clone
is sometimes creating view pointers of the target table instead of actually creating a cloned table.Internal thread: https://dbt-labs.slack.com/archives/C05FWBP9X1U/p1717992405960779
Expected Behavior
If the target table exist as a table - it should be cloned via
create or replace table db.ci.foo clone db.prod.foo;
instead of a view pointercreate or replace view db.ci.foo as (select * from db.prod.foo);
Steps To Reproduce
Not able to repro yet.
Relevant log output
Environment
Which database adapter are you using with dbt?
snowflake
Additional Context
I believe this is a core thing as opposed to an adapter thing so filing this here.
The text was updated successfully, but these errors were encountered: