[#2479] Allow unique_id to take a list #4618

gshank · 2022-01-24T18:23:19Z

resolves #2479

Description

This derives from pull request #4159.

Checklist

I have signed the CLA
I have run this code in development and it appears to resolve the stated issue
This PR includes tests, or tests are not required/relevant for this PR
I have updated the CHANGELOG.md and added information about my change

jtcohen6

This works with both Snowflake + BigQuery when I run locally. As a follow-up to merging this PR, we should look to add a simple integration tests to both of those plugins:

{{ config(
  materialized = 'incremental',
  unique_key = ['id1', 'id2']
)}}

select 1 as id1, 2 as id2

I think first step in getting the tests to pass should be rebasing against / merging in main. When I run locally, I see: Running with dbt=0.21.0-rc1.

jtcohen6 · 2022-01-25T07:59:53Z

core/dbt/include/global_project/macros/materializations/models/incremental/merge.sql

-            DBT_INTERNAL_SOURCE.{{ unique_key }} = DBT_INTERNAL_DEST.{{ unique_key }}
-        {% endset %}
-        {% do predicates.append(unique_key_match) %}
+        {% if unique_key is sequence and unique_key is not mapping and unique_key is not string %}


This update to default__get_merge_sql will work for Snowflake + BigQuery out of the box. For other databases:

Postgres + Redshift don't support merge, they use get_delete_insert_merge_sql instead. We'd need to refactor the delete statement to support multiple unique keys, which is tricky enough that I think we can open a new issue for it. No need to block this PR in the meantime.

Spark reimplements this as spark__get_merge_sql, so we'll need to make a similar change over there if we want to support the same functionality

For the Spark change here, @gshank maybe you can pair with @McKnight-42 or @VersusFacit on making that change.

@McKnight-42 @VersusFacit - this behavior difference is something good to document in the specific adapter pages if it isn't there yet

I think we'll also want integration tests for Snowflake + BigQuery, to ensure this change actually achieves the desired result

`unique_key` can be a string or a list.

extend the functionality to support both single and multiple keys Signed-off-by: triedandtested-dev (Bryan Dunkley) <bryan@triedandtested.dev>

Signed-off-by: triedandtested-dev (Bryan Dunkley) <bryan@triedandtested.dev>

gshank · 2022-01-27T18:45:25Z

I'm not quite sure what the flow should be here. Should this PR be merged first and then the separate adapter changes be addressed?

leahwicz · 2022-01-27T19:52:42Z

@gshank I'm not opposed to doing those changes in a separate PR as long as this change doesn't break anything w/o those other changes. If we do a separate PR, can you please open a tracking issue so we don't lose track of doing the work?

gshank · 2022-01-28T22:32:26Z

@leah I think they would have to be separate PRs, because they're different repositories, right? You can't put changes from multiple repos into the same PR. I don't know if this change would break anything in the other repos, and I'm not sure how to test them against a dbt-core branch.

leahwicz · 2022-01-31T15:51:03Z

they're different repositories, right?

@gshank that is correct but I just didn't know if we needed to coordinate with the other repos. I don't want to merge this change if it causes the other repos to start failing tests and instead coordinate to merge all the PR's the same day if that makes sense.

jtcohen6 · 2022-01-31T16:30:33Z

I don't want to merge this change if it causes the other repos to start failing tests and instead coordinate to merge all the PR's the same day if that makes sense.

This is a really good callout in general. In this particular case, I don't believe that this change will (or should) cause failing tests in those other repos. (If it does, that failing test will be meaningful, and indicate we've probably done something wrong here!)

It is incumbent on us to follow up with:

PRs to dbt-snowflake + dbt-bigquery to add an integration test for the functionality added in this PR — since this is a case where the "default" merge logic, defined in the dbt-core global project, is not actually used on Postgres
a PR to dbt-spark that mirrors this functionality, and adds an integration test for it

McKnight-42

Getting some context, and reading up LGTM

jtcohen6 · 2022-02-04T10:25:29Z

CHANGELOG.md

@@ -2,6 +2,7 @@

 ### Features
 - New Dockerfile to support specific db adapters and platforms.  See docker/README.md for details ([#4495](https://github.com/dbt-labs/dbt-core/issues/4495), [#4487](https://github.com/dbt-labs/dbt-core/pull/4487))
+- Allow unique_key to take a list ([#2479](https://github.com/dbt-labs/dbt-core/issues/2479), [#4618](https://github.com/dbt-labs/dbt-core/pull/4618))


Let's make sure to add the contributor from the original PR to the changelog for v1.1:

- [@triedandtested-dev](https://github.com/triedandtested-dev) ([#4618](https://github.com/dbt-labs/dbt-core/pull/4618))

By the way, I noticed that this list option is missing from the docs here:
https://docs.getdbt.com/reference/resource-configs/unique_key#use-a-combination-of-two-columns-as-a-unique-key

Good call @cdabel !

Just opened an issue for that here: dbt-labs/docs.getdbt.com#4642

HeddeCrisp · 2023-03-16T10:30:23Z

I encounter an error when running an incremental bigquery model with multiple unique keys. The error I am getting is:

10:24:38  Database Error in model sample_model (models/ODL_derived/sample_model.sql)
10:24:38    Syntax error: Unexpected "[" at [546:33]. If this is a table identifier, escape the name with `, e.g. `table.name` rather than [table.name].
10:24:38    compiled Code at target/run/sample_dwh/models/ODL_derived/sample_model.sql

This is caused by the compiled code:

[...] ) as DBT_INTERNAL_SOURCE
        on DBT_INTERNAL_SOURCE.['col1', 'col2', 'col3'] = DBT_INTERNAL_DEST.['col1', 'col2', 'col3']

Config used is:

config(
    , name="sample_model"
    , materialized = 'incremental'
    , unique_key=['col1', 'col2', 'col3']
)

Am I doing something wrong or is this dbt code that fails with bigquery with tese settings?

VersusFacit · 2023-03-21T06:21:48Z

Heya @HeddeCrisp ! It looks like you have the right syntax. Your error is familiar and strikes me as what used to happen before we merged this.

I did a little incremental model test with the following config:

{{
    config(
        materialized='incremental',
	unique_key=['id', 'first_name', 'last_name']
    )
}}

And my dbt version is:

Core:
  - installed: 1.4.5
  - latest:    1.4.5 - Up to date!

Plugins:
  - bigquery: 1.4.3 - Up to date!

So this to me implies there's not an error on our end with this feature but rather you might be using a dbt bigquery version that didn't have this feature active yet. What does dbt --version yield on your system?

nguyenann13 · 2023-06-08T16:07:23Z

Receiving an error when I pass a list
Parsing Error
at path ['unique_key']: ['usr_key', 'usr_mbr_key', 'prgm_key', 'prmsn_key', 'resp_key', 'mbr_cmpy_key', 'usr_prgm_prmsn_key'] is not valid under any of the given schemas

{% snapshot snap_mkp_usr_prgm_prmsn %}

    {{
        config(
          target_schema='edl_workdb',
          strategy='timestamp',
          unique_key=['usr_key', 'usr_mbr_key', 'prgm_key', 'prmsn_key', 'resp_key', 'mbr_cmpy_key', 'usr_prgm_prmsn_key'],
          updated_at='upd_ts',
          invalidate_hard_deletes=True,
        )
    }}

    -- Filter out value analysis app as it's sunset and has duplicates
    select * from {{ source('hzn_mkp', 'mkp_usr_prmsn_app_t') }} where app_key <> 5

{% endsnapshot %}

cla-bot bot added the cla:yes label Jan 24, 2022

gshank mentioned this pull request Jan 24, 2022

Allow unique_key to take a list #4159

Closed

4 tasks

jtcohen6 reviewed Jan 25, 2022

View reviewed changes

triedandtested-dev and others added 6 commits January 27, 2022 13:05

Add unique_key to NodeConfig

aefd509

`unique_key` can be a string or a list.

merge.sql update to work with unique_key as list

70d49df

extend the functionality to support both single and multiple keys Signed-off-by: triedandtested-dev (Bryan Dunkley) <bryan@triedandtested.dev>

Updated test to include unique_key

f9eb745

Signed-off-by: triedandtested-dev (Bryan Dunkley) <bryan@triedandtested.dev>

updated tests

b2f8e0a

Signed-off-by: triedandtested-dev (Bryan Dunkley) <bryan@triedandtested.dev>

Fix unit and integration tests

acd901a

Update Changelog for 2479/4618

3299e57

gshank force-pushed the 2479-unique_id_list branch from 72885a1 to 3299e57 Compare January 27, 2022 18:14

gshank requested review from leahwicz and jtcohen6 January 27, 2022 18:44

gshank requested review from McKnight-42 and VersusFacit January 27, 2022 19:00

leahwicz mentioned this pull request Jan 31, 2022

[CT-116] Add integration test for new list input for unique ID dbt-labs/dbt-bigquery#112

Closed

McKnight-42 approved these changes Feb 1, 2022

View reviewed changes

This was referenced Feb 1, 2022

[CT-129] Add integration test for new list input for unique ID dbt-labs/dbt-snowflake#91

Closed

[CT-134] Allow unique_id to take a list dbt-labs/dbt-spark#282

Closed

gshank merged commit 3ad3c21 into main Feb 3, 2022

gshank deleted the 2479-unique_id_list branch February 3, 2022 17:55

jtcohen6 reviewed Feb 4, 2022

View reviewed changes

McKnight-42 mentioned this pull request Feb 4, 2022

added unique_key to the test_docs_generate places relevant to changes… dbt-labs/dbt-bigquery#114

Merged

4 tasks

JCZuurmond mentioned this pull request Feb 5, 2022

Support composition key (multiple columns) when merging increment dbt-labs/dbt-spark#238

Closed

This was referenced Feb 10, 2022

Fix docs generation integration test dbt-labs/dbt-snowflake#94

Merged

Fix docs generation integration test dbt-labs/dbt-redshift#70

Merged

McKnight-42 mentioned this pull request Feb 15, 2022

adding in unique_key as a list for incremental models dbt-labs/dbt-spark#291

Merged

4 tasks

VersusFacit mentioned this pull request Feb 17, 2022

Test incremental model with unique key both as string and list dbt-labs/dbt-snowflake#96

Merged

4 tasks

jtcohen6 mentioned this pull request Feb 17, 2022

[CT-247] Support multiple unique_key for delete+insert incremental models (Postgres/Redshift/Snowflake/etc) #4738

Closed

McKnight-42 mentioned this pull request Mar 11, 2022

update of macro for postgres/redshift use of unique_key as a list #4858

Merged

4 tasks

jtcohen6 mentioned this pull request Mar 14, 2022

[CT-91] Separate serialization for internal use and for artifacts with published schemas #4617

Closed

jtcohen6 mentioned this pull request Mar 29, 2022

[CT-313] Package new adapters tests #4812

Closed

ghost mentioned this pull request Oct 10, 2022

support multiple unique key (composite primary key) autotraderuk/dbt-dry-run#18

Closed

This was referenced Dec 13, 2023

Include examples of unique_key as a list dbt-labs/docs.getdbt.com#4642

Open

Add contributor to 1.1.0 release #9284

Open

This was referenced Apr 22, 2024

Allows snapshots to have a list as a unique key dbt-labs/dbt-adapters#182

Open

Allows snapshots to have a list of unique keys #9993

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[#2479] Allow unique_id to take a list #4618

[#2479] Allow unique_id to take a list #4618

gshank commented Jan 24, 2022

jtcohen6 left a comment

jtcohen6 Jan 25, 2022

leahwicz Jan 25, 2022

jtcohen6 Jan 27, 2022

gshank commented Jan 27, 2022

leahwicz commented Jan 27, 2022

gshank commented Jan 28, 2022

leahwicz commented Jan 31, 2022

jtcohen6 commented Jan 31, 2022

McKnight-42 left a comment

jtcohen6 Feb 4, 2022

cdabel Aug 31, 2022

cdabel Aug 31, 2022

dbeatty10 Dec 13, 2023

HeddeCrisp commented Mar 16, 2023 •

edited

Loading

VersusFacit commented Mar 21, 2023

nguyenann13 commented Jun 8, 2023 •

edited

Loading

[#2479] Allow unique_id to take a list #4618

[#2479] Allow unique_id to take a list #4618

Conversation

gshank commented Jan 24, 2022

Description

Checklist

jtcohen6 left a comment

Choose a reason for hiding this comment

jtcohen6 Jan 25, 2022

Choose a reason for hiding this comment

leahwicz Jan 25, 2022

Choose a reason for hiding this comment

jtcohen6 Jan 27, 2022

Choose a reason for hiding this comment

gshank commented Jan 27, 2022

leahwicz commented Jan 27, 2022

gshank commented Jan 28, 2022

leahwicz commented Jan 31, 2022

jtcohen6 commented Jan 31, 2022

McKnight-42 left a comment

Choose a reason for hiding this comment

jtcohen6 Feb 4, 2022

Choose a reason for hiding this comment

cdabel Aug 31, 2022

Choose a reason for hiding this comment

cdabel Aug 31, 2022

Choose a reason for hiding this comment

dbeatty10 Dec 13, 2023

Choose a reason for hiding this comment

HeddeCrisp commented Mar 16, 2023 • edited Loading

VersusFacit commented Mar 21, 2023

nguyenann13 commented Jun 8, 2023 • edited Loading

HeddeCrisp commented Mar 16, 2023 •

edited

Loading

nguyenann13 commented Jun 8, 2023 •

edited

Loading