You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The snapshot calculation relies on the Teradata HASHROW function. The dbt_scd_id is generated for each row based on the provided unique_key and the current timestamp. However, the HASHROW function produces a 4-byte hash, which is highly prone to collisions. For instance, the values d3dadd49420542fb49ffbf6a77349b45 and 34f325fe5a4216f27357328b61c9eccb both produce the same hash 02-27-E3-B4. Similarly, the numbers 162181727 and 880145039 generate the same hash 2E-5B-FE-DD. In a source with 36 million numbers, we have over 180 thousand duplicate dbt_scd_id.
These collisions cause the snapshot update to fail with the error: [Error 7547] Target row updated by multiple source rows.
Steps To Reproduce
Create a source with the provided values as IDs and then try to create a snapshot of them.
Expected behavior
Calculating the snapshot without errors.
Screenshots and log output
The output of dbt --version:
Core:
- installed: 1.7.11
- latest: 1.8.0 - Update available!
Your version of dbt-core is out of date!
You can find instructions for upgrading here:
https://docs.getdbt.com/docs/installation
Plugins:
- teradata: 1.7.2 - Up to date!
The operating system you're using:
Windows 11
The output of python --version:
Python 3.11.3
Additional context
The text was updated successfully, but these errors were encountered:
Describe the bug
The snapshot calculation relies on the Teradata HASHROW function. The
dbt_scd_id
is generated for each row based on the providedunique_key
and the current timestamp. However, the HASHROW function produces a 4-byte hash, which is highly prone to collisions. For instance, the valuesd3dadd49420542fb49ffbf6a77349b45
and34f325fe5a4216f27357328b61c9eccb
both produce the same hash02-27-E3-B4
. Similarly, the numbers162181727
and880145039
generate the same hash2E-5B-FE-DD
. In a source with 36 million numbers, we have over 180 thousand duplicate dbt_scd_id.These collisions cause the snapshot update to fail with the error: [Error 7547] Target row updated by multiple source rows.
Steps To Reproduce
Create a source with the provided values as IDs and then try to create a snapshot of them.
Expected behavior
Calculating the snapshot without errors.
Screenshots and log output
The output of
dbt --version
:The operating system you're using:
Windows 11
The output of
python --version
:Python 3.11.3
Additional context
The text was updated successfully, but these errors were encountered: