You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We are using DBT cloud to run multiple DBT jobs. Each runs on a 10-minute cadence, uses the same DBT schema, and connects to Redshift DW.
So, there's a high probability that two different jobs are attempting to write into tables in the Elementary schema at the same time. And when that happens, we run into Serializable isolation violation errors, and our DBT jobs are aborted.
After digging into debug logs, I've found this interesting lines:
2023-09-13 12:42:31.851461 (MainThread): 12:42:31 On master: /* {"app": "dbt", "dbt_version": "1.3.5", "profile_name": "user", "target_name": "default", "connection_name": "master"} */
create temporary table
"dbt_tests__tmp_20230913124231846680124231848433"
as (
SELECT
*
FROM "analytics_elementary"."dbt_tests"
WHERE 1 = 0
);
...
2023-09-13 12:42:47.278637 (MainThread): 12:42:47 On master: /* {"app": "dbt", "dbt_version": "1.3.5", "profile_name": "user", "target_name": "default", "connection_name": "master"} */
begin transaction;
delete from "analytics_elementary"."dbt_tests"; -- truncate supported in Redshift transactions, but causes an immediate commit
insert into "analytics_elementary"."dbt_tests" select * from "dbt_tests__tmp_20230913124231846680124231848433";
commit;
2023-09-13 12:42:48.232158 (MainThread): 12:42:48 Postgres adapter: Postgres error: 1023
DETAIL: Serializable isolation violation on table - 88024998, transactions forming the cycle are: 954315228, 954315158 (pid:1073760694)
So, at 12:42:31 UTC, we create a temp table with the content of dbt_tests. Then, 16 seconds later, at 12:42:47 UTC, we "truncate" dbt_tests and insert everything from the temp table. Then, it fails because another DBT job is executing the same DELETE+INSERT commands.
I've confirmed in svv_table_info that the table 88024998 is indeed dbt_tests, and that the two transaction IDs (954315228, 954315158) are referring to the DELETE+INSERT transaction blocks in two different DBT jobs.
To Reproduce
Steps to reproduce the behavior:
Create a new DBT project with two models, ex: model_1 and model_2
Add Redshift target to the project
Add Elementary package to the project
Execute dbt run --select model_1 and dbt run --select model_2 at the same time
See an error - Serializable isolation violation on table
Expected behavior
Concurrent DBT executions with Elementary enabled are working. I can imagine the fix would be:
Create a new empty temp table
Insert new records into the temp table
Insert into the final table (analytics_elementary.dbt_tests) everything from the temp table
Environment (please complete the following information):
jakubro
changed the title
Concurrent DBT jobs are failing on populatiung elementary tables in Redshift
Concurrent DBT jobs are failing on populating elementary tables in Redshift
Sep 13, 2023
This also happens to us using dbt v1.4 and elementary-data==0.10.0 on Redshift. For now, we had to disable elementary because of this. Looking forward to a solution!
Describe the bug
We are using DBT cloud to run multiple DBT jobs. Each runs on a 10-minute cadence, uses the same DBT schema, and connects to Redshift DW.
So, there's a high probability that two different jobs are attempting to write into tables in the Elementary schema at the same time. And when that happens, we run into
Serializable isolation violation
errors, and our DBT jobs are aborted.After digging into debug logs, I've found this interesting lines:
So, at 12:42:31 UTC, we create a temp table with the content of
dbt_tests
. Then, 16 seconds later, at 12:42:47 UTC, we "truncate"dbt_tests
and insert everything from the temp table. Then, it fails because another DBT job is executing the same DELETE+INSERT commands.I've confirmed in
svv_table_info
that the table88024998
is indeeddbt_tests
, and that the two transaction IDs (954315228
,954315158
) are referring to the DELETE+INSERT transaction blocks in two different DBT jobs.To Reproduce
Steps to reproduce the behavior:
model_1
andmodel_2
dbt run --select model_1
anddbt run --select model_2
at the same timeSerializable isolation violation on table
Expected behavior
Concurrent DBT executions with Elementary enabled are working. I can imagine the fix would be:
analytics_elementary.dbt_tests
) everything from the temp tableEnvironment (please complete the following information):
pip freeze:
packages.yml:
The text was updated successfully, but these errors were encountered: