Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Concurrent DBT jobs are failing on populating elementary tables in Redshift #1161

Closed
jakubro opened this issue Sep 13, 2023 · 2 comments
Closed
Labels
Bug Something isn't working Triage 👀

Comments

@jakubro
Copy link

jakubro commented Sep 13, 2023

Describe the bug

We are using DBT cloud to run multiple DBT jobs. Each runs on a 10-minute cadence, uses the same DBT schema, and connects to Redshift DW.

So, there's a high probability that two different jobs are attempting to write into tables in the Elementary schema at the same time. And when that happens, we run into Serializable isolation violation errors, and our DBT jobs are aborted.

After digging into debug logs, I've found this interesting lines:

2023-09-13 12:42:31.851461 (MainThread): 12:42:31  On master: /* {"app": "dbt", "dbt_version": "1.3.5", "profile_name": "user", "target_name": "default", "connection_name": "master"} */

  create temporary table
    "dbt_tests__tmp_20230913124231846680124231848433"

  as (

        SELECT

            *

        FROM "analytics_elementary"."dbt_tests"
        WHERE 1 = 0

  );

...

2023-09-13 12:42:47.278637 (MainThread): 12:42:47  On master: /* {"app": "dbt", "dbt_version": "1.3.5", "profile_name": "user", "target_name": "default", "connection_name": "master"} */

begin transaction;
delete from "analytics_elementary"."dbt_tests";   -- truncate supported in Redshift transactions, but causes an immediate commit
insert into "analytics_elementary"."dbt_tests" select * from "dbt_tests__tmp_20230913124231846680124231848433";
commit;

2023-09-13 12:42:48.232158 (MainThread): 12:42:48  Postgres adapter: Postgres error: 1023
DETAIL:  Serializable isolation violation on table - 88024998, transactions forming the cycle are: 954315228, 954315158 (pid:1073760694)

So, at 12:42:31 UTC, we create a temp table with the content of dbt_tests. Then, 16 seconds later, at 12:42:47 UTC, we "truncate" dbt_tests and insert everything from the temp table. Then, it fails because another DBT job is executing the same DELETE+INSERT commands.

I've confirmed in svv_table_info that the table 88024998 is indeed dbt_tests, and that the two transaction IDs (954315228, 954315158) are referring to the DELETE+INSERT transaction blocks in two different DBT jobs.

To Reproduce

Steps to reproduce the behavior:

  1. Create a new DBT project with two models, ex: model_1 and model_2
  2. Add Redshift target to the project
  3. Add Elementary package to the project
  4. Execute dbt run --select model_1 and dbt run --select model_2 at the same time
  5. See an error - Serializable isolation violation on table

Expected behavior

Concurrent DBT executions with Elementary enabled are working. I can imagine the fix would be:

  1. Create a new empty temp table
  2. Insert new records into the temp table
  3. Insert into the final table (analytics_elementary.dbt_tests) everything from the temp table

Environment (please complete the following information):

  • dbt - 1.3.2
  • dbt redshift adapter - 1.3.2
  • elementary-data/elementary - 0.10.2

pip freeze:

agate==1.6.3
agate-dbf==0.2.2
agate-excel==0.2.5
agate-sql==0.5.9
attrs==22.2.0
Babel==2.11.0
black==23.7.0
boto3==1.26.46
botocore==1.29.46
certifi==2023.7.22
cffi==1.15.1
charset-normalizer==2.1.1
click==8.1.3
colorama==0.4.5
csvkit==1.1.1
dbfread==2.0.7
dbt-core==1.3.2
dbt-extractor==0.4.1
dbt-postgres==1.3.2
dbt-redshift==1.3.0
et-xmlfile==1.1.0
future==0.18.3
greenlet==2.0.2
hologram==0.0.15
idna==3.4
importlib-metadata==6.0.0
isodate==0.6.1
Jinja2==3.1.2
jmespath==1.0.1
jsonschema==3.2.0
leather==0.3.4
Logbook==1.5.3
MarkupSafe==2.1.1
mashumaro==3.0.4
minimal-snowplow-tracker==0.0.2
msgpack==1.0.4
mypy-extensions==1.0.0
networkx==2.6.3
olefile==0.46
openpyxl==3.1.2
packaging==21.3
parsedatetime==2.4
pathspec==0.9.0
platformdirs==3.10.0
psycopg2-binary==2.9.5
pycparser==2.21
pyparsing==3.0.9
pyrsistent==0.19.3
python-dateutil==2.8.2
python-slugify==7.0.0
pytimeparse==1.1.8
pytz==2022.7
PyYAML==6.0
requests==2.31.0
s3transfer==0.6.0
six==1.16.0
SQLAlchemy==1.4.49
sqlparse==0.4.4
text-unidecode==1.3
tomli==2.0.1
typing_extensions==4.4.0
urllib3==1.26.13
Werkzeug==2.2.3
xlrd==2.0.1
zipp==3.11.0

packages.yml:

packages:
  - package: dbt-labs/dbt_utils
    version: 1.1.1 
  - package: elementary-data/elementary
    version: 0.10.2
@jakubro jakubro added Bug Something isn't working Triage 👀 labels Sep 13, 2023
@jakubro jakubro changed the title Concurrent DBT jobs are failing on populatiung elementary tables in Redshift Concurrent DBT jobs are failing on populating elementary tables in Redshift Sep 13, 2023
@LePeti
Copy link
Contributor

LePeti commented Oct 4, 2023

This also happens to us using dbt v1.4 and elementary-data==0.10.0 on Redshift. For now, we had to disable elementary because of this. Looking forward to a solution!

@haritamar
Copy link
Collaborator

Hi all,
Closing this issue since it had no activity in the past 3 months, but if it's still relevant for anyone please feel free to re-open.

Thanks,
Itamar

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Something isn't working Triage 👀
Projects
None yet
Development

No branches or pull requests

3 participants