Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/faster materializations with fewer transactions #3468

Merged

Conversation

tconbeer
Copy link
Contributor

@tconbeer tconbeer commented Jun 16, 2021

resolves #3112

Description

This PR updates the view and table materializations in the core/.../global_project to check the relation cache for __dbt_tmp and __dbt_backup relations instead of blindly trying to drop those relations. The additional transactions required to drop those temp relations slows down dbt runs, especially on Redshift/RA3, where commits are slow.

I've also removed unused code from those same files (set exists_as_table = ... etc).

Unit and pg integration tests pass on my machine; I've added to an existing integration test for this specific edge case of preexisting temp relations.

This PR should not affect Snowflake or BQ, since those adapters have their own materializations.

Checklist

  • I have signed the CLA
  • I have run this code in development and it appears to resolve the stated issue
  • This PR includes tests, or tests are not required/relevant for this PR
  • I have updated the CHANGELOG.md and added information about my change to the "dbt next" section.

@cla-bot cla-bot bot added the cla:yes label Jun 16, 2021
@tconbeer tconbeer temporarily deployed to Bigquery June 16, 2021 22:58 Inactive
@tconbeer tconbeer temporarily deployed to Bigquery June 16, 2021 22:58 Inactive
@tconbeer tconbeer temporarily deployed to Snowflake June 16, 2021 22:58 Inactive
@tconbeer tconbeer temporarily deployed to Snowflake June 16, 2021 22:58 Inactive
@tconbeer tconbeer temporarily deployed to Postgres June 16, 2021 22:58 Inactive
@tconbeer tconbeer temporarily deployed to Redshift June 16, 2021 22:58 Inactive
@tconbeer tconbeer temporarily deployed to Redshift June 16, 2021 22:58 Inactive
@tconbeer tconbeer temporarily deployed to Postgres June 17, 2021 01:01 Inactive
@tconbeer tconbeer temporarily deployed to Bigquery June 17, 2021 01:01 Inactive
@tconbeer tconbeer temporarily deployed to Bigquery June 17, 2021 01:01 Inactive
@tconbeer tconbeer temporarily deployed to Redshift June 17, 2021 01:01 Inactive
@tconbeer tconbeer temporarily deployed to Redshift June 17, 2021 01:01 Inactive
@tconbeer tconbeer temporarily deployed to Snowflake June 17, 2021 01:01 Inactive
@tconbeer tconbeer temporarily deployed to Snowflake June 17, 2021 01:01 Inactive
@tconbeer tconbeer temporarily deployed to Postgres June 17, 2021 14:49 Inactive
@tconbeer tconbeer temporarily deployed to Snowflake June 17, 2021 14:50 Inactive
@tconbeer tconbeer temporarily deployed to Snowflake June 17, 2021 14:50 Inactive
@tconbeer tconbeer temporarily deployed to Redshift June 17, 2021 14:50 Inactive
@tconbeer tconbeer temporarily deployed to Redshift June 17, 2021 14:50 Inactive
@tconbeer tconbeer temporarily deployed to Bigquery June 17, 2021 14:50 Inactive
@tconbeer tconbeer temporarily deployed to Bigquery June 17, 2021 14:50 Inactive
Copy link
Contributor

@jtcohen6 jtcohen6 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking great, @tconbeer. Thanks for taking it on, and for adding the test case.

For consistency's sake, let's also update the code in the default incremental materialization:
https://github.com/fishtown-analytics/dbt/blob/9faa019a0780132c0cd908c1b6b875edc7f041f5/core/dbt/include/global_project/macros/materializations/incremental/incremental.sql#L25-L26
It's less pressing there than in table and view—in practice, it should only be relevant for full-refresh runs—but I'd like to maintain as much coordination as possible among code that the three materializations have in common.

CHANGELOG.md Outdated
@@ -16,6 +16,10 @@ Contributors:

- Handle quoted values within test configs, such as `where` ([#3458](https://github.com/fishtown-analytics/dbt/issues/3458), [#3459](https://github.com/fishtown-analytics/dbt/pull/3459))

### Under the hood

- Improve default view and table materialization performance by checking relational cache before attempting to drop temp relations ([#3112](https://github.com/fishtown-analytics/dbt/issues/3112), [#3468](https://github.com/fishtown-analytics/dbt/pull/3468))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you move this up to v0.21, and add yourself to the list of Contributors there?

This is a net-new change, so I'd prefer to have it in the next version, rather than trying to sneak it in between v0.20.0rc1 and final. We're going to be disciplined about keeping the v0.21 scope narrow and putting out prereleases early & often.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

totally. Didn't know where to put this, 0.21 makes more sense.

@tconbeer tconbeer temporarily deployed to Postgres June 18, 2021 21:22 Inactive
@tconbeer tconbeer temporarily deployed to Redshift June 18, 2021 21:22 Inactive
@tconbeer tconbeer temporarily deployed to Redshift June 18, 2021 21:22 Inactive
@tconbeer tconbeer temporarily deployed to Snowflake June 18, 2021 21:22 Inactive
@tconbeer tconbeer temporarily deployed to Snowflake June 18, 2021 21:22 Inactive
@tconbeer tconbeer temporarily deployed to Bigquery June 18, 2021 21:22 Inactive
@tconbeer tconbeer temporarily deployed to Bigquery June 18, 2021 21:22 Inactive
Copy link
Contributor

@jtcohen6 jtcohen6 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM @tconbeer! Thank you so much for the contribution :)

@jtcohen6 jtcohen6 merged commit 552e831 into dbt-labs:develop Jun 21, 2021
kwigley pushed a commit that referenced this pull request Jun 24, 2021
* drop if exists in view and table materializations

* Add integration test

* Add changelog entry

* PR Review 1
iknox-fa pushed a commit that referenced this pull request Feb 8, 2022
* drop if exists in view and table materializations

* Add integration test

* Add changelog entry

* PR Review 1

automatic commit by git-black, original commits:
  552e831
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

dbt runs slowly on Redshift RA3 nodes due to excessive commits
2 participants