Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Try using SQL for create_schema #183

Merged
merged 10 commits into from
Jul 14, 2022
Merged

Try using SQL for create_schema #183

merged 10 commits into from
Jul 14, 2022

Conversation

jtcohen6
Copy link
Contributor

@jtcohen6 jtcohen6 commented May 5, 2022

resolves #182

I'd like to avoid duplicating code from SQLAdapter if we can help it, but I'm not sure of the right way to plumb through its methods without turning BigQueryAdapter into an unholy hybrid of BaseAdapter + SQLAdapter.

Checklist

  • I have signed the CLA
  • I have run this code in development and it appears to resolve the stated issue
  • This PR includes tests, or tests are not required/relevant for this PR
  • I have updated the CHANGELOG.md and added information about my change to the "dbt-bigquery next" section.

@cla-bot cla-bot bot added the cla:yes label May 5, 2022
@jtcohen6
Copy link
Contributor Author

jtcohen6 commented May 5, 2022

  • Unit tests for create_schema + drop_schema fail, which isn't too surprising
  • All our legacy integration tests (tests/integration) are failing as well. Just out of curiosity, I'm going to reorder to run the new functional tests (tests/functional) first, since that was working for me locally

@jtcohen6
Copy link
Contributor Author

jtcohen6 commented May 5, 2022

Cool! Per latest GHA run, this change seems to be working with our new testing framework, just not the legacy one. I'm going to close this PR for now, and consider it blocked by our ongoing test conversion work.

@jtcohen6 jtcohen6 closed this May 5, 2022
@jtcohen6
Copy link
Contributor Author

The main value here is around converting create_schema, since that's what dbt runs at start-up. The only cases where dbt actually uses its own drop_schema method/macro are:

  • clean up after integration/functional test
  • clean up PR schemas in dbt Cloud

I'm running into a very confusing permissions issue at the intersection of the old testing framework and SQL drop schema. So, we could unblock this work for the meantime by moving ahead with just create schema, and opening a new tech debt ticket to track the unfinished work around drop schema. I believe that work will be unblocked once we've totally put to rest the old testing framework.

@jtcohen6 jtcohen6 reopened this May 11, 2022
@jtcohen6 jtcohen6 changed the title Try using SQL create_schema + drop_schema Try using SQL for create_schema May 11, 2022
@McKnight-42
Copy link
Contributor

This is very interesting, do you think we will eventually move completely away from using BaseAdapter for BigQuery?

@jtcohen6
Copy link
Contributor Author

jtcohen6 commented Jun 7, 2022

This is very interesting, do you think we will eventually move completely away from using BaseAdapter for BigQuery?

@McKnight-42 Very cool question! I think we're still a ways away from getting there, but it does seem like the direction we're heading in, slowly.

That said, work like #136 should convince us, if we weren't convinced already, that BigQuery's Python API might always be half a step ahead of its SQL API, in terms of performance / cost / reliability.

So, I could see us adding bits and pieces of the SQLAdapter here, until at some point we realize that it would be more expedient to inherit from the SQLAdapter instead, and reimplement the pieces that still need to be in Python. For today, I think the gaps are still pretty wide. In the meantime, I'd also say it's good, for our own edification, that our 5 maintained adapters offer a representative cross-section of the oddities that are out there.

@github-christophe-oudar
Copy link
Contributor

I would add that it's indeed pretty frustrating that we can't rely only on SQL on BigQuery. #167 is yet another example on how BigQuery lacks consistency on that topic and makes it harder for its users to rely only on SQL (and that we need some smart custom dbt logic to make it easier to work with for dbt users until Google figures it out).

@jtcohen6 jtcohen6 marked this pull request as ready for review June 15, 2022 08:38
@jtcohen6 jtcohen6 added the ready_for_review Externally contributed PR has functional approval, ready for code review from Core engineering label Jun 15, 2022
@McKnight-42 McKnight-42 self-requested a review June 15, 2022 14:29
Comment on lines 352 to 355
def test_create_schema(self):
relation = BigQueryRelation.create(database='db', schema='schema')
self.adapter.create_schema(relation)
self.mock_connection_manager.create_dataset.assert_called_once_with('db', 'schema')
# def test_create_schema(self):
# relation = BigQueryRelation.create(database='db', schema='schema')
# self.adapter.create_schema(relation)
# self.mock_connection_manager.create_dataset.assert_called_once_with('db', 'schema')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jtcohen6 do we plan on leaving the test commented out or is it okay to remove?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch - probably okay to remove. I don't think we can have a dedicated unit test like this (mocking a Python method) if dbt is running real SQL instead. I also don't feel like we need a dedicated test for create_schema, since it's a required "setup" step in every single functional test.

@McKnight-42 McKnight-42 self-requested a review June 16, 2022 15:02
Copy link
Contributor

@McKnight-42 McKnight-42 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This Looks fine to me!

@jtcohen6 jtcohen6 merged commit 803b954 into main Jul 14, 2022
@jtcohen6 jtcohen6 deleted the update/create-drop-schemas branch July 14, 2022 10:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla:yes ready_for_review Externally contributed PR has functional approval, ready for code review from Core engineering
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[CT-610] Prefer SQL for create_schema + drop_schema
3 participants