Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

propseal for modification to drop_test_schema #5198

Merged
merged 3 commits into from May 3, 2022

Conversation

McKnight-42
Copy link
Contributor

@McKnight-42 McKnight-42 commented Apr 29, 2022

resolves # dbt-labs/dbt-redshift#110

Description

Test files where getting locked due to transactions leaving them in database after test ended leading to error:

psycopg2.errors.InternalError_: 1040
DETAIL:  Maximum tables limit exceeded. The maximum number of tables per cluster is 9900 for this instance type. The limit includes permanent and temporary tables. (pid:25694)

due to database reaching limit on tables.

Checklist

@cla-bot cla-bot bot added the cla:yes label Apr 29, 2022
@github-actions
Copy link
Contributor

Thank you for your pull request! We could not find a changelog entry for this change. For details on how to document a change, see the contributing guide.

@McKnight-42 McKnight-42 requested a review from gshank April 29, 2022 20:10
@McKnight-42 McKnight-42 self-assigned this Apr 29, 2022
@McKnight-42
Copy link
Contributor Author

@gshank, @jtcohen6 was wondering if this change seems reasonable or if it posses any known problems?

@gshank
Copy link
Contributor

gshank commented Apr 29, 2022

The reason I used adapter.drop_schema was in theory to use more adapter specific macros/methods. Do you know why this works and adapter.drop_schema doesn't?

@McKnight-42
Copy link
Contributor Author

McKnight-42 commented Apr 29, 2022

for redshift when a table has a current transaction open on it it locks the tables and prevents the drop_schema macro from completing succesfully due to interaction with drop schema if exists {{ relation.without_identifier() }} cascade
https://github.com/dbt-labs/dbt-redshift/blob/main/dbt/adapters/redshift/impl.py#L36 talks about this in the docstring a bit
https://docs.aws.amazon.com/redshift/latest/dg/r_DROP_TABLE.html
https://docs.aws.amazon.com/redshift/latest/dg/r_STV_LOCKS.html

I believe this is just a redshift issue, and am a little worried us adjusting to fix it might affect other adapters but not entirely sure.

@McKnight-42 McKnight-42 reopened this Apr 29, 2022
for schema_name in self.created_schemas:
relation = self.adapter.Relation.create(database=self.database, schema=schema_name)
schema = self.adapter.quote_as_configured(relation.schema, "schema")
self.run_sql(f"drop schema if exists {schema} cascade")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd definitely prefer us not going back to hard-coded run_sql here. This will definitely cause issues on other adapters.

Do we know why we're running drop schema at a point in time when there are still open transactions against objects in the schema? In theory, every test cases uses its own dedicated schema, so by the time we're running drop_test_schema, we should be totally done with it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jtcohen6 from what I can tell it seems like a lock gets put on things like test_basic and incremental_unique_id because they each reference the same tables between many of their tests e.g snapshot_expected, and duplicate_insert and wonder if test hasn't had enough time to tear down ref to that table before next test kicks off especially if in same class so it locks as to not have two people updating same. table. as mentioned in https://docs.aws.amazon.com/redshift/latest/dg/r_STV_LOCKS.html. we were trying to deduce if either the get_connection or even the drop_schema calls we make as part of drop_test_schema were starting a transaction right when we were starting teardown. @kwigley does that sound correct.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the same tables

These should be in different schemas, though—a dedicated schema for each test case, for all the resources required for that test—right?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, every tests should get a randomly generated unique schema. So in theory different tests should not be locking each other's tables. And drop_test_schema should only be executed at the end of the test.

@gshank
Copy link
Contributor

gshank commented May 2, 2022

It's been a while since I had to interact with Redshift, but I do vaguely remember that dropping tables in Redshift didn't necessary give back everything... you had to recover the space, etc, by .... vacuuming? It's been a while. I'm wondering if this is a Redshift management thing anyway.

@gshank
Copy link
Contributor

gshank commented May 2, 2022

Also the "drop_schema" default implementation (which I believe is used by Redshift) is: drop schema if exists {{ relation.without_identifier() }} cascade. So that code should pretty much be doing the same thing as what's happening here.

@McKnight-42
Copy link
Contributor Author

McKnight-42 commented May 2, 2022

today i've continued playing around with what tests do leave files after running, non of the older integration tests have today, its only the 11 files generated by the new functional tests, so could this be an issue with how the fixtures work by chance? the initiating models/seeds staying open technically being an active transaction against the tests? because those files wouldn't go away till after the schemas are dropped at end of test correct? which would prevent the default drop_schema functionality from working?

@McKnight-42 McKnight-42 closed this May 2, 2022
@McKnight-42 McKnight-42 reopened this May 2, 2022
@gshank
Copy link
Contributor

gshank commented May 2, 2022

One thing that is different is that the older test framework used the initial adapter that was created by test/integration/base.py, which meant that we had to patch providers. In order to avoid that patching, the new tests now use the latest adapter. I wouldn't think that would make a difference. It's pretty much the same thing that you get when you run separate CLI dbt commands (a "new" adapter object every time).

@gshank
Copy link
Contributor

gshank commented May 2, 2022

By "leave files after running", you actually mean "leave schemas after running"?

@gshank
Copy link
Contributor

gshank commented May 2, 2022

Also, what query are you running to see what tables are left over, and what are the schema/table names?

@McKnight-42
Copy link
Contributor Author

McKnight-42 commented May 2, 2022

Also, what query are you running to see what tables are left over, and what are the schema/table names?

General process to look this over has been to move breakpoints throughtut the process even tried to debug in the macro which is interesting.

then in postico or whatever you need to sign in as super user to see lock tables and other relevant information
various commands run:

to see if you can drop the table
--drop schema if exists "<table_name>" cascade;
to look at current locked tables
select table_id, last_update, lock_owner, lock_owner_pid from stv_locks;
see open locked. tables
select "table" from svv_table_info where table_id = <Insert a table_id from above command>;
to get more information about table like what type of lock is on table.
select * from svv_transactions;

as for to run the tests
python -m pytest tests/functional/adapter/test_basic.py -s and comparing generated tables to tables that remain at various points during tests at each breakpoint stop and after test completion.

@kwigley
Copy link
Contributor

kwigley commented May 2, 2022

Something wonky is going here when calling self.adapter.drop_schema(relation) for redshift specifically that is difficult to debug. It involves database transactions and python threading. I have a feeling it is related to this bit of code https://github.com/dbt-labs/dbt-redshift/blob/6543b82bf5943e2bc272880984380205437d49e6/dbt/adapters/redshift/impl.py#L36-L52.
This snippet is essentially the equivalent of

-- I'm not sure what happens before this...
commit;
begin;
drop schema if exists "test16515170591869289928_test_basic" cascade;
commit;
begin;

which works perfectly fine, so I'm not 100% why the schema is not dropping. While this PR isn't a solution, it works properly for dropping schemas created by tests in dbt-redshift. I'm still working to understand why though..

@gshank
Copy link
Contributor

gshank commented May 2, 2022

So it sounds like the problem is with the drop_schema/drop_relation redshift code, not the test_drop_schema per se.

@gshank
Copy link
Contributor

gshank commented May 2, 2022

The base implementation of "drop_relation" in core/dbt/adapters/sql/impl.py does "self.cache_dropped(relation)", but the redshift version calls the super version... so the only thing different is the fresh_transaction bit?

@McKnight-42
Copy link
Contributor Author

The base implementation of "drop_relation" in core/dbt/adapters/sql/impl.py does "self.cache_dropped(relation)", but the redshift version calls the super version... so the only thing different is the fresh_transaction bit?

Yeah which applies the exclusivelock and starts a new transaction. but is supposed to drop it to allow us to use drop_relation from core

@gshank
Copy link
Contributor

gshank commented May 2, 2022

It looks like adapter.drop_schema doesn't work in postgres either.

@gshank
Copy link
Contributor

gshank commented May 2, 2022

I looked at the logs and the 'drop_schema' call was getting rolled back every time. The 'drop_schema' method in core/dbt/adapters/sql/impl.py did not have a commit. I added "self.commit_if_has_connection()" after "self.execute_macro(DROP_SCHEMA_MACRO_NAME, kwargs=kwargs)" and it worked.

@McKnight-42
Copy link
Contributor Author

McKnight-42 commented May 2, 2022

Reverted back to previous drop_test_schema method and added suggested code above.

@McKnight-42 McKnight-42 requested a review from jtcohen6 May 3, 2022 15:26
@McKnight-42 McKnight-42 marked this pull request as ready for review May 3, 2022 16:23
@McKnight-42 McKnight-42 requested review from a team as code owners May 3, 2022 16:23
Copy link
Contributor

@gshank gshank left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need a test of this functionality. It's supposed to be tested in tests/adapter/dbt/tests/adapter/basic/test_adapter_methods.py, but it's pretty clear that wasn't actually testing that a schema was dropped. In addition, when looking at that test, I see that they specifically decided in #1983 to not do the commit. But that really wasn't working. @jtcohen6 Could you weigh in here?

@gshank gshank requested review from jtcohen6 and removed request for jtcohen6 May 3, 2022 16:32
@jtcohen6
Copy link
Contributor

jtcohen6 commented May 3, 2022

If adding commit_if_has_connection() fixes this, I'm very happy.

@gshank Thanks for finding and linking the context from #1983. I have to disagree with Drew from 2019 on these points, though:

these sprinkled-in transactional statements make it super hard to reason about what dbt is doing in different parts of the codebase.
it's the destructive nature of this method that gave me pause

If the commit is necessary, we should have it. It should be sprinkled through the logs wherever it's needed. As far as the "destructive" nature of drop schema: dbt itself doesn't actually call drop_schema anywhere, in the course of its standard operations. It gets called in two highly specific places:

  • While running functional tests on dbt-core (this issue)
  • By dbt Cloud, after a PR is merged, to drop the "PR schema" used by the CI check built from that PR. I just checked into this part of the dbt Cloud codebase, and guess what's there?
            adapter.drop_schema(database, schema)
            adapter.commit_if_has_connection()

Of course, users may also call it for their own purposes. If they do, I think they'd expect it to work :)

@gshank
Copy link
Contributor

gshank commented May 3, 2022

It's up to you. In theory it should go in that test case that I mentioned, but that's an adapter zone test... I'm okay without one in this case.

@McKnight-42 McKnight-42 merged commit c270a77 into main May 3, 2022
@McKnight-42 McKnight-42 deleted the mcknight/modify_drop_test_schema branch May 3, 2022 18:10
agoblet pushed a commit to BigDataRepublic/dbt-core that referenced this pull request May 20, 2022
* propseal for modification to drop_test_schema

* changelog

* remove hard coded run_dbt version and put back previous version of drop_test_schema, add commit to drop_schema
codeforkjeff added a commit to codeforkjeff/dbt-sqlite that referenced this pull request Jul 16, 2022
codeforkjeff added a commit to codeforkjeff/dbt-sqlite that referenced this pull request Aug 2, 2022
* reorganize Dockerfile and use docker in github CI
* in dbt-core 1.2.x, commit is now being done in SQLAdapter.drop_schema(), so there's no need for it here anymore. see this PR: dbt-labs/dbt-core#5198
* rm file on disk on DROP SCHEMA
* load text.so sqlean extension to get split_parts fn
* add math.so for functions needed by datediff
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants