-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove TableComparison and convert existing calls to use dbt.tests.util #4986
Conversation
dbt.tests.util. Also update tests for accidentally commented out assert in dbt_run.
Thank you for your pull request! We could not find a changelog entry for this change. For details on how to document a change, see the contributing guide. |
This pr is not complete--I'm still looking at the run_sql piece--but I wanted to get it out for comment. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gshank Very impressive work! I'm glad to see this code rightfully retiring. It served us well enough, for a long time, and need serve no longer :)
I used this branch to run our "basic" functional testing suite against dbt-redshift
, dbt-snowflake
, dbt-bigquery
, and dbt-databricks
. All passed. This does make me yearn for an easy way to test changes in a dbt-core
branch by triggering GHAs in plugins: #4988
core/dbt/tests/util.py
Outdated
# Test utilities | ||
# run_dbt | ||
# run_dbt_and_capture | ||
# get_manifest | ||
# copy_file | ||
# rm_file | ||
# write_file | ||
# read_file | ||
# get_artifact | ||
# update_config_file | ||
# get_unique_ids_in_results | ||
# check_results_nodes_by_name | ||
# check_result_nodes_by_unique_id | ||
|
||
# SQL related utilities | ||
# run_sql_with_adapter | ||
# relation_from_name | ||
# check_relation_types (table/view) | ||
# check_relations_equal | ||
# check_relations_equal_with_relations | ||
# check_table_does_exist | ||
# check_table_does_not_exist | ||
# get_relation_columns | ||
# update_rows | ||
# generate_update_clause |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Which of these do you feel comfortable documenting, as "public" methods? Are there any we want to keep "private"? (The current draft docs mention only run_dbt
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think 'get_artifact' should be okay to document. I'm not totally sure about 'get_manifest', mainly because it will load an internal version of the manifest that I'm not sure we want people to rely on. Most of the rest of the test utilities are pretty innocuous. I'm not totally clear on how useful they would be to users, because my imagination is that they'd be testing their own project, and most of those are helpful for modifying/updating a project.
in SQL related utilities, I think update_rows is still kind of funky, i.e. the interface is very specific to a couple of particular tests. And I'd like to look a bit more at run_sql_with_adapter (which is mostly designed to be used by the project.run_sql method). I think check_relations_equal, relation_from_name, check_relation_types, check_table_does_(not)_exist, and get_relation_columns are okay. We will be expecting adapter contributors to be dealing with those utilities. I'm not totally clear how much uptake they'd get with general users.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The principal intended user here is very much the adapter plugin maintainer, and I think that should inform how we publish/document the first cut of this interface. If it proves valuable for package/project maintainers as well, all the better.
- Makes sense that
get_manifest
wants to be private, given that it's largely the output of parsing, and ought not vary across adapters - Agree that
update_rows
is not my favorite. I'd like to see all these methods refactored to use SQL wrapped in Jinja macros, rather than in Python f-strings. We can make that change independently of the test utilities, though.
# Uses: | ||
# adapter.config.credentials | ||
# adapter.quote | ||
# adapter.run_sql_for_tests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for annotating these so clearly! Did you find a need for any totally new adapter methods?
Below I see we make use of some standard attributes (credentials
, quote
, Relation
) and standard methods (get_colums_in_relation
), as well as some slightly less-standard methods (run_sql_for_tests
, get_rows_different_sql
).
Many adapters won't need to reimplement the default versions, but we'll definitely want to enumerated all of them in our documentation.
ed7b2c1
to
e0cccd7
Compare
e0cccd7
to
ebff78f
Compare
ebff78f
to
0770ed2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@@ -182,6 +228,7 @@ def adapter(unique_schema, project_root, profiles_root, profiles_yml, dbt_projec | |||
runtime_config = RuntimeConfig.from_args(args) | |||
register_adapter(runtime_config) | |||
adapter = get_adapter(runtime_config) | |||
adapter.load_macro_manifest(base_macros_only=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why we only load base macros here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This particular adapter is only used until the first dbt command is executed, when a new adapter is built from the full project. For the initial create_schema call and the run_sql commands that might be run in a test prior to a dbt command, the base macros should be enough. I'm trying to limit the pieces of the project that are loaded here because everything used here has to actually work, so if you want to test bad project files or profiles or macros, they will have to be loaded later instead of when the project is initially constructed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added some comments about it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That makes total sense!! Thanks for explaining it!
relation = self.adapter.Relation.create( | ||
database=self.database, schema=self.test_schema | ||
) | ||
self.adapter.create_schema(relation) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Love the fact that we are using more adapter function here!
@@ -28,10 +56,11 @@ def run_dbt(args: List[str] = None, expect_pass=True): | |||
|
|||
print("\n\nInvoking dbt with {}".format(args)) | |||
res, success = handle_and_check(args) | |||
# assert success == expect_pass, "dbt exit state did not match expected" | |||
assert success == expect_pass, "dbt exit state did not match expected" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is super helpful for prevent unknown downstream check error!
check_relations_equal(project.adapter, ["summary_expected", "materialized_summary"]) | ||
check_relations_equal(project.adapter, ["summary_expected", "view_summary"]) | ||
check_relations_equal(project.adapter, ["summary_expected", "ephemeral_summary"]) | ||
check_relations_equal(project.adapter, ["summary_expected", "view_using_ref"]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Feels like we can merge all these checks into one check? Also okay to leave like this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. I've updated those tests. No point in leaving bad examples out there.
I think we need to wait for some update finish in spark and then update here before this can be merged right? |
core/dbt/tests/fixtures/project.py
Outdated
# The main functional test fixture is the 'project' fixture, which combines | ||
# other fixtures to write out a dbt_project in a temporary directory. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# The main functional test fixture is the 'project' fixture, which combines | |
# other fixtures to write out a dbt_project in a temporary directory. | |
# The main functional test fixture is the 'project' fixture, which combines | |
# other fixtures, write out a temp dbt_project in a directory, create a temp | |
# schema in the testing database, and return a `TestProjInfo` object that | |
# contains information about that temp dbt_project. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've updated the comment.
resolves #4778
Description
Remove TableComparison and the dbt.test.tables file. Replace existing calls to TableComparison to functions in dbt.tests.util which were converted from dbt-adapter-tests. Add some comments.
I fixed a commented out assert in the dbt_run function, and had to update a bunch of tests because of that.
Checklist