-
Notifications
You must be signed in to change notification settings - Fork 138
Allow use of schema argument for project and dataset #63
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow use of schema argument for project and dataset #63
Conversation
| # If neither dataset nor project are the default | ||
| sample_table_1 = Table('natality', schema='bigquery-public-data.samples') | ||
| # If just dataset is not the default | ||
| sample_table_2 = Table('natality', schema='bigquery-public-data') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As discussed in #53, I believe it's more in the spirit of SQLAlchemy and common SQL database notions of 'schema name' to specify the location, rather than encouraging people to specify it in the name (in BigQuery-speak, the project and dataset).
As per previous discussion, the code will continue to accept dataset and project specified in the table name to avoid breaking existing users, and there are regression tests added for that usage.
| elif len(table_name_split) == 3: | ||
| project, dataset, table_name = table_name_split | ||
| else: | ||
| raise ValueError("Did not understand table_name: {}".format(full_table_name)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I used .format() instead of f-strings since I assume you want to support older Python versions. The underlying BigQuery libraries seemed to use ValueError for SQL parsing issues, so I assumed this was a good exception type.
| table_ref = TableReference.from_string("{}.{}.{}".format( | ||
| project_id, dataset_id, table_id | ||
| )) | ||
| return table_ref |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I broke out this function so it could be easily unit tested with the various valid combinations of specifying project/dataset/table in the schema and table arguments. I test drove this, and every line was added due to a corresponding test case.
| with pytest.raises(ValueError): | ||
| dialect._table_reference(provided_schema_name, | ||
| provided_table_name, | ||
| client_project) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Above are the test cases! Let me know if you think of any other scenarios we need to cover.
|
Hi @tswast! Let me know what you think. |
|
Thanks for the contribution! It'll probably be another week or two before I have time to review this. Thanks for your patience. |
tswast
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wonderful! Thanks so much for your contribution.
This is a follow-on PR to #53 to implement the desired expanded project and dataset parsing behavior discussed during its review.
Tests have been updated and pass.