-
Notifications
You must be signed in to change notification settings - Fork 135
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix problem with row access policy doesnt return bytes proccessed. #48
Conversation
Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: igor martynetz.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR @imartynetz! Are you able to sign the CLA?
I left one comment on the implementation.
dbt/adapters/bigquery/connections.py
Outdated
@@ -380,7 +380,7 @@ def execute( | |||
query_table = client.get_table(query_job.destination) | |||
code = 'CREATE TABLE' | |||
num_rows = query_table.num_rows | |||
bytes_processed = query_job.total_bytes_processed | |||
bytes_processed = query_job.total_bytes_processed or 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would make more sense for total_bytes_processed
to be None
here, rather than 0
. Per its type, it can be None
:
bytes_processed: Optional[int] = None |
A few points:
format_bytes
+format_rows_number
are only used indbt-bigquery
, so it would make sense to move those methods out ofdbt-core
(core/dbt/utils.py
) and into this package/repo- We should update the logic here, or in those methods, to return
None
ifbytes_processed
isNone
, rather than trying to wrap it inabs()
and raising an error
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @jtcohen6 I sign the CLA first with my personal account, than i figure out that my local github is linked with the company email, than I sign again with other email.
I make the change to be 0
to avoid change too much in the code and avoid to break in other part of the code, and it`s a safe approuch
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe change to this?
def format_bytes(num_bytes):
if num_bytes:
for unit in ['Bytes', 'KB', 'MB', 'GB', 'TB', 'PB']:
if abs(num_bytes) < 1024.0:
return f"{num_bytes:3.1f} {unit}"
num_bytes /= 1024.0
num_bytes *= 1024.0
return f"{num_bytes:3.1f} {unit}"
else:
return num_bytes
What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That makes sense to me!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, how we do? I make the changes in this repo, but I`m unable to remove from core, because I just fork the bigquery repo.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made a new change to add this methods to bigquery/connections
so just need to be deleted from core.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That sounds good to me. It's not a huge deal if that method remains unused in core/utils.py
for the time being.
…o bigquery/connections
Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: igor martynetz.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@imartynetz Do you think you'll be able to sign the CLA? Unfortunately, I won't be able to merge these changes otherwise
@jtcohen6 CLA is that google forms right? I sign twice, with both email that my github is linked. Don't know whats happening. |
@cla-bot check |
The cla-bot has been summoned, and re-checked this pull request! |
Hooray! CLA check worked. I'm going to close and reopen just to trigger the integration tests, now that I've added the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only test failures are unrelated (schema version bumps). This change LGTM! Thanks so much @imartynetz!
You're welcome, always nice to help. |
Hello @jtcohen6. Just to check, this hotfix will be commit to dbt-bigquery 0.21.0 to? |
@imartynetz This fix will be included in v1.0.0 final. (It's already been included in v1.0.0rc1.) Unfortunately, we split out the adapter plugins into their own repos between v0.21 and v1.0, so it would require extra manual, error-prone work to backport fixes into the Final release is planned for early December. That's not too long to wait, I hope! |
Nice to know, i will wait for this new version. |
…bt-labs#48) * FIX: Fix problem with row access policy doesnt return bytes proccessed. * UPDATE: Update changelog with fix. * UPDATE: Change methods format_row_number and format_bytes from core to bigquery/connections * Update changelog Co-authored-by: Jeremy Cohen <jeremy@dbtlabs.com>
resolves #47
Description
Solution for the bug I found with row policy access
I found the problem and a solution for it.
Because of some security in row access policy, when the user that is include in row access policy run a query with the table, he receive None as how much bytes is proccessed
https://cloud.google.com/bigquery/docs/best-practices-row-level-security
So with that dbt the
bytes_processed
return None value and break theabs(bytes_processed)
Checklist
CHANGELOG.md
and added information about my change to the "dbt-bigquery next" section.