Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix problem with row access policy doesnt return bytes proccessed. #48

Merged
merged 5 commits into from
Nov 5, 2021

Conversation

imartynetz
Copy link
Contributor

resolves #47

Description

Solution for the bug I found with row policy access

I found the problem and a solution for it.
Because of some security in row access policy, when the user that is include in row access policy run a query with the table, he receive None as how much bytes is proccessed
https://cloud.google.com/bigquery/docs/best-practices-row-level-security
So with that dbt the bytes_processedreturn None value and break the abs(bytes_processed)

Checklist

  • [ x] I have signed the CLA
  • [ x] I have run this code in development and it appears to resolve the stated issue
  • This PR includes tests, or tests are not required/relevant for this PR
  • [ x] I have updated the CHANGELOG.md and added information about my change to the "dbt-bigquery next" section.

@cla-bot
Copy link

cla-bot bot commented Oct 29, 2021

Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: igor martynetz.
This is most likely caused by a git client misconfiguration; please make sure to:

  1. check if your git client is configured with an email to sign commits git config --list | grep email
  2. If not, set it up using git config --global user.email email@example.com
  3. Make sure that the git commit email is configured in your GitHub account settings, see https://github.com/settings/emails

Copy link
Contributor

@jtcohen6 jtcohen6 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR @imartynetz! Are you able to sign the CLA?

I left one comment on the implementation.

@@ -380,7 +380,7 @@ def execute(
query_table = client.get_table(query_job.destination)
code = 'CREATE TABLE'
num_rows = query_table.num_rows
bytes_processed = query_job.total_bytes_processed
bytes_processed = query_job.total_bytes_processed or 0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would make more sense for total_bytes_processed to be None here, rather than 0. Per its type, it can be None:

bytes_processed: Optional[int] = None

A few points:

  • format_bytes + format_rows_number are only used in dbt-bigquery, so it would make sense to move those methods out of dbt-core (core/dbt/utils.py) and into this package/repo
  • We should update the logic here, or in those methods, to return None if bytes_processed is None, rather than trying to wrap it in abs() and raising an error

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @jtcohen6 I sign the CLA first with my personal account, than i figure out that my local github is linked with the company email, than I sign again with other email.

I make the change to be 0 to avoid change too much in the code and avoid to break in other part of the code, and it`s a safe approuch

Copy link
Contributor Author

@imartynetz imartynetz Nov 3, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe change to this?

def format_bytes(num_bytes):
    if num_bytes:
        for unit in ['Bytes', 'KB', 'MB', 'GB', 'TB', 'PB']:
            if abs(num_bytes) < 1024.0:
                return f"{num_bytes:3.1f} {unit}"
            num_bytes /= 1024.0

        num_bytes *= 1024.0
        return f"{num_bytes:3.1f} {unit}"
    
    else:
        return num_bytes

What do you think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense to me!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, how we do? I make the changes in this repo, but I`m unable to remove from core, because I just fork the bigquery repo.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made a new change to add this methods to bigquery/connections so just need to be deleted from core.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds good to me. It's not a huge deal if that method remains unused in core/utils.py for the time being.

@cla-bot
Copy link

cla-bot bot commented Nov 4, 2021

Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: igor martynetz.
This is most likely caused by a git client misconfiguration; please make sure to:

  1. check if your git client is configured with an email to sign commits git config --list | grep email
  2. If not, set it up using git config --global user.email email@example.com
  3. Make sure that the git commit email is configured in your GitHub account settings, see https://github.com/settings/emails

Copy link
Contributor

@jtcohen6 jtcohen6 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@imartynetz Do you think you'll be able to sign the CLA? Unfortunately, I won't be able to merge these changes otherwise

@imartynetz
Copy link
Contributor Author

@jtcohen6 CLA is that google forms right? I sign twice, with both email that my github is linked. Don't know whats happening.

@jtcohen6
Copy link
Contributor

jtcohen6 commented Nov 5, 2021

@cla-bot check

@cla-bot cla-bot bot added the cla:yes label Nov 5, 2021
@cla-bot
Copy link

cla-bot bot commented Nov 5, 2021

The cla-bot has been summoned, and re-checked this pull request!

@jtcohen6
Copy link
Contributor

jtcohen6 commented Nov 5, 2021

Hooray! CLA check worked. I'm going to close and reopen just to trigger the integration tests, now that I've added the ok to test label

@jtcohen6 jtcohen6 closed this Nov 5, 2021
@jtcohen6 jtcohen6 reopened this Nov 5, 2021
Copy link
Contributor

@jtcohen6 jtcohen6 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only test failures are unrelated (schema version bumps). This change LGTM! Thanks so much @imartynetz!

@jtcohen6 jtcohen6 merged commit 559bec2 into dbt-labs:main Nov 5, 2021
@imartynetz
Copy link
Contributor Author

The only test failures are unrelated (schema version bumps). This change LGTM! Thanks so much @imartynetz!

You're welcome, always nice to help.

@imartynetz imartynetz deleted the fix_row_access_policy branch November 5, 2021 14:17
@imartynetz imartynetz restored the fix_row_access_policy branch November 5, 2021 14:46
@imartynetz
Copy link
Contributor Author

Hello @jtcohen6. Just to check, this hotfix will be commit to dbt-bigquery 0.21.0 to?

@jtcohen6
Copy link
Contributor

@imartynetz This fix will be included in v1.0.0 final. (It's already been included in v1.0.0rc1.) Unfortunately, we split out the adapter plugins into their own repos between v0.21 and v1.0, so it would require extra manual, error-prone work to backport fixes into the dbt-core repo for earlier versions.

Final release is planned for early December. That's not too long to wait, I hope!

@imartynetz
Copy link
Contributor Author

@imartynetz This fix will be included in v1.0.0 final. (It's already been included in v1.0.0rc1.) Unfortunately, we split out the adapter plugins into their own repos between v0.21 and v1.0, so it would require extra manual, error-prone work to backport fixes into the dbt-core repo for earlier versions.

Final release is planned for early December. That's not too long to wait, I hope!

Nice to know, i will wait for this new version.

siephen pushed a commit to AgencyPMG/dbt-bigquery that referenced this pull request May 16, 2022
…bt-labs#48)

* FIX: Fix problem with row access policy doesnt return bytes proccessed.

* UPDATE: Update changelog with fix.

* UPDATE: Change methods format_row_number and format_bytes from core to bigquery/connections

* Update changelog

Co-authored-by: Jeremy Cohen <jeremy@dbtlabs.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] Problem create table with row access policy
2 participants