Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

drop the grants of authorized view when it's full refresh #1189

Open
wants to merge 12 commits into
base: main
Choose a base branch
from

Conversation

salimmoulouel
Copy link
Contributor

@salimmoulouel salimmoulouel commented Apr 22, 2024

resolves #1175

Problem

When you delete a view in BigQuery, all associated dataset grant access are lost because they are tied to the view's ID rather than its name. Consequently, during a full refresh that involves dropping the view, the old grants are not automatically removed and new ones must be created. This oversight can lead to various bugs and inconsistencies.

Solution

Delete the grant to the dataset when we are on full refresh

Checklist

  • I have read the contributing guide and understand what's expected of me
  • I have run this code in development and it appears to resolve the stated issue
  • This PR includes tests, or tests are not required/relevant for this PR
  • This PR has no interface changes (e.g. macros, cli, logs, json artifacts, config files, adapter interface, etc) or this PR has already received feedback and approval from Product or DX

@cla-bot cla-bot bot added the cla:yes label Apr 22, 2024
@salimmoulouel salimmoulouel force-pushed the bug/updating_of_grant_access_on_full_refresh branch 11 times, most recently from 1905626 to 8b46b6c Compare April 22, 2024 15:55
@salimmoulouel salimmoulouel force-pushed the bug/updating_of_grant_access_on_full_refresh branch from 8b46b6c to 9ae626d Compare April 22, 2024 16:10
@mikealfare mikealfare added the community This PR is from a community member label Apr 24, 2024
dataset (Dataset): the dataset to be updated
access_entry (AccessEntry): the access entry to be removed from the dataset
"""
access_entries = list(dataset.access_entries)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to coerce this to be a list?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was a precautionary measure, and you're correct, it turned out to be unnecessary.

dataset = add_access_entry_to_dataset(dataset, access_entry)
dataset = client.update_dataset(dataset, ["access_entries"])
full_dataset_id = f"{dataset.project}.{dataset.dataset_id}"
print(f"allowed dataset access for '{access_entry.entity_id}' to ' dataset '{full_dataset_id}.'")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove print

) # Make an API request.

full_dataset_id = f"{dataset.project}.{dataset.dataset_id}"
print(f"Revoked dataset access for '{access_entry.entity_id}' to ' dataset '{full_dataset_id}.'")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

swap print for logger.info

@colin-rogers-dbt
Copy link
Contributor

Mind adding another functional test?
Should be able to just copy TestAccessGrantSucceeds -> TestAccessGrantSucceedsWithFullRefresh and add some validation the expected grants are/not present:
https://github.com/dbt-labs/dbt-bigquery/blob/main/tests/functional/adapter/test_grant_access_to.py#L80

@salimmoulouel salimmoulouel force-pushed the bug/updating_of_grant_access_on_full_refresh branch 2 times, most recently from 18e55ce to 6f9632a Compare May 3, 2024 11:30
@salimmoulouel salimmoulouel force-pushed the bug/updating_of_grant_access_on_full_refresh branch from 6f9632a to b3ef011 Compare May 4, 2024 02:05
@salimmoulouel
Copy link
Contributor Author

Thank you for your feedback, Colin. I tried to address all the points you raised in your review.

@salimmoulouel salimmoulouel force-pushed the bug/updating_of_grant_access_on_full_refresh branch from 1744e65 to adeefeb Compare May 7, 2024 08:17
Copy link
Contributor

@colin-rogers-dbt colin-rogers-dbt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just one question on the use of sleep

# Need to run twice to validate idempotency
results = run_dbt(["run"])
assert len(results) == 2
time.sleep(10)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need the sleep call?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know exactly why they used it, I found the sleep call already existing in the not full-refresh call of the function, I reproduce it by precaution.

dbt/adapters/bigquery/impl.py Outdated Show resolved Hide resolved
@salimmoulouel salimmoulouel requested a review from a team as a code owner May 9, 2024 23:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla:yes community This PR is from a community member ok to test
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Feature] Keep grants for authorized views in sync when using the grant_access_to config
3 participants