-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added support for BigQuery authorized views #2517
Added support for BigQuery authorized views #2517
Conversation
Realized I probably should be passing in a relation into the macro instead of a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Realized I probably should be passing in a relation into the macro instead of a dataset_id however I'm running into an error (Object of type dict is not JSON serializable) when attempting to pass in a relation in my config.
Should dataset_id support specifying the project, as well? In that case you'll want to create a new hologram dataclass like this:
@dataclass
class GrantTarget(JsonSchemaMixin):
dataset_id: str
project_id: str
Then you'd make your grant_access_to
definition look like:
@dataclass
class BigqueryConfig(AdapterConfig):
...
grant_access_to: Optional[List[GrantTarget]]
And then users can supply the field like:
models:
this_project:
this_model:
+grant_access_to:
project_id: share-project-id
dataset_id: share-dataset-id
I think that would work, at least
{% macro bigquery__grant_access_to(entity, entity_type, role, dataset_id) -%} | ||
{% do adapter.grant_access_to(entity, entity_type, role, dataset_id) %} | ||
{% endmacro %} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need to provide a macro interface for this!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@DataClass
class BigqueryConfig(AdapterConfig):
...
grant_access_to: Optional[List[GrantTarget]]
Managed to incorporate all your suggested changes @beckjake except for this one, I wasn't able to figure out how to create a GrantTarget
object from a model config or when passing in --args
when using run-operation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I think this all looks good to me! Is it feasible to write a test for this? I wouldn't want to break this in the future, though I admit I am not sure how to write a useful test of this behavior. Would we need a second user with different permissions that we grant access to, and then make sure it can run another dbt job using the model in question as a source?
@beckjake conceptually that makes sense to me. Is that something that is possible right now with the testing framework? An alternative would be to to run the grant_access and then inside the test retrieve the |
@azhard It's possible, but I think we'd have to set up a second bigquery user on our end. We're making some changes to our test env in the immediate future and maybe we can sneak this in (cc @jtcohen6) I think your suggestion about having the test retrieve the access_entries is a great alternative for this PR. |
We could definitely make a second user within our new BigQuery test env. From skimming the BQ docs linked above, it sounds like that second user needs to be a User on the project and a DataViewer on the dataset containing the secure view, but should not have any permissions on the source dataset. Is that right? We haven't yet switched over to the new test env, but we could very soon for BQ, since all the tests are running. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great. Thanks for adding the test and getting this across the line @azhard! I'm going to kick off tests and then merge this into 0.18.0 once they pass.
That azure test failure is from the test issue I fixed in the 0.17.1 branch but haven't merged up into dev/marian-anderson yet, we can ignore it for now (if I'm wrong and this is somehow broken on Windows after that, I'll fix it!) |
Any docs for this? It seems like I can call the macro directly when needed or provide it as a config option? It's not clear from the test how to use this in a model but maybe I'm just a n00b :) |
Hey @preston-hf, I'd imagine docs will get added with the release of 1.18.0 but nothing as of yet. What you'll want to do to use this in your models directly is add
|
Awesome! I appreciate the contribution and the prompt reply. |
Hi @azhard I am wondering how I can use this to implement dataset access to a particular user. I am using dbt version 0.17.2 For example I have a user called "abc@gmail.com" and I want to give them access to a particular dataset in bq, can I use I tried using the following without any luck 😢 models: Thank you 😄 |
@hamzaalikhan This is not in 0.17.2. It's going to be released in 0.18.0 and should be available in 0.18.0b2 if you want to try it out. |
Thanks @beckjake for you prompt reply 😃 I upgraded dbt to 0.18.0b2 and tried this out, it works nicely for auth view! Great work 😀 👍 But what if I wanna use it to grant access to a dataset for a particular user? If we don't use auth views? For granting access to a dataset (no auth views), I tried the following syntax in the dbt_project.yml but it didn't work 😭 Any leads on this please? models: |
Hey @hamzaalikhan, try giving it a shot using this syntax:
|
hey @azhard thanks for your prompt reply! :) I tried the syntax you recommended and unfortunately it did not work for me, I don't see any logs also when I run models: ^ This is the full syntax I am using, I am the owner on the project so there are no permissions issue on the bq side |
I will probably just wait until 0.18.0 comes out, cheers for your help guys! 😄 |
Sorry for the confusion @hamzaalikhan, if you're still interested the actual way to run it for this case is to run it as a macro from the CLI like this:
In the case where you want to authorize views to a dataset you can update your dbt_project config like this:
|
Hey @azhard, using When using the dbt run-operation grant_access_to, is it supposed to give any logs like |
I don't believe I added any logs to the macro / function execution, but glad to hear that it worked! |
Hello! When we implemented this feature in the
After a little digging, I'm pretty sure this error was due to one dataset PATCH call not have the most recent etag of another dataset PATCH call being executed against the same dataset. In other words, that this functionality isn't thread-safe. We worked around the issue by rerunning the affected models in a single threaded run. I'm not familiar enough with the |
Hey @glqstrauss, I think that's right, authorized view operations are not thread-safe. That's something other community members have run into. Really, I view the current implementation of authorized views as necessarily limited, until such time as BigQuery releases the DDL to manage them, at which point we can switch to a more robust implementation. (They've been releasing a lot of SQL syntax parity of late, and I understand that In the meantime, I'd be supportive of adding a note to the docs site that mentions the single-threaded limitation. Is that something you'd be up to add? |
Hey @jtcohen6 I cut a change here: dbt-labs/docs.getdbt.com#634 |
resolves #1718
Description
This PR enables BigQuery dbt users to create authorized views and or as well as add read, write, and owner permissions for datasets in BigQuery.
https://cloud.google.com/bigquery/docs/share-access-views
https://cloud.google.com/bigquery/docs/dataset-access-controls
Checklist
CHANGELOG.md
and added information about my change to the "dbt next" section.