Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CT-3088] [Feature] Automatically grant usage on schema #8586

Open
3 tasks done
aBBDnGus opened this issue Sep 7, 2023 · 10 comments
Open
3 tasks done

[CT-3088] [Feature] Automatically grant usage on schema #8586

aBBDnGus opened this issue Sep 7, 2023 · 10 comments
Assignees
Labels
enhancement New feature or request paper_cut A small change that impacts lots of users in their day-to-day

Comments

@aBBDnGus
Copy link

aBBDnGus commented Sep 7, 2023

Is this your first time submitting a feature request?

  • I have read the expectations for open source contributors
  • I have searched the existing issues, and I could not find an existing issue for this feature
  • I am requesting a straightforward extension of existing dbt functionality, rather than a Big Idea better suited to a discussion

Describe the feature

In dbt, one can specify which user has access to a model via grants = {'select': ['user']}. But the user can only access a view, if it is also granted usage to the schema where the model is located.
Currently, one has to do this via a post-hook, e.g. grant usage on schema {{ model.schema }} to {{ model.config.grants.select|join(', ') }}.
Unfortunately, this makes it impossible to parallelize the processes with threads, because the post-hook is applied concurrently on the same schema and the error tuple concurrently updated appears.

Describe alternatives you've considered

Who will this benefit?

More complex model structures, including multiple schemas and users.

Are you interested in contributing this feature?

No response

Anything else?

No response

@aBBDnGus aBBDnGus added enhancement New feature or request triage labels Sep 7, 2023
@github-actions github-actions bot changed the title [Feature] Automatically grant usage on schema [CT-3088] [Feature] Automatically grant usage on schema Sep 7, 2023
@dataders
Copy link
Contributor

@aBBDnGus are you using Snowflake? Is so, I feel that this is already covered in dbt-labs/dbt-snowflake#715. I do agree that we should address this!

My suggestion is that we first solve this within the context Snowflake, then upstream the solution into something more general.

One question I have (perhaps you can help me here)

  1. given that grants shipped 9 months ago as part of 1.3.0, why have we not heard more users asking for this?
  2. what should the best practice permission model be for granting access to end users? Why not grant USAGE on all databases where end users are given SELECT access to particular tables?

For 1, my hunch is that this GRANT USAGE only happens within a minory of data teams, for reasons like:

  • the "production" Snowflake account to which they grant select access has an evolving set of databases and schema so that this problem isn't something that's addressed once on a single database,' then never again.
  • they strictly adhere to the principle of least privilege (users have only exactly what they need to do what they need to do, and no more)

For 2, I'd love to get some input from users as to how to use grants in combination with principle of least privilege.

@aBBDnGus
Copy link
Author

I'm using PostgreSQL.

For 1: I think, this is not a major problem to many users. It arises only when using multiple schemas and rights management (i.e. maintain application specific tool accounts which have access to specific models). Even then, one can circumvent this by manually granting the usage when a new schema is created.

For 2: In my opinion, it is the best solution to grant usage on a schema to all users which have access to one of the contained models. The dbt process creates schemas automatically, but it does not grant access.

@dataders dataders added the paper_cut A small change that impacts lots of users in their day-to-day label Sep 20, 2023
@jtcohen6
Copy link
Contributor

I think some of the limitation here is that dbt doesn't have a first-class construct for defining & configuring schemas. Some related conversation in this previous discussion:

The proposal in this issue is actually closer to dbt's current behavior: If a model needs to land in schema_x, and schema_x doesn't yet exist, just go ahead and create the schema. Similarly, if a model in schema_x needs to grant select to a specific role, just go ahead and grant usage to schema_x, too.

That's more or less what Doug & I cooked up for the over-clever approach here: https://docs.getdbt.com/blog/configuring-grants#option-b-too-clever-by-half

@jtcohen6 jtcohen6 removed the triage label Sep 22, 2023
@dataders
Copy link
Contributor

adding @chwiese's take from an internal Slack thread

I do not think dbt should automatically set usage (or any other grants) on schemata. The key reason is that dbt can not (out of the box) manage schema RBAC in its entirety so teams should already have a solution for that in place and dbt should not interfere with it. These alternative solutions are likely also the reason why this has not frequently come up yet. IMHO the situation would be different if dbt offered more explicit options for managing schemata like @jtcohen6 mentions in #5781. That being said, for Snowflake we would then have the same situation one level further up with databases.

I agree that implementation depends on whether or not dbt formally manages schemas and database objects, or if they remain as they are today as effectively attributes of models

@aBBDnGus
Copy link
Author

We have solved the original problem in the meantime with setting

+post-hook:
  - sql: "select pg_advisory_xact_lock(1); grant usage on schema {{ model.schema }} to {{ model.config.grants.select|join(', ') }}"
    transaction: false

The advisory lock prevents that the command is executed on the same schema for multiple times and the error tuple concurrently updated appears.

@ustulation
Copy link

I ran into this recently while using AWS Quicksight which needed to access tables from Redshift. Quicksight user needs to be granted usage on the schema where the tables (or views) reside. These schemas are created by dbt (due to the +schema field under the appropriate section in models in dbt_project.yml) but unfortunately it seems to lack the ability to grant usage to a user on it. So did it with a post-hook on a randomly chosen model in the schema because if that model has run successfully it means that the schema creation has already been done. However it feels a bit of a hack.

@kdazzle
Copy link

kdazzle commented Mar 11, 2024

+1
Setting grants in the schema config like a model seems like it would make the most sense. Given that dbt creates schemas, it becomes harder to say that it isn't dbt's responsibility.

models:
  my_dbt:
    marts:
      vehicles:
        +schema: vehicles
          config:
            grants:
              select: ["analyst"]

@eirikmag
Copy link

+1

2 similar comments
@tejas-esto
Copy link

+1

@stolomeo911
Copy link

+1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request paper_cut A small change that impacts lots of users in their day-to-day
Projects
None yet
Development

No branches or pull requests

8 participants