Ct-581 grants as configs #5230

gshank · 2022-05-11T00:52:34Z

resolves #5189

Description

Add "grants" attribute to NodeConfig. Fix up all the tests.

This also contains a fix for a partial parsing error that was encountered when writing the test.

Checklist

I have read the contributing guide and understand what's expected of me
I have signed the CLA
I have run this code in development and it appears to resolve the stated issue
This PR includes tests, or tests are not required/relevant for this PR
I have opened an issue to add/update docs, or docs changes are not required/relevant for this PR
I have run changie new to create a changelog entry

gshank · 2022-05-11T00:57:34Z

@jtcohen6 We weren't really clear on whether we want to limit the keys that can be contained in the 'grants' dictionary. I went forward assuming that people might want to create and support their own particular keys, and also that using something really specific like "select" would feel wrong on some warehouses. Or would you prefer to limit the legal keys in the grant dictionary?

I used "my_select" in the test instead of "select" because the test would probably break when we actually implement a "select" grant macro.

gshank · 2022-05-11T00:59:42Z

I also noticed that the jsonschema artifact test doesn't fail when we add a new attribute to NodeConfig, probably because of the really clever things we do to allow random keys in configs. I would have updated the jsonschemas, but I'm not sure whether we want to update to a v6 jsonschema now (and keep updating as things change) or do later.

gshank · 2022-05-11T01:05:48Z

The partial parsing error was that if you have config in a SQL file that's been parsed and add a new schema file that also has config, the config from the SQL file will get lost because I wasn't saving the 'config_call_dict' in the serialized manifest. I switched to saving that dictionary and deleting config_call_dict for the manifest.json instead. Updates to the schema file will work correctly. Any update to either file would correct the error.

This feels like an edge case and possibly not worth doing as a backport, but if you'd like it backported let me know and I can separate it out.

jtcohen6 · 2022-05-11T10:56:06Z

tests/functional/configs/test_grant_configs.py

+        model_config = model.config
+        assert hasattr(model_config, "grants")
+
+        expected = {"my_select": ["reporter", "bi", "other_user"]}


There needs to be some mechanism to remove grants that have been applied at a broader level of the configuration hierarchy.

So: I think we need other_user to clobber the existing values of the my_select key (["reporter", "bi"]), rather than append to the list of recipients for the my_select privilege.

The ideal case is one in which users have some control over this merging behavior. Is now a good time to revisit the discussion in #4108? (Even if we don't / can't actually enable that now)

Interesting. I'm not sure how we could do the one with the '+=' and '-=' syntax. Might need our own jinja parser?

One alternative would be to encode the behavior in the key, like '+select'. Of course we already heavily use that for a different purpose... We could also do something more complicated with having the value of the grant key ("select") be either a list (clobbering) or a dictionary like: "select": { "override": ["bi"]} or "select": {"merge": ["bi"]}

I can switch to using MergeBehavior.Update, which would replace selects with the more specific versions.

I'm into the + behavior! We'll need testing for its various permutations, but I like how it does the sensible thing by default (clobber—fewer grants is safer), while preserving flexibility for the more-advanced end user.

I think by far the hardest part here might be writing the docs for this in a way that doesn't totally confuse people who are already stymied by + in dbt_project.yml: https://docs.getdbt.com/reference/resource-configs/plus-prefix

I left some more "where could we go from here" thoughts about this in #4108 (comment), which are out of scope for the changes in this PR. The only thing worth considering now is whether we should support + on the grants dict key, or the values of those keys, or both.

jtcohen6 · 2022-05-11T11:06:44Z

I went forward assuming that people might want to create and support their own particular keys, and also that using something really specific like "select" would feel wrong on some warehouses. Or would you prefer to limit the legal keys in the grant dictionary?

I agree with your working assumption. I wouldn't want to have to update this every time a warehouse released a new privilege type. It's also infeasible on BigQuery, which supports tons of privileges, and custom privileges.

I would have updated the jsonschemas, but I'm not sure whether we want to update to a v6 jsonschema now (and keep updating as things change) or do later.

Worth thinking about:

Pairing this with the short-term salve for users of Slim CI, proposed in [CT-599] [Bug] Upgrading from 1.0 to 1.1 breaks defer/state #5213 (comment), for artifact updates that actually have backwards-compatible changes
Whether we actually need to update the artifact version for non-breaking changes, such as an additional default value in node.config (which can already contain any additional properties)

Nice catch on the partial parsing error! I haven't heard of someone running into this yet, so I'm happy to keep it as a forthcoming bug fix for now, without the backport.

gshank · 2022-05-11T15:20:44Z

I changed to default of clobbering the grants with the ability of extending them if the grant key starts with '+'. Take a look and see what you think.

jtcohen6

Coming along swimmingly!

jtcohen6 · 2022-05-12T11:50:36Z

core/dbt/contracts/graph/model_config.py

+    grants: Dict[str, Any] = field(
+        default_factory=dict, metadata=MergeBehavior.DictKeyAppend.meta()
+    )


Related tech debt that we discussed yesterday, which is out of scope for this PR, opened as a new issue: #5236

jtcohen6 · 2022-05-12T12:05:04Z

tests/functional/configs/test_grant_configs.py

+        model_config = model.config
+        assert hasattr(model_config, "grants")
+
+        expected = {"my_select": ["reporter", "bi", "other_user"]}


I'm into the + behavior! We'll need testing for its various permutations, but I like how it does the sensible thing by default (clobber—fewer grants is safer), while preserving flexibility for the more-advanced end user.

I think by far the hardest part here might be writing the docs for this in a way that doesn't totally confuse people who are already stymied by + in dbt_project.yml: https://docs.getdbt.com/reference/resource-configs/plus-prefix

I left some more "where could we go from here" thoughts about this in #4108 (comment), which are out of scope for the changes in this PR. The only thing worth considering now is whether we should support + on the grants dict key, or the values of those keys, or both.

jtcohen6 · 2022-05-12T12:10:29Z

core/dbt/contracts/graph/model_config.py

+            raise InternalException(f"expected dict, got {other_value}")
+        new_dict = {}
+        for key in self_value.keys():
+            new_dict[key] = _listify(self_value[key])


I think _listify might make sense in all cases. Consider:

{{ config(grants = {'select': 'model_level'}) }}

Shows up as expected: "grants": {"select": ["model_level"]}

Whereas:

{{ config(grants = {'+select': 'model_level'}) }}

Shows up as: "grants": {"select": ["project_level", "m", "o", "d", "e", "l", "_", "l", "e", "v", "e", "l"]}, when I would expect "grants": {"select": ["model_level, project_level"]}, i.e. the same as what I get if I just wrap it in [].

I really wish Python hadn't chosen to treat strings as sequences... So yeah, listify in all cases would be better.

ChenyuLInx

This looks good to me! Left some clearify questions.
To sum up, this PR

added the grant option in config
added the merge logic for dict with list as values
moved where we remove the config_call_dict.(I asked some questions about why we move it)

Is the sum up correct?

ChenyuLInx · 2022-05-18T22:49:00Z

core/dbt/context/context_config.py

-                config_call_dict[k].extend(v)
-            elif k in config_call_dict and isinstance(config_call_dict[k], dict):
-                config_call_dict[k].update(v)
+                if k in config_call_dict and isinstance(config_call_dict[k], list):


Why we need the second part of the check? looks like the current behavior is going to be if it is not a dictionary we would just overwrite config_call_dict[k]. Could this cause misterious behavior in the future?

I don't think it would cause any problems, but you're right that it's unnecessary. I've removed the isinstance check.

ChenyuLInx · 2022-05-18T22:49:52Z

core/dbt/context/context_config.py

+            elif k in BaseConfig.mergebehavior["dict_key_append"]:
+                if not isinstance(v, dict):
+                    raise InternalException(f"expected dict, got {v}")
+                if k in config_call_dict and isinstance(config_call_dict[k], dict):


Same with this

I've removed the isinstance check. It ought to be unnecessary.

ChenyuLInx · 2022-05-18T22:52:45Z

core/dbt/contracts/graph/manifest.py

@@ -1135,6 +1135,12 @@ class WritableManifest(ArtifactMixin):
        )
    )

+    def __post_serialize__(self, dct):


What's the reason we move this logic from parsed.py to here?

The problem we were running into with the partial parsing bug was that the config_call_dict wasn't being saved with the partial parsing manifest. So when a new schema file was added the config was being applied to the existing model node, but without the config_call_dict we lose the config from the SQL file.

There were two options to fix it, 1) save the config_call_dict or 2) every time we add a new schema file reload all of the nodes that might be affected. The second option would complicate partial parsing substantially and felt more inconsistent. The config_call_dict wasn't being saved to start with only because my imagination did not think about this case. I clean it up in the WritableManifest post_serialize method because we don't really want or need it in the manifest artifact.

gshank · 2022-05-19T13:42:10Z

@ChenyuLInx Your sum up is correct.

ChenyuLInx

LGTM!

* Handle 'grants' in NodeConfig, with correct merge behavior * Fix a bunch of tests * Add changie * Actually add the test * Change to default replace of grants with '+' extending them * Additional tests, fix config_call_dict handling * Tweak _add_config_call to remove unnecessary isinstance checks

gshank added 3 commits May 10, 2022 16:54

Handle 'grants' in NodeConfig, with correct merge behavior

4b46fb6

Fix a bunch of tests

9b788a4

Add changie

6d18915

gshank requested review from a team as code owners May 11, 2022 00:52

gshank requested review from nathaniel-may and ChenyuLInx May 11, 2022 00:52

cla-bot bot added the cla:yes label May 11, 2022

gshank marked this pull request as draft May 11, 2022 00:52

gshank requested a review from jtcohen6 May 11, 2022 00:53

Actually add the test

31c0e9f

gshank force-pushed the ct-581-grants_as_configs branch from fde7052 to 31c0e9f Compare May 11, 2022 01:08

jtcohen6 reviewed May 11, 2022

View reviewed changes

Change to default replace of grants with '+' extending them

649c9c5

This was referenced May 12, 2022

[CT-634] Node configs: tech debt #5236

Open

[CT-651] Custom merge behavior for configs #4108

Closed

jtcohen6 reviewed May 12, 2022

View reviewed changes

Additional tests, fix config_call_dict handling

7459608

gshank marked this pull request as ready for review May 16, 2022 21:41

gshank requested a review from emmyoop May 16, 2022 21:41

ChenyuLInx reviewed May 18, 2022

View reviewed changes

Tweak _add_config_call to remove unnecessary isinstance checks

266b469

gshank requested a review from ChenyuLInx May 19, 2022 16:49

ChenyuLInx approved these changes May 19, 2022

View reviewed changes

gshank merged commit e50678c into main May 19, 2022

gshank deleted the ct-581-grants_as_configs branch May 19, 2022 18:50

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ct-581 grants as configs #5230

Ct-581 grants as configs #5230

gshank commented May 11, 2022

gshank commented May 11, 2022

gshank commented May 11, 2022

gshank commented May 11, 2022

jtcohen6 May 11, 2022

gshank May 11, 2022

jtcohen6 May 12, 2022

jtcohen6 commented May 11, 2022

gshank commented May 11, 2022

jtcohen6 left a comment

jtcohen6 May 12, 2022

jtcohen6 May 12, 2022

jtcohen6 May 12, 2022

gshank May 16, 2022

ChenyuLInx left a comment

ChenyuLInx May 18, 2022

gshank May 19, 2022

ChenyuLInx May 18, 2022

gshank May 19, 2022

ChenyuLInx May 18, 2022

gshank May 19, 2022

gshank commented May 19, 2022

ChenyuLInx left a comment

Ct-581 grants as configs #5230

Ct-581 grants as configs #5230

Conversation

gshank commented May 11, 2022

Description

Checklist

gshank commented May 11, 2022

gshank commented May 11, 2022

gshank commented May 11, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jtcohen6 commented May 11, 2022

gshank commented May 11, 2022

jtcohen6 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ChenyuLInx left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gshank commented May 19, 2022

ChenyuLInx left a comment

Choose a reason for hiding this comment