Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Component and group commands #47

Merged
merged 18 commits into from
Jun 8, 2023
Merged

Conversation

dave-connors-3
Copy link
Collaborator

This PR attempts to unify the actions of group creation and contract creation into one group command, and adds a operation subgroup for the existing commands.

This required a couple small logic adjustments to the grouper logic to handle cases where selection syntax includes non-model nodes. I slotted it in a couple places, but likely could push it further down into the group logic.

@nicholasyager I am a little stuck on the click stuff -- this draft seems to successfully pass args and run the create_group command when the group command is called, but doesn't seem to do the same for the contract operations. Would love some guidance on this! Once we make that edit, I can add real tests to the test file.

@nicholasyager
Copy link
Collaborator

nicholasyager commented May 26, 2023

I am a little stuck on the click stuff -- this draft seems to successfully pass args and run the create_group command when the group command is called, but doesn't seem to do the same for the contract operations. Would love some guidance on this! Once we make that edit, I can add real tests to the test file.

@dave-connors-3, I did some poking around, and I believe the proximal issue is that resource selection for add-verson is returning no results. I added some logging in a local branch, and that showed that the method being called is being executed.

The root cause, however, is a little deeper. In the group function's invocation of add-version, it is using the following node selector: group:{{ group }},config.access:public.
The crux of the issue is that according to official docs for groups, access is a model property and not a config

According to spec

models:
 - name: example
   group: foo
   access: public

Not necessarily correct

models:
 - name: example
   group: foo
   config:
     access: public

As a result, using the config.access:public selector will yield no results!

Having that been said, it looks like access is an unutilized property of a model config within the dbt-core schema, so we can define access in the config and it will evaluate. Having this been said, the lack of an access NodeSelection method seems like an oversight, since it would be required if folks are following the official docs.

So to recap, we can likely work around this by writing access in a model's config instead of as a property. More long term, there's likely a dbt-core issue to be resolved.

CC @graciegoheen

@dave-connors-3
Copy link
Collaborator Author

alright @nicholasyager @graciegoheen

After some more tweaking, i think we're in decent shape:

New functionality:

  1. added a parameter to the add-contract function that filters down to public only models. I did not add CLI flags for this option since we agreed this should be supported in core in the future.

After some testing, found a few more places where logic needed updating:

  1. the test project I added from winnie's template had an instance where a column was in the yml file that the model in fact did not produce! I added a condition to the contract yml editor method to ignore these cases. Might be worth revisiting in the future.
  2. Somewhat Major Change Alert: after creating the group in my test case, I parsed the project again to make sure that the models I expected to be public and under contract were -- this raised an immediate parsing issue arising from our model access logic. Our default in classify_resource_access is to make a model private if it's not a leaf node has a cross group dependency. Private access prevents you from reffing that model at all, even within the group. As such, I opted to update the logic to default to protected to avoid this parsing issue.
  3. As discussed, added a filter to the add_group yml method to not attempt to update the access of any non-model resource
  4. Encountered another compilation issue: incremental models that have a contract added to them must have +on_schema_change: append_new_columns -- this was not the case in the test project. I updated the global config there, but I am going to open a new issue for us to do this as a part of the contract logic in this package. This could result in some wonkiness if the config is set within the model body itself as opposed to the model yml.

@dave-connors-3 dave-connors-3 marked this pull request as ready for review May 30, 2023 20:08
@nicholasyager
Copy link
Collaborator

2. Somewhat Major Change Alert: after creating the group in my test case, I parsed the project again to make sure that the models I expected to be public and under contract were -- this raised an immediate parsing issue arising from our model access logic. Our default in classify_resource_access is to make a model private if it's not a leaf node has a cross group dependency. Private access prevents you from reffing that model at all, even within the group. As such, I opted to update the logic to default to protected to avoid this parsing issue.

@dave-connors-3 Can you please demonstrate a reproducible case of this occurring? I'm not able to replicate using the create-group command in main, so I'm wondering if there's an odd interaction happening here.

@dave-connors-3
Copy link
Collaborator Author

@dave-connors-3 Can you please demonstrate a reproducible case of this occurring? I'm not able to replicate using the create-group command in main, so I'm wondering if there's an odd interaction happening here.

I think the issue is not with the command itself -- it runs properly and assigns the access as expected. The issue comes on the next dbt invocation, where the project fails to compile because we've made nodes that are connected in the DAG private.

@nicholasyager
Copy link
Collaborator

@dave-connors-3 Can you please demonstrate a reproducible case of this occurring? I'm not able to replicate using the create-group command in main, so I'm wondering if there's an odd interaction happening here.

I think the issue is not with the command itself -- it runs properly and assigns the access as expected. The issue comes on the next dbt invocation, where the project fails to compile because we've made nodes that are connected in the DAG private.

Very interesting and very odd! The docs would leave me to believe that referencing private models is allowed, so long as the referring model is in the same group.

For example, This should be fine

flowchart LR

  subgraph group
    private_model --> other_model
  end

For example, This should raise an error

flowchart LR

  subgraph group
    private_model
  end
  private_model --> other_model

@dave-connors-3
Copy link
Collaborator Author

@nicholasyager this may very well be a me problem! I'll see if i can reliably replicate the issue I was having

@dave-connors-3
Copy link
Collaborator Author

@nicholasyager I have not yet resolved the conflicts here (which may help!) but i flipped the AccessType assigned in grouper.py, created a group with this command, and immediately saw the same error i referenced before:
image

Checking the manifest, it looks like this falls into your diagram above -- stg_orders is in dave and is private, orders is in dave and is public, so should be able to select from the same group.

Can you confirm whether this set of commands creates the same error on your side? If it does, I think this might be a core issue?

@nicholasyager
Copy link
Collaborator

@dave-connors-3 Thanks for the test case! I was able to replicate the issue, and it stems from the interaction of two defects.

Defect 1: The add_group_and_access_to_model_yml method is setting model groups within a model config. This "works", but is not up to spec. Line 80 should be updated to set the group property instead of a config a la

model_yml.update({"access": access_type.value, "group": group.name})
models[model_name] = process_model_yml(model_yml)

Defect 2: The add_model_contract_to_yml method overwrites a model's defined config properties.

model_yml.update({"config": {"contract": {"enforced": True}}})

Instead, this should access the config itself and then update that sub-dictionary.

The combination of the two means that orders was getting set to the dave group, and then that group was being overwritten. Fixing the two defects above will resolve the issue.

@dave-connors-3
Copy link
Collaborator Author

@nicholasyager I just also discovered this! shout out to ruamel for making the yaml edits 10000% easier to read! Will update!

@dave-connors-3
Copy link
Collaborator Author

@nicholasyager should be ready to rock and roll!

@dave-connors-3 dave-connors-3 merged commit 3bec517 into main Jun 8, 2023
1 check passed
@nicholasyager nicholasyager deleted the component-and-group-commands branch July 25, 2023 20:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants