Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Configs in schema files #3616

Merged
merged 1 commit into from
Aug 17, 2021
Merged

Configs in schema files #3616

merged 1 commit into from
Aug 17, 2021

Conversation

gshank
Copy link
Contributor

@gshank gshank commented Jul 23, 2021

resolves #2401

Description

This adds the ability to set configs in schema files for models, seeds, snapshots, and analyses. It does not address configs in macros, sources, or tests, which will be addressed separately.

Checklist

  • I have signed the CLA
  • I have run this code in development and it appears to resolve the stated issue
  • This PR includes tests, or tests are not required/relevant for this PR
  • I have updated the CHANGELOG.md and added information about my change to the "dbt next" section.

@cla-bot cla-bot bot added the cla:yes label Jul 23, 2021
@gshank gshank temporarily deployed to Bigquery July 23, 2021 19:45 Inactive
@gshank gshank temporarily deployed to Bigquery July 23, 2021 19:45 Inactive
@gshank gshank temporarily deployed to Snowflake July 23, 2021 19:45 Inactive
@gshank gshank temporarily deployed to Redshift July 23, 2021 19:45 Inactive
@gshank gshank temporarily deployed to Redshift July 23, 2021 19:45 Inactive
@gshank
Copy link
Contributor Author

gshank commented Jul 23, 2021

Notes on implementation:
I moved the add_patch, add_macro_patch, and add_source_patch code into the schema parser file because they were more appropriate there.
I changed the 'render_test_update' code for unique and not_null to make better use of the 'update_parsed_node_config' method.
I added a 'meta' attribute to NodeConfig because there were not very many code implications. The node level meta and the config meta should now be the same.
I also removed the 'vars' config attribute because it was deprecated quite a while back.

@gshank gshank temporarily deployed to Postgres July 27, 2021 13:08 Inactive
@gshank gshank temporarily deployed to Bigquery July 27, 2021 13:08 Inactive
@gshank gshank temporarily deployed to Snowflake July 27, 2021 13:08 Inactive
@gshank gshank temporarily deployed to Snowflake July 27, 2021 13:08 Inactive
@gshank gshank temporarily deployed to Redshift July 27, 2021 13:08 Inactive
@gshank gshank temporarily deployed to Redshift July 27, 2021 13:08 Inactive
Copy link
Contributor

@jtcohen6 jtcohen6 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great work! I had a lot of fun taking this for a spin. We'll want to significantly revise our documentation on nodes vs. properties, and that's a good thing :)

I'm going to spend some more time playing around with this. For the moment, my biggest comment is that we should merge node.config.meta into node.meta. It doesn't make sense to leave these as separate, unrelated values. This might have some confusing behavior, so let's find ways to do reasonable things. I'm thinking we should treat the meta key in the models dict as being at the "same level as" the config.meta key, and raise an error if both are set. For example:

models:
  - name: my_model
    meta:
      something: blabla
    config:
      meta:
        something: lala

We should raise an error here, since we don't know whether to prefer blabla or lala. We should do it even if those meta dictionaries contain different keys. Any in-file config(meta = ...) should take precedence.

I'm also going to open some follow-on issues that are outside the scope of this specific PR, related to the larger reconciliation effort proposed in #2401 and kicked off here.

  • Support config: on tests. That might get us "for free" the ability to set the meta property on tests (Add support for meta in tests #3629)
  • Support config: on sources, and then turn existing node-level properties into configs that can be set in dbt_project.yml + config blocks. When I say this, I'm especially thinking about source properties like database, loaded_at_field, etc. (I think it would make sense to reserve some properties—description, columns, tests—rather than try to make those work as configs... though there are some valid reasons to want those, too!)
  • Support the same config: block-style syntax in dbt_project.yml, as a substitute for the + prefix that everyone finds confusing :)

core/dbt/parser/schemas.py Show resolved Hide resolved
@jtcohen6
Copy link
Contributor

Update on my comment above, with links to the promised issues:

@jtcohen6
Copy link
Contributor

jtcohen6 commented Aug 6, 2021

As discussed, there are three outstanding pieces for this PR:

  • Make sure configs are being rendered (using standard .yml file context). Ideally, we'd capture the un-rendered versions in unrendered_config, as we do for config set in dbt_project.yml.
  • Merge node.config.meta --> node.meta
  • Support a config: dict nested under tests. Support grabbing test configs from config: dict or top-level keys. Raise an exception if the same test config is set in both the config: dict and as a top-level key.

Those are all things we want for feature completeness in v0.21. They don't have to be blockers to merging this specific PR; I'd be happy to open follow-on issue(s), if they'd require nontrivial work and we'd rather get this PR merged now.

@gshank gshank temporarily deployed to Postgres August 12, 2021 19:24 Inactive
@gshank gshank temporarily deployed to Redshift August 12, 2021 19:24 Inactive
@gshank gshank temporarily deployed to Redshift August 12, 2021 19:24 Inactive
@gshank gshank temporarily deployed to Bigquery August 12, 2021 19:24 Inactive
@gshank gshank temporarily deployed to Bigquery August 12, 2021 19:24 Inactive
@gshank gshank temporarily deployed to Snowflake August 12, 2021 19:24 Inactive
@gshank gshank temporarily deployed to Snowflake August 12, 2021 19:24 Inactive
Copy link
Contributor

@jtcohen6 jtcohen6 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Giving this a thumbs-up from an end-user functional perspective. There are likely to be a handful of small things that we want to touch up as people start using this. I'm excited to get it into beta.

🚀 🚀

CHANGELOG.md Outdated Show resolved Hide resolved
@gshank gshank temporarily deployed to Bigquery August 17, 2021 15:50 Inactive
@gshank gshank temporarily deployed to Bigquery August 17, 2021 15:50 Inactive
@gshank gshank temporarily deployed to Snowflake August 17, 2021 15:50 Inactive
@gshank gshank temporarily deployed to Snowflake August 17, 2021 15:50 Inactive
@gshank gshank temporarily deployed to Postgres August 17, 2021 15:50 Inactive
@gshank gshank temporarily deployed to Redshift August 17, 2021 15:50 Inactive
@gshank gshank temporarily deployed to Redshift August 17, 2021 15:50 Inactive
@gshank gshank merged commit f204d24 into develop Aug 17, 2021
@gshank gshank deleted the config_in_schema_files branch August 17, 2021 16:18
@brad-meru
Copy link

These aren't being captured in dbt Docs. The only way that seems to work is to config in the model.

It would also be great if column tags made it into Docs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Set configs in schema.yml files
5 participants