Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BLOCKING] Define what values end up in the rendering context, and where #1255

Open
beckjake opened this issue Jan 23, 2019 · 6 comments

Comments

@beckjake
Copy link
Contributor

commented Jan 23, 2019

Feature

Currently, rendering in dbt is pretty ad-hoc. We should define what is rendered, where, and with what context very explicitly. It might end up being complex or layered, but at least having it defined and somewhat consistent should be possible.

Feature description

The status quo:

  • description fields in schema.yml are rendered in a context that only supports doc
  • The dbt_project.yml rendering supports var and env_var, but does not render any hooks or vars fields (they're used elsewhere)
  • Fields in the sources section of schema.yml are rendered with var and env_var available.
  • Fields in the models section of schema.yml are generally not rendered at all, except descriptions as described above.
  • Models/analyses get a full context for rendering/execution

We should define some rules for these things and formally document them. I think a good start would be:

  • var and env_var are availble in all rendering contexts
  • doc is additionally available in description field rendering
  • models gets treated the same as sources in schema.yml
  • target is available in schema.yml - in dbt_project.yml it's kind of a dependency ordering issue, since we potentially use information in there to pick the target!

@drewbanin any feedback on other variables you'd expect to be available in schema.yml? It would be nice if we had some sort of user-facing documentation on what you can use and where. I've implemented a lot of this stuff and I find it pretty confusing myself...

Who will this benefit?

developers, end users, anyone who wants to understand what dbt is doing with their config.

@beckjake beckjake added this to the Stephen Girard milestone Jan 24, 2019

@cmcarthur

This comment has been minimized.

Copy link
Member

commented Jan 24, 2019

var and env_var are availble in all rendering contexts

I'm on board, but we need to document how var precedence and scoping works.

@beckjake

This comment has been minimized.

Copy link
Contributor Author

commented Jan 24, 2019

Yeah! A great example is: should vars defined or overridden in dbt_project.yml's models section influence the behavior of var statements in schema.yml's models section? What about for tests defiend in that models section? On the one hand, that could be really useful and has obvious applications. On the other hand, I think the results will be extremely unintuitive: Tests would not get vars from their target models, but description fields, etc would.

I would prefer to only support cli-supplied vars in most rendering contexts, and allow scoped var declarations in models and such only.

@drewbanin

This comment has been minimized.

Copy link
Contributor

commented Jan 24, 2019

@beckjake strong agree that we should document these contexts more clearly. I'm all for documenting the current state of things (even if it's dumb, unintuitive) but I don't think we should actually mutate any contexts right now.

If anything, I want to bias in favor of restricting the available context as much as possible (for now). We ran into this with var() usage in generate_schema_name(). That's one are where there was indeed a var variable in the context, but it was pretty useless! I suspect the same is true in other places throughout the codebase.

When we do document this, I want to document only the set of context variables that we're sure we can support well going forwards. That might mean things like not documenting ref inside of macros, which works today, but might not in the future. vars really throw me for a loop - their scoping makes no sense for nonstandard applications (what does var('abc') resolve to in an Analysis or a Custom Data Test?).

I'm very on board with documenting the craziness, but let's produce two sorts of descriptions:

  1. for dbt developers, indicating the state of the world as-is right now
  2. for dbt users, indicating how they should write code in their dbt projects

The differences between the two will point out where things are inconsistent/poorly defined. We should then queue up work to resolve these inconsistencies. @beckjake if you can do the first part, I can do the second part :)

@beckjake

This comment has been minimized.

Copy link
Contributor Author

commented Jan 24, 2019

That might mean things like not documenting ref inside of macros, which works today, but might not in the future.

refs work in macros because it's pure text substitution, not because we interpret them. I don't imagine that changing!

vars really throw me for a loop - their scoping makes no sense for nonstandard applications (what does var('abc') resolve to in an Analysis or a Custom Data Test?).

Without checking, pretty sure that if you've defined abc on the cli it will work and otherwise it'll complain that the variable doesn't exist (which is why I think a global-scoped vars in dbt_project.yml would not be a bad idea).

But I agree, that's why I think we should only support cli-defined vars in most of these places (it is super useful for tests, at a minimum) and only do the SourceConfig stuff in rendered models that support {{ config(...) }} calls.

I'll do my best to document the status quo over the next few days. As you indicate, var scoping is arcane at best!

@cmcarthur cmcarthur changed the title Define what values end up in the rendering context, and where [BLOCKING] Define what values end up in the rendering context, and where Feb 6, 2019

@drewbanin

This comment has been minimized.

Copy link
Contributor

commented Mar 3, 2019

I just kicked this into Wilt Chamberlain. We should do it, but I just need to find some time to write everything down (and figure out a sensible format, since it's a lot of data!)

Would it be reasonable to unit test the context? I'm thinking we could write a contract, then use that contract to generate (even manually!) the docs. Otherwise, it's kind of a mess to keep everything straight everywhere.

This doesn't really handle things like the difference in var scoping across dbt_project.yml and models, but I think we can figure something out there...

@beckjake

This comment has been minimized.

Copy link
Contributor Author

commented Mar 4, 2019

Would it be reasonable to unit test the context? I'm thinking we could write a contract, then use that contract to generate (even manually!) the docs. Otherwise, it's kind of a mess to keep everything straight everywhere.

Yes, I think that's a fantastic idea!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants
You can’t perform that action at this time.