Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace deprecated adapter methods with relations #133

Merged
merged 4 commits into from
May 8, 2019

Conversation

jtcohen6
Copy link
Contributor

@jtcohen6 jtcohen6 requested a review from drewbanin April 27, 2019 00:38
Copy link
Contributor

@drewbanin drewbanin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks really great! Some small comments here in the interest of

  1. minimizing end-user confusion as we make these breaking changes
  2. pulling dbt-utils in line with the database-aware functionality provided by dbt

Using relations here is a really good decision - this will cause things like union_tables to automatically become database-aware (with your change). This is especially good because the source() function returns a Relation, and there are lots of really cool use cases you can imagine here!

So, great work and thanks for opening this PR! Happy to discuss any of my comments in here :)

README.md Outdated Show resolved Hide resolved
macros/sql/get_tables_by_prefix.sql Outdated Show resolved Hide resolved
macros/sql/get_tables_by_prefix_sql.sql Show resolved Hide resolved
{%- set include_cols = [] %}
{%- set cols = adapter.get_columns_in_table(schema_name, table_name) -%}
{%- set cols = adapter.get_columns_in_relation(from) -%}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there might be merit to adding a check here that returns an error if a non-relation is provided to this macro (and others). This would have the added benefit of raising an error if someone passes an ephemeral model into star.

Maybe:

{% if from.get('name') is none %}
  {% do exceptions.raise_compiler_error("Macro star() expected a Relation but received the value: " ~ from) %}
{% endif %}

or something. We could wrap this up in its own little macro as a helper :)

Copy link
Contributor Author

@jtcohen6 jtcohen6 Apr 29, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea! We might add this error to all of the macros that now expect relation arguments in a break-y sort of way. Two q's:

  • Is there a better way to infer the argument type, or is arg.get('name') != none the best way to confirm that arg is a Relation?
  • Is there a Jinja method you know of that lets us pass the name of the current macro? Relevant if we were to make error_if_not_relation(arg) its own macro, and wanted to pretty print a dynamic compiler error "Macro " ~ my_macro ~ " expected a Relation for argument " ~ arg"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are really good questions!

I don't think we have a great mechanism in place to assert the types of objects from the jinja context. We could conceivably add a context function or a jinja test (like abc is Relation) but that doesn't exist today :/. I think our best option is going to be to look for a name attribute. Here's some example code I played around with:

{% set things = [
    'abc.def',
    api.Relation.create(schema='ghi', identifier='jkl'),
] %}


{% for thing in things %}
    {{ log(thing ~ " is relation? " ~ (thing is mapping and thing.name is defined), info=true) }}
{% endfor %}

Let's be sure to test this with dbt-labs/dbt-core#1416

We also don't have a way to get the macro name from the macro context. I think we should just anticipate making a macro to handle raising the exception, then pass in a string to use as the macro name from the caller.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right on. Is an is_relation() macro something that I should add to utils now, or do you anticipate creating it within the context of dbt proper?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah - let's add it to dbt-utils! Maybe name it something like _is_relation() since I wouldn't be comfortable with folks using this externally (yet). If we had some way of locally scoping the macro I'd suggest that, but we can just indicate its private-ness with a naming convention for now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Possibly related to the PR you posted: from.name does return a value, but from.get('name') does not. The macro fails the Relation test as written above.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, yeah, we'll probably want to call from.name then. @beckjake off the top of your head, can you think of a better way to tell if an object is a Relation in the jinja context than obj is mapping and obj.name is not none?

Copy link
Contributor

@beckjake beckjake Apr 30, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, .name is a property that actually resolves to .path.get('identifier').

I think I'd do obj is mapping and 'type' in obj.get('metadata', {})). We should probably add a jinja test for this, or something.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ew, maybe you have to even check the type string:

obj is mapping and obj.get('metadata', {}).get('type', '').endswith('Relation')

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implemented the most rigorous logic in a macro _is_incremental(), which returns the following exception:

Compilation Error
  Macro star expected a Relation but received the value: hello

  > in macro _is_relation (macros/cross_db_utils/_is_relation.sql)
  > called by macro star (macros/sql/star.sql)
  > called by <Unknown>

README.md Outdated Show resolved Hide resolved
@drewbanin
Copy link
Contributor

@jtcohen6 is this ready for a final review?

@jtcohen6
Copy link
Contributor Author

jtcohen6 commented May 8, 2019

@drewbanin Yessir!

Copy link
Contributor

@drewbanin drewbanin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, this LGTM! Very nice work here @jtcohen6 - thanks so much for digging way into this! Merge when ready :)

@@ -1,25 +1,25 @@
{% macro get_tables_by_prefix_sql(schema, prefix, exclude='') %}
{% macro get_tables_by_prefix_sql(schema, prefix, exclude='', database=target.database) %}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this will work great, but there's an important thing to know about python (and therefore jinja, I assume?) argument defaults. Check out this article on mutable default arguments.

This code is A-OK because target.database is a string, which is immutable. It would only be a problem if the code read:

{% macro get_tables_by_prefix_sql(schema, prefix, exclude='', databases=[]) %}

or similar.

table_schema || '.' || table_name as ref
from information_schema.tables
table_schema as "table_schema", table_name as "table_name"
from {{database}}.information_schema.tables
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We spent a lot of time getting this working right with quoting in dbt-core. I think this code is fine for now, but in general, we'll probably want to create some sort of object (analogous to Relations) which encapsulates a database + schema and any applicable quoting configs.

This code won't work if a database is named my-database on eg. Snowflake. While a user could pass in their database string as "my-database", that would cause problems when we create the Relation object in get_tables_by_prefix. I'm ok with shipping this as-is, but let's keep examples like this in mind as we improve dbt's understanding of databases and quoting

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fully agreed. I had tried using the adapter.quote() method (necessary in the BQ version below, given our BQ test project name), but it caused errors on Snowflake. Noted for our future selves.

@@ -0,0 +1,5 @@
{% macro _is_relation(obj, macro) %}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the intended behavior, but it's important to note: This will raise if the ref of an ephemeral model is passed in here. The ref() function returns a string for ephemeral models, vs. a Relation for any non-ephemeral model (or source). I believe all of the macros here that use _is_relation are expecting a table/view, so that is A-OK!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

0.13.0 compatibility
3 participants