Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature: Fetch Macro #160

Merged
merged 27 commits into from
Dec 2, 2019
Merged

Feature: Fetch Macro #160

merged 27 commits into from
Dec 2, 2019

Conversation

ChrisLawliss
Copy link
Contributor

@ChrisLawliss ChrisLawliss commented Sep 13, 2019

This PR adds fetch macro which returns a dictionary from a sql query.

Todo:

  • check failing test

Copy link
Contributor

@drewbanin drewbanin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

couple of comments around documentation + tests, but the core fetch macro looks great! Thanks for splitting this out from #123 - this looks great

README.md Outdated
Usage:
```
-- Returns a dictionary of the users table where the state is California
{% set ___ = dbt_utils.fetch("select * from ref{{'users'}} where state = 'CA' ") %}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a good idea to make usage docs like this directly usable. Can you replace ___ here with users? I also think this is a syntax error - it should be something like:

{% set users = dbt_utils.fetch("select * from " ~ ref('users') ~ " where state = 'CA' ") %}

} %}

{% set actual_dictionary=dbt_utils.fetch(
"select * from {{ ref('data_fetch') }}"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same thing here - I don't think you want to use curly brackets inside of a set definition. Try:

"select * from " ~ ref('data_fetch')

"select * from {{ ref('data_fetch') }}"
) %}

{% if actual_dictionary | lower == expected_dictionary | lower %}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if you can call lower on a dictionary - I'd recommend plucking out the fields that you care about, so:

{% if actual_dictionary['col_1'] | lower != expected_dictionary['col_1'] | lower %}
  select 1
{% elif actual_dictionary['col_2'] | lower != expected_dictionary['col_2'] | lower %}
  select 1
{% elif ... %}

{% else %}
  select 1 limit 0
{% endif %}

The limit 0 approach tends to work better than where false, as I think there's a database out there (postgres? bigquery? I forget) which doesn't let you use a where predicate if you don't also have a from clause in the query.

Overall though, this is a slick looking test!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is actually something I'm having troubles/need help with. The fetch macro returns numbers in the form decimal('1'), decimal('2'), decimal('3') rather than 1,2,3 . I'm not sure how this would (or should) be converted.

Copy link

@ghost ghost Apr 28, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Has this decimal('1') been fixed? I'm having similar issues with dates which come as datetime.date(1970, 1, 1)

@@ -0,0 +1,21 @@
{% macro fetch(sql) %}
{# This macro returns a dictionary of the form {column_name: (tuple_of_results)} #}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

<3 this

@drewbanin
Copy link
Contributor

Oops - didn't mean to approve this quite yet! Let me know about my comments here - happy to look at the tests together if that would be helpful :)

Copy link
Contributor

@drewbanin drewbanin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(see above)

@ChrisLawliss
Copy link
Contributor Author

ChrisLawliss commented Sep 14, 2019

@drewbanin I made the adjustments, the tests are passing. Can you take a look at assert_fetch_objects_equal.sql? I switched the brackets for square brackets in setting the expected dictionary. I applied the list filter to that for consistency, but it is less simple.

@drewbanin
Copy link
Contributor

@ChrisLawliss it's just the Snowflake tests which are failing here, which to me always shouts "CAPITALIZATION!"

I bet fields like col_1, col_2, etc are being returned at COL_1, COL_2. Can you try re-adding that call to lower in here? I recall you providing that filter in an earlier version of this code.

@ChrisLawliss
Copy link
Contributor Author

I think that the new changes could be made a bit more compact

@clrcrl
Copy link
Contributor

clrcrl commented Oct 14, 2019

Yo late to the party here.

Wondering if fetch is the best name for this macro? Maybe get_query_results_as_dict? Just a thought!

@clrcrl
Copy link
Contributor

clrcrl commented Oct 23, 2019

Hi, me again 👋

So I was playing with the in a macro I was writing, and when I used fetch, it seemed to not close off the connection?

$ cat macros/redshift_maintenance.sql
{% macro get_vacuumable_tables() %}
    {% set vacuumable_tables_sql %}
        select
            '"' || table_schema || '"."' || table_name || '"' as table_name
        from information_schema.tables
        order by table_schema, table_name
    {% endset %}

    {% set vacuumable_tables_dict=dbt_utils.fetch(vacuumable_tables_sql) %}
    {{ return(vacuumable_tables_dict['table_name']) }}

{% endmacro %}

{% macro redshift_maintenance() %}

    {% for table in get_vacuumable_tables() %}
        {{ dbt_utils.log_info("Vacuuming " ~ table) }}
        {% set results=run_query("vacuum " ~ table) %}
        {{ log(results, info=True) }}
        {{ dbt_utils.log_info("Analyzing") ~ table }}
        {% do run_query("analyze " ~ table) %}
        {{ dbt_utils.log_info("Finished maintenance on " ~ table) }}
    {% endfor %}


{% endmacro %}

$ dbt run-operation redshift_maintenance
Running with dbt=0.14.2
05:54:49 + Vacuuming "analytics"."customer_orders"
Encountered an error while running operation: Database Error
  VACUUM cannot run inside a transaction block

When I instead used run_query, it worked fine:

 $ cat macros/redshift_maintenance.sql
{% macro get_vacuumable_tables() %}
    {% set vacuumable_tables_sql %}
        select
            '"' || table_schema || '"."' || table_name || '"' as table_name
        from information_schema.tables
        order by table_schema, table_name
        limit 2
    {% endset %}
    {% set vacuumable_tables=run_query(vacuumable_tables_sql) %}
    {{ return(vacuumable_tables.columns[0].values()) }}


{% endmacro %}

{% macro redshift_maintenance() %}

    {% for table in get_vacuumable_tables() %}
        {{ dbt_utils.log_info("Vacuuming " ~ table) }}
        {% set results=run_query("vacuum " ~ table) %}
        {{ dbt_utils.log_info("Analyzing") ~ table }}
        {% do run_query("analyze " ~ table) %}
        {{ dbt_utils.log_info("Finished maintenance on " ~ table) }}
    {% endfor %}


{% endmacro %}
$ dbt run-operation redshift_maintenance
Running with dbt=0.14.2
06:03:05 + Vacuuming "analytics"."customer_orders"
06:03:06 + Analyzing
06:03:06 + Finished maintenance on "analytics"."customer_orders"
06:03:06 + Vacuuming "analytics"."customer_payments"
06:03:06 + Analyzing
06:03:06 + Finished maintenance on "analytics"."customer_payments"

Is it because the call to statement hasn't defined auto_begin, so it's defaulting to True? (here
Whereas run_query sets auto_begin=False (here)

Should we use run_query here instead? To be clear, I still think this is a useful macro, I like that it abstracts away the Agate part of things! (vacuumable_tables.columns[0].values() is somewhat 🤢 )

Copy link
Contributor

@clrcrl clrcrl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ended up spending a bit of time today on this so we can get it merged. I think this macro is super valuable and I'm excited to get it into utils!

So annoying that the testing took so much longer than writing the macro, but that's how writing code goes sometimes!

@clrcrl clrcrl merged commit c506a36 into master Dec 2, 2019
@clrcrl clrcrl deleted the feature/fetch branch December 2, 2019 17:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants