Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds width_bucket macro #139

Closed
wants to merge 29 commits into from
Closed

Adds width_bucket macro #139

wants to merge 29 commits into from

Conversation

clausherther
Copy link
Contributor

@clausherther clausherther commented Jun 2, 2019

Adds width_bucket macro per #138 .

Not tested on BigQuery yet, but confirms it at least runs on sample data on Redshift and Snowflake (and matched there to the built-in function).

Other possible improvements: wrap mod in a cross-db macro as well.

Needs README documentation, also possibly data type coercion of num_buckets to float or other code to make sure integer division works properly on all platforms.

@jthandy jthandy self-requested a review June 3, 2019 00:08
@jthandy
Copy link
Member

jthandy commented Jun 10, 2019

hey @clausherther! love what you've done here--thanks for the contribution :)

have you made a PR against dbt utils in the recent past? we set up a testing process that uses static seed data plus dbt test plus CircleCI such that we can assert the validity of all macros across all supported databases. there is some light documentation on how this works here:

https://github.com/fishtown-analytics/dbt-utils/tree/master/integration_tests

I'd love to get this code a coming release if you're able to push it across the line and get tests built!

@jthandy jthandy removed their request for review June 10, 2019 01:47
@clausherther
Copy link
Contributor Author

Hi @jthandy! Sure thing, I'll add the integration test this week, I'll use the test data from the Snowflake docs. Thanks!

@clausherther
Copy link
Contributor Author

Btw, if I add a separate macro to handle mod, should that be a new PR, or can we roll that into this one? I think I know the answer.

@jthandy
Copy link
Member

jthandy commented Jun 10, 2019

I'd recommend putting it in a separate PR, but we can work on merging that one first if you'd like to use that functionality in here...

@clausherther
Copy link
Contributor Author

I'm opting to just use mod on all platforms, rather than writing a cross-db macro, since all 4 supported platforms use mod natively.

@clausherther clausherther changed the title First stab at adding width_bucket macro Adds width_bucket macro Jun 10, 2019
@jthandy
Copy link
Member

jthandy commented Jun 12, 2019

@drewbanin this is failing in CircleCI for reasons unrelated to the PR itself. any ideas on what we need to do to get the tests to run?

@clausherther this all looks solid to me. we need to figure out what's going on with the tests before I feel fully confident though.

@drewbanin
Copy link
Contributor

Hey @clausherther - just fixed these tests in #141

Can you try merging master, then pushing to this PR again? I think that should work

@clausherther
Copy link
Contributor Author

@drewbanin oh shit git, ok I think I did the right thing here, but let me know!

@clausherther
Copy link
Contributor Author

Ah blah, Postgres mod fail. Let me look into that. I checked that PG had a mod function, but I guess didn't check the arguments properly. We might still want a cross-db mod macro before we can merge this.

@clausherther
Copy link
Contributor Author

Postgres doesn't like floats in the mod function, so I'm casting the inputs to it as a numeric, which I think should work across dbs, but let me know if you want to handle this a different way. I confirmed on PG and also on Snowflake using the PG output.

@drewbanin
Copy link
Contributor

@clausherther is this ready to roll when the tests are passing?

Copy link
Contributor

@drewbanin drewbanin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dropped some comments in here - this looks really close! Happy to discuss - lmk!

macros/cross_db_utils/width_bucket.sql Outdated Show resolved Hide resolved
macros/cross_db_utils/width_bucket.sql Outdated Show resolved Hide resolved
Copy link
Contributor

@drewbanin drewbanin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

couple more comments after giving this another look! Thanks for revisiting it! I'm hoping to cut a dbt-utils release later today. Would love to get this in there if you have the time to look at it, otherwise, very happy to release it for a subsequent version. Just lmk what you prefer!

@@ -0,0 +1,16 @@
{% macro intersect() %}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, how did this one get in here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mmmh no idea, did this get added when I rebased? Says last updated 9 days ago by Claire, and I opened this PR before then?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

image

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah - i think that's right. I'd try rebasing against master again - that should make this commit disappear i think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Meh, I think this didn't do it. Now I have a lot more files from the rebase in this commit.
I fetched from upstream, merged upstream/master into my master, then rebased this branch from my local master. Does that sound right?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, yeah, that sounds like the right workflow to me! Not sure how you ended up with so many of these commits floating around. Lmk how I can help get it sorted - happy to chat on slack if that's easier :)

)
{%- endmacro %}

{% macro snowflake__width_bucket(expr, min_value, max_value, num_buckets) %}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need this macro? I think Snowflake should just be able to use the default implementation, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had understood this as: on Snowflake we pass the parameters on to the built-in function. Is that now what you think this is doing?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh!! I so didn't catch that this was how it worked - shout out to Snowflake for having all the good functions :)

Cool, this is perfect, ignore me :)

@clausherther
Copy link
Contributor Author

Closing this in favor of #145 because of hot-git-mess

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants