enablement due to dbt-utils' "dispatchifiction" #17

dataders · 2021-01-07T05:43:56Z

With the two PRs below, we can now do the following two things:

support as many macros as we can.

as of now, all 39 of dbt-utils's macros are fully working except for the 15 listed below:

sql/groupby.sql (will never be supported)
insert_by_period materialization
date_spine
test_mutually_exclusive_ranges() Port test_mutually_exclusive_ranges() to TSQL #18
all the url macros (get_url_host, get_url_path, get_url_host() port the get_url_*() macros #21
generate_series()
get_relations_by_pattern()
get_relations_by_prefix_and_union()
union_relations()

additionally the following macros work on Azure SQL & SQL Server, but due to a synapse bug are disabled, because the ITs don't work.

dateadd()
datediff()
split_part()
last_day()
test-hash()

`dbt-utils`'s integration testing

The goal of getting this repo working with dbt-utils' integration testing was to make it very easy to:

test that our shimmed macros, and
keep said macros up-to-date as dbt-utils changes.

One challenge with using dbt-utils' integration testing was that it is a bit of a RepRap in that it includes tests (schema and data) that are needed to evaluate whether the other macros are working. So in some cases, a macro works in TSQL with only a small change, but we had to disable the model corresponding to the macro in the dbt_project.yml anyway because there are tests used on the macro that weren't compatible with TSQL.

dependent, now-merged PRs:

macros/dbt_utils/schema_tests/mutually_exclusive_ranges.sql

dataders · 2021-01-07T09:58:09Z

@jtcohen6 any idea as to how we can tackle these tests? Maybe wrap the majority of logic into standalone macros that we can write our own versions of?
https://github.com/swanderz/dbt-utils/pull/1/files

jtcohen6 · 2021-01-07T10:03:54Z

@jtcohen6 any idea as to how we can tackle these tests? Maybe wrap the majority of logic into standalone macros that we can write our own versions of?
https://github.com/swanderz/dbt-utils/pull/1/files

@swanderz We could abstract those in {{ select_one() }} {{ limit_zero() }} macros, but your proposed changes also seem like reasonable rewrites to me that wouldn't break any mainline behavior. Want to add them to dbt-labs/dbt-utils#310?

dataders · 2021-01-07T10:10:00Z

Want to add them to fishtown-analytics/dbt-utils#310?

sure, though i'm leery of the size of that PR, so maybe in a follow-on 🤷 ?

jtcohen6 · 2021-01-07T11:41:45Z

Want to add them to fishtown-analytics/dbt-utils#310?

sure, though i'm leery of the size of that PR, so maybe in a follow-on 🤷 ?

Feels closely related, and that PR isn't too big (talking about "t-sql hacks that shouldn't break mainline behavior," not "dispatch-ify ALL THE MACROS!")

integration_tests/dbt_utils/dbt_project.yml

alittlesliceoftom · 2021-01-12T06:38:22Z

Could I be really boring and request:

Slightly more description of the PR - it seems we're (a) enabling a bunch more things due to the other PRs and (b) also adding some new functionality e.g. time datatypes. Presumably these are fixes in response to tests we can now run due to (a)?
resolve the merge conflict

…post-dispatchify-PR

dataders · 2021-01-12T07:30:41Z

Could I be really boring and request:

1. Slightly more description of the PR - it seems we're (a) enabling a bunch more things due to the other PRs and (b) also adding some new functionality e.g. time datatypes. Presumably these are fixes in response to tests we can now run due to (a)?

2. resolve the merge conflict

done. and done.

dataders · 2021-01-13T07:02:08Z

@mikaelene what do you think?

alittlesliceoftom

Hey @swanderz ,

This all looks really good. I haven't marked last 4 (test files) as 'viewed' as I didn't fully understand them as I haven't worked with those macros. But I definitely think this is all good to pull in.

alittlesliceoftom · 2021-01-13T07:14:58Z

macros/dbt_utils/cross_db_utils/hash.sql

@@ -1,5 +1,5 @@
 {% macro sqlserver__hash(field) %}
-    hashbytes('md5', {{field}})
+    convert(varchar(50), hashbytes('md5', {{field}}), 2)


@swanderz just noting that this will limit hashable field size to 50 (bytes not chars I think). Is this intended behaviour? Could use string?

I looked at this as well. I had the same in the old sqlserver-utils package. Maybe we can have varchar(max)? Most cases will be hashing one column, but if you want to look for diffs, like watching for changes, you may include a lot of columns making the hashes longer then 50

alittlesliceoftom · 2021-01-13T07:16:37Z

.gitmodules

@@ -1,3 +1,4 @@
 [submodule "dbt-utils"]
 	path = dbt-utils
 	url = https://github.com/fishtown-analytics/dbt-utils
+	branch = master


Just noting this may result in changes in dbt-utils breaking things on our side without an explicit commit as not pegged to a certain version.

alittlesliceoftom · 2021-01-13T07:23:10Z

Hey @swanderz ,

This all looks really good. I haven't marked last 4 (test files) as 'viewed' as I didn't fully understand them as I haven't worked with those macros. But I definitely think this is all good to pull in.

You might consider adding some docstrings to the tests to help others (and yourself when you forget what you did) to understand them as I think they're quite complex to reason about if looking at from fresh

mikaelene · 2021-01-13T07:25:53Z

@mikaelene what do you think?

I think this looks great on an overall level. Don't have time to go through all but really great work!

dataders · 2021-01-14T01:13:39Z

Hey @swanderz ,
This all looks really good. I haven't marked last 4 (test files) as 'viewed' as I didn't fully understand them as I haven't worked with those macros. But I definitely think this is all good to pull in.

You might consider adding some docstrings to the tests to help others (and yourself when you forget what you did) to understand them as I think they're quite complex to reason about if looking at from fresh

Some context about the bottom 3schema_test/ macros you refer to. one of the coolest things about dbt-utils are their integration tests, which are really unit tests. For many macros, there's two seed models: an input and an output. The testing process is that the macro is run on the input model and then, the output is compared to output seed.

The challenge is that some arguments defined in schema tests for some testing models in dbt-utils/integration_tests/models/schema_tests/schema.yml use things that are not supported by TSQL. Things like <> for not equal, or something = false (TSQL doesn't have true or false values. So rather than making a PR to dbt-utils and changing the args, it's better to just capture them inside of our dispatched macros and swap them for the TQL equivalent.

TL;DR the changes are just to make the integration testing work, and shouldn't ever affect end users unless I'm missing something

dataders added 2 commits January 6, 2021 03:37

dbt-utils macros now support dispatch

e68b95c

fix this test as well

41547e2

dataders mentioned this pull request Jan 7, 2021

SCRATCH: really for dbt-msft/tsql-utils#17 dataders/dbt-utils#1

Closed

dataders added 2 commits January 6, 2021 23:15

TEMP move to theoretical branch

4cfc406

shorten list of broken macros

cb788d7

dataders commented Jan 7, 2021

View reviewed changes

macros/dbt_utils/schema_tests/mutually_exclusive_ranges.sql Show resolved Hide resolved

dataders commented Jan 7, 2021

View reviewed changes

macros/dbt_utils/schema_tests/mutually_exclusive_ranges.sql Show resolved Hide resolved

dataders added 8 commits January 7, 2021 00:08

must be string for comparison

6f205e3

synapse does not yet support

96e29db

fixed by dbt-util dispatchification

f47fb18

synapse does not support

31151a0

clean up

1aeb356

downstream fix

72f422f

update submodule

a37e6c8

WHY PYTHON WHY!!!

47711de

inform users why

9c43f6b

dataders mentioned this pull request Jan 8, 2021

communicte what macros are not supported by tsql-utils #19

Open

dataders added 3 commits January 7, 2021 21:51

disable until #18 is finished

4230be2

unsupported

008ba9c

synapse does not support timestamp

270e953

dataders mentioned this pull request Jan 8, 2021

empty strings in seeds should be inserted as NULLs microsoft/dbt-synapse#36

Closed

dataders added 2 commits January 8, 2021 01:48

replacement for 'limit 0'

37449a9

hail mary

50811e8

dataders commented Jan 8, 2021

View reviewed changes

integration_tests/dbt_utils/dbt_project.yml Outdated Show resolved Hide resolved

dataders added 2 commits January 8, 2021 11:38

add limit zero to dispatch namespace

88a4beb

disabled due to other issue

80ca9b7

dataders commented Jan 8, 2021

View reviewed changes

integration_tests/dbt_utils/dbt_project.yml Outdated Show resolved Hide resolved

dataders added 6 commits January 10, 2021 11:58

correct macro directory

fff18d3

wrong section

74f9d84

changes are in dbt-utils now!

6eb5af3

pull from master w/ new PRs merged

a274666

perhaps these work now

04c9234

still unported

f46b793

dataders marked this pull request as ready for review January 11, 2021 16:56

dataders requested a review from alittlesliceoftom January 11, 2021 16:56

dataders added 2 commits January 11, 2021 23:05

Merge branch 'master' of https://github.com/dbt-msft/tsql-utils into …

f92d4b5

…post-dispatchify-PR

dbt-utils default arg is now TSQL compatible

0217691

alittlesliceoftom approved these changes Jan 13, 2021

View reviewed changes

dataders added 2 commits January 13, 2021 16:19

more context

49c41a3

pin to latest release

9449055

dataders merged commit cede755 into master Jan 14, 2021

dataders deleted the post-dispatchify-PR branch November 12, 2021 06:24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

enablement due to dbt-utils' "dispatchifiction" #17

enablement due to dbt-utils' "dispatchifiction" #17

dataders commented Jan 7, 2021 •

edited

dataders commented Jan 7, 2021

jtcohen6 commented Jan 7, 2021

dataders commented Jan 7, 2021

jtcohen6 commented Jan 7, 2021

alittlesliceoftom commented Jan 12, 2021

dataders commented Jan 12, 2021

dataders commented Jan 13, 2021

alittlesliceoftom left a comment

alittlesliceoftom Jan 13, 2021

mikaelene Jan 13, 2021

alittlesliceoftom Jan 13, 2021

alittlesliceoftom commented Jan 13, 2021

mikaelene commented Jan 13, 2021

dataders commented Jan 14, 2021

enablement due to dbt-utils' "dispatchifiction" #17

enablement due to dbt-utils' "dispatchifiction" #17

Conversation

dataders commented Jan 7, 2021 • edited

support as many macros as we can.

dbt-utils's integration testing

dataders commented Jan 7, 2021

jtcohen6 commented Jan 7, 2021

dataders commented Jan 7, 2021

jtcohen6 commented Jan 7, 2021

alittlesliceoftom commented Jan 12, 2021

dataders commented Jan 12, 2021

dataders commented Jan 13, 2021

alittlesliceoftom left a comment

Choose a reason for hiding this comment

alittlesliceoftom Jan 13, 2021

Choose a reason for hiding this comment

mikaelene Jan 13, 2021

Choose a reason for hiding this comment

alittlesliceoftom Jan 13, 2021

Choose a reason for hiding this comment

alittlesliceoftom commented Jan 13, 2021

mikaelene commented Jan 13, 2021

dataders commented Jan 14, 2021

dataders commented Jan 7, 2021 •

edited

`dbt-utils`'s integration testing