Array macros #5823

dbeatty10 · 2022-09-13T04:11:08Z

related to #5520

Description

All the related pull requests:

Checklist

I have read the contributing guide and understand what's expected of me
I have signed the CLA
I have run this code in development and it appears to resolve the stated issue
This PR includes tests, or tests are not required/relevant for this PR
I have opened an issue to add/update docs, or docs changes are not required/relevant for this PR
- Update list of cross-database macros docs.getdbt.com#2049
I have run changie new to create a changelog entry

dbeatty10 · 2022-09-22T01:24:41Z

This is ready for review.

After it is approved and merged, I will revert the changes to dev-requirements.txt in each of the adapter PRs before merging each of them:

colin-rogers-dbt · 2022-09-22T22:50:17Z

tests/adapter/dbt/tests/adapter/utils/fixture_array_append.py

+models__array_append_expected_sql = """
+select 1 as id, {{ array_construct([1,2,3,4]) }} as array_col union all
+select 2 as id, {{ array_construct([4]) }} as array_col
+""".lstrip()


what is the .lstrip() for?

Good question! In this particular case, it isn't needed, and I can remove it :)

More detail:
.lstrip() strips off the leading whitespace so that there isn't an empty newline at the beginning of this "file" (string that ends up being materialized as a file for pytest'ing purposes).

It's useful for seed CSV "files" in particular because it allows the text to be more readable. You can see examples with git grep "lstrip" ./.

thanks for the explanation!

colin-rogers-dbt · 2022-09-22T22:50:58Z

tests/adapter/dbt/tests/adapter/utils/base_array_utils.py

+        # check types equal
+        expected_cols = get_relation_columns(project.adapter, "expected")
+        actual_cols = get_relation_columns(project.adapter, "actual")
+        print(f"Expected: {expected_cols}")


do we need to print these?

Not strictly necessary. Copied from here.

Having this printing in here is particularly handy if there is a test failure in the future. Definitely came in handy for me during development too.

In general leaving print statements in can result in debug logs being full of extraneous statements that you then have to sift through so I'd recommend removing it (or just commenting it out so if someone wants to use it to debug they can just remove the comments)

I think print() statements in pytest tests only go to go to standard output (stdout) in two cases:

when there is a test failure

if you add the --show-capture command-line option (like pytest ... -s)

Otherwise, pytest "captures" all output so that you don't see anything other than the test summary.

colin-rogers-dbt · 2022-09-22T22:52:46Z

core/dbt/include/global_project/macros/utils/array_construct.sql

+{# all inputs must be the same data type to match postgres functionality #}
+{% macro default__array_construct(inputs, data_type) -%}
+    {% if inputs|length > 0 %}
+    array[ {{ inputs|join(' , ') }} ]


style nit: do we generally indent within an if block?

I don't know if we have a style guide for internal Jinja or not. This particular implementation was copied-pasted from dbt-utils as-is.

colin-rogers-dbt · 2022-09-22T22:53:44Z

core/dbt/include/global_project/macros/utils/array_construct.sql

+    {% if inputs|length > 0 %}
+    array[ {{ inputs|join(' , ') }} ]
+    {% else %}
+    array[]::{{data_type}}[]


looks like the adapter specific PRs don't have the functionality to return an empty array?

This dbt-core PR includes multiple instances of array_construct([]), namely within these tests:

tests/adapter/dbt/tests/adapter/utils/fixture_array_construct.py

tests/adapter/dbt/tests/adapter/utils/fixture_array_concat.py

tests/adapter/dbt/tests/adapter/utils/test_array_append.py

I was hoping that these are sufficient to establish that the adapters handle the base case of an empty array. But if you take a look at those and it doesn't seem sufficient, then let's discuss it further to update the tests accordingly!

as long as we're validating via tests that we don't need to implement the if condition for each adapter that's good to me

colin-rogers-dbt · 2022-09-22T22:55:21Z

core/dbt/include/global_project/macros/utils/array_concat.sql

+{%- endmacro %}
+
+{% macro default__array_concat(array_1, array_2) -%}
+    array_cat({{ array_1 }}, {{ array_2 }})


if array_concat is more common should we use that as the default instead of postgres syntax?

Between Postgres, Snowflake, BigQuery, and Redshift, it's six of one, half a dozen of the other.

array_cat - Postgres

array_cat - Snowflake

array_concat - BigQuery

array_concat - Redshift

Spark/Databricks refuses to break the tie by using a unique function name for array concatenation:

concat - Spark

I don't think this function name is covered in the SQL standard either.

Original implementations in dbt-utils

In an alternate universe:

array_cat

such is life (and sql standards)!

colin-rogers-dbt · 2022-09-22T22:58:52Z

tests/adapter/dbt/tests/adapter/utils/fixture_array_construct.py

@@ -0,0 +1,12 @@
+# array_construct
+
+models__array_construct_expected_sql = """


not sure how other folks feel but the python standard for constants is to use upper case, see. As I understand the previous convention was based on emulating how we match macros in jinja but since we're not doing that here it's not necessary.

The lowercase capitalization you see here is consistent with the tests I'm familiar with. You can see a sample of those existing tests via git grep '_sql = """' ./.

In terms of broader compliance with the PEP 8 style guide, we do use flake8 checks in CI and in pre-commit checks. I haven't read the nitty gritty details of what flake8 does/doesn't cover though. You can see our flake8 config here.

colin-rogers-dbt

style/testing nits but overall LGTM

dbeatty10 added 2 commits September 12, 2022 20:30

Helper macro to cast from array to string

0e1e476

Default implementations and tests for array macros

a3521bb

dbeatty10 added the Skip Changelog Skips GHA to check for changelog file label Sep 13, 2022

cla-bot bot added the cla:yes label Sep 13, 2022

dbeatty10 added 2 commits September 12, 2022 22:18

Trim Trailing Whitespace

79f6381

Changelog entry

8995a31

dbeatty10 removed the Skip Changelog Skips GHA to check for changelog file label Sep 13, 2022

jtcohen6 added the Team:Adapters Issues designated for the adapter area of the code label Sep 13, 2022

Merge branch 'main' into dbeatty/lift-shift-array-macros

28b1039

dbeatty10 mentioned this pull request Sep 14, 2022

[CT-1110] [Feature] Cross-database macro for type_boolean() #5739

Closed

3 tasks

dbeatty10 added 4 commits September 21, 2022 15:41

Remove dependence upon cast_array_to_string macro

cef387a

pre-commit fixes

94ea3eb

Remove cast_array_to_string macro

549c49f

pre-commit fix

e0a2ee4

This was referenced Sep 21, 2022

Update list of cross-database macros dbt-labs/docs.getdbt.com#2049

Closed

Array macros dbt-labs/dbt-redshift#182

Merged

Array macros dbt-labs/dbt-snowflake#257

Merged

Array macros dbt-labs/dbt-bigquery#308

Merged

Array macros dbt-labs/dbt-spark#454

Merged

Trivial direct test; array_concat/append test non-triviallly indirectly

2148abf

dbeatty10 marked this pull request as ready for review September 22, 2022 01:22

dbeatty10 requested review from a team as code owners September 22, 2022 01:22

dbeatty10 requested review from VersusFacit and stu-k September 22, 2022 01:22

dbeatty10 added the ready_for_review Externally contributed PR has functional approval, ready for code review from Core engineering label Sep 22, 2022

colin-rogers-dbt reviewed Sep 22, 2022

View reviewed changes

dbeatty10 added 2 commits September 23, 2022 06:35

Remove vestigial lstrip

eb12105

Merge branch 'main' into dbeatty/lift-shift-array-macros

71a3a2e

dbeatty10 requested a review from colin-rogers-dbt September 23, 2022 17:14

colin-rogers-dbt approved these changes Sep 23, 2022

View reviewed changes

dbeatty10 merged commit 0487b96 into main Sep 26, 2022

dbeatty10 deleted the dbeatty/lift-shift-array-macros branch September 26, 2022 19:40

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Array macros #5823

Array macros #5823

dbeatty10 commented Sep 13, 2022 •

edited

Loading

dbeatty10 commented Sep 22, 2022

colin-rogers-dbt Sep 22, 2022

dbeatty10 Sep 22, 2022

colin-rogers-dbt Sep 22, 2022

colin-rogers-dbt Sep 22, 2022

dbeatty10 Sep 22, 2022

colin-rogers-dbt Sep 23, 2022

dbeatty10 Sep 23, 2022

colin-rogers-dbt Sep 22, 2022

dbeatty10 Sep 22, 2022

colin-rogers-dbt Sep 22, 2022

dbeatty10 Sep 23, 2022

colin-rogers-dbt Sep 23, 2022

colin-rogers-dbt Sep 22, 2022

dbeatty10 Sep 23, 2022

colin-rogers-dbt Sep 23, 2022

colin-rogers-dbt Sep 22, 2022

dbeatty10 Sep 23, 2022

colin-rogers-dbt left a comment

		@@ -0,0 +1,12 @@
		# array_construct

		models__array_construct_expected_sql = """

Array macros #5823

Array macros #5823

Conversation

dbeatty10 commented Sep 13, 2022 • edited Loading

Description

Checklist

dbeatty10 commented Sep 22, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

colin-rogers-dbt left a comment

Choose a reason for hiding this comment

dbeatty10 commented Sep 13, 2022 •

edited

Loading