Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

implement type_boolean macro #5875

Merged
merged 2 commits into from Sep 21, 2022
Merged

implement type_boolean macro #5875

merged 2 commits into from Sep 21, 2022

Conversation

jpmmcneill
Copy link
Contributor

@jpmmcneill jpmmcneill commented Sep 18, 2022

resolves (partially) #5739

Description

dbt-core component required for #5739.

Other PRs

Checklist

@jpmmcneill jpmmcneill requested a review from a team as a code owner September 18, 2022 21:31
@cla-bot cla-bot bot added the cla:yes label Sep 18, 2022
@@ -59,7 +59,7 @@ The TIMESTAMP_* variation associated with TIMESTAMP is specified by the TIMESTAM
{{ return(api.Column.translate_type("float")) }}
{% endmacro %}

{# numeric ------------------------------------------------ #}
{# numeric ------------------------------------------------- #}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

every other comment had this many dashes 🙃

@dbeatty10
Copy link
Contributor

@jpmmcneill this is looking good to me! 🤩

Here's the next steps for this PR:

  1. Run changie new per these instructions and commit and push the resulting file.
  2. Open up pull requests for dbt-redshift, dbt-snowflake, and dbt-spark that add your new test to their test suites.
    • Unfortunately, new tests aren't inherited and executed automatically -- they need to be added explicitly.

I'll add some further instructions on your PR for dbt-bigquery on how to point at this PR branch for the purposes of automated CI tests in GitHub. (We'll want the adapters to be using your PR branch rather than the main branch.)

Once all of the PRs are passing the tests in GitHub Actions, we will merge dbt-core first. Then we will run CI once more for each adapter (pointing back at main) before merging them.

@jtcohen6 jtcohen6 added Team:Adapters Issues designated for the adapter area of the code ready_for_review Externally contributed PR has functional approval, ready for code review from Core engineering labels Sep 19, 2022
@jpmmcneill jpmmcneill marked this pull request as ready for review September 19, 2022 23:09
@jpmmcneill jpmmcneill requested a review from a team as a code owner September 19, 2022 23:09
Comment on lines +4 to +10
seeds__expected_csv = """boolean_col
True
""".lstrip()

models__actual_sql = """
select cast('True' as {{ type_boolean() }}) as boolean_col
"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably want to cover both boolean options at the very least.

Suggested change
seeds__expected_csv = """boolean_col
True
""".lstrip()
models__actual_sql = """
select cast('True' as {{ type_boolean() }}) as boolean_col
"""
seeds__expected_csv = """boolean_col
True
False
""".lstrip()
models__actual_sql = """
select cast('True' as {{ type_boolean() }}) as boolean_col
union all
select cast('False' as {{ type_boolean() }}) as boolean_col
"""

Even better

But test cases covering the truth tables for conjunction, disjunction, and negation would be even better. That way, we are verifying that everything is acting like booleans.

image image image

Something like the following untested code:

seeds__boolean_permutations_csv = """
x,y
False,False
True,False
False,True
True,True
""".lstrip()

seeds__expected_csv = """
x,y,conjunction,disjunction,negation_x
False,False,False,False,True
True,False,False,True,False
False,True,False,True,True
True,True,True,True,False
""".lstrip()

models__actual_sql = """
select
    x,
    y,
    x and y as conjunction,
    x or y as disjunction,
    not x as negation_x
from {{ ref("boolean_permutations" }}

Best? 🤷

Even though BOOLEAN is a data type described in the SQL standard, some databases don't have it (looking at you, SQL Server!).

If we are feeling extra magnanimous, we could change all the True/False values in the seeds to be 1/0 instead.

I'm hoping it wouldn't be necessary, but we could update the models__actual_sql definition so that x/y values are replaced with the following instead:

  • cast(x as {{ type_boolean() }})
  • cast(y as {{ type_boolean() }})

Copy link
Contributor

@dbeatty10 dbeatty10 Sep 20, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update:

The "Even better" example I gave above didn't actually test the type_boolean() macro at all! 😅

Something like this should fix that situation:

models__actual_sql = """
select
    cast(x as {{ type_boolean() }}) as x_bool,
    cast(y as {{ type_boolean() }}) as y_bool,
    cast(x as {{ type_boolean() }}) and cast(y as {{ type_boolean() }}) as conjunction,
    cast(x as {{ type_boolean() }}) or cast(y as {{ type_boolean() }}) as disjunction,
    not cast(x as {{ type_boolean() }}) as negation_x
from {{ ref("boolean_permutations" }}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cheers @dbeatty10. I'll take a second pass at this later today :)

Copy link
Contributor

@dbeatty10 dbeatty10 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jpmmcneill I just realized that your simple pytest for type_boolean() was exactly in line with the rest of the type_x macros.

So I'm approving as-is.

After this PR is merged, then the next steps will be to restore the original dev-requirements.txt files in each of your adapter PRs.

@jpmmcneill
Copy link
Contributor Author

@jpmmcneill I just realized that your simple pytest for type_boolean() was exactly in line with the rest of the type_x macros.

So I'm approving as-is.

After this PR is merged, then the next steps will be to restore the original dev-requirements.txt files in each of your adapter PRs.

Hey @dbeatty10 - thanks. I personally completely agree with the sentiment around the current level of testing :).

Do you agree with me that an issue that basically scopes "improve the current test coverage for data types" would be welcome?

@jpmmcneill jpmmcneill deleted the jpmmcneill/type-boolean branch September 22, 2022 12:26
@dbeatty10
Copy link
Contributor

Do you agree with me that an issue that basically scopes "improve the current test coverage for data types" would be welcome?

Formally logging where expectations differs from reality is an important mechanism for us. A new issue describing the current level of testing and how that compares to what your expectations were as a contributor (here) and dbt package maintainer (here) would be great!

From there, someone from dbt Labs (maybe me!) will triage the issue submission and give feedback, determine urgency, compare to current roadmap and capacity, etc.

@jpmmcneill
Copy link
Contributor Author

Brill, will do. Thanks @dbeatty10 🐐

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla:yes ready_for_review Externally contributed PR has functional approval, ready for code review from Core engineering Team:Adapters Issues designated for the adapter area of the code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants