Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix postgres__regexp_instr not validating regex #249 #250

Merged

Conversation

lookslikeitsnot
Copy link
Contributor

What does this PR do?

Fix regex match succeeding for any pattern in Postgres (#249)
Add expected-to-fail tests for regex (#207)

Change description

Coalesce array_length and 0 to avoid null in expression when no match exists resulting in always falsy WHERE condition on error check in tests.

Type of change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)

Where has this been tested?

  • OS: Manjaro Linux 22.0.2
  • Python: 3.10.9
  • dbt: 1.4.5
  • dbt-expectations: 0.8.3
  • PostgreSQL: 14.6

@clausherther
Copy link
Contributor

Thanks for the PR!
I'll look at this in more detail tomorrow, but looks like failed a test on BigQuery:
https://app.circleci.com/pipelines/github/calogica/dbt-expectations/229/workflows/09eac6bd-ccaf-46b2-af4a-e2fe299a5fad/jobs/143?invite=true#step-106-354

@clausherther
Copy link
Contributor

I think this (negative) test fails on BigQuery

- dbt_expectations.expect_column_values_to_not_match_regex_list:
              regex_list: ["[A-Z]", "[0-9]"]
              flags: i
              match_on: all
              config:
                error_if: "=0"
                warn_if: "<4"

because this expression returns true for all rows

select 
  regexp_instr(email_address, '[A-Z]', 1, 1) = 0
  and 
  regexp_instr(email_address, '[0-9]', 1, 1) = 0
  as expression
from dbt_expectations_integration_tests.data_text

@lookslikeitsnot
Copy link
Contributor Author

lookslikeitsnot commented Mar 27, 2023

Sorry about that, I missed the "The flags option is not supported for BigQuery and is being ignored." comment and forgot to add enabled: "{{ target.type in ['postgres', 'snowflake', 'redshift' ] }}" to new tests with flags.

This leads me to another question (out of scope of this ticket but might be nice to have): since BigQuery uses re2 and re2 supports flags by prepending the regex with (?flags:re), could we implement that?

@clausherther
Copy link
Contributor

This leads me to another question (out of scope of this ticket but might be nice to have): since BigQuery uses re2 and re2 supports flags by prepending the regex with (?flags:re), could we implement that?

Sounds good! I'm not super hip re: Regex options and flags, but if that'd be useful, I'd say go ahead and open an issue for it and we'll go from there. Thanks!

@clausherther clausherther self-requested a review March 27, 2023 22:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants