Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GE can't parse a row condition in a conditional expectation #6408

Closed
Vitalis-Tk opened this issue Nov 21, 2022 · 7 comments · Fixed by #7313
Closed

GE can't parse a row condition in a conditional expectation #6408

Vitalis-Tk opened this issue Nov 21, 2022 · 7 comments · Fixed by #7313
Labels
community devrel This item is being addressed by the Developer Relations Team triage Used by the GE core team to flag issues that were not yet triaged

Comments

@Vitalis-Tk
Copy link

Vitalis-Tk commented Nov 21, 2022

Describe the bug
GE can't parse a row condition in a conditional expectation if there is a comparison between strings and the string in the condition contains a whitespace.

To Reproduce
First Case: (ExpectationConfiguration in json format)
{
"expectation_type": "expect_column_values_to_not_be_null",
"column": column_A,
"condition_parser": "great_expectations__experimental__",
"mostly": 1.0,
"row_condition": "col(\"communication_type_desc_orig\") == \"Email\""
}
Result: the expectation executes without any issues.

Second Case: (ExpectationConfiguration in json format)
{
"expectation_type": "expect_column_values_to_not_be_null",
"column": column_B,
"condition_parser": "great_expectations__experimental__",
"mostly": 1.0,
"row_condition": "col(\"communication_type_desc_orig\")==\"Secure Message\""
}

Result: The expectation fails to execute. Error: unable to parse condition: col("communication_type_desc_orig")=="Secure Message"
Stacktrace:
File "/usr/local/lib/python3.9/site-packages/great_expectations/expectations/row_conditions.py", line 111, in _parse_great_expectations_condition
return condition.parseString(row_condition)
File "/usr/local/lib/python3.9/site-packages/pyparsing/core.py", line 1141, in parse_string raise exc.with_traceback(None)
pyparsing.exceptions.ParseException: Expected {{Combine:({Suppress:('col(\\\"') W:(A-Za-z, .0-9A-Z_a-z) Suppress:('\\\")')}) '.NOTNULL()'} ^ {Combine:({Suppress:('col(\\\"') W:(A-Za-z, .0-9A-Z_a-z) Suppress:('\\\")')}) {'>' ^ '<' ^ '>=' ^ '<=' ^ '=='} {Re:('[+-]?\\\\d+(?:\\\\.\\\\d*)?(?:[eE][+-]?\\\\d+)?') ^ {Suppress:('\\\"') W:(.0-9A-Z_a-z) Suppress:('\\\"')} ^ {Suppress:(\\\"'\\\") W:(.0-9A-Z_a-z) Suppress:(\\\"'\\\")}}}}, found 'Message' (at char 45), (line:1, col:46)

Expected behavior
The expectation executes without any errors.

Environment (please complete the following information):

  • Operating System: Linux
  • Great Expectations Version: 0.15.31
  • Execution Engine: SqlAlchemyExecutionEngine
@zvpanchal
Copy link

zvpanchal commented Nov 28, 2022

Hi,
I am having a similar issue. I am trying similar approach for redshift database to use row_condition with condition_parser as per below, however I am keep getting an error unable to parse condition:

expectation

    {
      "expectation_type": "expect_column_values_to_be_between",
      "kwargs": {
        "column": "current lvr",
        "max_value": 100,
        "min_value": 1,
        "row_condition": "col(\"number of days in advance\")<=0",
        "condition_parser": "great_expectations__experimental__"
      },
      "meta": {}
    }

Error:

"exception_message": "unable to parse condition: col(\"number of days in advance\")<=0",
"exception_traceback": "Traceback (most recent call last):
File \"/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/great_expectations/expectations/row_conditions.py\", line 111, in _parse_great_expectations_condition
return condition.parseString(row_condition)
File \"/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/pyparsing/core.py\", line 1141, in parse_string
raise exc.with_traceback(None)
pyparsing.exceptions.ParseException: Expected {{Combine:({Suppress:('col(\"') W:(A-Za-z, .0-9A-Z_a-z) Suppress:('\")')}) '.NOTNULL()'} ^ {Combine:({Suppress:('col(\"') W:(A-Za-z, .0-9A-Z_a-z) Suppress:('\")')}) {'>' ^ '<' ^ '>=' ^ '<=' ^ '=='} {Re:('[+-]?\\d+(?:\\.\\d*)?(?:[eE][+-]?\\d+)?') ^ {Suppress:('\"') W:(.0-9A-Z_a-z) Suppress:('\"')} ^ {Suppress:(\"'\") W:(.0-9A-Z_a-z) Suppress:(\"'\")}}}}, found ' '  (at char 11), (line:1, col:12)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File \"/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/great_expectations/execution_engine/execution_engine.py\", line 421, in resolve_metrics
] = self.resolve_metric_bundle(metric_fn_bundle)
File \"/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/great_expectations/execution_engine/sqlalchemy_execution_engine.py\", line 996, in resolve_metric_bundle
selectable: Any = self.get_domain_records(
File \"/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/great_expectations/execution_engine/sqlalchemy_execution_engine.py\", line 616, in get_domain_records
parsed_condition = parse_condition_to_sqlalchemy(
File \"/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/great_expectations/expectations/row_conditions.py\", line 152, in parse_condition_to_sqlalchemy
parsed = _parse_great_expectations_condition(row_condition)
File \"/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/great_expectations/expectations/row_conditions.py\", line 113, in _parse_great_expectations_condition
raise ConditionParserError(f\"unable to parse condition: {row_condition}\")
great_expectations.expectations.row_conditions.ConditionParserError: unable to parse condition: col(\"number of days in advance\")<=0
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File \"/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/great_expectations/validator/validation_graph.py\", line 177, in resolve_validation_graph
self._execution_engine.resolve_metrics(
File \"/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/great_expectations/execution_engine/execution_engine.py\", line 424, in resolve_metrics
raise ge_exceptions.MetricResolutionError(
great_expectations.exceptions.exceptions.MetricResolutionError: unable to parse condition: col(\"number of days in advance\")<=0
",

Issue occurs when column name or the value to compare contains a space. If I try column name without a space (e.g. current_id), then it works fine. However, column name with space in it fails with an error.

Could someone please help me with this issue?

@austiezr austiezr added triage Used by the GE core team to flag issues that were not yet triaged community devrel This item is being addressed by the Developer Relations Team labels Dec 1, 2022
@austiezr
Copy link
Contributor

austiezr commented Dec 1, 2022

Hey @Vitalis-Tk & @zvpanchal ! Thanks for reaching out. I've been able to reproduce the behavior you're experiencing here, and will be discussing this functionality internally with the team in the coming week.

@zvpanchal
Copy link

zvpanchal commented Feb 6, 2023

Hi @austiezr - Is there any update on this issue?

Using <> operator in row_condition throws parsing error. Is there any workaround to use not equal operator in the row_conditions?

        "row_condition": "col(\"account_state\")<>'CLOSED'",
        "condition_parser": "great_expectations__experimental__"

Error that I get:

File "/.pyenv/versions/3.11.0/lib/python3.11/site-packages/pyparsing/core.py", line 1141, in parse_string
raise exc.with_traceback(None)
pyparsing.exceptions.ParseException: Expected {{Combine:({Suppress:('col("') W:(A-Za-z, .0-9A-Z_a-z) Suppress:('")')}) '.NOTNULL()'} ^ {Combine:({Suppress:('col("') W:(A-Za-z, .0-9A-Z_a-z) Suppress:('")')}) {'>' ^ '<' ^ '>=' ^ '<=' ^ '=='} {Re:('[+-]?\d+(?:\.\d*)?(?:[eE][+-]?\d+)?') ^ {Suppress:('"') W:(.0-9A-Z_a-z) Suppress:('"')} ^ {Suppress:("'") W:(.0-9A-Z_a-z) Suppress:("'")}}}}, found '>' (at char 21), (line:1, col:22)

I am using great_expectations v0.15.43 (v3 API)

@Shinnnyshinshin
Copy link
Contributor

Shinnnyshinshin commented Mar 6, 2023

Hi @zvpanchal and @Vitalis-Tk I wanted to provide an update that the Core Engineering team has picked up this issue. More updates to come very soon. Thank you for your patience and for your follow up :)

@Vitalis-Tk
Copy link
Author

Hi @zvpanchal and @Vitalis-Tk I wanted to provide an update that the Core Engineering team has picked up this issue. More updates to come very soon. Thank you for your patience and for your follow up :)

@Shinnnyshinshin
Copy link
Contributor

@zvpanchal and @Vitalis-Tk the PR #7313 should address the issue you were having, and should be included in tomorrow's release :)

@zvpanchal
Copy link

Hi @Shinnnyshinshin - This issue still occurs when either column name or value contains space in it. Please refer to the issue I raised in discourse. Could anyone please help?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
community devrel This item is being addressed by the Developer Relations Team triage Used by the GE core team to flag issues that were not yet triaged
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants