New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FEATURE] ID/PK - unexpected_index_list
updated to include actual unexpected value, and EVR to include unexpected_index_column_names
#6586
Conversation
✅ Deploy Preview for niobium-lead-7998 ready!
To edit notification comments on pull requests, go to your Netlify site settings. |
👇 Click on the image for a new way to code review
Legend |
unexpected_index_list
includes actual unexpected valueunexpected_index_list
updated to include actual unexpected value
if unexpected_index_column_names is not None: | ||
return_obj["result"].update( | ||
{"unexpected_index_column_names": unexpected_index_column_names} | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ensuring unexpected_index_column_names
is part of the Validation result object.
* develop: [FEATURE] Ensure `result_format` accessed is through Checkpoint, and warns users if `Expectation` or `Validator`-level (#6562) [BUGFIX] Remove rendered header from Cloud-rendering tests (#6597) [MAINTENANCE] Refactor `BaseDataContext` and `DataContext` into factory functions (#6531) [MAINTENANCE] Utilize a `StrEnum` for `ConfigPeer` modes (#6596) [BUGFIX] Use v3.3.6 or higher of google-cloud-bigquery (with shapely bugfix) (#6590) [MAINTENANCE] Add docs snippet checker to `dev` CI (#6594) [MAINTENANCE] Leverage `RendererConfiguration` in existing prescriptive templates (3 of 3) (#6530) [BUGFIX] Support non-string `datetime` evaluation parameters (#6571) [RELEASE] 0.15.41 (#6593) [FEATURE] ZEP - PG `BatchSorter` loading + dumping (#6580) [MAINTENANCE] Leverage `RendererConfiguration` in existing prescriptive templates (2 of 3) (#6488) [MAINTENANCE] Remove `ExplorerDataContext` (#6592) [MAINTENANCE] Small refactor of ExecutionEngine.resolve_metrics() for better code readability (and miscellaneous additional clean up) (#6587) [MAINTENANCE] `mypy` config update (#6589) [BUGFIX] convert_to_json_serializable does not accept numpy datetime (#6553) [BUGFIX] Return unique list of batch_definitions (#6579) [MAINTENANCE] typo in method name (#6585)
@@ -3164,6 +3172,7 @@ def _format_map_output( | |||
unexpected_list: Optional[List[Any]] = None, | |||
unexpected_index_list: Optional[List[int]] = None, | |||
unexpected_index_query: Optional[str] = None, | |||
unexpected_index_column_names: Optional[Union[int, str, List[str]]] = None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not entirely accurate, but is necessary to keep the typechecker happy (it's actually Optional[List[str]]
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm assuming yes, but have you tried removing the int
and str
from the union in this definition and all others? I'm surprised that doesn't work. If it doesn't maybe a good idea is to put this note in a code comment.
unexpected_index_list
updated to include actual unexpected valueunexpected_index_list
updated to include actual unexpected value, and EVR to include unexpected_index_column_names
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good! Just a few questions I'd like to address before approving.
@@ -3164,6 +3172,7 @@ def _format_map_output( | |||
unexpected_list: Optional[List[Any]] = None, | |||
unexpected_index_list: Optional[List[int]] = None, | |||
unexpected_index_query: Optional[str] = None, | |||
unexpected_index_column_names: Optional[Union[int, str, List[str]]] = None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm assuming yes, but have you tried removing the int
and str
from the union in this definition and all others? I'm surprised that doesn't work. If it doesn't maybe a good idea is to put this note in a code comment.
@@ -2465,9 +2472,11 @@ def _sqlalchemy_map_condition_index( | |||
|
|||
column_selector: List[sa.Column] = [] | |||
all_table_columns: List[str] = metrics.get("table.columns") | |||
unexpected_index_column_names: List[str] = result_format.get( | |||
|
|||
unexpected_index_column_names: List[Union[str, None]] = result_format.get( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm thinking this type hint might instead be Union[List[str], None]
i.e. either a list of strings or None (not a list that can contain None values), what do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ha you're absolutely right. My mistake
evrs: List[ExpectationSuiteValidationResult] = result.list_validation_results() | ||
first_result_full_list = evrs[0]["results"][0]["result"]["unexpected_index_list"] | ||
assert first_result_full_list == [{"pk_1": 3}, {"pk_1": 4}, {"pk_1": 5}] | ||
first_result_partial_list = evrs[0]["results"][0]["result"][ | ||
index_column_names: List[str] = evrs[0]["results"][0]["result"][ | ||
"unexpected_index_column_names" | ||
] | ||
assert index_column_names == ["pk_1"] | ||
|
||
first_result_full_list: List[Dict[str, Any]] = evrs[0]["results"][0]["result"][ | ||
"unexpected_index_list" | ||
] | ||
assert first_result_full_list == [ | ||
{"pk_1": 3, "animals": "giraffe"}, | ||
{"pk_1": 4, "animals": "lion"}, | ||
{"pk_1": 5, "animals": "zebra"}, | ||
] | ||
first_result_partial_list: List[Dict[str, Any]] = evrs[0]["results"][0]["result"][ | ||
"partial_unexpected_index_list" | ||
] | ||
assert first_result_partial_list == [{"pk_1": 3}, {"pk_1": 4}, {"pk_1": 5}] | ||
assert first_result_partial_list == [ | ||
{"pk_1": 3, "animals": "giraffe"}, | ||
{"pk_1": 4, "animals": "lion"}, | ||
{"pk_1": 5, "animals": "zebra"}, | ||
] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a blocking suggestion, but you might be able to refactor out a lot of this repeated assertion logic into a private module level helper method (like you did with _add_expectations_and_checkpoint()
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might open the door to parametrizing all these tests. But your call - only if you think that helps readability.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great suggestion :) If you're ok with it, I'll move the expected values into a separate fixture in this case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for making those changes!
Changes proposed in this pull request:
EVR
now containunexpected_index_column_names
which were only in ExpectatinConfiguration before (changes toexpectation.py
to enable)unexpected_index_list
inEVR
will now contain actual unexpected value. Change has been made topandas
andsql
implementationCheckpoint
tests andMetrics
tests correspondinglyDefinition of Done