Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE] ID/PK - unexpected_index_list updated to include actual unexpected value, and EVR to include unexpected_index_column_names #6586

Merged
merged 34 commits into from Dec 20, 2022

Conversation

Shinnnyshinshin
Copy link
Contributor

@Shinnnyshinshin Shinnnyshinshin commented Dec 14, 2022

Changes proposed in this pull request:

  • EVR now contain unexpected_index_column_names which were only in ExpectatinConfiguration before (changes to expectation.py to enable)
  • unexpected_index_list in EVR will now contain actual unexpected value. Change has been made to pandas and sql implementation
  • Update Checkpoint tests and Metrics tests correspondingly

Definition of Done

  • My code follows the Great Expectations style guide
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have added unit tests where applicable and made sure that new and existing tests are passing.
  • I have run any local integration tests and made sure that nothing is broken.

@netlify
Copy link

netlify bot commented Dec 14, 2022

Deploy Preview for niobium-lead-7998 ready!

Name Link
🔨 Latest commit ccc6b3b
🔍 Latest deploy log https://app.netlify.com/sites/niobium-lead-7998/deploys/63a0f0ac605fae0008450aee
😎 Deploy Preview https://deploy-preview-6586--niobium-lead-7998.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site settings.

@ghost
Copy link

ghost commented Dec 14, 2022

👇 Click on the image for a new way to code review
  • Make big changes easier — review code in small groups of related files

  • Know where to start — see the whole change at a glance

  • Take a code tour — explore the change with an interactive tour

  • Make comments and review — all fully sync’ed with github

    Try it now!

Review these changes using an interactive CodeSee Map

Legend

CodeSee Map Legend

@Shinnnyshinshin Shinnnyshinshin changed the title [FEATURE] ID/PK - unexpected_index_list includes actual unexpected value [FEATURE] ID/PK - unexpected_index_list updated to include actual unexpected value Dec 14, 2022
@Shinnnyshinshin Shinnnyshinshin changed the base branch from develop to b/dx-105/warnings-for-expectation-and-validator-level December 14, 2022 23:58
Comment on lines 3249 to 3252
if unexpected_index_column_names is not None:
return_obj["result"].update(
{"unexpected_index_column_names": unexpected_index_column_names}
)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ensuring unexpected_index_column_names is part of the Validation result object.

@Shinnnyshinshin Shinnnyshinshin marked this pull request as ready for review December 16, 2022 20:51
Base automatically changed from b/dx-105/warnings-for-expectation-and-validator-level to develop December 16, 2022 21:34
Will Shin added 3 commits December 16, 2022 14:09
* develop:
  [FEATURE] Ensure `result_format` accessed is through Checkpoint, and warns users if `Expectation` or `Validator`-level (#6562)
  [BUGFIX] Remove rendered header from Cloud-rendering tests (#6597)
  [MAINTENANCE] Refactor `BaseDataContext` and `DataContext` into factory functions (#6531)
  [MAINTENANCE] Utilize a `StrEnum` for `ConfigPeer` modes (#6596)
  [BUGFIX] Use v3.3.6 or higher of google-cloud-bigquery (with shapely bugfix) (#6590)
  [MAINTENANCE] Add docs snippet checker to `dev` CI (#6594)
  [MAINTENANCE] Leverage `RendererConfiguration` in existing prescriptive templates (3 of 3) (#6530)
  [BUGFIX] Support non-string `datetime` evaluation parameters (#6571)
  [RELEASE] 0.15.41 (#6593)
  [FEATURE] ZEP - PG `BatchSorter` loading + dumping (#6580)
  [MAINTENANCE] Leverage `RendererConfiguration` in existing prescriptive templates (2 of 3) (#6488)
  [MAINTENANCE] Remove `ExplorerDataContext` (#6592)
  [MAINTENANCE] Small refactor of ExecutionEngine.resolve_metrics() for better code readability (and miscellaneous additional clean up) (#6587)
  [MAINTENANCE] `mypy` config update (#6589)
  [BUGFIX] convert_to_json_serializable does not accept numpy datetime (#6553)
  [BUGFIX] Return unique list of batch_definitions (#6579)
  [MAINTENANCE] typo in method name (#6585)
@Shinnnyshinshin Shinnnyshinshin requested a review from a team December 16, 2022 23:17
@@ -3164,6 +3172,7 @@ def _format_map_output(
unexpected_list: Optional[List[Any]] = None,
unexpected_index_list: Optional[List[int]] = None,
unexpected_index_query: Optional[str] = None,
unexpected_index_column_names: Optional[Union[int, str, List[str]]] = None,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not entirely accurate, but is necessary to keep the typechecker happy (it's actually Optional[List[str]])

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm assuming yes, but have you tried removing the int and str from the union in this definition and all others? I'm surprised that doesn't work. If it doesn't maybe a good idea is to put this note in a code comment.

@Shinnnyshinshin Shinnnyshinshin changed the title [FEATURE] ID/PK - unexpected_index_list updated to include actual unexpected value [FEATURE] ID/PK - unexpected_index_list updated to include actual unexpected value, and EVR to include unexpected_index_column_names Dec 16, 2022
Copy link
Member

@anthonyburdi anthonyburdi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good! Just a few questions I'd like to address before approving.

@@ -3164,6 +3172,7 @@ def _format_map_output(
unexpected_list: Optional[List[Any]] = None,
unexpected_index_list: Optional[List[int]] = None,
unexpected_index_query: Optional[str] = None,
unexpected_index_column_names: Optional[Union[int, str, List[str]]] = None,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm assuming yes, but have you tried removing the int and str from the union in this definition and all others? I'm surprised that doesn't work. If it doesn't maybe a good idea is to put this note in a code comment.

@@ -2465,9 +2472,11 @@ def _sqlalchemy_map_condition_index(

column_selector: List[sa.Column] = []
all_table_columns: List[str] = metrics.get("table.columns")
unexpected_index_column_names: List[str] = result_format.get(

unexpected_index_column_names: List[Union[str, None]] = result_format.get(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm thinking this type hint might instead be Union[List[str], None] i.e. either a list of strings or None (not a list that can contain None values), what do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ha you're absolutely right. My mistake

Comment on lines 195 to 216
evrs: List[ExpectationSuiteValidationResult] = result.list_validation_results()
first_result_full_list = evrs[0]["results"][0]["result"]["unexpected_index_list"]
assert first_result_full_list == [{"pk_1": 3}, {"pk_1": 4}, {"pk_1": 5}]
first_result_partial_list = evrs[0]["results"][0]["result"][
index_column_names: List[str] = evrs[0]["results"][0]["result"][
"unexpected_index_column_names"
]
assert index_column_names == ["pk_1"]

first_result_full_list: List[Dict[str, Any]] = evrs[0]["results"][0]["result"][
"unexpected_index_list"
]
assert first_result_full_list == [
{"pk_1": 3, "animals": "giraffe"},
{"pk_1": 4, "animals": "lion"},
{"pk_1": 5, "animals": "zebra"},
]
first_result_partial_list: List[Dict[str, Any]] = evrs[0]["results"][0]["result"][
"partial_unexpected_index_list"
]
assert first_result_partial_list == [{"pk_1": 3}, {"pk_1": 4}, {"pk_1": 5}]
assert first_result_partial_list == [
{"pk_1": 3, "animals": "giraffe"},
{"pk_1": 4, "animals": "lion"},
{"pk_1": 5, "animals": "zebra"},
]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a blocking suggestion, but you might be able to refactor out a lot of this repeated assertion logic into a private module level helper method (like you did with _add_expectations_and_checkpoint()).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might open the door to parametrizing all these tests. But your call - only if you think that helps readability.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great suggestion :) If you're ok with it, I'll move the expected values into a separate fixture in this case.

Copy link
Member

@anthonyburdi anthonyburdi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for making those changes!

@Shinnnyshinshin Shinnnyshinshin enabled auto-merge (squash) December 19, 2022 22:20
@Shinnnyshinshin Shinnnyshinshin merged commit 0805405 into develop Dec 20, 2022
@Shinnnyshinshin Shinnnyshinshin deleted the f/DX-103/update-id-pk-metric branch December 20, 2022 00:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants