Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MAINTENANCE] Refactor validations in Checkpoint to use CheckpointValidationConfig #8225

Merged
merged 20 commits into from Jul 7, 2023

Conversation

cdkini
Copy link
Member

@cdkini cdkini commented Jun 30, 2023

Instead of black-box dicts, we should use the rich CheckpointValidationConfig type. Our APIs should support both but we should convert to the rich type under the hood so we have better type safety throughout checkpoint flows.

  • Description of PR changes above includes a link to an existing GitHub issue
  • PR title is prefixed with one of: [BUGFIX], [FEATURE], [DOCS], [MAINTENANCE], [CONTRIB]
  • Code is linted - run invoke lint (uses black + ruff)
  • Appropriate tests and docs have been updated

For more details, see our Contribution Checklist, Coding style guide, and Documentation style guide.

After you submit your PR, keep the page open and monitor the statuses of the various checks made by our continuous integration process at the bottom of the page. Please fix any issues that come up and reach out on Slack if you need help. Thanks for contributing!

@netlify
Copy link

netlify bot commented Jun 30, 2023

Deploy Preview for niobium-lead-7998 canceled.

Name Link
🔨 Latest commit 3bc62e3
🔍 Latest deploy log https://app.netlify.com/sites/niobium-lead-7998/deploys/64a818c4d6db05000832c86f

@github-actions github-actions bot added the core label Jun 30, 2023
@ghost
Copy link

ghost commented Jun 30, 2023

👇 Click on the image for a new way to code review

Review these changes using an interactive CodeSee Map

Legend

CodeSee Map legend

@cdkini cdkini self-assigned this Jul 5, 2023
Comment on lines -65 to -78
assert (
checkpoint_config_repr
== """{
"action_list": [
{
"name": "store_validation_result",
"action": {
"class_name": "StoreValidationResultAction"
}
},
{
"name": "store_evaluation_params",
"action": {
"class_name": "StoreEvaluationParametersAction"
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The repr has differing outputs due to the variability of the CheckpointValidationConfig's ordering - I figured it was easier to just test some top level keys but lmk if you disagree.

Copy link
Member

@Kilo59 Kilo59 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couple non-blocking comments

@@ -450,39 +450,48 @@ def get_updated_action_list(


def does_batch_request_in_validations_contain_batch_data(
validations: Optional[List[dict]] = None,
validations: List[CheckpointValidationConfig],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm guessing CheckpointValidationConfig is just a DictDot? Or does it actually declare fields and types?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yup its an AbstractConfig (which inherits from SerializableDictDot)

Comment on lines 633 to 637
def name(self) -> str | None:
try:
return self.config.name
except AttributeError:
return None
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is name really optional/None-able?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm no I don't think so - updating now

Comment on lines +1128 to +1145
config_version: int | float | None = 1.0,
template_name: str | None = None,
run_name_template: str | None = None,
expectation_suite_name: str | None = None,
batch_request: BatchRequestBase | FluentBatchRequest | dict | None = None,
validator: Validator | None = None,
action_list: Sequence[ActionDict] | None = None,
evaluation_parameters: dict | None = None,
runtime_configuration: dict | None = None,
validations: list[CheckpointValidationConfig] | list[dict] | None = None,
profilers: list[dict] | None = None,
ge_cloud_id: str | None = None,
# the following four arguments are used by SimpleCheckpointConfigurator
site_names: Union[str, List[str]] = "all",
slack_webhook: Optional[str] = None,
site_names: str | list[str] = "all",
slack_webhook: str | None = None,
notify_on: str = "all",
notify_with: Union[str, List[str]] = "all",
expectation_suite_ge_cloud_id: Optional[str] = None,
notify_with: str | list[str] = "all",
expectation_suite_ge_cloud_id: str | None = None,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

non-blocking
How many overloads would it take to cover this __init__ method?
If it's small could you define them in this PR?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah I'm afraid it is quite cumbersome - I think it isn't worth the effort (but we can apply this to our updated checkpoint in experimental)

@@ -16,6 +17,9 @@ def __init__(self, id: Optional[str] = None, name: Optional[str] = None) -> None
self.name = name
super().__init__()

def __repr__(self) -> str:
return json.dumps(self.to_dict(), indent=2)
Copy link
Member

@Kilo59 Kilo59 Jul 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we just using json.dumps() for the pretty formatting?
If so I would recommend using pprint.pformat() instead.
https://docs.python.org/3/library/pprint.html#pprint.pformat

from pprint import pformat as pf

...

    def __repr__(self) -> str:
        return pf(self.to_dict(), indent=2, sort_dicts=True)

Can use sort_dicts if desired.

Also, you can take advantage of nice formatting features like depth which will cause the formatter to only show structures up to x depth, and uses ... for everything else.

>>> pf({"a": {"1": 1}, "b": {"2": 2}, "c": [1, 2, 3]}, depth=1)
... {'a': {...}, 'b': {...}, 'c': [...]}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

love it!

@cdkini cdkini enabled auto-merge (squash) July 7, 2023 13:39
@cdkini cdkini merged commit f5078f6 into develop Jul 7, 2023
50 checks passed
@cdkini cdkini deleted the m/_/refactor_checkpoint_validations branch July 7, 2023 14:56
Shinnnyshinshin added a commit that referenced this pull request Jul 10, 2023
* develop:
  [RELEASE] 0.17.4 (#8261)
  [MAINTENANCE] Remove unreferenced docs integration tests (#8228)
  [MAINTENANCE] Disable usage statistics when in Cloud-backed environments (#8248)
  [MAINTENANCE] Replace dynamic datasource deletion with single `delete` method (#8189)
  [MAINTENANCE] Support individual connection args for Snowflake FDS (#8183)
  [RELEASE] 0.17.3 (#8256)
  [MAINTENANCE] Change Pydantic models to utilize `by_alias=True` (#8252)
  [MAINTENANCE] Protect `develop` with `no-commit-to-branch` pre-commit hook (#8254)
  [MAINTENANCE] Bump scipy from 1.5.2 to 1.10.0 in docs_rtd (#8249)
  Revert "feat: add name property"
  feat: add name property
  [MAINTENANCE] Refactor validations in Checkpoint to use `CheckpointValidationConfig` (#8225)
  [MAINTENANCE] enable typechecking in validator.py (#8204)
  [MAINTENANCE] Update `test_deprecation.py` in advance of 0.17.3 release (#8251)
  [MAINTENANCE] Temporarily upper bound Click due to `mypy` typing issues (#8247)
  [DOCS] minor updates to the readme files (#8245)
  [MAINTENANCE] Filter `jsonschema.RefResolver`, `ErrorTree` warnings in tests (#8246)
  [MAINTENANCE] Filter altair/jsonschema Deprecation warning (#8244)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants