New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FEATURE] V1 Validation scaffolding #9508
Conversation
✅ Deploy Preview for niobium-lead-7998 canceled.
|
|
||
# TODO: Add analytics as needed | ||
class ValidationFactory(Factory[Validation]): | ||
def __init__(self, store: ValidationsStore) -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Current store won't work so we'll need to update accordingly
|
||
def __init__( | ||
self, | ||
name: str, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not part of the spec but we need an identifier if we want persistence
def __init__(self, store) -> None: | ||
# TODO: Update type hints when new ValidationStore is implemented | ||
self._store = store |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Think I'm blocked on this. We have an existing ValidationsStore
but those are for results?
I think we need to do the following:
- Rename the old store to
ValidationResultsStore
- Create a new
ValidationsStore
(orValidationStore
our inconsistent plurality should be resolved) - Plug in the new one here and have CRUD start working
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Working on the store now: #9515
…_expectations into f/v-176/validation
from great_expectations.datasource.fluent.interfaces import DataAsset | ||
|
||
|
||
class Validation: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want this to be a Pydantic model? Went back and forth on it but figured its a pretty lightweight object so we can just add it if we need (probably when persistence comes into question?)
name: str | ||
data: DataAsset | BatchConfig | ||
suite: ExpectationSuite |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I make a change to a suite, how can I ensure that changes cascade through our persistence layer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean a user has access to both a validation and a suite. They update the suite, how do we guarantee they see that in the validation?
Would be make use of a property and have it do an external call each time someone gets this property?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm concerned that a deletion or material change to a suite would result in downstream errors if the validations store didn't stay in sync. Would we need to check every time a suite or batch config was updated/deleted?
""" | ||
|
||
name: str | ||
data: DataAsset | BatchConfig |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know we've talked about this some, but did we land on this being a union with DataAsset? I know it can be done with discriminated unions, but do we currently have a good story around serialization there? I'm a bit concerned about the ergonomics of accessing validation.data
and having to do isinstance checks. Did we rule out a BatchConfig that represents the whole DataAsset?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could make this a BatchConfig and use pydantic coercion to allow a user to instantiate one with an asset.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We've decided to define a protocol and allow this to be a union.
""" | ||
|
||
name: str | ||
data: BatchConfig # TODO: Should support a union of Asset | BatchConfig |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only supporting batch config for the initial implementation - will follow up with asset support but keeping scope limited to get something working
@@ -18,23 +18,25 @@ | |||
from .id_dict import IDDict | |||
from .run_identifier import RunIdentifier, RunIdentifierSchema | |||
from .urn import ge_urn | |||
from .validation_config import ValidationConfig |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Name is temporary and will be updated based on team consensus in the near future
invoke lint
(usesblack
+ruff
)For more information about contributing, see Contribute.
After you submit your PR, keep the page open and monitor the statuses of the various checks made by our continuous integration process at the bottom of the page. Please fix any issues that come up and reach out on Slack if you need help. Thanks for contributing!