-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FEATURE] Implementing Fluent Datasources Support for Checkpoint #7697
Conversation
…/link/assess_checkpoint_operation_with_fluent_datasource_batch_request-2023_04_19-4
…ess_checkpoint_operation_with_fluent_datasource_batch_request-2023_04_19-4
…heckpoint_operation_with_fluent_datasource_batch_request-2023_04_19-4' into maintenance/DX-469/DX-470/alexsherstinsky/link/assess_checkpoint_operation_with_fluent_datasource_batch_request-2023_04_20-5
…sherstinsky/link/assess_checkpoint_operation_with_fluent_datasource_batch_request-2023_04_19-4' into maintenance/DX-469/DX-470/alexsherstinsky/link/assess_checkpoint_operation_with_fluent_datasource_batch_request-2023_04_20-5
…/link/assess_checkpoint_operation_with_fluent_datasource_batch_request-2023_04_20-6
…/link/assess_checkpoint_operation_with_fluent_datasource_batch_request-2023_04_20-6
…heckpoint_operation_with_fluent_datasource_batch_request-2023_04_20-6' into maintenance/DX-469/DX-470/alexsherstinsky/link/assess_checkpoint_operation_with_fluent_datasource_batch_request-2023_04_20-7
✅ Deploy Preview for niobium-lead-7998 canceled.
|
@@ -349,7 +349,7 @@ def _convert_fluent_datasources_loaded_from_yaml_to_internal_object_representati | |||
config: Dict[str, Any], _allow_empty: bool = False | |||
) -> Dict[str, Any]: | |||
if _FLUENT_DATASOURCES_KEY in config: | |||
fluent_datasources: dict = config[_FLUENT_DATASOURCES_KEY] | |||
fluent_datasources: dict = config[_FLUENT_DATASOURCES_KEY] or {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Kilo59 This is needed, because great_expectations.yml
may come with an empty fluent_datasources
section.
@@ -15,6 +15,7 @@ | |||
BATCH_REQUEST_OPTIONAL_TOP_LEVEL_KEYS: Set[str] = { | |||
"data_connector_query", | |||
"runtime_parameters", | |||
"options", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This allows Usage Statistics
to work with Checkpoints using Fluent Datasources out of the box.
@@ -59,6 +59,7 @@ class _FilePathDataAsset(DataAsset): | |||
"batching_regex", # file_path argument | |||
"kwargs", # kwargs need to be unpacked and passed separately | |||
"batch_metadata", | |||
"connect_options", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Kilo59 This is needed -- otherwise, instantiating an ExecutionEngine says that "options is not recognized by init().
@@ -25,6 +25,36 @@ def titanic_pandas_data_context_stats_enabled_and_expectation_suite_with_one_exp | |||
return context | |||
|
|||
|
|||
@pytest.fixture |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We might embellish this fixture later (e.g., if additional Fluent Datasources and/or DataAssets are needed).
tests/checkpoint/test_checkpoint.py
Outdated
@@ -127,7 +127,7 @@ def test_basic_checkpoint_config_validation( | |||
caplog, | |||
capsys, | |||
): | |||
context: FileDataContext = empty_data_context_stats_enabled | |||
data_context: FileDataContext = empty_data_context_stats_enabled |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These changes are only beautification focused and correcting typographical errors. No logic has been changed.
@@ -0,0 +1,1164 @@ | |||
import logging |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should also add tests for Checkpoint and SimpleCheckpoint utilizing Spark and SQL style Fluent Datasources.
…ess_checkpoint_operation_with_fluent_datasource_batch_request-2023_04_20-7
…ess_checkpoint_operation_with_fluent_datasource_batch_request-2023_04_20-7
…/link/assess_checkpoint_operation_with_fluent_datasource_batch_request-2023_04_20-7
…sherstinsky/link/assess_checkpoint_operation_with_fluent_datasource_batch_request-2023_04_20-7' into maintenance/DX-469/DX-470/alexsherstinsky/link/assess_checkpoint_operation_with_fluent_datasource_batch_request-2023_04_20-7
…sherstinsky/link/assess_checkpoint_operation_with_fluent_datasource_batch_request-2023_04_20-7' into maintenance/DX-469/DX-470/alexsherstinsky/link/assess_checkpoint_operation_with_fluent_datasource_batch_request-2023_04_20-7
…/link/assess_checkpoint_operation_with_fluent_datasource_batch_request-2023_04_20-7
…/link/assess_checkpoint_operation_with_fluent_datasource_batch_request-2023_04_20-7
4ce73ee
to
22c4eee
Compare
# Next two fields are for LegacyCheckpoint configuration | ||
"validation_operator_name": validation_operator_name, | ||
"batches": batches, | ||
# the following four keys are used by SimpleCheckpoint | ||
"site_names": site_names, | ||
"slack_webhook": slack_webhook, | ||
"notify_on": notify_on, | ||
"notify_with": notify_with, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've subclassed a SimpleCheckpoint
and this change has broken inheritance
@lukedyer-peak Thank you very much for reporting this. Is there any chance for you to update your class without too much difficulty? If so, that would be of great help; otherwise, please do let me know as soon as possible, and I will try to figure out a solution on our side. Again, apologies that this change broke your workflow, and please let me know. Thanks again. |
Note: conversation moved to #7790 |
Scope
Next TODO
Please annotate your PR title to describe what the PR does, then give a brief bulleted description of your PR below. PR titles should begin with [BUGFIX], [FEATURE], [DOCS], [MAINTENANCE], or [CONTRIB]. If a new feature introduces breaking changes for the Great Expectations API or configuration files, please also add [BREAKING]. You can read about the tags in our contributor checklist.
Changes proposed in this pull request:
After submitting your PR, CI checks will run and @cla-bot will check for your CLA signature.
For a PR with nontrivial changes, we review with both design-centric and code-centric lenses.
In a design review, we aim to ensure that the PR is consistent with our relationship to the open source community, with our software architecture and abstractions, and with our users' needs and expectations. That review often starts well before a PR, for example in GitHub issues or Slack, so please link to relevant conversations in notes below to help reviewers understand and approve your PR more quickly (e.g.
closes #123
).Previous Design Review notes:
Definition of Done
Please delete options that are not relevant.
Thank you for submitting!