[MAINTENANCE] Improve get validator functionality #4661

abegong · 2022-04-04T14:15:42Z

Changes proposed in this pull request:

Makes it possible to use context.get_validator without specifying an ExpectationSuite (by name, id, or otherwise.) This behavior was already supported by Validator.__init__. It just hadn't been enabled for DataContext.get_validator.
Makes it possible to pass a single Batch to context.get_validator

Both of these additions will improve usability of the top-level API in DataContext.

Each is tested with a single unit test.

…an ExpectationSuite when calling DataContext.get_validator

… argument

netlify · 2022-04-04T14:15:46Z

✅ Deploy Preview for niobium-lead-7998 ready!

Name	Link
🔨 Latest commit	`69e1fd7`
🔍 Latest deploy log	https://app.netlify.com/sites/niobium-lead-7998/deploys/624d0a89ea88d60009394e12
😎 Deploy Preview	https://deploy-preview-4661--niobium-lead-7998.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site settings.

…nality

cdkini · 2022-04-04T15:45:33Z

tests/data_context/test_data_context.py

+    assert my_validator.expectation_suite_name == "default"
+
+def test_get_validator_with_batch(
+    empty_data_context, tmp_path_factory


tmp_path_factory is session-scoped, which can cause state to bleed into other tests when the whole suite is run. I think we've been making a move to tmp_path for that reason.

cdkini · 2022-04-04T15:47:47Z

tests/data_context/test_data_context.py

@@ -1788,6 +1788,112 @@ def test_get_validator_with_attach_expectation_suite(
    )
    assert my_validator.expectation_suite_name == "A_expectation_suite"

+def test_get_validator_without_expectation_suite(
+    empty_data_context_stats_enabled, tmp_path_factory


do we want stats enabled here? I think we should refrain from using this fixture here since we don't test analytics.

cdkini · 2022-04-04T16:00:11Z

tests/data_context/test_data_context.py

+def test_get_validator_without_expectation_suite(
+    empty_data_context_stats_enabled, tmp_path_factory
+):
+    context: DataContext = empty_data_context_stats_enabled


Instead of using a context and updating the filesystem, could we instantiate a BaseDataContext like we do here: https://github.com/great-expectations/great_expectations/blob/develop/tests/expectations/test_expectation_arguments.py#L39-L86

An approach like this would remove the need for I/O and improve performance. We might need to do some mocking though happy to dig in further as needed.

cdkini · 2022-04-04T16:01:20Z

tests/data_context/test_data_context.py

+            "alphanumeric": "some_file",
+        },
+    )
+    assert type(my_validator.get_expectation_suite()) == ExpectationSuite


Nitpick - Could we use isinstance just to be consistent? This method accounts for things like inheritance hierarchies and is a bit safer.

cdkini · 2022-04-04T16:01:55Z

tests/data_context/test_data_context.py

+def test_get_validator_with_batch(
+    empty_data_context, tmp_path_factory
+):
+    context = empty_data_context


Same general note about context - I think we could probably use BaseDataContext and mock any I/O.

alexsherstinsky · 2022-04-04T17:05:28Z

great_expectations/data_context/data_context.py

@@ -1759,6 +1759,7 @@ def get_validator(
        data_connector_name: Optional[str] = None,
        data_asset_name: Optional[str] = None,
        *,
+        batch: Optional[Batch] = None,


@abegong I believe that this should be batch_list: List[Batch] -- do you agree? At least some way to accept list of Batch objects, even if we must support the passing of a single Batch object. Thanks

For usability sake I think it's important to take a single Batch as an argument.

We could also add the option to pass a batch_list.

@abegong I agree -- and, if possible, would prefer for both to be supported in the same pull request. Thanks!

I've made this change

alexsherstinsky

@abegong Small change requested (happy to discuss) -- I really like what this PR will provide (I believe that we even had a JIRA item for it in the Backlog)! Thanks!

…nality

tests/data_context/test_data_context.py

…nality

abegong · 2022-04-06T02:28:32Z

@cdkini , @alexsherstinsky , I've responded to all comments.

…hub.com:great-expectations/great_expectations into maintenance/improve-get-validator-functionality

alexsherstinsky

LGTM!

…ckpoint' of https://github.com/great-expectations/great_expectations into feature/CLOUD-839/GREAT-767/runtime-data-connector-assets * 'DOCS/DOC-212/how_to_validate_data_with_an_in_memory_checkpoint' of https://github.com/great-expectations/great_expectations: (22 commits) -correction to line reference -corrections to python version of the config reference script - formatted sample scripts with black - updated line numbers in document -updated line numbers for script references - Added example scripts with asserts for testing for use as referenced files in documentation. - Added example scripts to test_script_runner.py Remove bootstrap tests that are no longer needed (#4818) - correct typo in URL in script docstring (#4817) - Update how-to guide content - Update old documentation to point to new how-to guide. Maintenance/check for mostly equals 1 in renderers (#4815) Update tutorial_review.md (#4611) Update README.md (#4595) fix IPython deprecation warning (#4301) [FEATURE] Add support for returning fully-qualified parameters names/values from RuleOutput object (#4773) [MAINTENANCE] Remove unused bootstrap methods that were migrated to ML Flow (#4742) fix: update module_name in NoteBookConfigSchema from v2 path to v3 (#4589) [HACKATHON] ExpectColumnValuesToBeValidTcpPort (#4634) [FEATURE] Introducing RuleState class and RuleOutput class for Rule-Based Profiler in support of richer use cases (such as DataAssistant). (#4704) revert to not raising datasource errors on data context init (#4732) Add checks for mostly=1.0 for all renderers (#4736) Maintenance/improve get validator functionality (#4661) ...

abegong added 2 commits April 3, 2022 16:39

Make it optional to include an ExpectationSuite or parameters define …

93f5ff6

…an ExpectationSuite when calling DataContext.get_validator

Enable DataContext.get_validator to accept a single batch as an input…

0390b74

… argument

abegong added 2 commits April 4, 2022 08:16

Merge branch 'develop' into maintenance/improve-get-validator-functio…

e94e3d8

…nality

Merge branch 'develop' into maintenance/improve-get-validator-functio…

808fedc

…nality

cdkini reviewed Apr 4, 2022

View reviewed changes

alexsherstinsky reviewed Apr 4, 2022

View reviewed changes

alexsherstinsky suggested changes Apr 4, 2022

View reviewed changes

abegong and others added 4 commits April 5, 2022 09:27

Merge branch 'develop' into maintenance/improve-get-validator-functio…

5bc53c8

…nality

Move in_memory_runtime_context to conftest.py

c7c3ee1

Add batch_list to Context.get_validator

3a5bb87

Simplify new tests

31ab354

abegong commented Apr 6, 2022

View reviewed changes

tests/data_context/test_data_context.py Outdated Show resolved Hide resolved

abegong added 2 commits April 5, 2022 22:27

Update tests/data_context/test_data_context.py

8adb87d

Merge branch 'develop' into maintenance/improve-get-validator-functio…

df20475

…nality

abegong added 4 commits April 5, 2022 22:38

lint

bb25837

Merge branch 'maintenance/improve-get-validator-functionality' of git…

6ab6a3b

…hub.com:great-expectations/great_expectations into maintenance/improve-get-validator-functionality

Lint again

38938dd

Resolve warnings and errors

69e1fd7

alexsherstinsky approved these changes Apr 6, 2022

View reviewed changes

abegong merged commit c1c0dc0 into develop Apr 6, 2022

abegong deleted the maintenance/improve-get-validator-functionality branch April 6, 2022 11:52

fjork3 changed the title ~~Maintenance/improve get validator functionality~~ [MAINTENANCE] Improve get validator functionality Apr 7, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[MAINTENANCE] Improve get validator functionality #4661

[MAINTENANCE] Improve get validator functionality #4661

abegong commented Apr 4, 2022 •

edited

netlify bot commented Apr 4, 2022 •

edited

cdkini Apr 4, 2022

cdkini Apr 4, 2022

cdkini Apr 4, 2022

cdkini Apr 4, 2022

cdkini Apr 4, 2022

alexsherstinsky Apr 4, 2022

abegong Apr 4, 2022

alexsherstinsky Apr 4, 2022

abegong Apr 6, 2022

alexsherstinsky left a comment

abegong commented Apr 6, 2022

alexsherstinsky left a comment

[MAINTENANCE] Improve get validator functionality #4661

[MAINTENANCE] Improve get validator functionality #4661

Conversation

abegong commented Apr 4, 2022 • edited

netlify bot commented Apr 4, 2022 • edited

✅ Deploy Preview for niobium-lead-7998 ready!

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

alexsherstinsky left a comment

Choose a reason for hiding this comment

abegong commented Apr 6, 2022

alexsherstinsky left a comment

Choose a reason for hiding this comment

abegong commented Apr 4, 2022 •

edited

netlify bot commented Apr 4, 2022 •

edited