[MAINTENANCE] FDS Documentation - Creating ExpectationSuite with Domain Knowledge #7852

Shinnnyshinshin · 2023-05-09T23:55:45Z

What does this PR propose?

Updates doc:
How to create and edit Expectations based on domain knowledge, without inspecting data directly to be inline with FDS changes, removing references to CLI.
Brings script under test.
Modeled off of the guide that describes How to create Expectations interactively in Python

Checklist

Description of PR changes above includes a link to an existing GitHub issue
PR title is prefixed with one of: [BUGFIX], [FEATURE], [DOCS], [MAINTENANCE], [CONTRIB]
Code is linted
```
black .

ruff . --fix
```
Appropriate tests and docs have been updated

For more details, see our Contribution Checklist, Coding style guide, and Documentation style guide.

After you submit your PR, keep the page open and monitor the statuses of the various checks made by our continuous integration process at the bottom of the page. Please fix any issues that come up and reach out on Slack if you need help. Thanks for contributing!

* develop: (79 commits) [DOCS] Removing datasource centric test_yaml_config doc (#7836) [FEATURE] Splitters work with Spark Fluent Datasources (#7832) [MAINTENANCE] New PR template (#7710) [DOCS] Update how to create a checkpoint with Test YAML config (#7835) [DOCS] Updating Checkpoint terms page (#7722) [BUGFIX] Adding support for Fluent Batch Requests to context.get_validator (#7808) [FEATURE] Plumbing of validation_result_url from cloud response (#7809) [MAINTENANCE] Enable S3/Spark Connecting To Your Data tests (#7828) [BUGFIX] Handle "persist" directive in "SparkDFExecutionEngine" properly. (#7830) [DOCS] Update docs for how_to_initialize_a_filesystem_data_context_in_python (#7831) [MAINTENANCE] Clean up: Remove duplicated fixture and utilize deeper filtering mechanism for configuration assertions. (#7825) [MAINTENANCE] Enable Spark-S3 Integration tests on Azure CI/CD (#7819) [MAINTENANCE] FDS - Datasources can rebuild their own asset data_connectors (#7826) [DOCS] Prerequisites Cleanup (#7811) [BUGFIX] Azure Package Presence/Absence Tests Strengthening (#7818) [RELEASE] 0.16.11 (#7824) [MAINTENANCE] Fix pin count. (#7823) [BUGFIX] Upper bound `pyathena` due to breaking API in V3 (#7821) [MAINTENANCE] Fix linting error. (#7820) [BUGFIX] Cloud - Fix FDS Asset has no attribute `_data_connector` (#7813) ...

* develop: [MAINTENANCE] Prevent TCH001 warnings for pydantic model annotations (#7846) [BUGFIX] FDS - Deletes not immediately reflected in `great_expectations.yml` (#7843) [MAINTENANCE] Boto import pattern established (#7796) [DOCS] Creating a Checkpoint from an In-Memory Dataframe (#7701) [MAINTENANCE] ruff `.0.262 -> .0.265` (#7829)

netlify · 2023-05-09T23:55:51Z

✅ Deploy Preview for niobium-lead-7998 ready!

Name	Link
🔨 Latest commit	`0d83b0a`
🔍 Latest deploy log	https://app.netlify.com/sites/niobium-lead-7998/deploys/645e6145375c8d00084b49c7
😎 Deploy Preview	https://deploy-preview-7852--niobium-lead-7998.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site settings.

ghost · 2023-05-09T23:57:22Z

👇 Click on the image for a new way to code review

Legend

…ithub.com/great-expectations/great_expectations into d/dx-441/new_guide_edit_expectation_suite * 'd/dx-441/new_guide_edit_expectation_suite' of https://github.com/great-expectations/great_expectations: [MAINTENANCE] Clean up version checker message formatting (#7838) [BUGFIX] `batching_regex` tags are now correctly rendered in docs (#7845) [MAINTENANCE] Adding docs link checker to invoke (#7841) [MAINTENANCE] Pin altair (#7849)

anthonyburdi

Just a few small items to address.

...o_create_and_edit_expectations_based_on_domain_knowledge_without_inspecting_data_directly.md

anthonyburdi · 2023-05-10T22:06:28Z

...o_create_and_edit_expectations_based_on_domain_knowledge_without_inspecting_data_directly.md

+### 2. Use an existing Data Asset to create a Batch Request
+
+Add the following method to retrieve a previously configured Data Asset from the Data Context you initialized and create a Batch Request to identify the Batch of data that you'll use to validate your Expectations:
+
+```python name="tests/integration/docusaurus/expectations/how_to_create_and_edit_an_expectationsuite_domain_knowledge.py get_data_asset_and_build_batch_request"
+```
+
+:::info Limit the Batches returned by a Batch Request
+
+You can provide a dictionary as the `options` parameter of `build_batch_request()` to limit the Batches returned by a Batch Request.  If you leave the `options` parameter empty, your Batch Request will include all the Batches configured in the corresponding Data Asset.  For more information about Batch Requests, see [How to request data from a Data Asset](/docs/guides/connecting_to_your_data/fluent/batch_requests/how_to_request_data_from_a_data_asset).
+
+:::


Since we are not using the batch to create an expectation suite, would it make sense to remove this section (and re-number the rest)?

Ohhhhh I understand now. Yes, you're right. I think I had modeled this a bit "too" closely with the validator-based guide :)

anthonyburdi · 2023-05-10T22:07:38Z

...ation/docusaurus/expectations/how_to_create_and_edit_an_expectationsuite_domain_knowledge.py

+context.sources.add_pandas(name="my_datasource",).add_csv_asset(
+    name="my_data_asset", filepath_or_buffer="./data/yellow_tripdata_sample_2019-01.csv"
+)
+
+# <snippet name="tests/integration/docusaurus/expectations/how_to_create_and_edit_an_expectationsuite_domain_knowledge.py get_data_asset_and_build_batch_request">
+data_asset = context.get_datasource("my_datasource").get_asset("my_data_asset")
+batch_request = data_asset.build_batch_request()
+# </snippet>


Since we aren't using a batch, can we get rid of this section?

yup, you're absolutely right

anthonyburdi · 2023-05-10T22:08:28Z

...ation/docusaurus/expectations/how_to_create_and_edit_an_expectationsuite_domain_knowledge.py

+)
+suite.add_expectation(expectation_configuration=expectation_configuration)
+# </snippet>
+


How about an assertion about the content of the expectation suite after adding these expectations? I think that would be valuable to make sure that the suite contains all the expectations we added.

great idea. Do you mean in the script only, or in the guide too?

Good question! Just in the script so it's not just a smoke test. With this assert we will know that as long as this test passes that someone following the guide will get the end result we expect.

tests/integration/test_script_runner.py

Shinnnyshinshin · 2023-05-11T22:39:50Z

Thank you for the review @anthonyburdi. PR has been updated and ready for re-review

anthonyburdi

LGTM! Thanks for making those changes 🙇

* develop: [FEATURE] Add Spark DeltaAsset type (#7872) [DOCS] CLI Edits (#7865) [FEATURE] Spark directory asset types (#7873) [MAINTENANCE] FDS Documentation - Creating ExpectationSuite with Domain Knowledge (#7852) [DOCS] Update "How to use Great Expectations with Databricks" (#7762) [DOCS] removes remaining Block-config Datasource guides (#7870) [FEATURE] Spark file reader support for fluent datasources (#7844)

anthonyburdi and others added 7 commits April 26, 2023 18:09

Stub files and outline

647f64d

Stub files and outline

ca08a59

add the current changes

7bb1fc0

scope is smaller

7b12ff0

update writing left some todos in it

0e5a110

Merge branch 'develop' into d/dx-441/new_guide_edit_expectation_suite

27a3613

github-actions bot added core dx labels May 9, 2023

Shinnnyshinshin added 2 commits May 9, 2023 16:59

remove section on Validator

ad79b54

Shinnnyshinshin changed the title ~~[MAINTENANCE] FDS Documentation - Editing ExpectationSuite with Domain Knowledge~~ [MAINTENANCE] FDS Documentation - Creating ExpectationSuite with Domain Knowledge May 10, 2023

Shinnnyshinshin added 4 commits May 9, 2023 17:10

remove reference to additional steps

95e59de

renamed file to something more accurate

b56db49

typo in name

02a696a

remove stray statement

2371705

Shinnnyshinshin marked this pull request as ready for review May 10, 2023 16:38

Merge branch 'develop' into d/dx-441/new_guide_edit_expectation_suite

f64a91d

Shinnnyshinshin self-assigned this May 10, 2023

anthonyburdi requested changes May 10, 2023

View reviewed changes

Shinnnyshinshin added 3 commits May 10, 2023 19:15

try with the connection test

98f2903

update after review

d3c2bdc

Merge branch 'develop' into d/dx-441/new_guide_edit_expectation_suite

ff6c8e0

Shinnnyshinshin requested review from anthonyburdi and a team May 11, 2023 22:40

anthonyburdi approved these changes May 12, 2023

View reviewed changes

Merge branch 'develop' into d/dx-441/new_guide_edit_expectation_suite

0d83b0a

Shinnnyshinshin enabled auto-merge (squash) May 12, 2023 15:54

Shinnnyshinshin merged commit c4ed2e1 into develop May 12, 2023

Shinnnyshinshin deleted the d/dx-441/new_guide_edit_expectation_suite branch May 12, 2023 16:43

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[MAINTENANCE] FDS Documentation - Creating ExpectationSuite with Domain Knowledge #7852

[MAINTENANCE] FDS Documentation - Creating ExpectationSuite with Domain Knowledge #7852

Shinnnyshinshin commented May 9, 2023 •

edited

Loading

netlify bot commented May 9, 2023 •

edited

Loading

ghost commented May 9, 2023 •

edited by ghost

Loading

Legend

anthonyburdi left a comment

anthonyburdi May 10, 2023

Shinnnyshinshin May 11, 2023

anthonyburdi May 10, 2023

Shinnnyshinshin May 11, 2023

anthonyburdi May 10, 2023

Shinnnyshinshin May 11, 2023

anthonyburdi May 12, 2023

Shinnnyshinshin commented May 11, 2023

anthonyburdi left a comment

[MAINTENANCE] FDS Documentation - Creating ExpectationSuite with Domain Knowledge #7852

[MAINTENANCE] FDS Documentation - Creating ExpectationSuite with Domain Knowledge #7852

Conversation

Shinnnyshinshin commented May 9, 2023 • edited Loading

What does this PR propose?

Checklist

netlify bot commented May 9, 2023 • edited Loading

✅ Deploy Preview for niobium-lead-7998 ready!

ghost commented May 9, 2023 • edited by ghost Loading

Legend

anthonyburdi left a comment

Choose a reason for hiding this comment

anthonyburdi May 10, 2023

Choose a reason for hiding this comment

Shinnnyshinshin May 11, 2023

Choose a reason for hiding this comment

anthonyburdi May 10, 2023

Choose a reason for hiding this comment

Shinnnyshinshin May 11, 2023

Choose a reason for hiding this comment

anthonyburdi May 10, 2023

Choose a reason for hiding this comment

Shinnnyshinshin May 11, 2023

Choose a reason for hiding this comment

anthonyburdi May 12, 2023

Choose a reason for hiding this comment

Shinnnyshinshin commented May 11, 2023

anthonyburdi left a comment

Choose a reason for hiding this comment

Shinnnyshinshin commented May 9, 2023 •

edited

Loading

netlify bot commented May 9, 2023 •

edited

Loading

ghost commented May 9, 2023 •

edited by ghost

Loading