Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MAINTENANCE] Update NotImported mechanism to use scoped compatibility modules #7635

Conversation

alexsherstinsky
Copy link
Contributor

@alexsherstinsky alexsherstinsky commented Apr 15, 2023

Scope

  • This change moves all imports into scoped modules (sqlalchemy, pyspark, azure, google, etc.).
  • Some code cleanup, facilitated by this change.

Next To Be Done

Update imports to use the new NotImported style for the packages AWS S3, Trino, AWS Athena, Google BigQuery, Snowflake, and AWS Redshift (and any others as needed).

Please annotate your PR title to describe what the PR does, then give a brief bulleted description of your PR below. PR titles should begin with [BUGFIX], [FEATURE], [DOCS], [MAINTENANCE], or [CONTRIB]. If a new feature introduces breaking changes for the Great Expectations API or configuration files, please also add [BREAKING]. You can read about the tags in our contributor checklist.

Changes proposed in this pull request:

  • JIRA: DX-1/DX-440

After submitting your PR, CI checks will run and @cla-bot will check for your CLA signature.

For a PR with nontrivial changes, we review with both design-centric and code-centric lenses.

In a design review, we aim to ensure that the PR is consistent with our relationship to the open source community, with our software architecture and abstractions, and with our users' needs and expectations. That review often starts well before a PR, for example in GitHub issues or Slack, so please link to relevant conversations in notes below to help reviewers understand and approve your PR more quickly (e.g. closes #123).

Previous Design Review notes:

Definition of Done

Please delete options that are not relevant.

  • My code follows the Great Expectations style guide
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have added unit tests where applicable and made sure that new and existing tests are passing.
  • I have run any local integration tests and made sure that nothing is broken.

Thank you for submitting!

@netlify
Copy link

netlify bot commented Apr 15, 2023

Deploy Preview for niobium-lead-7998 canceled.

Name Link
🔨 Latest commit 85eb9e3
🔍 Latest deploy log https://app.netlify.com/sites/niobium-lead-7998/deploys/64400aae0c16860008186dc0

Alex Sherstinsky added 12 commits April 15, 2023 01:12
…/DX-440/alexsherstinsky/link/update_not_imported_mechanism_to_use_scoped_compatibility_modules_instead-2023_04_14-2
@alexsherstinsky alexsherstinsky marked this pull request as ready for review April 17, 2023 17:09
@alexsherstinsky alexsherstinsky enabled auto-merge (squash) April 17, 2023 17:10
@alexsherstinsky alexsherstinsky requested review from a team April 17, 2023 17:10
Alex Sherstinsky added 5 commits April 18, 2023 16:52
…ink/update_not_imported_mechanism_to_use_scoped_compatibility_modules_instead-2023_04_14-2
@alexsherstinsky alexsherstinsky enabled auto-merge (squash) April 19, 2023 00:05
Alex Sherstinsky added 15 commits April 18, 2023 17:24
Copy link
Member

@anthonyburdi anthonyburdi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for making all these changes! I have a few small requests, and a question but I'll approve and leave to your judgement. The question is - why are there type ignore statements in some compatibility/* files but not others? E.g. they are in great_expectations/compatibility/google.py but not sqlalchemy.py? Can we remove more of them?

great_expectations/core/batch.py Show resolved Hide resolved
@@ -460,15 +456,19 @@ def prepare_dump(self, data, **kwargs):
This method calls the schema's jsonValue() method, which translates the object into a json
"""
# check whether spark exists
if StructType is None:
if (not pyspark.types) or (pyspark.types.StructType is None):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

StructType won't be None so I think you can remove the right side of the or

@@ -933,15 +933,19 @@ def prepare_dump(self, data, **kwargs):
This method calls the schema's jsonValue() method, which translates the object into a json
"""
# check whether spark exists
if StructType is None:
if (not pyspark.types) or (pyspark.types.StructType is None):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment, we have to be careful comparing to None since we are now using the NotImported type

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@anthonyburdi I see what you mean. I fixed to utilize the boolean of NotImported and guard against AttributeError -- thanks!

Alex Sherstinsky added 2 commits April 19, 2023 08:27
…ink/update_not_imported_mechanism_to_use_scoped_compatibility_modules_instead-2023_04_14-2
@alexsherstinsky alexsherstinsky force-pushed the maintenance/DX-1/DX-440/alexsherstinsky/link/update_not_imported_mechanism_to_use_scoped_compatibility_modules_instead-2023_04_14-2 branch from 94694f3 to 5354641 Compare April 19, 2023 15:28
@alexsherstinsky
Copy link
Contributor Author

alexsherstinsky commented Apr 19, 2023

Thank you for making all these changes! I have a few small requests, and a question but I'll approve and leave to your judgement. The question is - why are there type ignore statements in some compatibility/* files but not others? E.g. they are in great_expectations/compatibility/google.py but not sqlalchemy.py? Can we remove more of them?

@anthonyburdi It is not entirely clear why this inconsistency happens. (It was observed even before the large refactoring work, aimed at utilizing NotImported in the best possible way, got under way.). One possibility is that the internals of the SQLAlchemy package behave differently than those of PySpark and others, thus being more compatible with the _NOT_IMPORTED constants of the NotImported object type, thus not requiring these ignore statements. Thanks.

@alexsherstinsky alexsherstinsky merged commit d3de4c5 into develop Apr 19, 2023
49 checks passed
@alexsherstinsky alexsherstinsky deleted the maintenance/DX-1/DX-440/alexsherstinsky/link/update_not_imported_mechanism_to_use_scoped_compatibility_modules_instead-2023_04_14-2 branch April 19, 2023 16:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants