New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUGFIX] Allows GE to work when installed in a zip file (PEP 273). Fixes issue #3772 #3798
[BUGFIX] Allows GE to work when installed in a zip file (PEP 273). Fixes issue #3772 #3798
Conversation
This commit adds a Github Action that checks automatically if issue 3772 ('NotADirectoryError occurs when GE is installed in a zip file and build_data_docs() is called') happens. The issue itself will be fixed in a another commit
…PEP 273) This commit fixes issue 3772 ('NotADirectoryError occurs when GE is installed in a zip file and build_data_docs() is called')
👷 Deploy request for niobium-lead-7998 pending review. 🔨 Explore the source changes: fc64df1 |
Hey @joseignaciorc thanks so much for contributing! Let me confer internally to determine how best to navigate running this in CI/CD. I'll bring it up during our next engineering meeting and get back to you! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor comments but the core logic looks great! Still conferring internally about best practices around testing.
# If `static_assets_source_absdir` contains the string ".zip/" (unix) or ".zip\" (win) | ||
# then we check if Great Expectations has been installed inside a zip file (see PEP 273) | ||
# If so, we need to extract all the static files and run this function again | ||
static_assets_source_absdir = os.path.abspath(static_assets_source_dir) | ||
zip_re = re.match( | ||
f"(.+[.]zip){re.escape(os.sep)}(.+)", | ||
static_assets_source_absdir, | ||
flags=re.IGNORECASE, | ||
) | ||
|
||
if zip_re: | ||
zip_filename = zip_re.groups()[0] # e.g.: /home/joe/libs/my_python_libs.zip | ||
path_in_zip = zip_re.groups()[1] # great_expectations/render/view/static | ||
unzip_destination = tempfile.mkdtemp() | ||
|
||
if is_zipfile(zip_filename): | ||
with ZipFile(zip_filename) as zipfile: | ||
static_files_to_extract = [ | ||
file | ||
for file in zipfile.namelist() | ||
if file.startswith(path_in_zip) | ||
] | ||
zipfile.extractall(unzip_destination, static_files_to_extract) | ||
return self.copy_static_assets(unzip_destination) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this logic is very specific to zip-files, could we extract the logic into a separate method? Maybe something like this:
def _handle_zipped_static_assets(self) -> bool:
# If `static_assets_source_absdir` contains the string ".zip/" (unix) or ".zip\" (win)
# then we check if Great Expectations has been installed inside a zip file (see PEP 273)
# If so, we need to extract all the static files and run this function again
static_assets_source_absdir = os.path.abspath(static_assets_source_dir)
zip_re = re.match(
f"(.+[.]zip){re.escape(os.sep)}(.+)",
static_assets_source_absdir,
flags=re.IGNORECASE,
)
if zip_re:
zip_filename = zip_re.groups()[0] # e.g.: /home/joe/libs/my_python_libs.zip
path_in_zip = zip_re.groups()[1] # great_expectations/render/view/static
unzip_destination = tempfile.mkdtemp()
if is_zipfile(zip_filename):
with ZipFile(zip_filename) as zipfile:
static_files_to_extract = [
file
for file in zipfile.namelist()
if file.startswith(path_in_zip)
]
zipfile.extractall(unzip_destination, static_files_to_extract)
self.copy_static_assets(unzip_destination)
return True
return False
That way, we can isolate the zip-specific code and simply invoke it in copy_static_assets
unzipped = self._handle_zipped_static_files()
if unzipped:
return
Thoughts?
) | ||
|
||
print(f"Great Expectations location: {ge.__file__}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we add a quick blurb that explains the purpose of this test?
"""
What does this test and why?
Ensures compatibility with PEP-273: https://www.python.org/dev/peps/pep-0273/
Covers the use-case where GE is installed in a zip file and build_data_docs() is called.
Any other relevant details you may deem relevant!
"""
I just added your test to our integration test matrix; I think that'll probably be the way to go here just so we don't test on each change! |
* A new function `_unzip_assets()` is created, to handle the unzipping * The test in `simple_build_data_docs.py` is now documented
aae6cc0
to
4e0d62f
Compare
I have added a new commit with the changes you commented, @cdkini Extracting the zip is now done in its own function ( |
About this PR
This PR closes the issue #3772.
Changes proposed in this pull request:
html_site_store.py
has been modified. Now when GE is installed in a zip file, the static files are extracted (i.e. unziped) before they are used, so that the context.build_data_docs() function works as expected instead of failing with a FileNotFoundError or NotADirectoryErrortest-pep273-compatibility.yml
has been created. A GA (Github Action) that verifies that the context.build_data_docs() function works when GE is installed in a directory and also in a zip. This verification is done by trying to run thesimple_build_data_docs.py
(which also created in this PR)How these changes were tested:
IMPORTANT:
The tests on test-pep273-compatibility are set to run on every push to the repo. The test is lightweight, but I understand that a Github Action may or may not the best way to test this change, so I am open to discuss changes in the GA and also in the code.
Definition of Done
Please delete options that are not relevant.