-
-
Notifications
You must be signed in to change notification settings - Fork 31.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Using zipfile.Path with several files prematurely closes zip #84744
Comments
Given a .zip file with more than one inner file, when reading those inner files using zipfile.Path the zip module closes the .zip file prematurely, causing an error. Given the following code (example zipfile is attached, but any should work). with zipfile.ZipFile('foo.zip') as file:
for name in ['file-1.txt', 'file-2.txt']:
p = zipfile.Path(file, name)
with p.open() as inner:
print(list(inner)) # ValueError: seek of closed file But the following code does not fail: with zipfile.ZipFile('foo.zip') as file:
for name in ['file-1.txt', 'file-2.txt']:
with file.open(name) as inner:
print(list(inner)) # Works! Python 3.8.2 macOS 10.15.4. |
I've attempted to replicate the issue in the zipp test suite with this test: diff --git a/test_zipp.py b/test_zipp.py
index a6fbf39..dc7c8aa 100644
--- a/test_zipp.py
+++ b/test_zipp.py
@@ -259,3 +259,10 @@ class TestPath(unittest.TestCase):
def test_implied_dirs_performance(self):
data = ['/'.join(string.ascii_lowercase + str(n)) for n in range(10000)]
zipp.CompleteDirs._implied_dirs(data)
+
+ def test_read_does_not_close(self):
+ for alpharep in self.zipfile_alpharep():
+ for rep in range(2):
+ p_ = zipp.Path(alpharep, 'a.txt')
+ with p_.open() as inner:
+ print(list(inner)) But the test passes. |
I am able to replicate the failure using the ondisk fixture: diff --git a/test_zipp.py b/test_zipp.py
index a6fbf39..539d0a9 100644
--- a/test_zipp.py
+++ b/test_zipp.py
@@ -259,3 +259,11 @@ class TestPath(unittest.TestCase):
def test_implied_dirs_performance(self):
data = ['/'.join(string.ascii_lowercase + str(n)) for n in range(10000)]
zipp.CompleteDirs._implied_dirs(data)
+
+ def test_read_does_not_close(self):
+ for alpharep in self.zipfile_ondisk():
+ with zipfile.ZipFile(alpharep) as file:
+ for rep in range(2):
+ p_ = zipp.Path(file, 'a.txt')
+ with p_.open() as inner:
+ print(list(inner)) |
I suspect the issue lies in how the CompleteDirs.make replaces one instance with another in order to provide a complete list of implied directories and to memoize the names lookup. Because constructing a zipfile.Path object around a zipfile.ZipFile copies the underlying state, closing one will have the affect of closing the other. I believe this issue is the same root issue as bpo-41350. |
I see a few options here:
|
Implementing that last option:
Does appear to address the issue. I'm not super happy about the implementation, though. |
In jaraco/zipp, I've implemented three of the options above: |
I've also created this alternative to option 2: This alternative approach uses a mix-in rather than subclasses, creating a new class on-demand. I was hoping this approach would enable just augmenting the instance rather than affecting This approach does have an advantage over option 2 in that it would support other ZipFile subclasses for source. It has the disadvantage in that it creates a new subclass for every instance created. I've thought about it a lot and while I'm not particularly happy with any of the approaches, I'm leaning toward option 2. |
I've released zipp 3.2.0 that includes a fix for this issue (option 2). Please test it out. Because this change affects the user's expectation about the effect of what's passed in, I'm hesitant to backport it to 3.8 and 3.9. It's too late to go into 3.9.0, so the earliest it can appear is in 3.9.1. Please advise - does the undesirable behavior warrant a bugfix backport, or is it sufficient to direct users impacted by this issue prior to Python 3.10 to use the |
Fix merged into master. Please use zipp 3.2.0 on Python 3.9 and earlier for the improved behavior or report back here if a backport is needed (and why). |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: