-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove FileObject Wrapper #38
Conversation
@@ -84,6 +108,7 @@ def test_makedirs(self): | |||
self.assertRaises(io.UnsupportedOperation, | |||
handler.makedirs, "test/test") | |||
|
|||
@pytest.mark.timeout(1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why this timeout is introduced? Introducing non-determinism makes tests very flaky thus must be carefully designed and intended.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I stated in the comment message, I want to make sure the pickle problem is solved in pytest
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The timeout value need to be chosen again probably
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Move this to another PR
chainerio/io.py
Outdated
@@ -31,7 +29,26 @@ def __init__(self, io_profiler: Optional[IOProfiler] = None, | |||
self.io_profiler = io_profiler | |||
self.type = "BASEIO" | |||
self.root = root | |||
self.fileobj_class = FileObject | |||
|
|||
def _fileobject_returner(self, file_obj: Type['IOBase'], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
_wrap_fileobject
would be more suitable name?
chainerio/io.py
Outdated
replace the file object returned by underly system. | ||
Derived class overrides this function in order to | ||
add functionalities when needed or match the behaviour. | ||
In the default case, just forward the given file object. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The term 'forward' is not common IMO.
chainerio/containers/zip.py
Outdated
if self.zip_file_obj is None: | ||
zip_file = self.base_handler.open(self.base, "rb") | ||
if isinstance(self.base_handler, ZipContainer) \ | ||
and sys.version_info < (3, 7, ): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My opinion was not like this; to reopen B.zip and replace the file object right before opening A in B.zip in C.zip, not when opening B.zip in C.zip.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct me if I am thinking in a wrong way.
to reopen B.zip and replace the file object right before opening A in B.zip in C.zip, not when opening B.zip in C.zip.
I don't think this is possible because zip file module doesn't do lazy open, hence they check the seekable upon creating the zipfile.ZipFile object. If the file object passed to the zipfile.ZipFile is not seekable, an exception is thrown at that point. So we need to make the file object seekable at open. But since we are using lazy open for now, the actual open is delayed until user open A in B.zip in C.zip. so the behavior to user matches your expectation for now I think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As we discussed offline, we will move this issue to another PR
0a3482a
to
1f9d02d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, this should be merged after the CI fix.
chainerio/containers/zip.py
Outdated
if self.zip_file_obj is None: | ||
zip_file = self.base_handler.open(self.base, "rb") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't get why this line is moved above. Could you explain your intention?
chainerio/io.py
Outdated
replace the file object returned by underly system. | ||
Derived class overrides this function in order to | ||
add functionalities when needed or match the behaviour. | ||
In the default case, just return the given file object. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe "In the default case, it just returns the given file object."
pfnci, test this please |
Successfully created a job for commit ccd5661: |
0d60355
to
eef39df
Compare
This commit implements the redesigned open wrapper, and replace the file object with a `fileobject_returner` to wrap the file object only when necessary. For more details, please refer to (Issue #37) A support for nested zip for Python < 3.7 is added in zipfile.
The readline test in hdfs `test_read_bytes` is removed. The original purpose of this test was to support pickle, while such functionality is already tested in `test_pickle`. Having `readline` when opening with 'rb' is weird.
eef39df
to
30caf4a
Compare
This PR implements the redesigned open wrapper discussed in #37,
and replace the file object with a
fileobject_returner
to wrap the file object only when necessary.
For more details, please refer to (Issue #37)
This solves problem 1 and partially for problem 2 (except for Python <= 3.6) in #37