Skip to content

[gridfs] more closely implement io.IOBase #387

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from
Closed

[gridfs] more closely implement io.IOBase #387

wants to merge 1 commit into from

Conversation

terencehonles
Copy link
Contributor

@terencehonles terencehonles commented Nov 30, 2018

PYTHON-1695

I'm testing out moving our system to using Python3.7. We have an instance where we are using ZipFile on a GridOut instance. Because seekable() is defined as part of the IO API the ZipFile implementation does not handle an AttributeError coming from accessing the seekable method.

This appears to be a change in the ZipFile implementation between Python3.6 and Python3.7.

The GridOut implementation is relatively close to implementing io.IOBase so I figured I would close the gap, have it subclass io.IOBase and also note where it diverges from the io.IOBase behavior.

I don't use ZipFile against a GridIn instance and it diverges from io.IOBase in that it doesn't return the number of bytes written so I refrained from making it a subclass of io.IOBase

@terencehonles
Copy link
Contributor Author

Not sure how I should follow up on this...

Anyone? @behackett ?

@ShaneHarvey
Copy link
Member

@terencehonles we are interested in supporting ZipFile with GridOut/GridIn but I'm not sure we want to claim full support for io.Base yet. Can you add a test in test/test_grid_file.py that uses ZipFile with GridOut and only add the methods required for it to work? For example I don't think we should add the readlines method or inherit from io.Base.

I don't use ZipFile against a GridIn instance and it diverges from io.IOBase in that it doesn't return the number of bytes written so I refrained from making it a subclass of io.IOBase

That's interesting... perhaps we can just start returning the number of bytes written? The docstring says Write data to the file. There is no return value. but I can't find any explanation for why we don't return anything. I'll open a new ticket to investigate this problem separately.

@ShaneHarvey ShaneHarvey self-requested a review March 12, 2019 20:19
@terencehonles
Copy link
Contributor Author

@ShaneHarvey I can add the test and remove io.Base as a parent, but is there a strong reason I shouldn't add readlines?

You're basically asking me to keep as close to io.Base's interface but also diverge. I can look at what zip uses and make sure only those are implemented, but if something changes in the future or if someone else expects the GridOut file to behave file-like this will likely result in another PR pushing GridOut closer to io.Base. Therefore I suggest we push GridOut to have roughly the same interface as io.Base, but not inherit from io.Base at this time because GridOut.__next__ has a incompatible implementation with io.Base (which is the only incompatibility).

I'd go further and suggest we should add a deprecation warning and a "future flag" which will allow GridOut to eventually subclass io.Base and allow users to select a "future" GridOut.__next__ implementation which would be the same as io.Base

@ShaneHarvey
Copy link
Member

You're basically asking me to keep as close to io.Base's interface but also diverge. I can look at what zip uses and make sure only those are implemented, but if something changes in the future or if someone else expects the GridOut file to behave file-like this will likely result in another PR pushing GridOut closer to io.Base.

Yes that's what I'm asking. We don't want to claim we support io.Base yet because there are differences in behavior. We can however test that GridOut works with ZipFile, that's a much smaller change and much easier to test and review :). It also gives us a good starting point for discovering what other changes are needed to fully support io.Base in the future. Does that make sense?

@ShaneHarvey
Copy link
Member

is there a strong reason I shouldn't add readlines?

We have not had any requests to add readlines and ZipFile does not use it. Do you have an actual use-case for it? Also, I believe IOBase already has an implementation of readlines so if we inherit from IOBase we get it for free.

@terencehonles
Copy link
Contributor Author

We don't actually use hint, so it's sufficient for us to .read() and then split ourselves, and if this does eventually subclass io.IOBase it will be easier to not have code that potentially could diverge before that happens.

I left a Sphinx comment note in __iter__ explaining the difference (so it should only show in the source)

ShaneHarvey pushed a commit that referenced this pull request Mar 28, 2019
Allows GridOut to be wrapped with zipfile.ZipFile from the stdlib.
ShaneHarvey pushed a commit that referenced this pull request Mar 28, 2019
Allows GridOut to be wrapped with zipfile.ZipFile from the stdlib.

(cherry picked from commit 481600b)
Copy link
Member

@ShaneHarvey ShaneHarvey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Merged in 481600b and 8cecd8e.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants