New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix DTML upload for Python 3. #265
Conversation
src/OFS/DTMLMethod.py
Outdated
@@ -266,10 +266,12 @@ def manage_upload(self, file='', REQUEST=None): | |||
if self.wl_isLocked(): | |||
raise ResourceLockedError('This DTML Method is locked.') | |||
|
|||
if not isinstance(file, binary_type): | |||
if not isinstance(file, basestring): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
basestring
is not defined in Python 3
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right in general, but it is defined inside the module:
Lines 45 to 46 in bc180d5
if sys.version_info >= (3, ): | |
basestring = str |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That if sys.version_info
could use the PY3 symbol imported above
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Although it is not completely the same it is close enough, as we will drop Python 2 support before Python 4 comes out. :)
src/OFS/DTMLMethod.py
Outdated
if REQUEST and not file: | ||
raise ValueError('No file specified') | ||
file = file.read() | ||
if PY3: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't the conversion binary_type->text_type happen for Python 2 as well? I would assume the file type is always binary_type (in which case the method's default should also be changed to file=b''
).
Plus, if the desired output type is text_type, then you might also want to decode inputs that are non-file-like (such as file=b'fnord'
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rbu I think you are generally right. But there are databases out in the world which have stored the file contents as bytes (aka Python 2 str
). Can we provide a way to convert their file contents to text so we do not have to live forever with file contents wich might be bytes
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok I see what you're saying. In the old code, Python 2 would always have str
byte strings and byte file objects. Those were read, resulting in bytes being saved to ZODB. As I understand it, per the WSGI spec, file
will usually be a bytes file object, so read()
ing from it needs to be decoded. So the goal of this code is to always save byte strings in PY2 and always save text strings in PY3.
To make that consistent, though, the default method parameter needs to be decoded as well. What do you think of this:
if not isinstance(file, basestring):
if REQUEST and not file:
raise ValueError('No file specified')
file = file.read()
if PY3 and isinstance(file, binary_type):
file = file.decode('utf-8')
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right if manage_upload
is called with a file
parameter which is bytes
it should be decoded, too.
FYI: I've rebased this on top of master |
I changed the behaviour of |
@icemac it sounds reasonable that both |
src/OFS/DTMLMethod.py
Outdated
@@ -42,7 +44,7 @@ | |||
from OFS.SimpleItem import Item_w__name__ | |||
from ZPublisher.Iterators import IStreamIterator | |||
|
|||
if sys.version_info >= (3, ): | |||
if PY3: | |||
basestring = str |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the current state, you don't need the basestring
symbol anymore.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall, this looks good to me
DTML documents need to be stored as native strings ("str" in both Py2 and Py3), which was implemented in #265 for the manage_upload case (create a document, then upload the content). This commit introduces the conditional type conversion for the "add" case as well (create a document with initial uploaded content).
Without this change uploading a file to a DTMLMethod via ZMI breaks because
self.munge(file)
tries to use a string regex on a bytes object (thefile
variable).What do you think is this patch the way to go?
Open tasks:
See zopefoundation/Products.SiteErrorLog#11 which uses this method in tests and fails with
AttributeError: 'str' object has no attribute 'read'
. in OFS/DTMLMethod.py:272