Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Uploading a file b0rken under Python3 #148

Closed
tseaver opened this issue Jun 6, 2017 · 10 comments
Closed

Uploading a file b0rken under Python3 #148

tseaver opened this issue Jun 6, 2017 · 10 comments
Labels
Milestone

Comments

@tseaver
Copy link
Member

tseaver commented Jun 6, 2017

  1. From the ZMI "Add" menu, select "File".
  2. Leave id and title blank, and select a file using the file browser.
  3. Click "Add".

Traceback:

 Traceback (most recent call last):
  File "/tmp/z2py36/lib/python3.6/site-packages/waitress/channel.py", line 338, in service
    task.service()
  File "/tmp/z2py36/lib/python3.6/site-packages/waitress/task.py", line 169, in service
    self.execute()
  File "/tmp/z2py36/lib/python3.6/site-packages/waitress/task.py", line 399, in execute
    app_iter = self.channel.server.application(env, start_response)
  File "/tmp/z2py36/lib/python3.6/site-packages/ZPublisher/httpexceptions.py", line 33, in __call__
    return self.application(environ, start_response)
  File "/tmp/z2py36/lib/python3.6/site-packages/ZPublisher/WSGIPublisher.py", line 238, in publish_module
    request, response, module_info, _publish=_publish)
  File "/tmp/z2py36/lib/python3.6/site-packages/ZPublisher/WSGIPublisher.py", line 211, in _publish_response
    reraise(exc.__class__, exc, sys.exc_info()[2])
  File "/tmp/z2py36/lib/python3.6/site-packages/six.py", line 686, in reraise
    raise value
  File "/tmp/z2py36/lib/python3.6/site-packages/ZPublisher/WSGIPublisher.py", line 172, in _publish_response
    response = _publish(request, module_info)
  File "/tmp/z2py36/lib/python3.6/site-packages/ZPublisher/WSGIPublisher.py", line 162, in publish
    bind=1)
  File "/tmp/z2py36/lib/python3.6/site-packages/ZPublisher/mapply.py", line 85, in mapply
    return debug(object, args, context)
  File "/tmp/z2py36/lib/python3.6/site-packages/ZPublisher/WSGIPublisher.py", line 57, in call_object
    return obj(*args)
  File "/tmp/z2py36/lib/python3.6/site-packages/OFS/Image.py", line 83, in manage_addFile
    newFile.manage_upload(file)
  File "/tmp/z2py36/lib/python3.6/site-packages/OFS/Image.py", line 506, in manage_upload
    data, size = self._read_data(file)
  File "/tmp/z2py36/lib/python3.6/site-packages/OFS/Image.py", line 554, in _read_data
    seek(0, 2)
  File "/tmp/z2py36/lib/python3.6/tempfile.py", line 483, in func_wrapper
    return func(*args, **kwargs)
ValueError: seek of closed file
@tseaver
Copy link
Member Author

tseaver commented Jun 6, 2017

Unsurprisingly, the same error occurs if I first add the file with id and title set, but no upload, and then attempt the upload from its "Edit" tab.

@tseaver
Copy link
Member Author

tseaver commented Jun 6, 2017

Also unsurprisingly, Image is b0rken the same way.

@tseaver
Copy link
Member Author

tseaver commented Jun 6, 2017

Likewise PageTemplate.

@hannosch
Copy link
Contributor

hannosch commented Jun 6, 2017

Btw. if you haven't seen it, using runwsgi -dv etc/zope.ini - the -d enables Zope debug mode and disables the catch all part of the httpexceptions middleware, so you get useful tracebacks on the console.

@tseaver
Copy link
Member Author

tseaver commented Jun 6, 2017

Thanks, that is actually what I was using to get the traceback above.

@stephan-hof
Copy link
Member

Let me try to summarize what leads to this error:

  • HTTPRequest.processInputs starts and creates an instance of ZopeFieldStorage
  • This field storage parses the request and stores the data to upload in a temporary file. This temporary file object is also stored in self.file of the field storage
  • processInputs adds later a FileUpload object into the request form. This FileUpload object references now the same temporary file object as ZopeFieldStorage
  • When processInputs ends the field storage goes out of scope and its __del__ method is called. This method closes now the temporary file object, which is still referenced by the FileUpload object in the request form as well.
  • ZPublisher goes ahead and calls Image.manage_addFile which then tries to seek on the closed file object.

Why did this not happen before ?

  • cgi.FieldStorage has no __del__ in python2. It was added in python3.4 python/cpython@f79126f
  • We added a __del__ as well to close the file object. See here 743581a

For a quick verification of my theory I did the following change.

diff --git a/src/ZPublisher/HTTPRequest.py b/src/ZPublisher/HTTPRequest.py
index ab14701db..7d9712cfb 100644
--- a/src/ZPublisher/HTTPRequest.py
+++ b/src/ZPublisher/HTTPRequest.py
@@ -1666,6 +1666,7 @@ class ZopeFieldStorage(FieldStorage):
         return TemporaryFileWrapper(os.fdopen(handle, 'w+b'), name)
 
     def __del__(self):
+        return
         # Only call close on file object, cStringIO/BytesIO objects
         # would be closed too early.
         if isinstance(self.file, TemporaryFileWrapper):

And yes, the error above went away and the image was added successfully.
I'm a bit puzzled why we have to call the close on the TemporaryFileWrapper at all.
When an instance of this is not referenced any more close is called automatically.

@tseaver
Copy link
Member Author

tseaver commented Jul 10, 2017

@hannosch Can you recall why we thought we needed ZopeFieldStorage.__del__ to close the TFW?

@hannosch
Copy link
Contributor

hannosch commented Jul 10, 2017

Can you recall why we thought we needed ZopeFieldStorage.__del__ to close the TFW?

I actually added the ZopeFieldStorage.__del__ to customize the logic. Otherwise we inherit the cgi.FieldStorage version under Python 3.4+

Without overriding __del__ we got even more errors, in the more common case where self.file is a BytesIO instance and not a temporary file.

I didn't go as far as disabling the __del__ completely, as I trusted the Python 3 code to have a good reason to add it. But it looks like that's not really the case.

I wonder if we do get a ResoureWarning for an unclosed file if we disable ZopeFieldStorage.__del__. At least avoiding one of those is the reason in the Python commit message.

@stephan-hof
Copy link
Member

I don't get it why they added __del__ to FieldStorage.
As far as I can see the objects which are assigned to self.file implement __del__ as well to cleanup resources.

So in both cases the garbage collector has the task to identify that they have to be closed. The only difference I can see is that FieldStorage does not emit this ResourceWarning, but the objects assigned to self.file do.

Here is a link to the original issue http://bugs.python.org/issue18394 and the only reason I can see there is "getting rid of the warning"

If you read some of the comments other web frameworks do it like Zope.
Which is, using FieldStorage.file longer than FieldStorage itself. The answer to this was to improve the documentation. python/cpython@c089f70#diff-96bd1b7d95b46f628d77589d74002860 and a context manager.

So right now I see the following solutions:

  • Do it like my diff above. Overriding __del__ of FieldStorage to do nothing. Then rely on the __del__ of self.file to close it.
  • Keep a reference to the FieldStorage for the lifetime of the request. For example:
--- a/src/ZPublisher/HTTPRequest.py
+++ b/src/ZPublisher/HTTPRequest.py
@@ -511,7 +511,9 @@ class HTTPRequest(BaseRequest):
             environ['QUERY_STRING'] = ''
 
         meth = None
-        fs = ZopeFieldStorage(fp=fp, environ=environ, keep_blank_values=1)
+        # Keep a reference to the FieldStorage. Otherwise it's __del__ method is called
+        # early and closing FieldStorage.file
+        fs = self._field_storage = ZopeFieldStorage(fp=fp, environ=environ, keep_blank_values=1)
         if not hasattr(fs, 'list') or fs.list is None:
             if 'HTTP_SOAPACTION' in environ:
                 # Stash XML request for interpretation by a SOAP-aware view
@@ -1665,13 +1667,6 @@ class ZopeFieldStorage(FieldStorage):
         handle, name = mkstemp()
         return TemporaryFileWrapper(os.fdopen(handle, 'w+b'), name)
 
-    def __del__(self):
-        # Only call close on file object, cStringIO/BytesIO objects
-        # would be closed too early.
-        if isinstance(self.file, TemporaryFileWrapper):
-            self.file.close()
-
-

I'm somehow in favour of the second option, because it is closer to what the python people suggest. The first option is like "I know better".
If nobody objects I'll prepare a pull request with the second option.

@hannosch
Copy link
Contributor

I'm somehow in favour of the second option, because it is closer to what the python people suggest. The first option is like "I know better".
If nobody objects I'll prepare a pull request with the second option.

+1, that sounds like the better idea to me. Bottle did the same and keeps the fieldstorage instance around longer (bottlepy/bottle#567).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
No open projects
Development

No branches or pull requests

4 participants