Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

POST a Multipart-Encoded File with streaming #1584

Closed
robi-wan opened this issue Sep 9, 2013 · 12 comments
Closed

POST a Multipart-Encoded File with streaming #1584

robi-wan opened this issue Sep 9, 2013 · 12 comments

Comments

@robi-wan
Copy link

robi-wan commented Sep 9, 2013

We have a web application which accepts (large) files together with some meta data.
We build a form for uploading these kind of files - and then we added a REST-Interface to this form.
I build an upload task in our fabfile which essentialliy does this:

    with open(filename, 'rb') as f:
        response = requests.post(url, data=values, files={'file': f})

It seems that that for multipart-encoded POST requests streaming does not work because I get this error (since our files exceeded 128 MB file size):

Upload artifact 'artifact.tar.gz' (group, SNAPSHOT) running on http://<ip> on behalf of user 'fabric'.
2aeaa77d0ac917a28e15ec73fe92e060 *artifact.tar.gz
Traceback (most recent call last):
  File "C:\Documents and Settings\Administrator\.virtualenvs\avatar\lib\site-packages\fabric\main.py", line 743, in main
    *args, **kwargs
  File "C:\Documents and Settings\Administrator\.virtualenvs\avatar\lib\site-packages\fabric\tasks.py", line 405, in execute
    results['<local-only>'] = task.run(*args, **new_kwargs)
  File "C:\Documents and Settings\Administrator\.virtualenvs\avatar\lib\site-packages\fabric\tasks.py", line 171, in run
    return self.wrapped(*args, **kwargs)
  File "C:\development\work\avatar_herbsting_2013\fabfile.py", line 480, in upload
    response = upload_artifact(**data)
  File "C:\development\work\avatar_herbsting_2013\scripts\fabfile\deploy.py", line 106, in upload_artifact
    response = requests.post(url, data=values, files={'file': f})
  File "C:\Documents and Settings\Administrator\.virtualenvs\avatar\lib\site-packages\requests\api.py", line 88, in post
    return request('post', url, data=data, **kwargs)
  File "C:\Documents and Settings\Administrator\.virtualenvs\avatar\lib\site-packages\requests\api.py", line 44, in request
    return session.request(method=method, url=url, **kwargs)
  File "C:\Documents and Settings\Administrator\.virtualenvs\avatar\lib\site-packages\requests\sessions.py", line 335, in request
    resp = self.send(prep, **send_kwargs)
  File "C:\Documents and Settings\Administrator\.virtualenvs\avatar\lib\site-packages\requests\sessions.py", line 438, in send
    r = adapter.send(request, **kwargs)
  File "C:\Documents and Settings\Administrator\.virtualenvs\avatar\lib\site-packages\requests\adapters.py", line 327, in send
    raise ConnectionError(e)
requests.exceptions.ConnectionError: HTTPConnectionPool(host='<ip>', port=80): Max retries exceeded with url: /deliver/upload/ (Caused by <class 'socket.error'>: [Errno 10055] An operation on a socket could not be performed because the system lacked sufficient buffer space or because a queue was full)

Happens on Windows XP SP3 32bit with 2 GB RAM and Windows Server 2003 R2 SP2 with 3 GB RAM.
Python 2.7.5 32bit.
requests 1.2.3

Full code (filename contains the path to the large file I want to upload):

def upload_artifact(address, filename, version, group, username):
    """Upload an artifact.
    """
    path = 'deliver/upload/'
    url = urlparse.urljoin(address, path)

    # get id for group
    url_group_id = urlparse.urljoin(address, 'deliver/groupidbyname/?groupname={}'.format(group))
    response = requests.get(url_group_id)
    group_id = response.text

    # upload file
    values = {'md5hash': md5sum(filename),
              'group': group_id,
              'version': version,
              'username': username,
              }

    with open(filename, 'rb') as f:
        response = requests.post(url, data=values, files={'file': f})

    return response

Is there a way to enable streaming of the large file for this case?

@Lukasa
Copy link
Member

Lukasa commented Sep 9, 2013

Hi @robi-wan, thanks for raising this issue!

Firstly, I'll address the question you actually asked. Currently Requests does not support streaming multipart-encoded files. If you want to do this you'll need to provide a file-like-object wrapper for your file that does the multipart encoding itself, and then pass that to the data object as described here.

Secondly, I'll address your actual problem. Your specific error is the Winsock error WSAENOBUFS. It should not be easily possible to hit this error in Requests because we use blocking sockets, which ought to block until there is sufficient buffer space available. You don't appear to be running out of memory in your process, so I don't think the file size itself has anything to do with this problem.

I'm going to take an educated guess and say that you're running out of ephemeral ports. By default, Windows only exposes 5000 ephemeral ports: sufficiently many long-running uploads could exhaust the supply and cause this error. Does that sound possible in your case? If so, take a look here.

@avallen
Copy link

avallen commented Sep 10, 2013

Hi @Lukasa and @robi-wan I'm delighted to see this question and already find it answered by you. I happen to just have hit the same issue with missing streaming functionality for multipart file uploads.

I suggest to you to look at the poster module - I have used this instead for realizing the upload in a script that otherwise uses requests. I have implemented the whole upload request with this other module and urllib2, however it might as well be possible to use it to prepare the file-like data argument for requests.

The support for this kind of streaming uploads was a prime reason for going with requests, therefore I was disappointed when encountering the NotImplementedError that is thrown by PreparedRequest.prepare_body when the files as well as the data argument is provided. This could be made clearer in the documentation.

@Lukasa
Copy link
Member

Lukasa commented Sep 10, 2013

I agree that we could better document this behaviour. =)

I'm also wondering whether it's worth having a semi-official Requests-y way of better handling complex file behaviours.

@robi-wan
Copy link
Author

Hi @Lukasa thanks for your quick response. I read the Microsoft Knowledge Base Article and tried the suggested solution without success.

@avallen In the meantime I found poster and used it for solving this problem:

def upload_artifact(address, filename, version, group, username):
    """Upload an artifact.
    """
    path = 'deliver/upload/'
    url = urlparse.urljoin(address, path)

    # get id for group
    url_group_id = urlparse.urljoin(address, 'deliver/groupidbyname/?groupname={}'.format(group))
    response = requests.get(url_group_id)
    group_id = response.text

    # upload file
    values = {'md5hash': md5sum(filename),
              'group': group_id,
              'version': version,
              'username': username,
              'file': open(filename, 'rb')
              }

    # Register the streaming http handlers with urllib2
    poster.streaminghttp.register_openers()

    # Start the multipart/form-data encoding of the file filename.
    # headers contains the necessary Content-Type and Content-Length
    # datagen is a generator object that yields the encoded parameters
    datagen, headers = poster.encode.multipart_encode(values)
    # Create the Request object
    request = urllib2.Request(url, datagen, headers)
    resp = None
    try:
        # Actually do the request, and get the response
        resp = urllib2.urlopen(request)
    except urllib2.HTTPError as error:
        print(error)
        print(error.fp.read())

    return resp

This works... but I would like to use requests for task like this.

@Lukasa
Copy link
Member

Lukasa commented Oct 14, 2013

It's difficult to answer that question because we don't know what you're trying to do. What are you hoping to achieve?

@sigmavirus24
Copy link
Contributor

@bernardolima how is your question relevant to this issue? Your question should be asked on StackOverflow. I have a strong suspicion as to why your Post is not working. I'll answer you either on StackOverfow or by email (if you choose to email me privately).

@4n0n1mo
Copy link

4n0n1mo commented Oct 15, 2013

@sigmavirus24 sorry, you're right, I will email you, if you don't mind.
Thank you very much.

@sigmavirus24
Copy link
Contributor

I don't mind. That's why I suggested it. 😉

@Lukasa
Copy link
Member

Lukasa commented Feb 3, 2014

I think the semi-official way to do this is to use sigmavirus24/requests-toolbelt, so I'm going to close this now. =)

@Lukasa Lukasa closed this as completed Feb 3, 2014
@sigmavirus24
Copy link
Contributor

"semi-official" is very accurate.

@Cyxapic

This comment has been minimized.

@firestalk
Copy link

@Cyxapic wrong repo =)

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Sep 7, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants