New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow user to override httplib's default 8192 read size #75

Closed
sigmavirus24 opened this Issue Mar 14, 2015 · 7 comments

Comments

Projects
None yet
3 participants
@sigmavirus24
Member

sigmavirus24 commented Mar 14, 2015

No description provided.

@sigmavirus24

This comment has been minimized.

Member

sigmavirus24 commented Apr 14, 2015

So I didn't add any details to this, but I'm glad I titled this issue the way I did.

Originally, I thought I'd only add this to the MultipartEncoder class that we have but now I'm certain I want to make it a generic thing that can be used with all objects that have a read method.

Here's the idea:

A user has a large file, e.g., 1GB that they want to stream (even 30MB is slow) and they are on a system that isn't resource constrained so they can feel confident increasing the read-size performed by httplib. They would do something like:

import requests
from requests_toolbelt.utils import HttpFileWrapper  # We need a much better name

MB = 1024 ** 2

my_large_file = open('some_large_file', 'rb')
file_obj = HttpFileWrapper(my_large_file, read_size=10 * MB)
requests.post(url, data=file_obj, headers={'Content-Type': 'application/octet-stream'})

This makes the usage of it perfect for the MultipartEncoder and the MultipartEncoderMonitor.

The object will proxy all attribute access to the wrapped file object except the read method. The read method will then call the wrapped file object's read method with the read_size and return that. There's no need for us to perform validation on the return from that read or anything else.

@sigmavirus24 sigmavirus24 self-assigned this Apr 14, 2015

@sigmavirus24 sigmavirus24 added this to the 0.5 milestone Apr 14, 2015

sigmavirus24 added a commit that referenced this issue May 2, 2015

Add HttpFileWrapper as a separate module
This just provides a basic way to wrap a file object so you can force a
specific amount of data to be returned at a minimum.

So far, basic tests at the interactive console have shown it to work as
expected. This still needs real tests, however.

Closes #75
@sigmavirus24

This comment has been minimized.

Member

sigmavirus24 commented Jul 16, 2015

I'm not so enthusiastic about this anymore. I wasn't really enthusiastic in the first place, but I'm especially not ready to encourage this use. As such, I'm closing this as "Wontfix"

@sigmavirus24 sigmavirus24 removed the ready label Jul 16, 2015

@valhallasw

This comment has been minimized.

valhallasw commented Aug 3, 2016

The documentation still links to this issue requesting feedback. I'm not sure if you're still looking for any, but I'll pitch in my 2 cents anyway:

  • the 8192 byte read size of httplib limits my upload speed to 500-600 kB/s
  • forcing more data into httplib seems to work without too many issues; I used the following monkey-patch:
    m = encoder.MultipartEncoderMonitor(e, callback)
    m._read = m.read
    m.read = lambda size: m._read(1024*1024)

which increased the upload rate to ~12MB/s (which is the limit on this connection).

However, I'm not entirely sure how the intermediate layers mess with passing an iterator. In any case, that seems to be a much cleaner option than forcing .read(size) to ignore size.

@sigmavirus24

This comment has been minimized.

Member

sigmavirus24 commented Aug 3, 2016

The intermediate layers do mess with iterators though. That's why this is designed the way it is.

@sigmavirus24

This comment has been minimized.

Member

sigmavirus24 commented Aug 13, 2017

@fjhaveri no, not at all. That's why this is closed. It's not something I want to encourage folks doing, even though I understand why they might want to do it.

@tzickel

This comment has been minimized.

Contributor

tzickel commented Jan 29, 2018

An option to do it in python 2:
https://gist.github.com/tzickel/18f1d6aedf71c82b5f81095c590c529b

>>> import patch_const
>>> import httplib
>>> patch_const.patch_const(httplib.HTTPConnection.send.im_func, 3, 2**16, 8192)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment