Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add HttpFileWrapper as a separate module #84

Closed
wants to merge 1 commit into from
Closed

Conversation

sigmavirus24
Copy link
Collaborator

This just provides a basic way to wrap a file object so you can force a
specific amount of data to be returned at a minimum.

So far, basic tests at the interactive console have shown it to work as
expected. This still needs real tests, however.

Closes #75

This just provides a basic way to wrap a file object so you can force a
specific amount of data to be returned at a minimum.

So far, basic tests at the interactive console have shown it to work as
expected. This still needs real tests, however.

Closes #75
"""
force_size = self._force_read_size
read = self._file_object.read
if size == -1 or size == 0 or size > force_size:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer the if size in (-1, 0) or size > force_size: idiom here.

@Lukasa
Copy link
Member

Lukasa commented May 2, 2015

Proposed alternative names:

[18:38:57] lukasa_home: May as well rename it the 'ScrewYouIWontDoWhatYouTellMe' wrapper
[18:39:25] sigmavirus24:    ScrewYouIAmAnAdult
[18:39:29] sigmavirus24:    IThrewItOnTheGround
[18:39:57] lukasa_home: HttplibSucksWrapper
[18:40:04] sigmavirus24:    BreakAllExpectationsWrapper
[18:40:09] lukasa_home: YoloWrapper
[18:40:12] sigmavirus24:    WhyYouDoThisWrapper
[18:40:23] lukasa_home: ItsMoreASetOfGuidelinesWrapper
[18:40:57] sigmavirus24:    NeedForSpeedWrapper
[18:41:26] sigmavirus24:    FastererererWrapper
[18:41:54] lukasa_home: ThisIsWhatHappensWhenYouDontConfirmYourConstraintsWrapper
[18:42:46] sigmavirus24:    GiantFileUploadFastererWrapper
[18:43:05] sigmavirus24:    TakeLessThanAnHourToUploadSomethingRidiculouslyLargeWrapper
[18:43:17] lukasa_home: BoyIHopeYou'reUsingTheSocketAPIProperlyWrapper

@sigmavirus24
Copy link
Collaborator Author

Or we could just make this into a documentation chapter instead of bothering to write it for someone and distribute it.

@sigmavirus24
Copy link
Collaborator Author

More names

12:40.57  sigmavirus>NeedForSpeedWrapper
12:41.25  sigmavirus>FastererererWrapper
12:41.55  lukasa_hom ThisIsWhatHappensWhenYouDontConfirmYourConstraintsWrapper
12:42.46  sigmavirus>GiantFileUploadFastererWrapper
12:43.05  sigmavirus>TakeLessThanAnHourToUploadSomethingRidiculouslyLargeWrapper
12:43.17  lukasa_hom BoyIHopeYou'reUsingTheSocketAPIProperlyWrapper

@lifeofdave
Copy link

Hey sigmavirus, thanks for pointing this out. Had a quick test with this but it seemed to make performance a bit to a lot worse, depending on the chunk size. Would I need any changes on the django backend to take advantage of this?

@Lukasa
Copy link
Member

Lukasa commented Mar 2, 2016

@lifeofdave What chunk sizes were you using?

@lifeofdave
Copy link

@Lukasa After retesting today it turns out I was incorrect, the File Wrapper doesn't make transfer speeds worse (don't know why that seemed to be the case yesterday), but also it doesn't improve them.

I've been testing with a 5MB file, and code looks like this:

chunk_size = 16384
m = MultipartEncoder(fields={'myfile': (fname, HttpFileWrapper(open(fname, 'rb'),chunk_size))})
fup = self.session.post(self.SITE_URL + 'api/uploads/', data=m, headers={'Content-Type':m.content_type},)

Results:
Not using HttpFileWrapper:
http: 3.9 seconds
https: 10.8 seconds

Using HttpFileWrapper:
chunk_size = 8192
http: 4 seconds
https: 10.4 seconds

chunk_size = 16384
http: 3.8 seconds
https: 10.7 seconds

chunk_size = 32768
http: 3.8 seconds
https: 10.9 seconds

chunk_size = 65536
http: 4 seconds
https: 10.4

Using curl
http: ~4 seconds
https: ~4 seconds

@Lukasa
Copy link
Member

Lukasa commented Mar 2, 2016

Given that our speeds are about the same as curl's with HTTP, the problem is unlikely to be in the read() logic in httplib. Instead, the problem seems to be in the interface with OpenSSL. What platform are you on?

@lifeofdave
Copy link

Windows 7, python 2.7.10 64 bit

@Lukasa
Copy link
Member

Lukasa commented Mar 2, 2016

So the first thing to note is that your curl probably isn't using OpenSSL at all: what's the output of curl --version?

@lifeofdave
Copy link

$ curl --version
curl 7.47.1 (x86_64-unknown-cygwin) libcurl/7.47.1 OpenSSL/1.0.2f zlib/1.2.8 libidn/1.29 libpsl/0.12.0 (+libidn/1.29) libssh2/1.5.0 nghttp2/1.6.0
Protocols: dict file ftp ftps gopher http https imap imaps ldap ldaps pop3 pop3s rtsp scp sftp smb smbs smtp smtps telnet tftp
Features: Debug IDN IPv6 Largefile GSS-API Kerberos SPNEGO NTLM NTLM_WB SSL libz TLS-SRP HTTP2 UnixSockets Metalink PSL

Thanks

@Lukasa
Copy link
Member

Lukasa commented Mar 2, 2016

Ohh, but it's a cygwin thing, so it is linked against OpenSSL. That's tough.
Can you also show me the output of pip freeze please?

@lifeofdave
Copy link

pip freeze outputs nothing

@Lukasa
Copy link
Member

Lukasa commented Mar 2, 2016

Uh..where are you getting your copy of requests from?

@lifeofdave
Copy link

I'm only using cygwin for curl, I've installed requests using the pycharm package manager, which I assume is the same as pip install requests. Requests version is 2.9.1, requests-toolbelt is 0.6.0. Sorry if that wasn't clear!

@Lukasa
Copy link
Member

Lukasa commented Mar 2, 2016

Hmmm. Well, in that case you're using the standard library's SSL module, which is presumably the source of the majority of your slowdown. If you'd like to use something like cprofile to get a feel for where you're spending most of your time, that's probably a good idea.

@lifeofdave
Copy link

Here's the openssl version (if it helps):

import ssl
print ssl.OPENSSL_VERSION
OpenSSL 1.0.2a 19 Mar 2015

Is it simple to use a different SSL module?
I'll check out cprofile

@Lukasa
Copy link
Member

Lukasa commented Mar 2, 2016

You can try using PyOpenSSL instead: if you install pyopenssl, ndg-httpsclient and pyasn1 requests will automatically switch to using that.

@lifeofdave
Copy link

Same results with PyOpenSSL, I'll have a look into cprofile. Thanks so much for your help..

@sigmavirus24 sigmavirus24 deleted the bug/75 branch September 2, 2016 18:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Allow user to override httplib's default 8192 read size
3 participants