Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Factor out HTTP(S)Connection -> ConnectionCls, and cleanup. #254

Merged
merged 6 commits into from
Oct 16, 2013

Conversation

shazow
Copy link
Member

@shazow shazow commented Oct 4, 2013

Also removed the need to None-ify global ssl module in tests.

This should address #253 for now. @honzakral, thoughts?

Also, @kevinburke, can you comment on the lines I commented out in test/with_dummyserver/test_https.py? Ctrl+f "kevinburke". :)

Also removed the need to None-ify global `ssl` module in tests.
@honzakral
Copy link

Thanks for the super quick reaction!

This will definitely make my life easier, just wondering how people using PoolManager (also users of requests) would use this - they would either have to pass in two different (http and https) connection classes or have some other mechanism; or just say that this functionality is only available at the lower level.

Since this functionality is mainly going to be used to modify the socket options it might be better served by more specific code (hook to configure socket/nodelay flag) built on top of this code but still part of urllib3.

Thanks again and let me know if there is anything I can help with.

@shazow
Copy link
Member Author

shazow commented Oct 4, 2013

While I agree that in this scenario a hook registry would be the most convenient, I'm reluctant to introduce hooks within urllib3 for two reasons:

  1. We don't use hooks anywhere else. I'd rather not introduce a new API and design pattern just for a single edge case. If we decide to got that route, I'd rather redesign most of the scenarios to work around hooks as well.
  2. I have a feeling that introducing a hooks registry into urllib3 will veer the design of the library into something that it isn't (namely, more into the land of requests). urllib3 is meant to be a low-ish level http library, specializing in being a foundation for other libraries (rather than focusing on end-user application scenarios). With that in mind, I think it's ideal for the parent library to supply its own Connection types rather than registering hooks which feels much more end-user runtime-y.

And then the other consideration is making those socket opts being default. Do you have any argument for why that should be done in urllib3 but not in the Python stdlib?

@honzakral
Copy link

You are absolutely right that hook registry makes little sense. I was just thinking out loud, not trying to push for any particular solution, because I feel that this should be a common usecase (httplib2 does this as default) and should be either on by default or very easy to just enable everywhere.

Also this is a very low-level feature that would be very hard to do without explicit support from urllib3 - this change is definitely a step in the right direction. I'd just personally like to see more complete solution - for example have not just the option of passing in your own connection class but also have a connection class implementation ready that does enable nodelay and maybe even a flag that would use this class.

That's my personal opinion however and I am not pushing for it too hard, I will be happy with any support you can give me. After all I chose urllib3 because I liked the design so I obivously trust your opinion on these things.

My only reason for trying to get it to urllib3 instead of stdlib is time - I do believe this would make sense in stdlib (in httplib at least) but that would mean a huge delay before it's usable.

@shazow
Copy link
Member Author

shazow commented Oct 5, 2013

You are absolutely right that hook registry makes little sense. I was just thinking out loud, not trying to push for any particular solution, because I feel that this should be a common usecase (httplib2 does this as default) and should be either on by default or very easy to just enable everywhere.

Hrm the fact that httplib2 already hardcodes these options and you believe that it would be reasonable for the stdlib to use them as default is fairly compelling.

Can you think of any scenarios that we need to be aware of where having these options as default would have an undesired effect?

Also this is a very low-level feature that would be very hard to do without explicit support from urllib3 - this change is definitely a step in the right direction. I'd just personally like to see more complete solution - for example have not just the option of passing in your own connection class but also have a connection class implementation ready that does enable nodelay and maybe even a flag that would use this class.

I completely agree. My long-term desire is to completely drop our dependency on the httplib stdlib by including our own http spec state machine (that can be swapped for any other protocol) and working with sockets directly within our more-generic connection pools. This will allow for much more configurability. While I'm convinced that we can do this in a fraction of the size of the current httplib, it's still a fairly big undertaking that no one has volunteered for. :)

Btw when do you need this merged? I wouldn't mind waiting for @kevinburke to comment on the tests part in a couple of days, but I can merge earlier if you need this urgently.

@honzakral
Copy link

According to W3C:

When the segments happens to lign up with an even set of segments, the
presence of Nagle's algorithm doesn't make any difference in the traces.
However, when there is an odd, short final segment, then the last ACK from the
client is delayed up to 200ms.

There seems to be no adverse effects to disabling Nagle's algorithm and the positive effect can be quite profound (200ms is a long time).

I am in no hurry to implement and merge this in urllib3, I can always just ship the ugly work around, it's only in one place for me so it shouldn't hurt. It can definitely wait a couple days no problem, especially if I can get TCP_NODELAY directly from urllib3 :)

@shazow
Copy link
Member Author

shazow commented Oct 5, 2013

@t-8ch Any insight on whether making TCP_NODELAY default may be a bad idea? :D

@t-8ch
Copy link
Contributor

t-8ch commented Oct 5, 2013

No idea. Looks harmless. As we are sending data in larger chunks this sounds reasonable.

if not self.ConnectionCls or self.ConnectionCls is DummyConnection:
# Platform-specific: Python without ssl
raise SSLError("Can't connect to HTTPS URL because the SSL "
"module is not available.")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know this was the error message originally, but is there any resource we can/would want to point the user to if they hit this error?

@kevinburke
Copy link
Contributor

This looks good to me. Sorry @shazow for the delay in getting back to you.

@shazow
Copy link
Member Author

shazow commented Oct 16, 2013

I always regret rebasing PR's... dunno why I do it.

shazow added a commit that referenced this pull request Oct 16, 2013
Factor out HTTP(S)Connection -> ConnectionCls, and cleanup.
@shazow shazow merged commit 2948607 into master Oct 16, 2013
@shazow
Copy link
Member Author

shazow commented Oct 16, 2013

I'm actually convinced that disabling Nagle by default is a Good Idea. If anyone wants to attempt it, I started a branch at #259 but I don't think I'll have time to finish it. Feel free to take a crack at it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants