New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a Timeout class #231

Merged
merged 49 commits into from Sep 11, 2013

Conversation

Projects
None yet
7 participants
@kevinburke
Contributor

kevinburke commented Aug 18, 2013

You can now specify Timeout values with

from urllib3 import util
urlopen(timeout=util.Timeout(connect=5, read=20))

in addition to the usual float/int ways of specifying it.

Also adds a test case for the old and the new way of specifying a timeout.

Kevin Burke added some commits Aug 18, 2013

Kevin Burke
Add rudimentary Timeout class
This is just to get the tests passing, it doesn't change behavior

You can now specify Timeout values with urlopen(timeout=util.Timeout(request=20))
in addition to the usual float/int ways of specifying it.

Also add a test case for the old way of specifying a timeout
Show outdated Hide outdated urllib3/util.py
Kevin Burke
Add an EnhancedHTTPConnection class
We need to override the default connect() method so we can set our own connect
timeout. Also adds two new error classes, ConnectTimeoutError and
RequestTimeoutError

Adds a bunch of tests that total, request and connect timeouts do what they are
expected to do.

Kevin Burke added some commits Aug 18, 2013

Kevin Burke
Update PR with code review feedback
Timeouts are initialized to a sentinel value to allow user to specify
None as the timeout value
Kevin Burke
Add more tests around util.Timeout
Also discard negative values
Kevin Burke
Add better test coverage
I am unsure how to test _tunnel(). Fixes some bugs
uncovered by more robust test coverage
@kevinburke

This comment has been minimized.

Show comment
Hide comment
@kevinburke

kevinburke Aug 18, 2013

Contributor

I found that timeouts on SSL connections came in a few forms. In the latest pip download the error that matches on a timeout is:

            raise ProxyError('Cannot connect to proxy. '
                             'Socket error: %s.' % e)

In this branch if the read operation times out, a RequestTimeoutError gets raised. However if the read operation starts and does not complete it's a little more ambiguous - was that caused by the timeout on the socket or did either end just decide to close the connection? I'm not sure of the best way to handle it.

Contributor

kevinburke commented Aug 18, 2013

I found that timeouts on SSL connections came in a few forms. In the latest pip download the error that matches on a timeout is:

            raise ProxyError('Cannot connect to proxy. '
                             'Socket error: %s.' % e)

In this branch if the read operation times out, a RequestTimeoutError gets raised. However if the read operation starts and does not complete it's a little more ambiguous - was that caused by the timeout on the socket or did either end just decide to close the connection? I'm not sure of the best way to handle it.

@kevinburke

This comment has been minimized.

Show comment
Hide comment
@kevinburke

kevinburke Aug 18, 2013

Contributor

I'm happy to merge this in with the existing PR, it's up to you which interface we'd like to have for the code.

Contributor

kevinburke commented Aug 18, 2013

I'm happy to merge this in with the existing PR, it's up to you which interface we'd like to have for the code.

log = logging.getLogger('urllib3.connectionpool')
log.setLevel(logging.NOTSET)
log.addHandler(logging.StreamHandler(sys.stdout))
# We need a host that will not immediately close the connection with a TCP
# Reset. SO suggests this hostname
TARPIT_HOST = '10.255.255.1'

This comment has been minimized.

@shazow

shazow Aug 20, 2013

Collaborator

Tests must run offline, please avoid external hosts.

@shazow

shazow Aug 20, 2013

Collaborator

Tests must run offline, please avoid external hosts.

This comment has been minimized.

@pasha-r

pasha-r Aug 20, 2013

Connection timeout must be tested somehow and the best way is to use some non-existent IP.
I ran into this problem myself and solved it like this:
pasha-r@4a3b45e#L2R79
Which is essentially the same solution.

@pasha-r

pasha-r Aug 20, 2013

Connection timeout must be tested somehow and the best way is to use some non-existent IP.
I ran into this problem myself and solved it like this:
pasha-r@4a3b45e#L2R79
Which is essentially the same solution.

This comment has been minimized.

@kevinburke

kevinburke Aug 20, 2013

Contributor

When you connect to a port on loopback that's not open, the operating system immediately sends a TCP RST packet. Connecting to a 192.168 or a 10.* host does not yield the same reset.

I will look into a mock TCP server that never ACK's. Alternately we could mock out the call to connect() and have the call just sleep, and verify that the connect timeout does what it is supposed to do.

Sample tcpdump traffic on lo0 from a host that's not open (notice the [R.] packet response)

09:36:15.921174 IP6 (hlim 64, next-header TCP (6) payload length: 44) fe80::1.55621 > fe80::1.avt-profile-2: Flags [S], cksum 0xfd35 (incorrect -> 0x1e2f), seq 533239964, win 65535, options [mss 16324,nop,wscale 4,nop,nop,TS val 643765797 ecr 0,sackOK,eol], length 0
09:36:15.921191 IP6 (hlim 64, next-header TCP (6) payload length: 20) fe80::1.avt-profile-2 > fe80::1.55621: Flags [R.], cksum 0xfd1d (incorrect -> 0x0d95), seq 0, ack 533239965, win 0, length 0 

Sample tcpdump from a 192.168.* host:

09:41:33.888484 IP (tos 0x10, ttl 64, id 20873, offset 0, flags [DF], proto TCP (6), length 64)
kburke.55711 > 192.168.5.6.avt-profile-2: Flags [S], cksum 0xa14e (correct), seq 2132708649, win 65535, options [mss 1460,nop,wscale 4,nop,nop,TS val 644083291 ecr 0,sackOK,eol], length 0
@kevinburke

kevinburke Aug 20, 2013

Contributor

When you connect to a port on loopback that's not open, the operating system immediately sends a TCP RST packet. Connecting to a 192.168 or a 10.* host does not yield the same reset.

I will look into a mock TCP server that never ACK's. Alternately we could mock out the call to connect() and have the call just sleep, and verify that the connect timeout does what it is supposed to do.

Sample tcpdump traffic on lo0 from a host that's not open (notice the [R.] packet response)

09:36:15.921174 IP6 (hlim 64, next-header TCP (6) payload length: 44) fe80::1.55621 > fe80::1.avt-profile-2: Flags [S], cksum 0xfd35 (incorrect -> 0x1e2f), seq 533239964, win 65535, options [mss 16324,nop,wscale 4,nop,nop,TS val 643765797 ecr 0,sackOK,eol], length 0
09:36:15.921191 IP6 (hlim 64, next-header TCP (6) payload length: 20) fe80::1.avt-profile-2 > fe80::1.55621: Flags [R.], cksum 0xfd1d (incorrect -> 0x0d95), seq 0, ack 533239965, win 0, length 0 

Sample tcpdump from a 192.168.* host:

09:41:33.888484 IP (tos 0x10, ttl 64, id 20873, offset 0, flags [DF], proto TCP (6), length 64)
kburke.55711 > 192.168.5.6.avt-profile-2: Flags [S], cksum 0xa14e (correct), seq 2132708649, win 65535, options [mss 1460,nop,wscale 4,nop,nop,TS val 644083291 ecr 0,sackOK,eol], length 0

This comment has been minimized.

@shazow

shazow Aug 20, 2013

Collaborator

TIL that `10.255.255.1' is a special non-routable address. I presume this will work while you're offline as well, so that's fine with me. :)

@shazow

shazow Aug 20, 2013

Collaborator

TIL that `10.255.255.1' is a special non-routable address. I presume this will work while you're offline as well, so that's fine with me. :)

@shazow

This comment has been minimized.

Show comment
Hide comment
@shazow

shazow Aug 20, 2013

Collaborator

I do like where you're going with this. I think this is the "next step" of what @pasha-r was working on. Might be worth leapfrogging straight to the next step if you think we can cover all the same functionality.

Ideally I'd love to have both of you collaborate on this, but I'm not sure how much time @pasha-r has (can you weigh in?).

Would be feasible for you to send your changes against the #145 pull request? I think you have a lot of overlapping stuff that could be reused (such as tests).

Collaborator

shazow commented Aug 20, 2013

I do like where you're going with this. I think this is the "next step" of what @pasha-r was working on. Might be worth leapfrogging straight to the next step if you think we can cover all the same functionality.

Ideally I'd love to have both of you collaborate on this, but I'm not sure how much time @pasha-r has (can you weigh in?).

Would be feasible for you to send your changes against the #145 pull request? I think you have a lot of overlapping stuff that could be reused (such as tests).

@pasha-r

This comment has been minimized.

Show comment
Hide comment
@pasha-r

pasha-r Aug 20, 2013

I like the idea of Timeout class.
I'd suggest this: move HTTP(S)Connection to a separate file, like in my last commit:
pasha-r@518d0c3

P.S.: it also solves stylistic problem of renaming Connection classes over and over :)

pasha-r commented Aug 20, 2013

I like the idea of Timeout class.
I'd suggest this: move HTTP(S)Connection to a separate file, like in my last commit:
pasha-r@518d0c3

P.S.: it also solves stylistic problem of renaming Connection classes over and over :)

@pasha-r

This comment has been minimized.

Show comment
Hide comment
@pasha-r

pasha-r Aug 20, 2013

I've commited fixes and changes to the #145 PR, so it does actually work now :)

But once I saw your approach with Timeout class, I like it way better.
And you @kevinburke write more comments - always a plus :)

So you may consider some pieces of my PR if you'd like but I'd suggest this PR as generally preferable.

pasha-r commented Aug 20, 2013

I've commited fixes and changes to the #145 PR, so it does actually work now :)

But once I saw your approach with Timeout class, I like it way better.
And you @kevinburke write more comments - always a plus :)

So you may consider some pieces of my PR if you'd like but I'd suggest this PR as generally preferable.

@shazow

This comment has been minimized.

Show comment
Hide comment
@shazow

shazow Aug 20, 2013

Collaborator

Fair enough, we can treat this PR as the primary. Would be nice to merge some of the pieces/tests of the other PR where it makes sense. :)

Collaborator

shazow commented Aug 20, 2013

Fair enough, we can treat this PR as the primary. Would be nice to merge some of the pieces/tests of the other PR where it makes sense. :)

@kevinburke

This comment has been minimized.

Show comment
Hide comment
@kevinburke

kevinburke Aug 20, 2013

Contributor

Will do!


Kevin Burke | 415-723-4116 | www.twilio.com

On Tue, Aug 20, 2013 at 10:54 AM, Andrey Petrov notifications@github.comwrote:

Fair enough, we can treat this PR as the primary. Would be nice to merge
some of the pieces/tests of the other PR where it makes sense. :)


Reply to this email directly or view it on GitHubhttps://github.com/shazow/urllib3/pull/231#issuecomment-22964123
.

Contributor

kevinburke commented Aug 20, 2013

Will do!


Kevin Burke | 415-723-4116 | www.twilio.com

On Tue, Aug 20, 2013 at 10:54 AM, Andrey Petrov notifications@github.comwrote:

Fair enough, we can treat this PR as the primary. Would be nice to merge
some of the pieces/tests of the other PR where it makes sense. :)


Reply to this email directly or view it on GitHubhttps://github.com/shazow/urllib3/pull/231#issuecomment-22964123
.

@kevinburke

This comment has been minimized.

Show comment
Hide comment
@kevinburke

kevinburke Aug 21, 2013

Contributor

Ok... I think I addressed the issues you've raised in this PR. @pasha-r, I added you as a contributor to my repo so you should be able to push commits to this branch.

For some reason the unit tests and travis tests are failing with the message "No handlers could be found for logger "nose.plugins.cover"". Tomorrow I will investigate/fix this, merge the tests from @pasha-r's branch, and add more tests to ensure the Timeout.total is doing what we expect it should do.

Contributor

kevinburke commented Aug 21, 2013

Ok... I think I addressed the issues you've raised in this PR. @pasha-r, I added you as a contributor to my repo so you should be able to push commits to this branch.

For some reason the unit tests and travis tests are failing with the message "No handlers could be found for logger "nose.plugins.cover"". Tomorrow I will investigate/fix this, merge the tests from @pasha-r's branch, and add more tests to ensure the Timeout.total is doing what we expect it should do.

@kevinburke

This comment has been minimized.

Show comment
Hide comment
@kevinburke

kevinburke Aug 21, 2013

Contributor

@shazow, I added you to my repo as well

Contributor

kevinburke commented Aug 21, 2013

@shazow, I added you to my repo as well

@shazow

This comment has been minimized.

Show comment
Hide comment
@shazow

shazow Aug 21, 2013

Collaborator

The logging thing is just a warning. The failure is that there's <100% coverage. :)

Collaborator

shazow commented Aug 21, 2013

The logging thing is just a warning. The failure is that there's <100% coverage. :)

Kevin Burke added some commits Aug 21, 2013

Kevin Burke
Add better test coverage
Add better validation in util.py
Convert get_request_timeout and get_connect_timeout to properties,
per the Url class in the same file
Kevin Burke
Timeout.total is a number or None
This simplifies the comparison logic.

Connect timeouts now follow the below table:

connect = int, total = None: connect is int
connect = int, total = int: connect is minimum

connect = None, total = int: connect is total
connect = None, total = None: connect is None

connect = object, total = int: connect is total
connect = object, total = None: connect is object
Kevin Burke
Set the legacy timeout on the request and the connect
It looks like the timeout is set on the socket which means it applies equally to
the request and the connect calls, make the code reflect this behavior
Kevin Burke
Refactor connection objects into connection.py
Update the tests and objects to point to the new place
@kevinburke

This comment has been minimized.

Show comment
Hide comment
@kevinburke

kevinburke Aug 22, 2013

Contributor

Okay... I made the following changes

  • source_address is Python 2.6 compatible, thanks @pasha-r
  • There's now 100% test coverage
  • Now that get_request_timeout doesn't take arguments, moved it to be a @Property value, like the Url class.
  • Timeout.total defaults to None - it doesn't really make sense to have it be the global timeout as this is covered by the request and connect timeouts. Documentation updated to reflect this
  • Legacy timeouts (eg a float or an int) are turned into Timeout(connect=old_value, request=old_value) instead of just Timeout(request=old_value). This reflects what the old code is doing. There is one additional socket call in httplib.py besides the connect() and the recv(), socket.sendall, but this is nonblocking which means it returns very quickly.
  • Moved the connection classes into their own file per @pasha-r's suggestion
  • Fixed test coverage for 2.6 and 3x
Contributor

kevinburke commented Aug 22, 2013

Okay... I made the following changes

  • source_address is Python 2.6 compatible, thanks @pasha-r
  • There's now 100% test coverage
  • Now that get_request_timeout doesn't take arguments, moved it to be a @Property value, like the Url class.
  • Timeout.total defaults to None - it doesn't really make sense to have it be the global timeout as this is covered by the request and connect timeouts. Documentation updated to reflect this
  • Legacy timeouts (eg a float or an int) are turned into Timeout(connect=old_value, request=old_value) instead of just Timeout(request=old_value). This reflects what the old code is doing. There is one additional socket call in httplib.py besides the connect() and the recv(), socket.sendall, but this is nonblocking which means it returns very quickly.
  • Moved the connection classes into their own file per @pasha-r's suggestion
  • Fixed test coverage for 2.6 and 3x
# This timeout error does not have a URL attached and needs to inherit from the
# base HTTPError

This comment has been minimized.

@shazow

shazow Aug 22, 2013

Collaborator

Maybe it would be better if TimeoutError didn't inherit from RequestError, and then RequestTimeoutError can inherit from both?

@shazow

shazow Aug 22, 2013

Collaborator

Maybe it would be better if TimeoutError didn't inherit from RequestError, and then RequestTimeoutError can inherit from both?

This comment has been minimized.

@kevinburke

kevinburke Aug 23, 2013

Contributor

Fixed in 829ad99

@kevinburke

kevinburke Aug 23, 2013

Contributor

Fixed in 829ad99

Show outdated Hide outdated urllib3/util.py
@kevinburke

This comment has been minimized.

Show comment
Hide comment
@kevinburke

kevinburke Aug 31, 2013

Contributor

I think this is ready for another look. @wolever, I destroyed the stop/start interface - I'm not sure it makes much sense as the connect timeout start time is the only piece of information needed to compute the total timeout.

Contributor

kevinburke commented Aug 31, 2013

I think this is ready for another look. @wolever, I destroyed the stop/start interface - I'm not sure it makes much sense as the connect timeout start time is the only piece of information needed to compute the total timeout.

@kevinburke kevinburke referenced this pull request Sep 4, 2013

Closed

Consider 2.0 #1459

@kevinburke

This comment has been minimized.

Show comment
Hide comment
@kevinburke

kevinburke Sep 5, 2013

Contributor

Anything I can do to move this forward?

Contributor

kevinburke commented Sep 5, 2013

Anything I can do to move this forward?

@shazow

This comment has been minimized.

Show comment
Hide comment
@shazow

shazow Sep 5, 2013

Collaborator

@kevinburke Could you do a rebase or pull the latest master.

@wolever Will you have a chance to take a last look at it? If not, I can do it.

Collaborator

shazow commented Sep 5, 2013

@kevinburke Could you do a rebase or pull the latest master.

@wolever Will you have a chance to take a last look at it? If not, I can do it.

@Lukasa Lukasa referenced this pull request Sep 5, 2013

Closed

Requests not timing out #1577

@Lukasa

This comment has been minimized.

Show comment
Hide comment
@Lukasa

Lukasa Sep 5, 2013

Contributor

@shazow Can I ask you to ping me when this gets merged? We'll want to plumb this through from Requests I think. =)

Contributor

Lukasa commented Sep 5, 2013

@shazow Can I ask you to ping me when this gets merged? We'll want to plumb this through from Requests I think. =)

Kevin Burke added some commits Sep 5, 2013

Kevin Burke
Merge master into branch.
Resolve strict python3 merge conflicts
Kevin Burke
Document the behavior of read/total timeouts
They are not the timeout for a complete response, they are
the timeout in between reads on the socket.
@kevinburke

This comment has been minimized.

Show comment
Hide comment
@kevinburke

kevinburke Sep 5, 2013

Contributor

@shazow The rebase got a little messy, I just merged master into the branch if it's ok.

I also fixed up the documentation so it actually looks readable and added the exceptions module to the docs.

Contributor

kevinburke commented Sep 5, 2013

@shazow The rebase got a little messy, I just merged master into the branch if it's ok.

I also fixed up the documentation so it actually looks readable and added the exceptions module to the docs.

@shazow

This comment has been minimized.

Show comment
Hide comment
@shazow

shazow Sep 9, 2013

Collaborator

@wolever says (but forgot to leave a comment):
The only thing I would specifically change.
The note about DNS:
Since IMO "high load" is fundamentally different from "Python's dns resolver doesn't honour the socket timeout"

Collaborator

shazow commented Sep 9, 2013

@wolever says (but forgot to leave a comment):
The only thing I would specifically change.
The note about DNS:
Since IMO "high load" is fundamentally different from "Python's dns resolver doesn't honour the socket timeout"

# Reset the timeout for the recv() on the socket
read_timeout = timeout_obj.read_timeout
log.debug("Setting read timeout to %s" % read_timeout)

This comment has been minimized.

@shazow

shazow Sep 9, 2013

Collaborator

I think it's safe to remove this guy.

@shazow

shazow Sep 9, 2013

Collaborator

I think it's safe to remove this guy.

This comment has been minimized.

@kevinburke

kevinburke Sep 10, 2013

Contributor

Yeah I was unsure here as this would be the only way to determine after the fact what the connect() and read() were actually set to..

@kevinburke

kevinburke Sep 10, 2013

Contributor

Yeah I was unsure here as this would be the only way to determine after the fact what the connect() and read() were actually set to..

httplib_response = conn.getresponse()
except SocketTimeout:
err = ReadTimeoutError(self, url,
"Read timed out. (read timeout=%s)" %

This comment has been minimized.

@shazow

shazow Sep 9, 2013

Collaborator

Can we add a it more info into this exception? Like the host at least? (Something similar to Line 369)

@shazow

shazow Sep 9, 2013

Collaborator

Can we add a it more info into this exception? Like the host at least? (Something similar to Line 369)

This comment has been minimized.

@shazow

shazow Sep 9, 2013

Collaborator

Oh nevermind, the URL is part of the exception.

@shazow

shazow Sep 9, 2013

Collaborator

Oh nevermind, the URL is part of the exception.

Show outdated Hide outdated urllib3/util.py
Show outdated Hide outdated urllib3/util.py
@shazow

This comment has been minimized.

Show comment
Hide comment
@shazow

shazow Sep 9, 2013

Collaborator

Bunch of nitpicks. Try to fix as many of them as you can, then let me know when it's ready and I'll merge. :)

Thanks again!

Collaborator

shazow commented Sep 9, 2013

Bunch of nitpicks. Try to fix as many of them as you can, then let me know when it's ready and I'll merge. :)

Thanks again!

@Anorov

This comment has been minimized.

Show comment
Hide comment
@Anorov

Anorov Sep 10, 2013

Will this new timeout API be reflected in requests at all, @kevinburke? Or is it going to remain strictly at the urllib3 level?

Anorov commented Sep 10, 2013

Will this new timeout API be reflected in requests at all, @kevinburke? Or is it going to remain strictly at the urllib3 level?

@Lukasa

This comment has been minimized.

Show comment
Hide comment
@Lukasa

Lukasa Sep 10, 2013

Contributor

@Anorov I would like to reflect it in requests, though I'm not sure in what form that will be just yet.

Contributor

Lukasa commented Sep 10, 2013

@Anorov I would like to reflect it in requests, though I'm not sure in what form that will be just yet.

@shazow

This comment has been minimized.

Show comment
Hide comment
@shazow

shazow Sep 11, 2013

Collaborator

I think this is good to go. Want to add yourself to the CONTRIBUTORS.txt before I merge? :)

Collaborator

shazow commented Sep 11, 2013

I think this is good to go. Want to add yourself to the CONTRIBUTORS.txt before I merge? :)

@pasha-r

This comment has been minimized.

Show comment
Hide comment
@pasha-r

pasha-r Sep 11, 2013

Vanity request!
@kevinburke can you add me as well in contributors to this PR?
Only if you don't mind, of course :)

pasha-r commented Sep 11, 2013

Vanity request!
@kevinburke can you add me as well in contributors to this PR?
Only if you don't mind, of course :)

@Lukasa

This comment has been minimized.

Show comment
Hide comment
@Lukasa

Lukasa Sep 11, 2013

Contributor

Vanity metrics are the best kind of metrics!

Contributor

Lukasa commented Sep 11, 2013

Vanity metrics are the best kind of metrics!

@kevinburke

This comment has been minimized.

Show comment
Hide comment
@kevinburke

kevinburke Sep 11, 2013

Contributor

Fixed in e679ff7

Contributor

kevinburke commented Sep 11, 2013

Fixed in e679ff7

shazow added a commit that referenced this pull request Sep 11, 2013

@shazow shazow merged commit f545dcf into urllib3:master Sep 11, 2013

1 check failed

default The Travis CI build failed
Details
@shazow

This comment has been minimized.

Show comment
Hide comment
@shazow

shazow Oct 4, 2013

Collaborator

@kevinburke Could you clarify what this test is testing? I'm confused why ssl is being None'd to test the new Timeout behaviour.

Collaborator

shazow commented on test/with_dummyserver/test_https.py in b4e5a8f Oct 4, 2013

@kevinburke Could you clarify what this test is testing? I'm confused why ssl is being None'd to test the new Timeout behaviour.

This comment has been minimized.

Show comment
Hide comment
@kevinburke

kevinburke Oct 4, 2013

Contributor
Contributor

kevinburke replied Oct 4, 2013

@kevinburke kevinburke deleted the kevinburke:connect_timeout branch Aug 25, 2014

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment