Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.
Sign upConnect timeout #1801
Conversation
added some commits
Nov 18, 2013
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
sigmavirus24
Dec 16, 2013
Member
I'm generally +1 on the idea and the API. I haven't combed over the diff though and a quick skim makes me think there's more there than needs to be. Also, did I see "TimeoutSauce"? Is there any reason to not simply call it "Timeout"?
|
I'm generally +1 on the idea and the API. I haven't combed over the diff though and a quick skim makes me think there's more there than needs to be. Also, did I see "TimeoutSauce"? Is there any reason to not simply call it "Timeout"? |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
Lukasa
Dec 16, 2013
Member
In principle I'm +0.5 on this. It's a nice extension. However, as memory serves, @kennethreitz considered this approach and ended up not taking it. It might be that now that someone has written the code he'll be happy with it, but equally, it might be that he doesn't like the approach.
@sigmavirus24 Yeah, TimeoutSauce is used for the urllib3 Timeout object, because we have our own Timeout object (an exception).
|
In principle I'm +0.5 on this. It's a nice extension. However, as memory serves, @kennethreitz considered this approach and ended up not taking it. It might be that now that someone has written the code he'll be happy with it, but equally, it might be that he doesn't like the approach. @sigmavirus24 Yeah, |
Lukasa
referenced this pull request
Dec 16, 2013
Closed
Timeouts do not occur when stream == True. #1803
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
sigmavirus24
Dec 16, 2013
Member
@Lukasa As I understood it @kennethreitz was more concerned with the addition (and requirement) of the Timeout class from urllib3. And thanks for clearing up the naming, I still think there has to be a better name for it. (I'm shaving a yak, I know)
|
@Lukasa As I understood it @kennethreitz was more concerned with the addition (and requirement) of the |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
|
@sigmavirus24 That was my understanding from IRC as well. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
sigmavirus24
Dec 16, 2013
Member
@kevinburke I discussed it with you on IRC so the likelihood is that you came to that conclusion through me.
|
@kevinburke I discussed it with you on IRC so the likelihood is that you came to that conclusion through me. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
kennethreitz
Dec 18, 2013
Member
I'd rather us have our current interface and just have it support the intended use case, personally. You don't have to tweak with a bunch of timeouts when you're using a web browser :)
|
I'd rather us have our current interface and just have it support the intended use case, personally. You don't have to tweak with a bunch of timeouts when you're using a web browser :) |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
kennethreitz
Dec 18, 2013
Member
The main reason for this is because we have the tuple interface in another spot (e.g. an expansion of the file upload api) and it's my least favorite part of the API.
|
The main reason for this is because we have the tuple interface in another spot (e.g. an expansion of the file upload api) and it's my least favorite part of the API. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
kennethreitz
Dec 18, 2013
Member
I'm not against the class either, I just need to brew about it for a bit, basically.
|
I'm not against the class either, I just need to brew about it for a bit, basically. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
kennethreitz
Dec 18, 2013
Member
I was under the impression that your concern was for the streaming API, not for more standard uses like GET?
|
I was under the impression that your concern was for the streaming API, not for more standard uses like GET? |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
kevinburke
Dec 18, 2013
Contributor
Actually I care more about the connect timeout than the streaming case,
though we use requests with both at Twilio and would like both changes to
go through.
The use case is that if I want to wait 30 seconds for a request to complete
(for a server to complete processing it and return), that shouldn't also
imply that I need to wait 30 seconds for requests to tell me I can't
establish a connection to the host, which is currently true for remote
hosts that are dead (ones where your machine doesn't immediately send a TCP
Reset, like http://google.com:81 or 10.255.255.1).
There are other possible interfaces - adding a connect_timeout parameter
for example, or similar.
On Wednesday, December 18, 2013, Kenneth Reitz wrote:
I was under the impression that your concern was for the streaming API,
not for more standard uses like GET?—
Reply to this email directly or view it on GitHubhttps://github.com/kennethreitz/requests/pull/1801#issuecomment-30853501
.
Kevin Burke | 415-723-4116 | www.twilio.com
|
Actually I care more about the connect timeout than the streaming case, The use case is that if I want to wait 30 seconds for a request to complete There are other possible interfaces - adding a On Wednesday, December 18, 2013, Kenneth Reitz wrote:
Kevin Burke | 415-723-4116 | www.twilio.com |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
kennethreitz
Dec 20, 2013
Member
Perhaps we can make this an option on the connection adapter itself. That was the intention for this interface.
|
Perhaps we can make this an option on the connection adapter itself. That was the intention for this interface. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
Lukasa
Dec 20, 2013
Member
That's no unreasonable, the HTTPAdapter could happily take these as parameters/properties.
|
That's no unreasonable, the |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
kevinburke
Dec 28, 2013
Contributor
Yeah the HTTPAdapter seems like a good place for this. To try and avoid more go-rounds, what should the interface look like on the Adapter? so far we've proposed
- a
Timeoutclass - seems the least popular - a tuple
(connect, read)which is also not implemented in very many other places in the library - separate parameters - a
timeoutparameter would apply to both, aconnect_timeoutparam to the connect and aread_timeoutparam to the read.
|
Yeah the HTTPAdapter seems like a good place for this. To try and avoid more go-rounds, what should the interface look like on the Adapter? so far we've proposed
|
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
|
Ping :) |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
Lukasa
Jan 6, 2014
Member
Hmm. I'd rather not do a Timeout class, I'd prefer the optional tuple I think. But hold off until @kennethreitz gets another chance to look at this.
|
Hmm. I'd rather not do a |
colons
and others
added some commits
Feb 16, 2014
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
p2
Apr 4, 2014
I found this because my Flask app currently tries to connect to a dead host for which I've set a timeout of 5 seconds but it takes forever to actually time out. What is the status on this one?
p2
commented
Apr 4, 2014
|
I found this because my Flask app currently tries to connect to a dead host for which I've set a timeout of 5 seconds but it takes forever to actually time out. What is the status on this one? |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
Lukasa
Apr 5, 2014
Member
Right now we do apply the timeout to connections, we just don't have it configurable. Are you using stream=True?
|
Right now we do apply the timeout to connections, we just don't have it configurable. Are you using |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
p2
Apr 5, 2014
Hadn't been using stream=True, I guess I should use it if only to get the timeout? Just did a quick test with the following script (the host in the test, http://pillbox.nlm.nih.gov, is still down), and with stream=True it does timeout after 5 seconds, without it runs anywhere from 20 to 120 seconds—not what I would expect.
import requests
url = 'http://pillbox.nlm.nih.gov/assets/large/abc.jpg'
requests.get(url, timeout=5)
Using requests 2.2.1 with Python 3.3.3
p2
commented
Apr 5, 2014
|
Hadn't been using
Using requests 2.2.1 with Python 3.3.3 |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
Lukasa
Apr 5, 2014
Member
That's weird, it works fine on Python 2.7. Seems like a Python 3 bug, because I can reproduce your problem in 3.4. @kevinburke, are you aware of any timeout bugs in urllib3?
|
That's weird, it works fine on Python 2.7. Seems like a Python 3 bug, because I can reproduce your problem in 3.4. @kevinburke, are you aware of any timeout bugs in urllib3? |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
kevinburke
Apr 5, 2014
Contributor
I believe connection timeouts would be retried 3 times as they represent a
failure to connect to the server, but not at my computer at the moment.
On Saturday, April 5, 2014, Cory Benfield notifications@github.com wrote:
That's weird, it works fine on Python 2.7. Seems like a Python 3 bug,
because I can reproduce your problem in 3.4. @kevinburkehttps://github.com/kevinburke,
are you aware of any timeout bugs in urllib3?Reply to this email directly or view it on GitHubhttps://github.com/kennethreitz/requests/pull/1801#issuecomment-39640149
.
Kevin Burke | Twilio
phone: 925.271.7005 | kev.inburke.com
|
I believe connection timeouts would be retried 3 times as they represent a On Saturday, April 5, 2014, Cory Benfield notifications@github.com wrote:
Kevin Burke | Twilio |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
Lukasa
Apr 5, 2014
Member
Weird, I'm not seeing that happen in Python 2.7 then. We'll need to investigate this inconsistency.
|
Weird, I'm not seeing that happen in Python 2.7 then. We'll need to investigate this inconsistency. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
kirelagin
May 10, 2014
Hm, we definitely need a sane way of configuring both timeouts, and I think that the tuple-approach is nice. requests is for humans and humans don't really like all that extra classes and additional keyword arguments mess.
By the way, speaking about hosts that are down, I just did a test with
import requests
requests.get('http://google.com:81/', timeout=5)
and I get 35 seconds both on Python 2.7 and 3.3 (requests 2.2.1). That's not what I would expect from timeout=5… And with timeout=1 I get 7 seconds. I mean, we really need a sane interface…
kirelagin
commented
May 10, 2014
|
Hm, we definitely need a sane way of configuring both timeouts, and I think that the tuple-approach is nice. By the way, speaking about hosts that are down, I just did a test with
and I get 35 seconds both on Python 2.7 and 3.3 (requests 2.2.1). That's not what I would expect from |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
Lukasa
May 10, 2014
Member
@kirelagin I see the same behaviour as you, but I believe this to be a socket-level problem. Dumping Wireshark shows that my OS X box makes five independent attempts to connect. Each of those connection attempts only retransmits for five seconds.
I suspect this behaviour comes down to the fact that httplib uses socket.create_connection, not socket.connect(). Python's socket module documentation has this to say (emphasis mine):
This is a higher-level function than socket.connect(): if host is a non-numeric hostname, it will try to resolve it for both AF_INET and AF_INET6, and then try to connect to all possible addresses in turn until a connection succeeds.
Closer examination of Wireshark trace shows that we are definitely hitting different addresses: five of them.
If we wanted to 'fix' this behaviour (more on those scare quotes in a minute) it would be incredibly complicated: we'd end up needing to either circumvent httplib's connection logic or use signals or some form of interrupt-processing to return control to us after a given timeout.
More generally, I don't know what 'fix' would mean here. This behaviour is naively surprising (you said we'd time out after 5 seconds but it took 30!), but makes sense once you understand what's happening. I believe that this is an 'expert friendly' interface (thanks to Nick Coghlan for the term), so I'm prepared to believe that we should change it. With that said, if we do change it, how do we give the 'I want to wait 5 seconds for each possible target' behaviour to expert users?
|
@kirelagin I see the same behaviour as you, but I believe this to be a socket-level problem. Dumping Wireshark shows that my OS X box makes five independent attempts to connect. Each of those connection attempts only retransmits for five seconds. I suspect this behaviour comes down to the fact that
Closer examination of Wireshark trace shows that we are definitely hitting different addresses: five of them. If we wanted to 'fix' this behaviour (more on those scare quotes in a minute) it would be incredibly complicated: we'd end up needing to either circumvent More generally, I don't know what 'fix' would mean here. This behaviour is naively surprising (you said we'd time out after 5 seconds but it took 30!), but makes sense once you understand what's happening. I believe that this is an 'expert friendly' interface (thanks to Nick Coghlan for the term), so I'm prepared to believe that we should change it. With that said, if we do change it, how do we give the 'I want to wait 5 seconds for each possible target' behaviour to expert users? |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
kevinburke
May 10, 2014
Contributor
There's a retries branch of urllib3 that would let you specify exactly this.
On Saturday, May 10, 2014, Cory Benfield notifications@github.com wrote:
@kirelagin https://github.com/kirelagin I see the same behaviour as
you, but I believe this to be a socket-level problem. Dumping Wireshark
shows that my OS X box makes five independent attempts to connect. Each of
those connection attempts only retransmits for five seconds.I suspect this behaviour comes down to the fact that httplib uses
socket.create_connection, not socket.connect(). Python's socket module
documentation has this to sayhttps://docs.python.org/2/library/socket.html#socket.create_connection(emphasis mine):This is a higher-level function than socket.connect(): if host is a
non-numeric hostname, it will try to resolve it for both AF_INET and
AF_INET6, and _then try to connect to all possible addresses in turn_until a connection succeeds.Closer examination of Wireshark trace shows that we are definitely hitting
different addresses: five of them.If we wanted to 'fix' this behaviour (more on those scare quotes in a
minute) it would be incredibly complicated: we'd end up needing to either
circumvent httplib's connection logic or use signals or some form of
interrupt-processing to return control to us after a given timeout.More generally, I don't know what 'fix' would mean here. This behaviour is
naively surprising (you said we'd time out after 5 seconds but it took
30!), but makes sense once you understand what's happening. I believe that
this is an 'expert friendly' interface (thanks to Nick Coghlan for the
term), so I'm prepared to believe that we should change it. With that said,
if we do change it, how do we give the 'I want to wait 5 seconds for each
possible target' behaviour to expert users?—
Reply to this email directly or view it on GitHubhttps://github.com/kennethreitz/requests/pull/1801#issuecomment-42735878
.
Kevin Burke | Twilio
phone: 925.271.7005 | kev.inburke.com
|
There's a retries branch of urllib3 that would let you specify exactly this. On Saturday, May 10, 2014, Cory Benfield notifications@github.com wrote:
Kevin Burke | Twilio |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
|
Uh, is there? Where? I can't see it... |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
kirelagin
May 10, 2014
Ah, yeah, that makes sense.
five independent attempts
different addresses: five of them.
Wait, google.com has six A records (plus one AAAA, that's why I see 35=5*7 seconds, I guess).
Anyway, I just checked Firefox and it is trying exactly one IPv6 and exactly one IPv4 address. I believe, that multiple DNS records are mostly used for load balancing, not fault-tolerance, so attempting only the first address by default makes most sense. Having an option to control this is useful, of course.
kirelagin
commented
May 10, 2014
|
Ah, yeah, that makes sense.
Wait, google.com has six A records (plus one AAAA, that's why I see 35=5*7 seconds, I guess). Anyway, I just checked Firefox and it is trying exactly one IPv6 and exactly one IPv4 address. I believe, that multiple DNS records are mostly used for load balancing, not fault-tolerance, so attempting only the first address by default makes most sense. Having an option to control this is useful, of course. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
p2
May 10, 2014
Is there need to be able to specify both timeouts independently? When I specify the timeout, I'm thinking of it as "I don't want this line of code to run longer than x seconds", I don't care which part of the connection takes how long.
It seems to me this would be a true "human" interpretation and could be implemented without having to rely on urllib3 by an internal timer that kills the request if it hasn't returned within the timeout.
p2
commented
May 10, 2014
|
Is there need to be able to specify both timeouts independently? When I specify the timeout, I'm thinking of it as "I don't want this line of code to run longer than x seconds", I don't care which part of the connection takes how long. It seems to me this would be a true "human" interpretation and could be implemented without having to rely on urllib3 by an internal timer that kills the request if it hasn't returned within the timeout. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
kevinburke
May 10, 2014
Contributor
@Lukasa urllib3/urllib3#326 ; though now that I read it more carefully, if the OS itself is trying each DNS record in turn then there's not much that can be done. That pull request lets you specify the number of times you would like to retry a connection failure, whether a timeout or an error.
|
@Lukasa urllib3/urllib3#326 ; though now that I read it more carefully, if the OS itself is trying each DNS record in turn then there's not much that can be done. That pull request lets you specify the number of times you would like to retry a connection failure, whether a timeout or an error. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
kevinburke
May 10, 2014
Contributor
@p2 Sadly computing the wall clock time for an HTTP request remains incredibly difficult. Mostly, our tools for determining the amount of time used for system calls like DNS resolution, establishing a socket, and sending data (and then passing this information to the necessary places, and subtracting these from a total amount of time) remains difficult.
My suggestion would be to run your HTTP request in a separate thread, then use a timer to cancel the thread if it takes longer than a stated amount of time to return a value. gevent can be used for this: http://www.gevent.org/gevent.html#timeouts
Sorry I can't be more helpful :(
|
@p2 Sadly computing the wall clock time for an HTTP request remains incredibly difficult. Mostly, our tools for determining the amount of time used for system calls like DNS resolution, establishing a socket, and sending data (and then passing this information to the necessary places, and subtracting these from a total amount of time) remains difficult. My suggestion would be to run your HTTP request in a separate thread, then use a timer to cancel the thread if it takes longer than a stated amount of time to return a value. gevent can be used for this: http://www.gevent.org/gevent.html#timeouts Sorry I can't be more helpful :( |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
p2
May 10, 2014
@kevinburke That is kind of what I do now, I was just wondering if this would make sense as a default approach for requests as well. I personally don't have a need to specify individual timeouts, but that assumption may be too naïve.
p2
commented
May 10, 2014
|
@kevinburke That is kind of what I do now, I was just wondering if this would make sense as a default approach for requests as well. I personally don't have a need to specify individual timeouts, but that assumption may be too naïve. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
kirelagin
May 10, 2014
@p2 I agree that “human” interpretation is to have a total timeout. And, by the way, that's how requests works right now. But there, still might be possibilities when you want a more fine-grained control.
kirelagin
commented
May 10, 2014
|
@p2 I agree that “human” interpretation is to have a total timeout. And, by the way, that's how |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
kirelagin
May 10, 2014
Also, as a human when I'm telling a line of code “go, try to connect to Google with this timeout” I'm not thinking about multiple DNS A-records. I'm thinking of Google as a single entity. So there are, naturally, two sane options:
- Do not attempt multiple IPs. If some library code does this, I consider this code broken. If some OS does this, I consider the OS poor.
- Do whatever library/OS wants, but have timeout guard total execution time of all the queries. That's totally weird, but somewhat reasonable and definitely more natural than what's happening now…
kirelagin
commented
May 10, 2014
|
Also, as a human when I'm telling a line of code “go, try to connect to Google with this timeout” I'm not thinking about multiple DNS A-records. I'm thinking of Google as a single entity. So there are, naturally, two sane options:
|
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
Lukasa
May 11, 2014
Member
Ok, a lot of conversation happened here, let me deal with it in turn.
Anyway, I just checked Firefox and it is trying exactly one IPv6 and exactly one IPv4 address. I believe, that multiple DNS records are mostly used for load balancing, not fault-tolerance, so attempting only the first address by default makes most sense.
You checked Firefox against actual google.com then, not on an incorrect port. Browsers will also fall back to the other addresses if the first doesn't respond. This makes sense. Having multiple A records means "this host is available at these addresses". If I can't contact that host at one of those addresses, it's nonsensical to say "welp, clearly the host is down" when I know several other addresses I might be able to contact them at.
This feature of 'multiple addresses' is widely used for both balancing load and fault tolerance. In fact, if you really want to balance load then DNS SRV is the mechanism to use, not A/AAAA, as it provides better control over how the load is spread.
Is there need to be able to specify both timeouts independently? When I specify the timeout, I'm thinking of it as "I don't want this line of code to run longer than x seconds", I don't care which part of the connection takes how long.
The short answer is 'yes', because of the stream keyword argument. If you've set stream=True and use iter_content() or iter_lines(), it's useful to be able to set a timeout for how long those calls can block.
It seems to me this would be a true "human" interpretation and could be implemented without having to rely on urllib3 by an internal timer that kills the request if it hasn't returned within the timeout.
As @kevinburke points out, this isn't as easy as it seems. More importantly, it also leaves us exposed to implementation details. 'Until the request returns' is not a well-defined notion. What does it mean for the request to return? Do I have to download the whole body? Just the headers? Just the status line? Whatever we choose is going to be utterly arbitrary.
Also, as a human when I'm telling a line of code “go, try to connect to Google with this timeout” I'm not thinking about multiple DNS A-records. I'm thinking of Google as a single entity.
Agreed.
- Do not attempt multiple IPs. If some library code does this, I consider this code broken. If some OS does this, I consider the OS poor.
Woah, now you go off the rails. If you are thinking of Google as a single entity, then you would expect us to connect to it if it's up. If one time in seven we fail to connect, even though you always connect fine in your browser, you're going to assume requests is bugged as hell.
If a host is up, we must be able to connect you to it.
The ideal fix, from my position, would be to take over the logic used in socket.create_connection(). This allows us to have fine-grained control over timeouts. Unfortunately, it also complicates the timeout logic further, as you'd now have per-host connection attempt timeouts, total connection attempt timeout, and read timeout. That's beginning to become hugely complicated and to expose people to the complexities of TCP and IP in a way that I'm not delighted about.
|
Ok, a lot of conversation happened here, let me deal with it in turn.
You checked Firefox against actual google.com then, not on an incorrect port. Browsers will also fall back to the other addresses if the first doesn't respond. This makes sense. Having multiple A records means "this host is available at these addresses". If I can't contact that host at one of those addresses, it's nonsensical to say "welp, clearly the host is down" when I know several other addresses I might be able to contact them at. This feature of 'multiple addresses' is widely used for both balancing load and fault tolerance. In fact, if you really want to balance load then DNS SRV is the mechanism to use, not A/AAAA, as it provides better control over how the load is spread.
The short answer is 'yes', because of the
As @kevinburke points out, this isn't as easy as it seems. More importantly, it also leaves us exposed to implementation details. 'Until the request returns' is not a well-defined notion. What does it mean for the request to return? Do I have to download the whole body? Just the headers? Just the status line? Whatever we choose is going to be utterly arbitrary.
Agreed.
Woah, now you go off the rails. If you are thinking of Google as a single entity, then you would expect us to connect to it if it's up. If one time in seven we fail to connect, even though you always connect fine in your browser, you're going to assume requests is bugged as hell. If a host is up, we must be able to connect you to it. The ideal fix, from my position, would be to take over the logic used in |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
|
Plans have been made. Expect something soon ;) |
added a commit
to fedora-infra/supybot-fedora
that referenced
this pull request
Jul 19, 2014
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
|
@kevinburke can you rebase? |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
|
Closing this one, I re-added it in #2176 |
kevinburke commentedDec 16, 2013
Per discussion with @sigmavirus24 in IRC, this PR kills the Timeout class and adds support for connect timeouts via a simple tuple interface:
Sometimes you try to connect to a dead host (this happens to us all the time, because of AWS) and you would like to fail fairly quickly in this case; the request has no chance of succeeding. However, once the connection succeeds, you'd like to give the server a fairly long time to process the request and return a response.
The attached tests and documentation have more information. Hope this is okay!