My program uses SQS a lot over the course of many days. I have noticed that my program will fall to 0% cpu usage and just get stuck their for a long time. When the program is suck i can debug it to obtain the following stack trace. It appears as though the socket isn't timing out properly. I think the best solution is to expose httplib's timeout value on Boto's connection object so that I can lower it. What does the Boto community think?
File "/usr/lib/pymodules/python2.7/boto/sqs/queue.py", line 245, in get_messages
File "/usr/lib/pymodules/python2.7/boto/sqs/connection.py", line 157, in receive_message
File "/usr/lib/pymodules/python2.7/boto/connection.py", line 605, in get_list
response = self.make_request(action, params, path, verb)
File "/usr/lib/pymodules/python2.7/boto/connection.py", line 592, in make_request
File "/usr/lib/pymodules/python2.7/boto/connection.py", line 461, in make_request
return self._mexe(method, path, data, headers, host, sender)
File "/usr/lib/pymodules/python2.7/boto/connection.py", line 395, in _mexe
response = connection.getresponse()
File "/usr/lib/python2.7/httplib.py", line 1027, in getresponse
File "/usr/lib/python2.7/httplib.py", line 407, in begin
version, status, reason = self._read_status()
File "/usr/lib/python2.7/httplib.py", line 365, in _read_status
line = self.fp.readline()
File "/usr/lib/python2.7/socket.py", line 430, in readline
data = recv(1)
I believe the socket timeout is global, so worse case, you may be able to set it via the socket module. Worse case, we're working on a re-factor to replace a bunch of the plumbing that may resolve this.
Well, if you had bought my handy "Python and AWS Cookbook" 8^) 8^) you would know that the socket time can be configured in your boto config file like this:
http_socket_timeout = 5
The timeout is specified in seconds. @gtaylor is correct that this is a global setting that is actually buried down in the socket module. This just gives you a convenient way to control it.
I'm going to close this issue. Please re-open if you feel some action should be taken in boto to address it.
@garnaat Yes I think boto needs to take action to address this issue. All connection object should have a timeout. This isn't just for boto, from now on every time you write a connection object it should have a timeout. Requireing us to go though every damn socket in our entire application and just to change the timeout is a fucking nightmare. It was less code for us to modify boto, because that is where the mistake is.
@gtaylor Asking us to adjust the timeout for every other socket in the entire application is a really horrible suggestion. We hacked up boto, because every connection object needs a timeout.
@MikeBrooks Blame Python, not me. The global socket timeout is nothing new: http://docs.python.org/library/socket.html#socket.socket.settimeout
@gtaylor Every connection object needs a timeout value. Python adheres to this important design, httplib and all of their connection objects have a timeout value that can be passed in to their constructor and has a getter/setter method.
socket.getdefaulttimeout() defaults to None, so boto crashes at 0% cpu for infinite time. Yes, this is a very serious problem.
@MikeBrooks - Your anger is misdirected.
You can override it in the config file. We can add an optional parameter to Connection objects. I would rather make the change once in the config rather than have to remember everytime I create a Connection but it's not a difficult thing to add if it helps. Note that the timeout parameter to httplib connections was not added until Python 2.6.
@garnaat This doesn't have to break compatibility for anyone. Have a default value timeout=None (or check if its None and set it to socket.getdefaulttimeout()), the user can override it if they want but no one is forced to. This is exactly the call convention for httilib. Having a getter/setter for this value would also be very nice.
(I didn't even know there was a config file.)
As I said, I'm fine adding that. The only reason it's not there is that no one has asked for it before. A previous discussion:
resulted in a pull request that added the timeout to the boto config file.
I think that both systems would work well together if the default was the config file value.
Cool, sorry for being such a dick and a regularity-nazi. Thank you for hearing me out, we both want Boto to be a strong project.
No problem, sounded like it was all pretty fresh in your mind so you undoubtedly just got burned by it. Totally understand. i will re-open.