socket.timeout exception #1787

Closed
ThiefMaster opened this Issue Dec 10, 2013 · 31 comments

9 participants

@ThiefMaster

I think that exception should be handled/wrapped internally and be reraised as a RequestException-based exception.

Traceback (most recent call last):
  File "X:\dev\rotaryboard\daemon\testdaemon.py", line 117, in <module>
    res = foobar_cmd()
  File "X:\dev\rotaryboard\daemon\testdaemon.py", line 53, in foobar_cmd
    return requests.get(url, params=params, auth=('foobar', 'meow'), timeout=0.1).json()
  File "F:\Python27\lib\site-packages\requests\api.py", line 55, in get
    return request('get', url, **kwargs)
  File "F:\Python27\lib\site-packages\requests\api.py", line 44, in request
    return session.request(method=method, url=url, **kwargs)
  File "F:\Python27\lib\site-packages\requests\sessions.py", line 382, in request
    resp = self.send(prep, **send_kwargs)
  File "F:\Python27\lib\site-packages\requests\sessions.py", line 485, in send
    r = adapter.send(request, **kwargs)
  File "F:\Python27\lib\site-packages\requests\adapters.py", line 388, in send
    r.content
  File "F:\Python27\lib\site-packages\requests\models.py", line 676, in content
    self._content = bytes().join(self.iter_content(CONTENT_CHUNK_SIZE)) or bytes()
  File "F:\Python27\lib\site-packages\requests\models.py", line 615, in generate
    decode_content=True):
  File "F:\Python27\lib\site-packages\requests\packages\urllib3\response.py", line 236, in stream
    data = self.read(amt=amt, decode_content=decode_content)
  File "F:\Python27\lib\site-packages\requests\packages\urllib3\response.py", line 183, in read
    data = self._fp.read(amt)
  File "F:\Python27\lib\httplib.py", line 552, in read
    s = self.fp.read(amt)
  File "F:\Python27\lib\httplib.py", line 1288, in read
    return s + self._file.read(amt - len(s))
  File "F:\Python27\lib\socket.py", line 378, in read
    data = self._sock.recv(left)
socket.timeout: timed out
@ThiefMaster

Ok, seems to be a duplicate of #1236... still annoying.

@adaschevici

With the latest version this no longer seems to show. The socket exception has been wrapped with a requests.exceptions.TimeoutException .
I think it is fair to say this can be closed?

@ThiefMaster

It happened with requests 2.1.0. So if that's the "latest version" it might not be wrapped in all cases.

@Lukasa
Collaborator

That's actually a urllib3 bug: socket.timeout should be caught in urllib3 and rethrown as a urllib3 exception. Mind opening this issue over there?

@adaschevici

@ThiefMaster , Hi, can you still reproduce this using the requests package compiled from the sources on here?
I burrowed down into the urllib3 and it seems i can't reproduce this issue.
Can you provide me with the server code or give me some hints of what it is or description something to help me diagnose the problem better?
Thanks.

@ThiefMaster

When it happened the server was not running. And even then it just happened sporadically (the script was trying to GET a page every ~0.5s all night and at some point failed with the error above).

@adamsc

@adaschevici I've been experiencing this bug sporadically too. It's been occurring under similar circumstances to @ThiefMaster. I'm just running a script which makes a bunch of GET requests on 3rd party servers. About 1 of every 100 requests which time out throw the bad exception.

I've been attempting to reproduce it consistently, but haven't had any luck.

edit: This seems to be the exact same issue as: kennethreitz/requests#1236. I'm also using 2.1.0

@joepie91

Okay, so what I can tell from a cursory look at the code:

  • This socket.timeout exception will only happen when the connection was successful, but it times out while reading the response from the server, ie. the server went away inbetween the original request and finishing the response retrieval. HTTP requests are usually fairly short-lived, thus this would be an uncommon scenario - that'd explain why it's hard to reproduce.
  • The issue manifests itself in urllib3's response.py.
  • httplib does not wrap this in a custom kind of exception, and just leaves it propagate as socket.exception. It is unclear to me whether this is intended behaviour or not.
  • To fix this, urllib3 would have to capture socket.exceptions on .read()/.recv() calls of the socket (via httplib), and re-raise them as a urllib3 exception of some sort (I'm unfamiliar with the internals of urllib3).

I will also cross-post this to shazow/urllib3#297.

@adaschevici

@ThiefMaster , Thanks. I will try and reproduce this. Regarding the description provided it seems that it will occur on server overload.
When i tried to overload it in ubuntu what i got was a pipe error. I've been trying it on windows but unfortunately the tests are failing before this one.
We might be looking into a platform dependent test?
Or maybe a patch test to provide the coverage?

CrossRef: shazow/urllib3#297

@Lukasa
Collaborator

Hang on, it looks to me like urllib3 doesn't catch socket.timeout in Response.read(). That feels clearly and obviously wrong to me. @shazow, (and @kevinburke since you implemented this), am I missing something here?

@shazow
Collaborator

Makes sense, if we can get a test that triggers a socket.timeout in Response.read().

@Lukasa
Collaborator

@shazow Pfft, as a socket-level test that's easy =D. I'll take a swing at it at somepoint today if someone doesn't beat me to it.

@shazow
Collaborator

👍

@adaschevici

@Lukasa , where do you mean to catch the exception
In httpresponse.read(), yes?
I have been working with it and trying to get the exception thrown from there but i've had no successful attempts so far.

Anyway...i will retrace my steps and see where i went wrong maybe i will come up with the proper test.

@adaschevici

Given further investigation, this seems to no longer show, at least under ubuntu.
From urllib3.
I built a http server and was returning a file on get request, made the get request run multiple times.
Closed the server.
All i am getting for a stack trace is this:
File "/home/hackawaye/PycharmProjects/test_urllib3/test_urllib3.py", line 15, in
r = pool.request('GET', '/', timeout=20)
File "/home/hackawaye/virtual_envs/urllib3_virtual/local/lib/python2.7/site-packages/urllib3-dev-py2.7.egg/urllib3/request.py", line 75, in request
*urlopen_kw)
File "/home/hackawaye/virtual_envs/urllib3_virtual/local/lib/python2.7/site-packages/urllib3-dev-py2.7.egg/urllib3/request.py", line 88, in request_encode_url
return self.urlopen(method, url, *
urlopen_kw)
File "/home/hackawaye/virtual_envs/urllib3_virtual/local/lib/python2.7/site-packages/urllib3-dev-py2.7.egg/urllib3/connectionpool.py", line 544, in urlopen
release_conn=release_conn, *response_kw)
File "/home/hackawaye/virtual_envs/urllib3_virtual/local/lib/python2.7/site-packages/urllib3-dev-py2.7.egg/urllib3/connectionpool.py", line 544, in urlopen
release_conn=release_conn, *
response_kw)
File "/home/hackawaye/virtual_envs/urllib3_virtual/local/lib/python2.7/site-packages/urllib3-dev-py2.7.egg/urllib3/connectionpool.py", line 544, in urlopen
release_conn=release_conn, **response_kw)
File "/home/hackawaye/virtual_envs/urllib3_virtual/local/lib/python2.7/site-packages/urllib3-dev-py2.7.egg/urllib3/connectionpool.py", line 528, in urlopen
raise MaxRetryError(self, url, e)
urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='localhost', port=8888): Max retries exceeded with url: / (Caused by : [Errno 111] Connection refused)

@adaschevici

In requests(Ubuntu) :

Traceback (most recent call last):
File "/home/hackawaye/PycharmProjects/test_urllib3/test_urllib3.py", line 9, in
r = requests.get('http://localhost', timeout=0.5)
File "/home/hackawaye/virtual_envs/requests_virtual/local/lib/python2.7/site-packages/requests-2.1.0-py2.7.egg/requests/api.py", line 55, in get
return request('get', url, *kwargs)
File "/home/hackawaye/virtual_envs/requests_virtual/local/lib/python2.7/site-packages/requests-2.1.0-py2.7.egg/requests/api.py", line 44, in request
return session.request(method=method, url=url, *
kwargs)
File "/home/hackawaye/virtual_envs/requests_virtual/local/lib/python2.7/site-packages/requests-2.1.0-py2.7.egg/requests/sessions.py", line 382, in request
resp = self.send(prep, *send_kwargs)
File "/home/hackawaye/virtual_envs/requests_virtual/local/lib/python2.7/site-packages/requests-2.1.0-py2.7.egg/requests/sessions.py", line 485, in send
r = adapter.send(request, *
kwargs)
File "/home/hackawaye/virtual_envs/requests_virtual/local/lib/python2.7/site-packages/requests-2.1.0-py2.7.egg/requests/adapters.py", line 372, in send
raise ConnectionError(e)
requests.exceptions.ConnectionError: HTTPConnectionPool(host='localhost', port=80): Max retries exceeded with url: / (Caused by : [Errno 104] Connection reset by peer)

Both stack traces were with the code commented from the read method in urllib so i was not attempting to catch the timeout exception at all. I will test it on windows later on to see if it shows.

@Lukasa
Collaborator

The way I'd test this is to use a urllib3 socket-level test, like this one, but rewritten to close the socket midway through the data. Something like this:

# At the top of the file, some more imports
from urllib3.import Timeout

# Down with the test I linked above
    def test_timeout_midway_through_read(self):
        def socket_handler(listener):
            sock = listener.accept()[0]

            buf = b''
            while not buf.endswith(b'\r\n\r\n'):
                buf = sock.recv(65536)

            body = 'Test Data'
            sock.send(('HTTP/1.1 200 OK\r\n'
                      'Content-Type: text/plain\r\n'
                      'Content-Length: %d\r\n'
                      '\r\n' % len(body)).encode('utf-8'))

            # Wait for the read timeout.
            time.sleep(0.002)

            sock.send(body.encode('utf-8'))
            sock.close()

        self._start_server(socket_handler)
        pool = HTTPConnectionPool(self.host, self.port)

        response = pool.urlopen('GET', '/', timeout=Timeout(connect=1, read=0.001))
        response.read() # Should throw our exception.

Confirm that this test is OK: I wrote it in this text box and haven't run it, so there could be all kinds of crazy problems here.

@adaschevici

Looks ok to me. This is the output i am getting from the test. It's similar to what i was getting before, seems the exception has been wrapped properly.

Traceback (most recent call last):
File "/home/hackawaye/virtual_envs/urllib3/test/with_dummyserver/test_socketlevel.py", line 357, in test_timeout_midway_through_read
response = pool.urlopen('GET', '/', timeout=Timeout(connect=1, read=0.001))
File "/home/hackawaye/virtual_envs/urllib3/urllib3/connectionpool.py", line 544, in urlopen
release_conn=release_conn, *response_kw)
File "/home/hackawaye/virtual_envs/urllib3/urllib3/connectionpool.py", line 544, in urlopen
release_conn=release_conn, *
response_kw)
File "/home/hackawaye/virtual_envs/urllib3/urllib3/connectionpool.py", line 544, in urlopen
release_conn=release_conn, **response_kw)
File "/home/hackawaye/virtual_envs/urllib3/urllib3/connectionpool.py", line 528, in urlopen
raise MaxRetryError(self, url, e)
MaxRetryError: HTTPConnectionPool(host='localhost', port=35642): Max retries exceeded with url: / (Caused by : [Errno 111] Connection refused)
-------------------- >> begin captured logging << --------------------
urllib3.connectionpool: INFO: Starting new HTTP connection (1): localhost
urllib3.connectionpool: DEBUG: "GET / HTTP/1.1" 200 9
urllib3.connectionpool: WARNING: Retrying (3 attempts remain) after connection broken by 'timeout('timed out',)': /
urllib3.connectionpool: INFO: Starting new HTTP connection (2): localhost
urllib3.connectionpool: WARNING: Retrying (2 attempts remain) after connection broken by 'error(104, 'Connection reset by peer')': /
urllib3.connectionpool: INFO: Starting new HTTP connection (3): localhost
urllib3.connectionpool: WARNING: Retrying (1 attempts remain) after connection broken by 'error(111, 'Connection refused')': /
urllib3.connectionpool: INFO: Starting new HTTP connection (4): localhost
--------------------- >> end captured logging << ---------------------

@Lukasa
Collaborator

Arg, sorry, you need to turn the retries off. Pass retries=0 into urlopen.

@adaschevici

Yes, did that same result:
My guess is that it is being wrapped from the HTTPConnectionPool
Traceback (most recent call last):
File "/home/hackawaye/virtual_envs/urllib3/test/with_dummyserver/test_socketlevel.py", line 357, in test_timeout_midway_through_read
response = pool.urlopen('GET', '/', retries=0, timeout=Timeout(connect=1, read=0.001))
File "/home/hackawaye/virtual_envs/urllib3/urllib3/connectionpool.py", line 528, in urlopen
raise MaxRetryError(self, url, e)
MaxRetryError: HTTPConnectionPool(host='localhost', port=34819): Max retries exceeded with url: / (Caused by : timed out)

Doing this on ubuntu i will try it on windows tonight.

@Lukasa
Collaborator

See, we're failing in the wrong place here. I'm expecting to throw the exception in read() not in urlopen. I'll see if I can work out why.

@Lukasa
Collaborator

Uh, I know why. Also pass preload_content=False to urlopen().

@adaschevici

good call. Throws it now. Wasn't aware of that preload_content directive.
I will update the fix and get the test in as well. Thanks for the help.

@Leandros

I'am gona sneak into the thread, because I've experienced the very same bug.

The fix isn't live yet, am I right? I'am using the latest version from requests and it still occurs to me.
Is there a way to fix that these exceptions didn't even occur?

My code which raise the exception:

try:
    if session:
        r = session.get(url, stream=True)
    else:
        r = requests.get(url, stream=True)
except requests.exceptions.ConnectionError as e:
    xbmc.log("Saving Error: " + e, xbmc.LOGDEBUG)
    return
with open(file_path, "wb") as file:
    for chunk in r.iter_content(1024):
        if not chunk:
            break
        file.write(chunk)

The exception itself:

18:34:29 T:5812   ERROR: EXCEPTION Thrown (PythonToCppException) : -->Python callback/script returned the following error<--
                                             - NOTE: IGNORING THIS CAN LEAD TO MEMORY LEAKS!
                                            Error Type: <class 'socket.timeout'>
                                            Error Contents: timed out
                                            Traceback (most recent call last):
                                              File "C:\Users\Leandros\AppData\Roaming\XBMC\addons\steambmc\default.py", line 211, in <module>
                                                steamuser.getOwnedGames(prog_callback=progress, artupdate=True)
                                              File "C:\Users\Leandros\AppData\Roaming\XBMC\addons\steambmc\steamapi.py", line 270, in getOwnedGames
                                                game.scrapePromo(s, artupdate)
                                              File "C:\Users\Leandros\AppData\Roaming\XBMC\addons\steambmc\steamapi.py", line 165, in scrapePromo
                                                file_path = self.__downloadFile(url, session, "promo")
                                              File "C:\Users\Leandros\AppData\Roaming\XBMC\addons\steambmc\steamapi.py", line 198, in __downloadFile
                                                for chunk in r.iter_content():
                                              File "C:\Users\Leandros\AppData\Roaming\XBMC\addons\script.module.requests2\lib\requests\models.py", line 616, in generate
                                                decode_content=True):
                                              File "C:\Users\Leandros\AppData\Roaming\XBMC\addons\script.module.requests2\lib\requests\packages\urllib3\response.py", line 236, in stream
                                                data = self.read(amt=amt, decode_content=decode_content)
                                              File "C:\Users\Leandros\AppData\Roaming\XBMC\addons\script.module.requests2\lib\requests\packages\urllib3\response.py", line 183, in read
                                                data = self._fp.read(amt)
                                              File "C:\Program Files (x86)\XBMC\system\python\Lib\httplib.py", line 542, in read
                                                s = self.fp.read(amt)
                                              File "C:\Program Files (x86)\XBMC\system\python\Lib\socket.py", line 377, in read
                                                data = self._sock.recv(left)
                                            timeout: timed out
                                            -->End of Python script error report<--
@Lukasa
Collaborator

Correct, the fix isn't live yet. This exception is raised when a server doesn't respond with data within our timeout. To prevent these, you can try extending the timeout by passing a timeout parameter to requests.get() or session.get(). However, if the server has for some reason stopped sending data, that bug will manifest in this way.

@Leandros

Ok, I did tested it with a 30 seconds timeout, but I'am going to add a bit higher one and try to repeat the request if it fails.

Thanks!

@darshahlu

Is this bug fixed yet?

I am getting this same "socket.timeout: timed out" exception when using requests with a particular embedded device.

The exception occurs even when I use Session with max_retries argument: session.mount('http://', HTTPAdapter(max_retries=10). I'm guessing that when the "socket.timeout" exception is translated to a requests-based exception, that max_retries will catch it and retry.

In the meantime, any advice to get around this "socket.timeout" exception? I suppose I could catch and retry in a higher level.

@Lukasa
Collaborator

@darshahlu To be clear, are you actually getting a socket.timeout exception, or one that has been wrapped? Can I see the full exception text?

@darshahlu

@Lukasa Sure, here is the full exception (starting at requests code):

File "C:\dev\virtualenvs\py27mft\lib\site-packages\requests\sessions.py", line 395, in get
return self.request('GET', url, *kwargs)
File "C:\dev\virtualenvs\py27mft\lib\site-packages\requests\sessions.py", line 383, in request
resp = self.send(prep, *
send_kwargs)
File "C:\dev\virtualenvs\py27mft\lib\site-packages\requests\sessions.py", line 486, in send
r = adapter.send(request, **kwargs)
File "C:\dev\virtualenvs\py27mft\lib\site-packages\requests\adapters.py", line 394, in send
r.content
File "C:\dev\virtualenvs\py27mft\lib\site-packages\requests\models.py", line 679, in content
self._content = bytes().join(self.iter_content(CONTENT_CHUNK_SIZE)) or bytes()
File "C:\dev\virtualenvs\py27mft\lib\site-packages\requests\models.py", line 616, in generate
decode_content=True):
File "C:\dev\virtualenvs\py27mft\lib\site-packages\requests\packages\urllib3\response.py", line 236, in stream
data = self.read(amt=amt, decode_content=decode_content)
File "C:\dev\virtualenvs\py27mft\lib\site-packages\requests\packages\urllib3\response.py", line 183, in read
data = self._fp.read(amt)
File "C:\Python27\Lib\httplib.py", line 567, in read
s = self.fp.read(amt)
File "C:\Python27\Lib\socket.py", line 380, in read
data = self._sock.recv(left)
socket.timeout: timed out

When I merged the fixed response.py (https://raw.githubusercontent.com/adaschevici/urllib3/296-exception-not-properly-wrapped/urllib3/response.py), the exception now changed to (the retry did not occur as I hoped):

File "C:\dev\virtualenvs\py27mft\lib\site-packages\requests\sessions.py", line 395, in get
return self.request('GET', url, *kwargs)
File "C:\dev\virtualenvs\py27mft\lib\site-packages\requests\sessions.py", line 383, in request
resp = self.send(prep, *
send_kwargs)
File "C:\dev\virtualenvs\py27mft\lib\site-packages\requests\sessions.py", line 486, in send
r = adapter.send(request, **kwargs)
File "C:\dev\virtualenvs\py27mft\lib\site-packages\requests\adapters.py", line 394, in send
r.content
File "C:\dev\virtualenvs\py27mft\lib\site-packages\requests\models.py", line 679, in content
self._content = bytes().join(self.iter_content(CONTENT_CHUNK_SIZE)) or bytes()
File "C:\dev\virtualenvs\py27mft\lib\site-packages\requests\models.py", line 616, in generate
decode_content=True):
File "C:\dev\virtualenvs\py27mft\lib\site-packages\requests\packages\urllib3\response.py", line 245, in stream
data = self.read(amt=amt, decode_content=decode_content)
File "C:\dev\virtualenvs\py27mft\lib\site-packages\requests\packages\urllib3\response.py", line 223, in read
"Remote connection closed. Read timed out.")
urllib3.exceptions.ReadTimeoutError: : Remote connection closed. Read timed out.

By the way: this issue is easily reproduced on this particular embedded device--occurs within seconds. I took a packet trace and I can see a difference between no-issue and issue--but not sure if it is the client (requests) or the server (my embedded device) that is misbehaving.

Thanks!
Darhsan

@Lukasa
Collaborator

Ok, so it looks like there's a fix for the exact issue in the GH issue (unwrapped socket error), so that's something.

I don't think max_retries will help you here. As far as I recall, max_retries only retries if errors are occurred during connection, which is not happening here. Here, we're timing out our attempt to read the response.

You say you took a packet trace and can't see the difference: do you mind posting them?

@darshahlu

Absolutely. Thanks for taking a look!

There are two traces--one with the issue and one without.

Issue trace:
https://drive.google.com/file/d/0B7V4ToYxdpLwVjFDRmVIUXF4b28/edit?usp=sharing

No Issue trace:
https://drive.google.com/file/d/0B7V4ToYxdpLwSlNKY2NZZW1FSVk/edit?usp=sharing

In the issue trace, you can see that the server continually retransmits the FIN,ACK--as if it never received the client ACK. Shouldn't the client reply with the ACK the server is waiting to see?

@Lukasa Lukasa closed this Jan 19, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment