Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

get request hangs "in readinto return self._sock.recv_into(b)" without raising exception #3353

Closed
FlxVctr opened this issue Jun 18, 2016 · 20 comments

Comments

@FlxVctr
Copy link

FlxVctr commented Jun 18, 2016

Hi,

I'm pretty desperately trying to nail down the reason that an url unshortener script is freezing from time to time.

I'm using requests with several head and get-requests, such as

with requests.Session as s:
    r = s.get(url, allow_redirects=True, timeout=self.timeout, headers=self.headers, verify=False)

where self.timeout are 10 seconds and the headers just include a user-agent.

The request is called by a worker in a multiprocessing Pool with 10 workers that get's constantly feeded via Pool.imap() with a list of urls.

When keyboard interrupting I get from some of the workers the following Traceback:

Traceback (most recent call last):
  File "---/dataprocessor.py", line 98, in try_unshorten_expanded_urls
    result = unshorten_expanded_urls(tweet_dataframe_tuple)
  File "---/dataprocessor.py", line 84, in unshorten_expanded_urls
    'unshortened_url': u.unshorten(expanded_urls[i]),
  File "---/unshorten.py", line 162, in unshorten
    verify=False)
  File "/home/ubuntu/miniconda3/envs/TQ/lib/python3.5/site-packages/requests/sessions.py", line 480, in get
    return self.request('GET', url, **kwargs)
  File "/home/ubuntu/miniconda3/envs/TQ/lib/python3.5/site-packages/requests/sessions.py", line 468, in request
    resp = self.send(prep, **send_kwargs)
  File "/home/ubuntu/miniconda3/envs/TQ/lib/python3.5/site-packages/requests/sessions.py", line 608, in send
    r.content
  File "/home/ubuntu/miniconda3/envs/TQ/lib/python3.5/site-packages/requests/models.py", line 737, in content
    self._content = bytes().join(self.iter_content(CONTENT_CHUNK_SIZE)) or bytes()
  File "/home/ubuntu/miniconda3/envs/TQ/lib/python3.5/site-packages/requests/models.py", line 660, in generate
    for chunk in self.raw.stream(chunk_size, decode_content=True):
  File "/home/ubuntu/miniconda3/envs/TQ/lib/python3.5/site-packages/requests/packages/urllib3/response.py", line 344, in stream
    data = self.read(amt=amt, decode_content=decode_content)
  File "/home/ubuntu/miniconda3/envs/TQ/lib/python3.5/site-packages/requests/packages/urllib3/response.py", line 301, in read
    data = self._fp.read(amt)
  File "/home/ubuntu/miniconda3/envs/TQ/lib/python3.5/http/client.py", line 433, in read
    n = self.readinto(b)
  File "/home/ubuntu/miniconda3/envs/TQ/lib/python3.5/http/client.py", line 473, in readinto
    n = self.fp.readinto(b)
  File "/home/ubuntu/miniconda3/envs/TQ/lib/python3.5/socket.py", line 575, in readinto
    return self._sock.recv_into(b)
KeyboardInterrupt

If the function would raise an exception, I've made sure that it would be catched within the function so that it would not block the pool. But as it seems, readinto is just hanging here (for hours). I don't know whether this is related to #3066, as I am not familiar enough with the protocols or requests itself, so please excuse a duplicate if this should be one or if the error is my bad.

The freeze happened several times with the same batch of urls now.
I tried to find the exact url causing this freeze but did not succeed as the batch contains several thousand, though I've already excluded everything that exceeds a certain content length or does not have text as content type via a head request before the get.

I am running requests 2.10.0 with Python 3.5.1 in a miniconda environment installed via pip on an Ubuntu 14.04.

However, great library, hope this helps to make it better.

Cheers!

UPDATE:

Here the relevant code for the traceback (line 162 in unshorten.py):

def check_size_and_type(request_object, max_content_length):
    try:
        length = int(request_object.headers['content-length'])
    except:
        length = None
    try:
        type = str(request_object.headers['content-type'])
    except:
        type = ''
    if ((length is None or length < max_content_length) and
        (type.startswith('text/html') or
         type.startswith('application/xhtml') or
         type.startswith('text/xml'))):
        return True
    else:
        return False

with requests.Session() as s:
    r = s.head(url, allow_redirects=True,
                     timeout=self.timeout,
                     headers=self.headers,
                     verify=False)

if check_size_and_type(r, self.max_content_length):
    r = s.get(url, allow_redirects=True,
    timeout=self.timeout,
    headers=self.headers,
    verify=False)
@Lukasa
Copy link
Member

Lukasa commented Jun 18, 2016

Thanks for this report!

What worries me is that we're getting stuck even with a timeout. The timeout should prevent any socket call lasting longer than that number of seconds. This suggests two possibilities to me that aren't bugs, as well as the possibility of a bug (though it's unlikely).

Firstly, can you use a print statement to confirm your timeout value at the call site?

Secondly, it's possible that data is slowly being fed to the socket so that each individual socket call takes less than 10s. Can you try waiting to see if this problem resolves itself?

Thirdly is the risk of bugs. For that I ideally need a reproducible script I can run in my own environment. Do you think you can produce one?

@FlxVctr
Copy link
Author

FlxVctr commented Jun 18, 2016

Thanks for the quick answer!

I can confirm via PyCharm Debugger and print statements that the timeout is 10 seconds for every call.

It could be that data is slowly being fed, but it hanged for more than 8 hours the last time and max_content_length is 500kB. Waiting for such a long time (especially if it happens over the weekend) leads to quite a bunch of data piling up to be processed afterwards and is not available for live analysis.

This happened already for several datasets we're collecting.

I will try to write something, but to make it reproducible I'd have to narrow down a smaller batch of urls causing it. This could take a while.

@Lukasa
Copy link
Member

Lukasa commented Jun 19, 2016

Yeah, that's fine, there's no rush, but I'm unlikely to be able to discover this without a good repro scenario: it's likely that this is a weird interaction around sockets, forking, and the standard library.

@FlxVctr
Copy link
Author

FlxVctr commented Jun 19, 2016

Ok, I think I found the url. But it's no problem with requests then. Rather a misconfigured webserver ...

>>> r = requests.head('http://bluepoint-radio.de:9080/')
>>> r.headers
{'Content-Type': 'text/html'}

😡

Thanks for your help though.

@FlxVctr FlxVctr closed this as completed Jun 19, 2016
@FlxVctr
Copy link
Author

FlxVctr commented Jun 19, 2016

Yes, reproduces the error:

>>> r = requests.get('http://bluepoint-radio.de:9080/', timeout=5)
^CTraceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/ubuntu/miniconda3/envs/TQ/lib/python3.5/site-packages/requests/api.py", line 67, in get
    :rtype: requests.Response
  File "/home/ubuntu/miniconda3/envs/TQ/lib/python3.5/site-packages/requests/api.py", line 53, in request
    # By using the 'with' statement we are sure the session is closed, thus we
  File "/home/ubuntu/miniconda3/envs/TQ/lib/python3.5/site-packages/requests/sessions.py", line 468, in request

  File "/home/ubuntu/miniconda3/envs/TQ/lib/python3.5/site-packages/requests/sessions.py", line 608, in send
    # Shuffle things around if there's history.
  File "/home/ubuntu/miniconda3/envs/TQ/lib/python3.5/site-packages/requests/models.py", line 737, in content

  File "/home/ubuntu/miniconda3/envs/TQ/lib/python3.5/site-packages/requests/models.py", line 660, in generate
    def generate():
  File "/home/ubuntu/miniconda3/envs/TQ/lib/python3.5/site-packages/requests/packages/urllib3/response.py", line 344, in stream
    :param decode_content:
  File "/home/ubuntu/miniconda3/envs/TQ/lib/python3.5/site-packages/requests/packages/urllib3/response.py", line 301, in read
    data = None
  File "/home/ubuntu/miniconda3/envs/TQ/lib/python3.5/http/client.py", line 433, in read
    n = self.readinto(b)
  File "/home/ubuntu/miniconda3/envs/TQ/lib/python3.5/http/client.py", line 473, in readinto
    n = self.fp.readinto(b)
  File "/home/ubuntu/miniconda3/envs/TQ/lib/python3.5/socket.py", line 575, in readinto
    return self._sock.recv_into(b)
KeyboardInterrupt
>>> 

@FlxVctr
Copy link
Author

FlxVctr commented Jun 19, 2016

Just for reference if somebody is looking this up:

Use body content workflow instead of head requests to make content retrieval conditional. Headers from head requests seem to be unreliable.

@wsw70
Copy link

wsw70 commented Oct 25, 2016

I have a very similar problem (if not the same): an application is randomly hanging (every ~10 to 20 days, roughly) and I connected gdb to a hung session today. py-list shows

(gdb) py-list
 366            self._checkReadable()
 367            if self._timeout_occurred:
 368                raise OSError("cannot read from timed out object")
 369            while True:
 370                try:
>371                    return self._sock.recv_into(b)
 372                except timeout:
 373                    self._timeout_occurred = True
 374                    raise
 375                except InterruptedError:
 376                    continue

it just hangs here, apparently (I have no experience with gdb but when I disconnect and reconnect, py-list brings me to the same place above).

@FlxVctr : thanks for the tip but I do not use head requests (nor body content workflow)

@Lukasa
Copy link
Member

Lukasa commented Oct 25, 2016

That hang point seems to be in the standard library. Do you have a full Python traceback?

@wsw70
Copy link

wsw70 commented Oct 25, 2016

@Lukasa : no, my program hung (without any traceback) and I was wondering what could have caused the hanging. I just did a gdb python3 <pid of my script> and the only valuable thing I got was the pitput of py-list. I just rebooted my machine (it is a RPi) because I needed the script to run but if there is something i can do with the code to make it more talkative please let me know.

@Lukasa
Copy link
Member

Lukasa commented Oct 25, 2016

gdb should be capable of printing a Python traceback.

@wsw70
Copy link

wsw70 commented Oct 25, 2016

Ah, I still have what I tried on the terminal. Yes, there is indeed a traceback:

(gdb) py-bt
Traceback (most recent call first):
  File "/usr/lib/python3.4/socket.py", line 371, in readinto
    return self._sock.recv_into(b)
  File "/usr/lib/python3.4/wsgiref/simple_server.py", line 118, in handle
    self.raw_requestline = self.rfile.readline(65537)
  File "/usr/lib/python3.4/socketserver.py", line 669, in __init__
    self.handle()
  File "/usr/lib/python3.4/socketserver.py", line 344, in finish_request
    self.RequestHandlerClass(request, client_address, self)
  File "/usr/lib/python3.4/socketserver.py", line 331, in process_request
    self.finish_request(request, client_address)
  File "/usr/lib/python3.4/socketserver.py", line 305, in _handle_request_noblock
    self.process_request(request, client_address)
  File "/usr/lib/python3.4/socketserver.py", line 238, in serve_forever
    self._handle_request_noblock()
  File "/usr/lib/python3/dist-packages/bottle.py", line 2769, in run
    srv.serve_forever()
  File "/usr/lib/python3/dist-packages/bottle.py", line 3114, in run
    server.run(app)
  File "webserver.py", line 510, in __init__
  File "webserver.py", line 547, in <module>

@Lukasa
Copy link
Member

Lukasa commented Oct 25, 2016

@wsw70 I should note that requests is not in that stack. 😉

@wsw70
Copy link

wsw70 commented Oct 25, 2016

Yes, I googled the question and saw references to urllib*, socket etc. - all of them were quite cryptic. I just mentioned in this thread that I have the same issue and that it is has indeed something to do with the standard lib.

Did I mention that requests is awesome? (and I really mean it, it is an opportunity to voice this out since the core dev is listening :) )

@Lukasa
Copy link
Member

Lukasa commented Oct 25, 2016

Yup. =P So that hang seems to be related to a TCP connection falling over. You'll have to take that up with the bottle devs. =D And I'm glad you like Requests!

@p0ooo0ya
Copy link

p0ooo0ya commented Nov 18, 2017

Having the same issue here. with timeout without it doesn't make any difference. It is hanging in the socket module self._sock.recv_into(b)

python3 "/home/devilfromir/code/src/utilities/deckdl/ddtokenizer.py"
^CTraceback (most recent call last):
File "/usr/local/lib/python3.6/dist-packages/requests/packages/urllib3/connectionpool.py
", line 379, in _make_request
httplib_response = conn.getresponse(buffering=True)
TypeError: getresponse() got an unexpected keyword argument 'buffering'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/home/devilfromir/code/src/utilities/deckdl/ddtokenizer.py", line 63, in
ip = anon.IP()
File "/home/devilfromir/code/src/utilities/deckdl/anon.py", line 26, in init
self.check(withTor=True)
File "/home/devilfromir/code/src/utilities/deckdl/anon.py", line 37, in check
res = req.get('http://www.icanhazip.com', headers=headerGen(''))
File "/home/devilfromir/code/src/utilities/deckdl/anon.py", line 87, in torReq
r = getattr(requests, attrName)(self.url, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/requests/api.py", line 70, in get
return request('get', url, params=params, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/requests/api.py", line 56, in request
return session.request(method=method, url=url, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/requests/sessions.py", line 488, in request
resp = self.send(prep, **send_kwargs)
File "/usr/local/lib/python3.6/dist-packages/requests/sessions.py", line 609, in send r = adapter.send(request, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/requests/adapters.py", line 423, in send
timeout=timeout
File "/usr/local/lib/python3.6/dist-packages/requests/packages/urllib3/connectionpool.py
", line 600, in urlopen
chunked=chunked)
File "/usr/local/lib/python3.6/dist-packages/requests/packages/urllib3/connectionpool.py
", line 382, in _make_request
httplib_response = conn.getresponse()
File "/usr/lib/python3.6/http/client.py", line 1331, in getresponse
response.begin()
File "/usr/lib/python3.6/http/client.py", line 297, in begin
version, status, reason = self._read_status()
File "/usr/lib/python3.6/http/client.py", line 258, in _read_status
line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1")
File "/usr/lib/python3.6/socket.py", line 586, in readinto
return self._sock.recv_into(b)
KeyboardInterrupt

This happens when i tunnel through tor using privoxy as http proxy

@cpys
Copy link

cpys commented Oct 24, 2018

遇到了同样的问题,进程在return self._sock.recv_into(b)这里卡住了

@aaronhua123
Copy link

I had the same problem in windows,hangs in self._sock.recv_into(b) .Maybe once in a dozen days.I use multiprocessing and requests.But when I switch to threading, It don't hangs in the socket.This may be a complicated problem.

@JackNova
Copy link

JackNova commented May 1, 2020

I'm having a similar issue.
My application hangs in self._sock.recv_into(b) without throwing any exception. I have configured pool_manager.connection_pool_kw["timeout"] with a small timeout.

I tried to trace the problem by including the hanging_threads package in the module where the hang happens. The hang happens in a client stub generated by swagger codegen from an openapi schema.

This is how I managed to get the traceback:

from hanging_threads import start_monitoring monitoring_thread = start_monitoring()

The issue seems related to the fact that I was issuing an http request to the same resource from whom I was called, in the same thread and before issuing a response to the caller.
Moving the call to another thread solves the problem. It probably has something to do with the underlying socket but I need to investigate further.

The following is the traceback of the hang

--------Thread 139917216446208 "Thread-12" hangs ----------
File "/p37/threading.py", line 890, in _bootstrap
self._bootstrap_inner()
File "/p37/threading.py", line 926, in _bootstrap_inner
self.run()
File "/p37/threading.py", line 870, in run
self._target(*self._args, **self._kwargs)
File "/p37/socketserver.py", line 650, in process_request_thread
self.finish_request(request, client_address)
File "/p37/socketserver.py", line 360, in finish_request
self.RequestHandlerClass(request, client_address, self)
File "/p37/socketserver.py", line 720, in init
self.handle()
File "/p37/site-packages/werkzeug/serving.py", line 345, in handle
BaseHTTPRequestHandler.handle(self)
File "/p37/http/server.py", line 426, in handle
self.handle_one_request()
File "/p37/site-packages/werkzeug/serving.py", line 379, in handle_one_request
return self.run_wsgi()
File "/p37/site-packages/werkzeug/serving.py", line 323, in run_wsgi
execute(self.server.app)
File "/p37/site-packages/werkzeug/serving.py", line 312, in execute
application_iter = app(environ, start_response)
File "/p37/site-packages/flask/cli.py", line 337, in call
return self._app(environ, start_response)
File "/p37/site-packages/flask/app.py", line 2464, in call
return self.wsgi_app(environ, start_response)
File "/p37/site-packages/flask/app.py", line 2447, in wsgi_app
response = self.full_dispatch_request()
File "/p37/site-packages/flask/app.py", line 1950, in full_dispatch_request
rv = self.dispatch_request()
File "/p37/site-packages/flask/app.py", line 1936, in dispatch_request
return self.view_functionsrule.endpoint
File "/p37/site-packages/my_app/init.py", line 49, in ast
STORYBOARD_FSM.ast_request()
File "/p37/site-packages/my_app/my_app.py", line 123, in ast_request
self._state.ast()
File "/p37/site-packages/my_app/my_app.py", line 231, in ast
self.context.transition_to(Reading())
File "/p37/site-packages/my_app/my_app.py", line 116, in transition_to
self._state.do()
File "/p37/site-packages/my_app/my_app.py", line 184, in do
self.context.transition_to(RunningLookAt())
File "/p37/site-packages/my_app/my_app.py", line 116, in transition_to
self._state.do()
File "/p37/site-packages/my_app/my_app.py", line 245, in do
embs.call_look_at_api(point)
File "/p37/site-packages/my_app/my_service.py", line 62, in call_look_at_api
api_instance.look_at_post(body)
File "/swagger_client/api/idle_api.py", line 148, in look_at_post
(data) = self.look_at_post_with_http_info(body, **kwargs) # noqa: E501
File "/swagger_client/api/idle_api.py", line 221, in look_at_post_with_http_info
collection_formats=collection_formats)
File "/swagger_client/api_client.py", line 321, in call_api
_preload_content, _request_timeout)
File "/swagger_client/api_client.py", line 152, in __call_api
_request_timeout=_request_timeout)
File "/swagger_client/api_client.py", line 364, in request
body=body)
File "/swagger_client/rest.py", line 274, in POST
body=body)
File "/swagger_client/rest.py", line 166, in request
headers=headers)
File "/p37/site-packages/urllib3/request.py", line 80, in request
method, url, fields=fields, headers=headers, **urlopen_kw
File "/p37/site-packages/urllib3/request.py", line 171, in request_encode_body
return self.urlopen(method, url, **extra_kw)
File "/p37/site-packages/urllib3/poolmanager.py", line 336, in urlopen
response = conn.urlopen(method, u.request_uri, **kw)
File "/p37/site-packages/urllib3/connectionpool.py", line 677, in urlopen
chunked=chunked,
File "/p37/site-packages/urllib3/connectionpool.py", line 421, in _make_request
httplib_response = conn.getresponse()
File "/p37/http/client.py", line 1344, in getresponse
response.begin()
File "/p37/http/client.py", line 306, in begin
version, status, reason = self._read_status()
File "/p37/http/client.py", line 267, in _read_status
line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1")
File "/p37/socket.py", line 589, in readinto
return self._sock.recv_into(b)
--------Thread 139917208053504 "Thread-11" died ----------

@sonu3009
Copy link

sonu3009 commented Jun 7, 2020

I am also facing similar issue, using Python3.7

Code:
import speech_recognition as sr

r = sr.Recognizer()
with sr.Microphone() as source:
print("Hello World 1")
audio = r.listen(source)
voice_data = r.recognize_google(audio)
print(voice_data)

Error:-
voice_data = r.recognize_google(audio)
File "C:\Program Files (x86)\Microsoft Visual Studio\Shared\Python37_64\lib\site-packages\speech_recognition_init_.py", line 840, in recognize_google
response = urlopen(request, timeout=self.operation_timeout)
File "C:\Program Files (x86)\Microsoft Visual Studio\Shared\Python37_64\lib\urllib\request.py", line 222, in urlopen
return opener.open(url, data, timeout)
File "C:\Program Files (x86)\Microsoft Visual Studio\Shared\Python37_64\lib\urllib\request.py", line 525, in open
response = self._open(req, data)
File "C:\Program Files (x86)\Microsoft Visual Studio\Shared\Python37_64\lib\urllib\request.py", line 543, in _open
'_open', req)
File "C:\Program Files (x86)\Microsoft Visual Studio\Shared\Python37_64\lib\urllib\request.py", line 503, in _call_chain
result = func(*args)
File "C:\Program Files (x86)\Microsoft Visual Studio\Shared\Python37_64\lib\urllib\request.py", line 1345, in http_open
return self.do_open(http.client.HTTPConnection, req)
File "C:\Program Files (x86)\Microsoft Visual Studio\Shared\Python37_64\lib\urllib\request.py", line 1320, in do_open
r = h.getresponse()
File "C:\Program Files (x86)\Microsoft Visual Studio\Shared\Python37_64\lib\http\client.py", line 1336, in getresponse
response.begin()
File "C:\Program Files (x86)\Microsoft Visual Studio\Shared\Python37_64\lib\http\client.py", line 306, in begin
version, status, reason = self._read_status()
File "C:\Program Files (x86)\Microsoft Visual Studio\Shared\Python37_64\lib\http\client.py", line 267, in _read_status
line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1")
File "C:\Program Files (x86)\Microsoft Visual Studio\Shared\Python37_64\lib\socket.py", line 589, in readinto
return self._sock.recv_into(b)
ConnectionResetError: [WinError 10054] An existing connection was forcibly closed by the remote host

@guangningyu
Copy link

I'm having the same issue and it turns out that it is the socket getting disconnected. It works fine once the TCP keep alive options are set explicitly:

# Set TCP keep alive options to avoid HTTP requests hanging issue
# Reference: https://stackoverflow.com/a/14855726/2360527
import platform
import socket
import urllib3.connection

platform_name = platform.system()
orig_connect = urllib3.connection.HTTPConnection.connect
def patch_connect(self):
    orig_connect(self)
    if platform_name == "Linux" or platform_name == "Windows":
        self.sock.setsockopt(socket.SOL_SOCKET, socket.SO_KEEPALIVE, 1),
        self.sock.setsockopt(socket.IPPROTO_TCP, socket.TCP_KEEPIDLE, 1),
        self.sock.setsockopt(socket.IPPROTO_TCP, socket.TCP_KEEPINTVL, 3),
        self.sock.setsockopt(socket.IPPROTO_TCP, socket.TCP_KEEPCNT, 5),
    elif platform_name == "Darwin":
        TCP_KEEPALIVE = 0x10
        self.sock.setsockopt(socket.SOL_SOCKET, socket.SO_KEEPALIVE, 1)
        self.sock.setsockopt(socket.IPPROTO_TCP, TCP_KEEPALIVE, 3)
urllib3.connection.HTTPConnection.connect = patch_connect

import requests
requests.get(...)

Requests version: requests==2.24.0

Useful links:

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Aug 29, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

9 participants