-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Connection pool exhausted when connection failures occur, should refill with empty connections #644
Comments
Hello again! I think we need to be sure that we close the connection before we return them to the pool. Otherwise we run the risk of attempting to re-use a live connection that has timed out, which will end extremely poorly for us. |
We should be putting a |
@shazow's idea is even better than mine. |
@jlatherfold Is this something you'd be interested in working on? Producing a failing test would be the first step. Make a ConnectionPool with a small pool (1-2), trigger some connection exceptions, see if the pool gets exhausted. |
Hi, I've already done this, hence this bug report. Locally I've fixed it by putting the [closed] connection back on to the pool in the exception handlers (which do close the connections). Those Nones or closed connections are discarded and a new one is returned in subsequent calls to _get_conn. Sadly I don't know when I will get time to do any more on this right now. |
@jlatherfold Would be handy if you can share your code for reproducing the issue. :) Also if you can confirm that putting |
I can confirm that _put_conn(None) works as expected. Here's some basic code, which also produces this error. Setting the read-timeout to a low value will cause read timeout exceptions. You'll need to hit a url with some large content - doesn't have to be massive - 500KB or so although you can probably replicate this with much less): import os
import sys
import requests
def get(req, url):
r = req.get(url, stream=True, timeout=(0.5, 0.1)) # set get timeout low to force timeout exceptions
r.raise_for_status()
return r.iter_content(chunk_size=4096)
if __name__ == '__main__':
# if you need to auth the session do so....
session = requests.Session()
# setting the poolsize to 1 and 1 retries will result in this program
# hanging on the first retry attempt (failed connection or read timeout)
# set the pool_maxsize=4 and set some trace statements in connectionpool.py
# to print the queue size in _get_conn and _put_conn and watch it slowly
# decrease as we hit timeouts and retries.....
adpater = requests.adapters.HTTPAdapter(pool_connections=1, pool_maxsize=1,
pool_block=True, max_retries=1)
session.mount('http://', adpater)
while True:
try:
print('request.get......')
content = get(session, '<url to large content>')
# consume the response
for chunk in content:
pass
except Exception:
print('ops.....')
# after a while when you've hit few timeouts you're not going to get here....
print('request.done......') |
Sorry - didn't know how to format for code - but you get the idea.... |
Updated code formatting. |
Bonus points if somebody wants to translate that to plain-urllib3 and double-bonus if you make it into a test. :) |
Ok, I'll make the patch myself. Please let me know what procedure I need to follow or point me in the right direction where I can get that info as I have not made any code contributions before. |
@jlatherfold Thanks! Fork the project, make a branch, start with a test that shows the failure (we have full test coverage, have a look at other tests for examples), then send a PR for feedback. We'll review the code and give you more pointers. If you can think of a way to reproduce the scenario without doing a full end-to-end request, then urllib/tests/test_connectionpool.py would be a good place to put it. If you need a server, then take a look at the with_dummyserver subdirectory of suites. We lately prefer the test_socketlevel.py method of testing with a dummyserver when possible. Might be wise to just write a simple example of the scenario reproducing in urllib3 first, to make sure it's not requests' fault. |
Hi, I now have a fix for this with unit test and am ready to submit a PR. The work has been done on a local branch. Would you like me to merge it to master or push the branch to my fork and use that? |
@jlatherfold Whatever you prefer, as long as it ends up as a Pull Request on our end. :) |
When a ConnectionError occurs, python-requests <= 2.8.0 and/or urllib3 <= 1.11, connections are not returned to the pool. This change adds a CustomHTTPAdapter to workaround this issue. This HTTPAdapter doesn't trigger theses urllib3 issues. This HTTPAdapter workaround this by enforcing preloading of the response. When enabled, urllib3 releases the connection to the pool immediately after its usage, and doesn't trigger the issue. By enforcing preloading, this break some requests features (like stream) that we didn't use into our GnocchiClient Upstream bugs: * urllib3/urllib3#659 * urllib3/urllib3#651 * urllib3/urllib3#644 * https://github.com/kennethreitz/requests/issues/2687 We could remove this when requests 2.8.0 will be released Closes-bug: #1498823 Change-Id: I7d40ade927b834909e230613777cba1f7537c0ec
Project: openstack/ceilometer 75491297e34a359fbc1faff7bba484693a0b32af Workaround requests/urllib connection leaks When a ConnectionError occurs, python-requests <= 2.8.0 and/or urllib3 <= 1.11, connections are not returned to the pool. This change adds a CustomHTTPAdapter to workaround this issue. This HTTPAdapter doesn't trigger theses urllib3 issues. This HTTPAdapter workaround this by enforcing preloading of the response. When enabled, urllib3 releases the connection to the pool immediately after its usage, and doesn't trigger the issue. By enforcing preloading, this break some requests features (like stream) that we didn't use into our GnocchiClient Upstream bugs: * urllib3/urllib3#659 * urllib3/urllib3#651 * urllib3/urllib3#644 * https://github.com/kennethreitz/requests/issues/2687 We could remove this when requests 2.8.0 will be released Closes-bug: #1498823 Change-Id: I7d40ade927b834909e230613777cba1f7537c0ec
…b3#644 Pull-request for branch users/faustkun/VHSUP-8556 ref:fe62c165ed60688fc43330aae53b43b889f132ce
I think we're experiencing an issue related to this when communicating over HTTPS. Using urllib version 1.25.3 I used @jlatherfold's proof of concept and https://beeceptor.com/ to create an https endpoint with an artificial 10 second timeout (passed into the example here as
|
If preload_content is not specified on a request, connections are implicitly returned to the pool after the request is read. However, when a timeout error occurs or the connection is dropped the connection is closed but not returned in to the pool (since the response is None). This problem is compounded when retries are turned on since the next attempt will grab a new connection from the pool thus depleting it further. With non-blocking pools this is not a problem since we will create a new connection if the pool is empty but when the pool is blocking we have found that eventually the pool size becomes zero (after no.of connections timeout errors) which causes the calling application to hang on it's next request.
This can be fixed in connectionpool.urlopen in the exception handlers by explicitly returning the closed connection to the pool via _put_conn (if release_conn is false), since subsequent calls to _get_conn will check if the connection has been dropped and return a new one.
The text was updated successfully, but these errors were encountered: