New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Question: What is your recommended approach to MultiThreading? #158
Comments
Hi @zurferr, According to this python-requests github thread, the Session object seems to be "almost" thread-safe. The recommended approach is to use a separate session per thread, or no session depending on your usecase. So unfortunately yes, it is sub-optimal and I don't see an easy way around it. There are some solutions like requests-futures available but python-arango is not equipped to use them (yet). What you could do is find out why the Session objects are not considered thread-safe. If the reasons do not apply to you, you might be able to get away with simply increasing the connection pool count like this: import requests
session = requests.Session()
adapter = requests.adapters.HTTPAdapter(
pool_connections=100,
pool_maxsize=100)
session.mount('http://', adapter) See here for customizing your HTTP client. |
Hi @joowani, On the way I found a small typo in https://docs.python-arango.com/en/main/http.html. http_adapter = HTTPAdapter(max_retries=retry_strategy)
session.mount('https://', adapter) # should be http_adapter
session.mount('http://', adapter) |
Alright, I now have a custom client that only uses urllib3 and so should be thread save. Right now, it only works with basic auth. Thus I am avoiding a session, but still am able to reuse the TCP connection pool handled by urllib3. class ThreadSafeHTTPClient(HTTPClient):
"""HTTP client that only relies on urllib3, which is thread save"""
def create_session(self, host):
http = urllib3.PoolManager(num_pools=50)
return http
def send_request(self,
session: urllib3.poolmanager.PoolManager,
method,
url,
params=None,
data=None,
headers=None,
auth=None):
# join headers with basic auth
headers = {**headers, **urllib3.make_headers(basic_auth=':'.join(auth))}
# Send a request.
response:HTTPResponse = session.request(
method=method,
url=url,
fields=params,
body=data,
headers=headers,
timeout=urllib3.Timeout(connect=10.0, read=60.0)
)
# Return an instance of arango.response.Response.
return Response(
method=method,
url=response.geturl(),
headers=response.headers,
status_code=response.status,
status_text=response.reason,
raw_body=response.data,
) |
Hi @zurferr, This is great! When I find some time I'll play around with this myself. If it passes all the tests I'll also put it in the threading documentation page. |
Oh and I'll also fix the typo. Thanks for pointing it out. |
Hi @joowani, I currently moving apartments and probably should not publish code right now. Here is a version that should work for all sorts of requests. class ThreadSafeHTTPClient(HTTPClient):
"""My custom HTTP client with cool features."""
def create_session(self, host):
"""not a real session, only a connection pool manager"""
# allow 100 concurrent connection, queue/block after that
http = urllib3.PoolManager(maxsize=100, block=True)
return http
def send_request(self,
session: urllib3.poolmanager.PoolManager,
method,
url,
params=None,
data=None,
headers=None,
auth=None):
# join headers
headers = {**headers, **urllib3.make_headers(basic_auth=':'.join(auth))}
# prepare url parameter
if params is not None:
url = url + '?' + urlencode(params)
# Send a requests
response: HTTPResponse = session.request(
method=method,
url=url,
body=data,
headers=headers,
timeout=urllib3.Timeout(connect=10.0, read=60.0)
)
# Return an instance of arango.response.Response.
return Response(
method=method,
url=response.geturl(),
headers=response.headers,
status_code=response.status,
status_text=response.reason,
raw_body=response.data,
) |
Closing this out. Feel free to reopen if you have any more questions. Thanks. |
Hi,
I want to use python-arango in a multithread environment (Webserver spawns thread for each request). Hundreds of concurrent requests are possible.
After reading the documentation, I understand that I might run into problems with the Session and indeed I had occasional 'Connection Refused' errors.
https://docs.python-arango.com/en/main/threading.html
I found the following issue, where a custom NoSessionHttpClient was used:
#92 (comment)
Is this the recommended solution for dealing with multithreading?
It seems sub-optimal because no connection pooling at all is used. So the performance especially latency will suffer.
Do you know better solutions? Thanks for reading. :)
The text was updated successfully, but these errors were encountered: