New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve connection towards Azure #121
Conversation
The one thing we need to make sure of when reusing connections is that they are re-created if it fails when IP changes, over longer periods (reusing connections in long-running processes) the connection must be recreated if DNS is changing in the blob store. I assume this is handled by retries on the methods itself, but since the non-idempotent methods such as PUT and POST to blobs I have to read up on the details to be sure. |
datareservoirio/storage/storage.py
Outdated
@@ -15,6 +15,14 @@ | |||
|
|||
log = logging.getLogger(__name__) | |||
|
|||
_AZURE_SESSION = requests.Session() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this really a blobstorage or blobstorageaccount session? All of our resources are in azure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean we should call it BLOBSTORAGE_SESSION
to be more precise? :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just keep the underscore prefix, and call it whatever you like :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
renamed to BLOBSTORAGE_SESSION now
Regarding retries over DNS failures, from what I read it should be handled, though I haven't tested. |
All low level network stuff is handled by urllib3. Almost certain that fundamentals are covered. |
Theoretically, we run at the risk of leaving a session (and all the related low level resources) open and allocated. But, since its one session per Python kernel, I'm sure it wont have an impact overall. |
LGTM |
This PR is related to user story ESS-XXXX
Description
Establish and reuse
requests.Session
object in connection towards Azure. In this was, the underlying TCP connection is reused and potentially result in more performant and stable network connection. (See, https://docs.python-requests.org/en/latest/user/advanced/#session-objects)The effect is marginal and barely observable in practice (normal office conditions), since there are other bottlenecks in the process. Nonetheless, an improvement of the code quality wrt network connection performance.
Checklist
PR title tips: