-
Notifications
You must be signed in to change notification settings - Fork 817
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use Session for HTTP requests #738
Conversation
Include a default requests.Session as part of settings that is used for HTTP requests to OSM APIs, and can be overridden using the utils.config. This should improve performance by re-using the underlying TCP connection when sending multiple requests to the same host, and allows a user of osmnx to override the session to provide client/server certificates when making request to any OSM API that uses 2-way SSL verification.
Correct formatting for imports.
Update downloader.py to fix out-of-scope variable.
Re-format code using black (for line length)
…/osmnx into use-requests-session
Switched to Ubuntu dev machine to properly run lint_test.sh, after introducing formatting issue through Github merge conflict interface.
Codecov Report
@@ Coverage Diff @@
## main #738 +/- ##
==========================================
- Coverage 96.77% 96.65% -0.13%
==========================================
Files 24 24
Lines 2422 2423 +1
==========================================
- Hits 2344 2342 -2
- Misses 78 81 +3
Continue to review full report at Codecov.
|
Thanks @hokieg3n1us let me think about this. Two questions for you:
|
Updated docstring to better describe how to use session. Since Session creates a persistent TCP connection that is used when requests are sent to the same host/port, remove redundant fix from #699. Unable to duplicate network error from tests after running multiple times on Python 3.6, 3.7, 3.8, and 3.9.
|
Diagnosing the intermittent test failures, determined that the root cause was the server closing the connection due to inactivity. Refactored the settings to expose a dictionary of attributes that can be configured for a requests.Session that is instantiated when a call is made to _osm_network_download, _osm_geometries_download, or nominatum_request. Less elegant, but should allow users to provide authentication and 2-way SSL configuration, while also improving performance during repeated calls to the overpass API.
I diagnosed the intermittent failures (inactivity on the underlying TCP connection causing the server to close), and refactored significantly to still gain the benefits of using a requests Session, while providing some of the more important configuration options to a user. More of the underlying Session parameters can be exposed as necessary. |
Properly check if key exists in session_config, so a user can only provide keys as necessary for their environment.
moving files used for local testing of different python versions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've left just a few comments. My main overall question would be: have you tested this to confirm consistent IP address resolution in line with what #699 provided? If so, could you document those test methods/results here? Thanks!
utils.log(f"Pausing {this_pause} seconds before making HTTP POST request") | ||
time.sleep(this_pause) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These need to be indented back out of the if
block.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That produces an out of scope variable reference to this_pause similar to the recently fixed issue for the response status code.
and verify as keys. auth is a tuple of (username, password). cert is the | ||
path to a single file (containing the private key and certificate) or a tuple | ||
of both files' paths. verify can be the path to a CA_BUNDLE file, or directory | ||
of trusted CAs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be easier and more future-proof to just point the user toward the requests documentation on requests.Session.auth, cert, and path since we're just passing these values through to those requests.Session attributes.
@@ -275,7 +278,7 @@ def _get_pause(base_endpoint, recursive_delay=5, default_duration=60): | |||
|
|||
try: | |||
url = base_endpoint.rstrip("/") + "/status" | |||
response = settings.session.get(url, headers=_get_http_headers()) | |||
response = session.get(url, headers=_get_http_headers()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
_create_session
is not called in the _get_pause
function, but it is called in the other functions that make HTTP requests to the Overpass server?
@@ -451,6 +454,8 @@ def _osm_network_download(polygon, network_type, custom_filter): | |||
else: | |||
osm_filter = _get_osm_filter(network_type) | |||
|
|||
session = _create_session() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I notice that _create_session
is called 3 times (i.e. separately in the different "download" functions). Is there a reason to do it this way, rather than creating a session one time, for example whenever ox.config
is called, and then just using that session throughout?
I am closing this pull request because I cannot reproduce the intermittent issues causing the remote server to close the connection, and that's difficult to debug properly without more information about OSM's service layer. It's also difficult to elegantly construct a session, without refactoring the downloader.py to be instantiated classes that live/die during graph construction since you can't create a single Session to be used for the entirety of osmnx, because the client can't hold open an inactive TCP connection. I plan to identify a route of cleanly passing cert, auth, and verify directly to the requests.post/requests.get. |
New PR for development at #745. |
This proposed change would resolve any issues similar to issue #17 .
Description: Include a dictionary of configuration for a requests Session that is instantiated by the downloader, including keys for auth, cert, and verify. This allows a user to modify that session by providing a tuple of username and password for basic authentication, or provide client and server certificates. It should also improve performance when making repeated requests to the same host by re-using the underlying urllib3 connection.
Example Usage: