Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensure session timeout is used, add default #68

Merged
merged 7 commits into from Mar 19, 2021
Merged

Ensure session timeout is used, add default #68

merged 7 commits into from Mar 19, 2021

Conversation

lion-sz
Copy link
Contributor

@lion-sz lion-sz commented Mar 15, 2021

The bug described in #66, therefore the session now respects the default ratelimit, if no other ratelimit was provided when calling WaybackSession.request.

Default timeout of 60 seconds.

Copy link
Member

@Mr0grog Mr0grog left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However, for now there is no way to actually provide a timeout value to the requests function without hacking the code.

I’m not sure I’m following. Do you have an example of what you mean? It seems like someone could still do:

session = wayback.WaybackSession()
# Custom timeout:
session.request('http://something.com/', timeout=5)
# or no timeout:
session.request('http://something.com/', timeout=None)

wayback/_client.py Show resolved Hide resolved
@Mr0grog
Copy link
Member

Mr0grog commented Mar 15, 2021

Also, could you please add tests for this?

@lion-sz
Copy link
Contributor Author

lion-sz commented Mar 15, 2021

Also, could you please add tests for this?

Sure. I've not yet worked with pytest so give me a couple days.

@lion-sz
Copy link
Contributor Author

lion-sz commented Mar 17, 2021

So, I've added the the tests.
It's somewhat hacky, I've had no idea how to properly check if the timeouts were applied. So I patched the requests.Session.send method to return the provided timeout value back to the caller. If you know of a better solution, let me know and I'll implement it.

I've forgotten to add the mock library to the dev dependencies, that is what the third commit was for.

Copy link
Member

@Mr0grog Mr0grog left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great. I noted a few things inline. The big one is just that we shouldn’t need to install mock, which is just a backport of the builtin unittest.mock, which we should be able to use instead.

@@ -2,6 +2,8 @@
from pathlib import Path
import pytest
import vcr
import requests
import mock
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this project requires Python 3.6+, you can just use the built-in unittest.mock library rather than adding a new dependency. mock is a backport of the built-in for older versions of Python.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huh, I didn't know that mock came build in with unittest.
I've changed that (and the typos below).

@@ -474,6 +476,14 @@ def test_get_memento_follow_redirects_does_not_follow_historical_redirects():
assert len(memento.debug_history) == 1


def return_timeout(self, *args, **kwargs) -> requests.Response:
"""Overwirte the requests.Session.send to return a response with the provied timeout value as content"""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo:

Suggested change
"""Overwirte the requests.Session.send to return a response with the provied timeout value as content"""
"""Overwrite requests.Session.send to return a response with the provided timeout value as content."""

Comment on lines +528 to +538
@mock.patch('requests.Session.send', side_effect=return_timeout)
def test_timeout_applied_session(self, mock_class):
# Is the timeout applied through the WaybackSession
session = WaybackSession(timeout=1)
res = session.request('GET', 'http://test.com')
assert res.text == '1'
# Overwriting the default in the requests method
res = session.request('GET', 'http://test.com', timeout=None)
assert res.text == 'None'
res = session.request('GET', 'http://test.com', timeout=2)
assert res.text == '2'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is hacky, but I think it works fine for now! I’m fine merging this. 👍

If you want to do something fancier, you could check out this timeout test in web-monitoring-diff: https://github.com/edgi-govdata-archiving/web-monitoring-diff/blob/2fe3088b04bb2a58fdb5fec1e92d4fed181233f3/web_monitoring_diff/tests/test_server_exc_handling.py#L192-L202

(It tests that a proxy server correctly times out while contacting the “upstream” server with the actual content. The patch decorator makes the proxy being tested have a short timeout, and responder and SimpleHttpServer create a quick little HTTP server to use as the upstream, which waits longer than the expected timeout before responding. That’s way more complex than what you’re doing here, though. It would be more complete, but not really necessary. Also note that package uses async/await, so the solution here couldn’t be exactly the same.)

@Mr0grog
Copy link
Member

Mr0grog commented Mar 18, 2021

Is this good to go? If so, I’ll merge later tonight and backport to the v0.2.x release line. (And then publish v0.2.6.)

@lion-sz
Copy link
Contributor Author

lion-sz commented Mar 18, 2021

From my side it is good to go.

@Mr0grog
Copy link
Member

Mr0grog commented Mar 19, 2021

@LionSzl I should have asked this before: would you like to be added to the contributors list in the README?

@Mr0grog Mr0grog changed the title Default timeout Ensure session timeout is used, add default Mar 19, 2021
@Mr0grog Mr0grog merged commit 763bb0f into edgi-govdata-archiving:main Mar 19, 2021
Mr0grog added a commit that referenced this pull request Mar 19, 2021
@Mr0grog Mr0grog mentioned this pull request Mar 19, 2021
Mr0grog added a commit that referenced this pull request Mar 19, 2021
Mr0grog added a commit that referenced this pull request Mar 19, 2021
This fixes a major bug where a WaybackSession's `timeout` would never actually be applied (part of #66).

Co-authored-by: LionSzl <46633199+LionSzl@users.noreply.github.com>
Mr0grog added a commit that referenced this pull request Mar 19, 2021
In #68, we forgot to update the documentation to specify the new default value.
@lion-sz
Copy link
Contributor Author

lion-sz commented Mar 19, 2021

@LionSzl I should have asked this before: would you like to be added to the contributors list in the README?

Sure, if one bugfix qualifies me as contributor... My full name is Lion Szlagowski.
Thanks!

@Mr0grog
Copy link
Member

Mr0grog commented Mar 19, 2021

Of course it does! And you are now in the README. 😄

@Mr0grog
Copy link
Member

Mr0grog commented Mar 19, 2021

I’ve also backported the bugfix part of this and published it as v0.2.6.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants