Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor cached response info & initialization into a separate class #189

Merged
merged 3 commits into from Mar 22, 2021
Merged

Conversation

JWCook
Copy link
Member

@JWCook JWCook commented Mar 18, 2021

Closes #99, #148, #186 , #187, #188, and does some of the work for #169 and #184.

Summary

The goal of this PR is to reduce code complexity, and hopefully make things a little easier for others who want to contribute. Some of the logic around getting and saving responses had gotten a bit complicated as more features have been added, mainly in:

  • CachedSession.request()
  • CachedSession.send()
  • BaseCache.set_response() (and associated helper methods)
  • BaseCache.get_response() (and associated helper methods)

I've consolidated most of the logic around expiration, serializiation compatibility, and other response object handling into a separate class, CachedResponse. This simplifies the above methods, removes most of the code duplication, and fixes a few other issues. According to radon, all classes/modules/methods except two have an 'A' maintainability rating (CC <= 5).

Changes

Short version: Most of the important bits are here: requests_cache/response.py

Serialization/Deserialization

Expiration

  • Add optional expire_after param to CachedSession.remove_old_responses()
  • Wrap temporary _request_expire_after in a contextmanager
  • Remove expires_before param from remove_old_entries, and always use the current time
  • Remove relative_to param from CachedSession._determine_expiration_datetime for unit testing and patch datetime.now in unit tests instead
  • Rename response.expire_after and response.cache_date to expires and created_at, respectively, based on browser caching

Docs

  • Add type annotations to public methods in CachedSession, CachedResponse, and BaseCache
  • Add some more docstrings and code comments
  • Add intersphinx links for urllib classes & methods
  • Update user guide for using requests-cache with requests_mock.Adapter

Tests

  • Add an update tests for all new and changed code, and add coverage for additional edge cases
  • Add fixture for combining requests-cache with requests-mock, using a temporary SQLite db in /tmp
  • Refactor all tests in test_cache as pytest-style functions instead of unittest.TestCase methods
  • Update most tests to use requests-mock instead of using httpbin.org (see Use mock requests in unit tests #169)

Misc

  • Split some of the largest functions into multiple smaller functions
  • Make use of dict ordering from python3.6+ in _normalize_parameters()
  • Fix linting issues raised by flake8

Backwards-compatibility

These changes don't break compatibility with code using requests-cache <= 0.5.2, but they aren't compatible with previously cached data due to the different serialization format. Retrieving a previously cached response will fail quietly and simply fetch and cache a new response. Since the docs already warn about this potentially happening with new releases, I don't think this is a problem.

* Pass per-request expiration in request params instead of setting as a temporary instance variable
* Make use of dict ordering from python3.6+ in _normalize_parameters()
* Add some more type annotations to `CachedSession` methods
* Remove `expires_before` param from remove_old_entries, and always use the current time
* Remove `relative_to` param from `CachedSession._determine_expiration_datetime` and use mock values
  in unit tests instead
@JWCook JWCook added docs Documentation and examples enhancement labels Mar 18, 2021
@JWCook JWCook added this to the v0.6 milestone Mar 18, 2021
@JWCook
Copy link
Member Author

JWCook commented Mar 18, 2021

@shoeffner To make this work with the changes you added in PR #177, I needed to change a few things:

  • Cache revalidation: I'd like both per-request and per-session expire_after to mostly behave the same way, so I would prefer to not do cache revalidation every time you pass a different expire_after to request(). In browser cache-land, I think that would be similar to passing the Cache-Control: must-revalidate header; just passing a value for Cache-Control: max-age (or our equivalent expire_after) should only apply to a new response added to the cache.
  • Since this is a useful feature, though, I moved this behavior to CachedSession.remove_old_responses(), so you can optionally pass expire_after to revalidate the cache with the new expiration.
  • expire_after = None, 'default', or -1: After taking a closer look at this, it makes things a lot simpler if we can just say, "A per-request expire_after value that's not None will override the per-session expire_after". So now None instead of 'default' means default behavior (i.e., don't override per-session expiration), and -1 sets no expiration.

I hope you're okay with those changes. Let me know if you have any suggestions, or if one of those changes is inconvenient for the use cases you had in mind.

@shoeffner
Copy link
Contributor

I am completely fine with those changes, the 'default' etc. solution was just if someone would like to have some more custom control over when to cache what and when to reset it. But allowing None for that purpose and using the simpler semantics is definitely a plus.

tests/test_cache.py Outdated Show resolved Hide resolved
tests/test_cache.py Outdated Show resolved Hide resolved
@JWCook JWCook changed the title Refactoring to reduce code complexity Refactor cached response info & initialization into a separate class Mar 20, 2021
@JWCook JWCook marked this pull request as ready for review March 20, 2021 17:27
tests/test_cache.py Outdated Show resolved Hide resolved
tests/test_cache.py Outdated Show resolved Hide resolved
tests/test_cache.py Outdated Show resolved Hide resolved
requests_cache/core.py Outdated Show resolved Hide resolved
requests_cache/core.py Outdated Show resolved Hide resolved
@shoeffner
Copy link
Contributor

Overall this is a very good rewrite of multiple bits and pieces. It is a massive PR so there might be things I didn't see, but except for a few minor issues listed above I think this is really good.

While going through the PR I found that the documentation on install_cache is a little bit short (no mention of the kwargs or how the default backend is chosen), but that is something for another PR.

@JWCook
Copy link
Member Author

JWCook commented Mar 21, 2021

@shoeffner Thanks for reviewing the tests! That's super helpful.

While going through the PR I found that the documentation on install_cache is a little bit short

Yeah, and after going through the backlog of issues, I've noticed several other things that should be explained more in the docs. Differences between using CachedSession vs install_cache() was a common one. I'm working on updating the docs in a separate issue: #190

… logic into CachedResponse class:

* Replace `_RawStore`  with `CachedHTTPResponse` class to wrap raw responses
    * Maintain support for streaming requests (#68)
    * Improve handling for generator usage
    * Add support for use with `pandas.read_csv()` and similar readers (#148)
    * Add support for use as a context manager (#148)
    * Add support for `decode_content` arg
* Fix streaming requests when used with memory backend (#188)
* Verified that `PreparedRequest.body` is always encoded in utf-8, so no need to detect encoding (Re: TODO note)
* Response creation time and expiration time are stored as CachedResponse, so the `(response, timestamp)` tuple is no longer necessary
* Rename `response.expire_after` and `response.cache_date` to `expires` and `created_at`, respectively, based on browser cache directives
* Add optional `expire_after` param to `CachedSession.remove_old_responses()`
* Make `CachedSession` members `allowable_codes, allowable_methods, filter_fn, old_data_on_error`
  public, since they can safely be modified after initialization
* More type annotations and docstring updates
* Move main cache documentation from `CacheMixin` to CachedSession`, since that's probably where a user would look first
* Wrap temporary `_request_expire_after` in a contextmanager
* Add intersphinx links for `urllib` classes & methods
* Fix linting issues raised by flake8
* Start adding some unit tests using requests-mock

tmp
@JWCook
Copy link
Member Author

JWCook commented Mar 22, 2021

I'm going to go ahead and merge now this so I can start working on my next PR. If anyone has more comments or suggestions on this one, I can address them in a follow-up PR.

@JWCook JWCook merged commit b374cd3 into requests-cache:master Mar 22, 2021
@JWCook JWCook linked an issue Mar 22, 2021 that may be closed by this pull request
@JWCook JWCook deleted the refactor branch March 22, 2021 23:49
@JWCook JWCook added the serialization Features or changes related to response serialization label Sep 7, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
docs Documentation and examples enhancement serialization Features or changes related to response serialization
Projects
None yet
2 participants