Streaming responses for file backend fails if not already in the cache #68

davidmegginson · 2016-07-28T02:49:58Z

If a remote resource is not already in the cache for the file backend, the following will return no content:

response = requests.get('http://ourairports.com/countries/CA/PE/airports.csv', stream=True)

Once it's already cached (and not expired), it returns the proper content. Here's the full test script (on first run, it will print empty content; on second run, within 60 seconds, it will dump the full remote content).

import requests, requests_cache

# Use a file backend with 10-second timeout
requests_cache.install_cache(
    '/tmp/test-requests',
    expire_after=60
)

# Get a streaming response, and try reading the raw stream
response = requests.get('http://ourairports.com/countries/CA/PE/airports.csv', stream=True)

# Will be empty if not already in the cache
print(response.raw.read(-1))

(Python 3.5.1, requests-cache 0.4.12)

The text was updated successfully, but these errors were encountered:

davidmegginson · 2016-07-28T14:50:47Z

Confirmed in Python 2.7.12 as well.

reclosedev · 2016-07-30T13:49:45Z

Thanks! Fixed.

davidmegginson · 2016-08-02T18:23:05Z

Thank you. Confirmed working in Python 2.7.12 and Python 3.5.2.

Refactor cached response info & initialization into a separate class Closes #99, #148, #186 , #187, #188, and does some of the work for #169 and #184. ## Summary The goal of this PR is to reduce code complexity, and hopefully make things a little easier for others who want to contribute. Some of the logic around getting and saving responses had gotten a bit complicated as more features have been added, mainly in: * `CachedSession.request()` * `CachedSession.send()` * `BaseCache.set_response()` (and associated helper methods) * `BaseCache.get_response()` (and associated helper methods) I've consolidated most of the logic around expiration, serializiation compatibility, and other response object handling into a separate class, `CachedResponse`. This simplifies the above methods, removes most of the code duplication, and fixes a few other issues. According to [radon](https://radon.readthedocs.io/), all classes/modules/methods except two have an 'A' maintainability rating (CC <= 5). ## Changes **Short version:** Most of the important bits are here: [requests_cache/response.py](https://github.com/reclosedev/requests-cache/blob/b0f2c132fc320c9b1e32e839add63240b2805bbb/requests_cache/response.py) ### Serialization/Deserialization * Response creation time and expiration time are stored in `CachedResponse`, so the timestamp from the `(response, timestamp)` tuple is no longer needed * `CachedResponse.is_expired` can be used to indicate if an old response was returned as the result of an error with `old_data_on_error=True` (#99) * Replace `_RawStore` with `CachedHTTPResponse` class to wrap raw responses, and: * Maintain support for streaming requests (#68) * Improve handling for generator usage * Add support for use with `pandas.read_csv()` and similar readers (#148) * Add support for use as a context manager (#148) * Add support for `decode_content` param * Fix streaming requests when used with memory backend (#188) * Verified that `PreparedRequest.body` is always encoded in utf-8, so no need to detect encoding * Re: [todo note here](https://github.com/reclosedev/requests-cache/blob/08886efe1841aae9c6a5e133008b69296d23b8b9/requests_cache/backends/base.py#L238); see [requests.PreparedRequest.prepare_body()](https://github.com/psf/requests/blob/54336568789e0ec41f70c800a96384e4fac58dd9/requests/models.py#L455) ### Expiration * Add optional `expire_after` param to `CachedSession.remove_old_responses()` * Wrap temporary `_request_expire_after` in a contextmanager * Remove `expires_before` param from remove_old_entries, and always use the current time * Remove `relative_to` param from `CachedSession._determine_expiration_datetime` for unit testing and patch `datetime.now` in unit tests instead * Rename `response.expire_after` and `response.cache_date` to `expires` and `created_at`, respectively, based on browser caching ### Docs * Add type annotations to public methods in `CachedSession`, `CachedResponse`, and `BaseCache` * Add some more docstrings and code comments * Add intersphinx links for `urllib` classes & methods * Update user guide for using requests-cache with `requests_mock.Adapter` ### Tests * Add an update tests for all new and changed code, and add coverage for additional edge cases * Add fixture for combining requests-cache with requests-mock, using a temporary SQLite db in `/tmp` * Refactor all tests in `test_cache` as pytest-style functions instead of `unittest.TestCase` methods * Update most tests to use requests-mock instead of using httpbin.org (see #169) ### Misc * Split some of the largest functions into multiple smaller functions * Make use of dict ordering from python3.6+ in _normalize_parameters() * Fix linting issues raised by flake8 ## Backwards-compatibility These changes don't break compatibility with code using requests-cache <= 0.5.2, but they aren't compatible with previously cached data due to the different serialization format. Retrieving a previously cached response will fail quietly and simply fetch and cache a new response. Since the docs already warn about this potentially happening with new releases, I don't think this is a problem.

davidmegginson changed the title ~~Streaming responses for file backend if not already in the cache~~ Streaming responses for file backend fails if not already in the cache Jul 28, 2016

reclosedev added a commit that referenced this issue Jul 30, 2016

Emulate raw stream is not consumed after content prefetch #68

900b048

reclosedev closed this as completed Jul 30, 2016

This was referenced Mar 18, 2021

Fix streaming requests used with memory backend #188

Closed

Refactor cached response info & initialization into a separate class #189

Merged

JWCook added bug enhancement labels Sep 7, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Streaming responses for file backend fails if not already in the cache #68

Streaming responses for file backend fails if not already in the cache #68

davidmegginson commented Jul 28, 2016

davidmegginson commented Jul 28, 2016

reclosedev commented Jul 30, 2016

davidmegginson commented Aug 2, 2016

Streaming responses for file backend fails if not already in the cache #68

Streaming responses for file backend fails if not already in the cache #68

Comments

davidmegginson commented Jul 28, 2016

davidmegginson commented Jul 28, 2016

reclosedev commented Jul 30, 2016

davidmegginson commented Aug 2, 2016