Implementing caching of chunked/streamed responses #106

rmcgibbo · 2015-11-24T06:13:31Z

Resolves #105, #81. This takes a different approach from #82 in that it doesn't get rid of CallbackFileWrapper, so there is still the same delay actually caching the response until the calling application itself reads the data.

On the other hand, this requires dynamically monkeypatching urllib3, so ¯_(ツ)_/¯.

landscape-bot · 2015-11-24T06:17:21Z

Repository health decreased by 0.49% when pulling 6d42027 on rmcgibbo:stream into bc89ecc on ionrock:master.

3 new problems were found (including 0 errors and 3 code smells).
No problems were fixed.

landscape-bot · 2015-11-24T07:01:14Z

Repository health decreased by 0.63% when pulling d466ead on rmcgibbo:stream into bc89ecc on ionrock:master.

4 new problems were found (including 0 errors and 4 code smells).
No problems were fixed.

landscape-bot · 2015-11-24T07:11:14Z

Code quality remained the same when pulling 0750e56 on rmcgibbo:stream into bc89ecc on ionrock:master.

landscape-bot · 2015-11-24T07:21:12Z

Repository health decreased by 0.64% when pulling 81b21fd on rmcgibbo:stream into bc89ecc on ionrock:master.

4 new problems were found (including 0 errors and 4 code smells).
No problems were fixed.

rmcgibbo · 2015-11-24T23:15:52Z

Fixes #95 (comment)

ionrock · 2015-11-25T17:05:55Z

tests/test_stream.py

+        content_1 = resp_1.content
+
+        resp_2 = self.sess.get(url + 'stream')
+        content_2  = resp_1.content


There are two spaces before the = here.

landscape-bot · 2015-11-25T22:01:30Z

Repository health decreased by 0.64% when pulling e60d0e4 on rmcgibbo:stream into bc89ecc on ionrock:master.

4 new problems were found (including 0 errors and 4 code smells).
No problems were fixed.

ionrock · 2015-11-26T01:38:38Z

@rmcgibbo In trying this out, I added a test:

def test_stream_is_not_cached_when_content_is_not_read(self, url):
    self.sess.get(url + 'stream')
    resp = self.sess.get(url + 'stream')
    assert not resp.from_cache

This fails, when (I think) it should pass. I'm thinking a chunked response is essentially the same as if you did something like sess.get(url, stream=True), which means you need to consume the file handle in order to cache it.

Let me know your thoughts and if you have ideas for fixing it. At the moment, I think it would be most helpful if urllib3 provided a "consumed" method on the HTTPResponse that helped solve this type of problem where CacheControl has to guess that the file handle has been read.

rmcgibbo · 2015-11-26T02:29:08Z

Is that a fresh session or in the TestStream?

ionrock · 2015-11-26T02:35:05Z

@rmcgibbo That is the same session in the TestStream.

rmcgibbo · 2015-11-26T02:36:37Z

Then it depends on the order in which the two tests get executed, no? If your test is executed afterwards, then should be cached.

rmcgibbo · 2015-11-26T02:37:15Z

Or, maybe not. I just added your test in a way that I thought should pass, but no. I'll take a look.

landscape-bot · 2015-11-26T02:41:03Z

Repository health decreased by 0.62% when pulling dee246b on rmcgibbo:stream into bc89ecc on ionrock:master.

4 new problems were found (including 0 errors and 4 code smells).
No problems were fixed.

rmcgibbo · 2015-11-26T02:48:09Z

Let me know your thoughts and if you have ideas for fixing it. At the moment, I think it would be most helpful if urllib3 provided a "consumed" method on the HTTPResponse that helped solve this type of problem where CacheControl has to guess that the file handle has been read.

I'm taking a look now. A consumed method would definitely be cleaner.

rmcgibbo · 2015-11-26T03:09:17Z

@ionrock: this test is failing, even without anything to do with streamed or chunked responses:

class TestNonchunked(object):
    def test_not_cached_when_content_is_not_read(self, url):
        sess = CacheControl(requests.Session())
        sess.get(url)
        resp = sess.get(url)

        assert not resp.from_cache

https://travis-ci.org/ionrock/cachecontrol/jobs/93289665#L153-L154

landscape-bot · 2015-11-26T03:13:55Z

Repository health decreased by 0.57% when pulling 1dd1e51 on rmcgibbo:stream into 8d2e6e8 on ionrock:master.

4 new problems were found (including 0 errors and 4 code smells).
No problems were fixed.

landscape-bot · 2015-11-26T05:34:19Z

Repository health decreased by 0.60% when pulling 50778f3 on rmcgibbo:stream into 8d2e6e8 on ionrock:master.

4 new problems were found (including 0 errors and 4 code smells).
No problems were fixed.

scollinson · 2016-02-17T09:47:30Z

Any movement on this PR? Would be nice to use this lib for services with chunked responses.

rmcgibbo · 2016-02-17T12:35:58Z

@scollinson I don't really remember. I haven't looked at this in a long time. But I think this PR is good to go. IIRC, the apparent bug that @ionrock noted above is not a bug in this PR, but actually a bug somewhere else that exists even in the current version of cachecontrol without this PR (see the travis output in #107).

ionrock · 2016-03-23T18:27:10Z

cachecontrol/serialize.py

+            headers.pop('transfer-encoding')
+
+        cached['response']['headers'] = headers
+


Is there a good reason for this? It seems like it doesn't hurt to know the original response was chunked.

I don't really remember this PR. I think perhaps when the initial request is transferred with chunks, one of the first steps is to strip off the chunk markers, so that the data that actually enters the cache lacks the chunk markers. when it's read back out of the cache then, it doesn't look like a chunked request.

Ah ok, that is what I suspected. Thanks!

I've finally gotten around to getting this merged with some changes. Thanks for your patience!

ionrock · 2016-03-23T20:02:53Z

Made some changes and merged this locally. Unfortunately, it seems I blew it and it didn't pick up this PR as merged :(

Closing it now.

rmcgibbo force-pushed the stream branch 2 times, most recently from 97be551 to 0750e56 Compare November 24, 2015 07:04

rmcgibbo force-pushed the stream branch from 0750e56 to 81b21fd Compare November 24, 2015 07:14

rmcgibbo changed the title ~~Cache streamed responses~~ Implementing caching of chunked/streamed responses Nov 24, 2015

ionrock reviewed Nov 25, 2015
View reviewed changes

rmcgibbo added 3 commits November 25, 2015 18:44

Add test for stream

07ee8ad

Implement caching for chunked responses

a10539c

Apply some pep8/flake8 changes

b1d2452

rmcgibbo force-pushed the stream branch from dee246b to 1dd1e51 Compare November 26, 2015 03:05

rmcgibbo mentioned this pull request Nov 26, 2015

Add test_request_not_cached_when_content_is_not_read #107

Closed

Add second test

50778f3

rmcgibbo force-pushed the stream branch from 1dd1e51 to 50778f3 Compare November 26, 2015 05:30

ionrock mentioned this pull request Mar 23, 2016

Ensure chunked responses are cacheable #82

Closed

ionrock reviewed Mar 23, 2016
View reviewed changes

ionrock closed this Mar 23, 2016

ionrock mentioned this pull request Mar 23, 2016

Move the hook for reading response data into the cache up one level. #102

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implementing caching of chunked/streamed responses #106

Implementing caching of chunked/streamed responses #106

rmcgibbo commented Nov 24, 2015

landscape-bot commented Nov 24, 2015

landscape-bot commented Nov 24, 2015

landscape-bot commented Nov 24, 2015

landscape-bot commented Nov 24, 2015

rmcgibbo commented Nov 24, 2015

ionrock Nov 25, 2015

landscape-bot commented Nov 25, 2015

ionrock commented Nov 26, 2015

rmcgibbo commented Nov 26, 2015

ionrock commented Nov 26, 2015

rmcgibbo commented Nov 26, 2015

rmcgibbo commented Nov 26, 2015

landscape-bot commented Nov 26, 2015

rmcgibbo commented Nov 26, 2015

rmcgibbo commented Nov 26, 2015

landscape-bot commented Nov 26, 2015

landscape-bot commented Nov 26, 2015

scollinson commented Feb 17, 2016

rmcgibbo commented Feb 17, 2016

ionrock Mar 23, 2016

rmcgibbo Mar 23, 2016

ionrock Mar 23, 2016

ionrock commented Mar 23, 2016

		headers.pop('transfer-encoding')

		cached['response']['headers'] = headers

Implementing caching of chunked/streamed responses #106

Implementing caching of chunked/streamed responses #106

Conversation

rmcgibbo commented Nov 24, 2015

landscape-bot commented Nov 24, 2015

landscape-bot commented Nov 24, 2015

landscape-bot commented Nov 24, 2015

landscape-bot commented Nov 24, 2015

rmcgibbo commented Nov 24, 2015

ionrock Nov 25, 2015

Choose a reason for hiding this comment

landscape-bot commented Nov 25, 2015

ionrock commented Nov 26, 2015

rmcgibbo commented Nov 26, 2015

ionrock commented Nov 26, 2015

rmcgibbo commented Nov 26, 2015

rmcgibbo commented Nov 26, 2015

landscape-bot commented Nov 26, 2015

rmcgibbo commented Nov 26, 2015

rmcgibbo commented Nov 26, 2015

landscape-bot commented Nov 26, 2015

landscape-bot commented Nov 26, 2015

scollinson commented Feb 17, 2016

rmcgibbo commented Feb 17, 2016

ionrock Mar 23, 2016

Choose a reason for hiding this comment

rmcgibbo Mar 23, 2016

Choose a reason for hiding this comment

ionrock Mar 23, 2016

Choose a reason for hiding this comment

ionrock commented Mar 23, 2016