New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implementing caching of chunked/streamed responses #106
Conversation
|
|
97be551
to
0750e56
Compare
|
Fixes #95 (comment) |
content_1 = resp_1.content | ||
|
||
resp_2 = self.sess.get(url + 'stream') | ||
content_2 = resp_1.content |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are two spaces before the =
here.
|
@rmcgibbo In trying this out, I added a test:
This fails, when (I think) it should pass. I'm thinking a chunked response is essentially the same as if you did something like Let me know your thoughts and if you have ideas for fixing it. At the moment, I think it would be most helpful if urllib3 provided a "consumed" method on the HTTPResponse that helped solve this type of problem where CacheControl has to guess that the file handle has been read. |
Is that a fresh session or in the TestStream? |
@rmcgibbo That is the same session in the TestStream. |
Then it depends on the order in which the two tests get executed, no? If your test is executed afterwards, then should be cached. |
Or, maybe not. I just added your test in a way that I thought should pass, but no. I'll take a look. |
|
I'm taking a look now. A consumed method would definitely be cleaner. |
@ionrock: this test is failing, even without anything to do with streamed or chunked responses:
https://travis-ci.org/ionrock/cachecontrol/jobs/93289665#L153-L154 |
|
|
Any movement on this PR? Would be nice to use this lib for services with chunked responses. |
@scollinson I don't really remember. I haven't looked at this in a long time. But I think this PR is good to go. IIRC, the apparent bug that @ionrock noted above is not a bug in this PR, but actually a bug somewhere else that exists even in the current version of cachecontrol without this PR (see the travis output in #107). |
headers.pop('transfer-encoding') | ||
|
||
cached['response']['headers'] = headers | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a good reason for this? It seems like it doesn't hurt to know the original response was chunked.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't really remember this PR. I think perhaps when the initial request is transferred with chunks, one of the first steps is to strip off the chunk markers, so that the data that actually enters the cache lacks the chunk markers. when it's read back out of the cache then, it doesn't look like a chunked request.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah ok, that is what I suspected. Thanks!
I've finally gotten around to getting this merged with some changes. Thanks for your patience!
Made some changes and merged this locally. Unfortunately, it seems I blew it and it didn't pick up this PR as merged :( Closing it now. |
Resolves #105, #81. This takes a different approach from #82 in that it doesn't get rid of
CallbackFileWrapper
, so there is still the same delay actually caching the response until the calling application itself reads the data.On the other hand, this requires dynamically monkeypatching urllib3, so ¯_(ツ)_/¯.