-
-
Notifications
You must be signed in to change notification settings - Fork 369
Fix static file cached 304 response #1692
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov Report
@@ Coverage Diff @@
## master #1692 +/- ##
==========================================
- Coverage 80.83% 80.29% -0.54%
==========================================
Files 105 104 -1
Lines 13591 13552 -39
==========================================
- Hits 10986 10882 -104
- Misses 2605 2670 +65 |
Thanks! I'll try to review this ASAP. But if not, it's okay to ping me. |
cherrypy/_cprequest.py
Outdated
# self.body is a generator object that must be yielded | ||
# for _caching.tee_output() to finish it's job and | ||
# remove orphaned threading._Event object from cache | ||
for i in self.body: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could be replaced by self.collapse_body()
, but I'm still trying to figure out a fix on a different level.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change seems to have elicited a new failure in the Windows AppVeyor tests. I've restarted the tests for this PR twice and they fail consistently. We'll have to devise a fix for that failed test before we can accept this.
@@ -929,6 +939,7 @@ def finalize(self): | |||
# and 304 (not modified) responses MUST NOT | |||
# include a message-body." | |||
dict.pop(headers, 'Content-Length', None) | |||
self._flush_body() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jaraco do you fully agree with such approach? I'd prefer not putting the body there over consuming it here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm still uneasy with this PR. I don't fully understand its implications and as much as I've tried to document the changes, I don't know what to write to explain the behavior.
Furthermore, this sort of change really needs a test to capture the desired behavior. The associated ticket describes a scenario which should trigger the failure, so what needs to be done is codify that scenario in a test. I'm out of time to work on this today, but I welcome others to implement the test.
@jaraco it feels like you got into my head and wrote down every single thought I had regarding this PR :) |
It took me more than a month to understand the time-outs we experienced (and that included working my way into the dungeons of cherrypy). I'm glad it takes you guys a while as well ;) The issue I linked fully describes the bug and condition, and it's hard to explain it better. It does help to exactly follow the code as I explain, but it also helps to fully grok the CachingTool. The problem arises around a race condition that is caused by expiring a previous normally cached static file. As the MemoryCache.expire_cache daemon thread only deletes the variant and leaves the uri in the store dictionary, MemoryCache.get() will now follow variant = uricache.wait() codepath on a next request, and AntiStampedeCache will guard the cache by inserting a threading._Event object in the cache (because it didn't find a result). Now, if this next request happens to be an 'if-modified-since' request, the static.serve_fileobj() path will except (caused by cptools.validate_since()) before the body is filled with the fileObject and set headers to 304 not-modified. Response.finalize() on it's turn used to skip the tee_output generator on this and leave the threading._Event behind, holding off the next requests untill the threading._Event timed out. Causing our observed time-outs. However, only consuming tee_output on this 304 case would fill the cache with an empty body and repsonse headers 304. This is nonsense for someone that would request the same uri/variant without an if-modified-since header, because that client would not have the file in cache, but would be served with an empty body, containing a header (not-modified) it never asked for! I do doubt the sanity of the coarse empty body case cache deletion in tee_output. It certainly helps to solve the observed case and doesn't harm others. The only problem is, it interferes with the caching of potentially expensive empty body results, but I have no clue how to correctly address that and well, we had to move on with our project ;) By the way: This bug also raises my question why first requests to non-existing uri's in caching.store won't receive the AntiStampedeCache guard treatment. Of all circumstances, a freshly restarted server that has empty (memory)cache should be protected against a request stampede? Of course, I see why it happens, but I think it could use some attention as well? |
So, did my second attempt at explaining the race help? |
I didn't have time to check this attentively just yet. |
I suddenly realised why I had so much trouble understanding the CachingTool in the first place and suggest another fix for that: the _caching.get() function isn't a getter, so it's name is confusing. It loads the server.response if cache is available and returns true or false on success/failure to do that. I think it should be renamed load(). |
Still no time to review the PR? |
Hi everyone, I think this is fine as it is now. It is a proper fix, even if it is not the best possible solution. As this does not affect any public API or method signature, I think this should be merged. If you think there is room for improvement, lets open an issue and link it to this PR with a reference to the merge commit and work on it when time allows - but until that is done we can have the fix. What do you think? |
@c00kiemon5ter Thanks for your opinion! The main blocker for me as a maintainer is being able to understand the implementation and what are the consequences for other parts of code. I'm really sorry that I didn't find time to dig into it, but to me it feels irresponsible to blindly merge changes. I hope you understand this. It's still in my long backlog. As a community member, you can try helping @jaraco and me to verify things and comment on this more, so that it will make it easier for us to do reviews and would take much less time. P.S. @mrvanes could you please rebase this PR on top of current master? |
…r. Use 'iter' to make iterator from iterable.
…I still don't understand why deleting the cache is necessary. The previous comment said 'an empty response will be cached', but that seems inconsistent with the 'if body' block.
…I still don't understand why deleting the cache is necessary. The previous comment said 'an empty response will be cached', but that seems inconsistent with the 'if body' block.
// Side note:
it looks like the list should also include |
// Also side note: https://tools.ietf.org/html/rfc7231#section-4.3.7
|
What kind of change does this PR introduce? (Bug fix, feature, docs update, ...)
Bugfix
What is the related issue number (starting with
#
)AntiStampede Cache leaves orphaned threading.Event object on 304 Not Modified response #1690
What is the current behavior? (You can also link to an open issue here)
AntiStampede Cache leaves orphaned threading.Event object on 304 Not Modified response #1690
What is the new behavior (if this is a feature change)?
No timeouts