Skip to content
This repository has been archived by the owner on Apr 21, 2023. It is now read-only.

Invalid cache entries on aborted requests #1081

Closed
jeffkaufman opened this issue May 14, 2015 · 13 comments
Closed

Invalid cache entries on aborted requests #1081

jeffkaufman opened this issue May 14, 2015 · 13 comments

Comments

@jeffkaufman
Copy link
Contributor

In debugging #1048 we found a case where we weren't handling IPRO recording properly. To reproduce:

  • Set up a heavily rate limited origin server
  • Set up a local apache proxying that origin
  • Set up PageSpeed on that apache with IPRO
  • Request a file for the first time, and cancel the download partway through
  • Observe that the partially downloaded file is stored in the cache
@jeffkaufman
Copy link
Contributor Author

We need to check r->connection->aborted before saving contents to the cache. I'll do this.

This may also affect nginx; I'll check.

@jeffkaufman
Copy link
Contributor Author

I checked: this doesn't affect nginx. In nginx we check last_buf which isn't set if the user aborts the load. I checked both the code and manual loading.

@jeffkaufman
Copy link
Contributor Author

The fix is very simple:

--- a/pagespeed/apache/mod_instaweb.cc
+++ b/pagespeed/apache/mod_instaweb.cc
@@ -823,6 +823,13 @@ apr_status_t instaweb_in_place_check_headers_filter(ap_filter_t* filter,
       response_headers.SetDate(timer.NowMs());
       response_headers.ComputeCaching();

+      if (request->connection->aborted) {
+        // Browser cancelled download, request may be incomplete, don't record
+        // to the cache.
+        // See https://github.com/pagespeed/mod_pagespeed/issues/1081
+        recorder->Fail();
+      }
+
       // We now have the final headers.  If they don't let us cache then we'll
       // abort even though we've already buffered up the whole resource.
       InstawebHandler::AboutToBeDoneWithRecorder(request, recorder);

@jeffkaufman
Copy link
Contributor Author

Testing this is really annoying. I have a manual test set up, which works fine, but to test this properly we need a rate-limiting webserver and to do this on apache 2.2 we need to install other modules. I tried writing a shell script:

function send_slow_test_response() {
   local first_half="document.write('";
   local second_half="hello world');";
   local length=$(echo "$first_half $second_half" | wc -c);

   echo -e "HTTP/1.1 200 OK\\r";
   echo -e "Content-Type: text/javascript\\r";
   echo -e "Content-Length: $length\\r";
   echo -e "Cache-Control: public,max-age=700\\r";
   echo -e "Connection: close\\r";
   echo -e "Date: $(date)\\r";
   echo -e "\\r";
   echo "$first_half";
   sleep 1;
   echo "$second_half";
}
send_slow_test_response  | stdbuf -i0 -o0 -e0 nc -l -p 8765 &
curl -D- localhost:8080/simple-proxy/ ... Ctrl+C

but couldn't get it to work reliably. Not sure why.

@oschaaf
Copy link
Member

oschaaf commented May 14, 2015

@jeffkaufman

re: simple fix:
Would it be possible to somehow DCHECK when a port gets the specifics of this wrong?

re: testing rate limiting
with apache/incubator-pagespeed-ngx#936 PSOL gets the capability to forward flushes from php, which might be useful here? (I needed that for testing purposes as well).

@oschaaf
Copy link
Member

oschaaf commented May 14, 2015

Sorry - I meant #1066 instead of the ngx_pagespeed part of that

@jeffkaufman
Copy link
Contributor Author

@oschaaf yes, #1066 would make this easier to test. I tried doing it with PHP and couldn't pass flushes through; that seems to be why. @morlovich are you going to get a chance to look at #1066? You'd said you were interested in looking at it, but if not I can.

Would it be possible to somehow DCHECK when a port gets the specifics of this wrong?

I don't think we can DCHECK, because we don't know how to tell that the download was aborted except in a server-specific way (apache: r->connection->aborted, nginx: don't receive last_buf).

@oschaaf
Copy link
Member

oschaaf commented May 14, 2015

@jeffkaufman: "I don't think we can DCHECK, because we don't know how to tell that the download was aborted except in a server-specific way (apache: r->connection->aborted, nginx: don't receive last_buf)."

perhaps somehow assuming failure in the recorder until success is explicitly indicated via an api call would help here? Though that might require an api change which might be too much.

@jeffkaufman
Copy link
Contributor Author

perhaps somehow assuming failure in the recorder until success is explicitly indicated via an api call
would help here?

Maybe adding:

InPlaceResourceRecorder::RequestAborted(bool aborted)

?

@jeffkaufman
Copy link
Contributor Author

So anyone working with InPlaceResourceRecorder has to think about how to know if the request was aborted?

@oschaaf
Copy link
Member

oschaaf commented May 14, 2015

@jeffkaufman yes, that would make it explicit, and would be easy maintenance-wise, which sounds great to me.

@jeffkaufman
Copy link
Contributor Author

Actually, what if I just do this as another (required) argument to DoneAndSetHeaders:

   // Because of Apache's quirky filter order, we cannot get both the
   // uncompressed final contents and the complete headers at the same time.
   //
+  // Set entire_response_received to true if you know that the response data fed
+  // into Write() is complete.  For example, if the browser cancelled the
+  // download and so this is a partial response, set entire_response_received to
+  // false so we know not to cache it.
+  //
   // Does not take ownership of response_headers.
   //
   // Deletes itself. Do not use object after calling DoneAndSetHeaders().
-  void DoneAndSetHeaders(ResponseHeaders* response_headers);
+  void DoneAndSetHeaders(ResponseHeaders* response_headers,
+                         bool entire_response_received);

@oschaaf
Copy link
Member

oschaaf commented May 14, 2015

@jeffkaufman I think that would be even better. That would follow the Done(success) interface of the other classes more closely, which seems more natural.

jeffkaufman added a commit that referenced this issue May 15, 2015
…note this and not record the truncated resource to cache.

Fixes #1081

Nginx side of the change: apache/incubator-pagespeed-ngx#966
jeffkaufman added a commit that referenced this issue May 27, 2015
…note this and not record the truncated resource to cache.

Fixes #1081

Nginx side of the change: apache/incubator-pagespeed-ngx#966
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants