Skip to content
This repository has been archived by the owner on Apr 21, 2023. It is now read-only.

pagespeed resources served with chunked encoding #712

Closed
jeffkaufman opened this issue May 28, 2014 · 21 comments
Closed

pagespeed resources served with chunked encoding #712

jeffkaufman opened this issue May 28, 2014 · 21 comments

Comments

@jeffkaufman
Copy link
Contributor

If I install 1.8.31.2 and fetch http://localhost:8050/mod_pagespeed_example/styles/A.rewrite_css_images.css.pagespeed.cf.HASH.css I will get a Content-Length header and the expected Cache-Control: private. If instead I fetch it with the correct hash [1] I get Transfer-Encoding: chunked and the expected Cache-Control: public .... We have a test that's supposed to be making sure we always send Content-Length, but it's not catching this because it uses a fake hash.

[1] get this with curl -s 'http://localhost:8050/mod_pagespeed_example/rewrite_css_images.html?PageSpeedFilters=rewrite_css' | grep rewrite_css_images | awk -F'"' '{print $(NF-1)}'

@jeffkaufman
Copy link
Contributor Author

working on this now

@oschaaf
Copy link
Member

oschaaf commented May 29, 2014

@jeffkaufman I just saw you are looking at this as well - I'll pick something else.
I'm was able to reproduce, and following a suspicion commenting the line that calls ngx_http_clean_header() seems to a least leave a CL header in there. I'm not sure that that is the correct fix, that would need further analysis.
By the way, I'm having trouble passing the system tests, which PSOL revision should I take to test 1.8.31.2?

@jeffkaufman
Copy link
Contributor Author

This is also a problem in trunk, so I'm debugging it there. Currently writing a test to reproduce, though I see it with manual testing already.

ngx_pagespeed trunk-tracking is currently one commit ahead of PSOL trunk because @xqyin accidentally committed #709 before pushing to PSOL svn, so to build trunk-tracking right now you need to roll back off 96f185d. This should be fixed soon.

But if you do want to look at 1.8.31.2 the psol version to use is the one tagged for the release: /tags/1.8.31.2. (For pagespeed_automatic.a you can just use the released debug binaries, no need to build it.)

@jeffkaufman
Copy link
Contributor Author

Here's a test to reproduce it:

start_test PageSpeed resources should have a content length.
CONTAINING_URL="$EXAMPLE_ROOT/rewrite_css_images.html"
CONTAINING_URL+="?PageSpeedFilters=rewrite_css"
fetch_until -save "$CONTAINING_URL" \
  "fgrep -c rewrite_css_images.css.pagespeed.cf" 1
# Pull the rewritten resource name out so we get the hash.                       
CSS_FILE="$(grep rewrite_css_images.css.pagespeed $FETCH_UNTIL_OUTFILE |\        
            awk -F\" '{print $(NF-1)}')"
echo $CSS_FILE
OUT=$($WGET_DUMP "$EXAMPLE_ROOT/$CSS_FILE")
check_from "$OUT" grep "^Content-Length:" # this line fails
check_not_from "$OUT" grep "^Transfer-Encoding: chunked"
check_not_from "$OUT" grep "^Connection: close"

@jeffkaufman
Copy link
Contributor Author

When serving http://localhost:8050/mod_pag_example/styles/A.rewrite_css_images.css.pagespeed.cf.JWIi61uu6v.css, NgxBaseFetch::CollectHeaders sees content_length_known() as false.

@xqyin
Copy link
Contributor

xqyin commented May 29, 2014

Sorry about that, but I pushed to PSOL svn yesterday, I think.
The trunk-tracking cannot be built without rolling back now?

On Thursday, May 29, 2014, Jeff Kaufman notifications@github.com wrote:

This is also a problem in trunk, so I'm debugging it there. Currently
writing a test to reproduce, though I see it with manual testing already.

ngx_pagespeed trunk-tracking is currently one commit ahead of PSOL trunk
because @xqyin https://github.com/xqyin accidentally committed #709https://github.com/pagespeed/ngx_pagespeed/pull/709before pushing to PSOL svn, so to build trunk-tracking right now you need
to roll back off 96f185dhttps://github.com/pagespeed/ngx_pagespeed/commit/96f185d472bbcad9039683216656245cafbdc388.
This should be fixed soon.

But if you do want to look at 1.8.31.2 the psol version to use is the one
tagged for the release: /tags/1.8.31.2. (For pagespeed_automatic.a you
can just use the released debug binaries, no need to build it.)


Reply to this email directly or view it on GitHubhttps://github.com//issues/712#issuecomment-44528394
.

@jeffkaufman
Copy link
Contributor Author

In this case we currently call on the base fetch HandleHeadersComplete() and then HandleWrite(). When the base fetch gets HandleHeadersComplete() it asks nginx to send out the headers.

Either PSOL needs to be setting the content length, which it's not doing so, or ngx_pagespeed needs to wait after it gets HandleHeadersComplete until HandleDone before it can send out the headers. Waiting until HandleDone definitely doesn't work in the general case, so I'll look at getting PSOL to set a content length.

@jeffkaufman
Copy link
Contributor Author

@xqyin you're right, you did. Sorry! Everything should be fine now, and trunk-tracking should build properly.

@jeffkaufman
Copy link
Contributor Author

(@oschaaf ignore my comment about rolling back 96f185d; @xqyin already fixed things up.)

@jeffkaufman
Copy link
Contributor Author

For some pagespeed resources the content length is set properly: http://localhost:8050/mod_pagespeed_example/images/Puzzle.jpg.pagespeed.ce.91_WewrLtP.jpg

In these cases NgxBaseFetch::CollectHeaders sees content_length_known() because SharedAsyncFetch::HandleHeadersComplete() ran base_fetch->set_content_length(). So the question is: why does a SharedAsyncFetch only sometimes know it's content length?

@oschaaf
Copy link
Member

oschaaf commented May 29, 2014

@jeffkaufman Your analysis makes sense to me, setting the CL early when possible sounds good.
However, I'm still wondering why a content-length header actually is emitted when an incorrect hash is used.

@jeffkaufman
Copy link
Contributor Author

I'm still wondering why a content-length header actually is emitted when an incorrect hash is used.

I'm not sure I totally understand, but the incorrect hash triggers fallback rewriting, and in fallback rewriting we weren't properly clearing the content type before r3996. 1.8.31.2 doesn't have that change, though we're including it in 1.8.31.3.

@jeffkaufman
Copy link
Contributor Author

On the cache extension path we don't store the rewritten resource (only the input resource) while on all the other paths we do. Perhaps the content length isn't populated when we load from a rewritten resource?

@jeffkaufman
Copy link
Contributor Author

This code (ProxyFetch etc) is very different from the apache code, and this bug doesn't seem to apply in apache. But it's almost identical to what PSS runs, and I do see Transfer-Encoding: chunked there: http://1-ps.googleusercontent.com/h/www.pssdemos.com/filter/scripts/example.js+example2.js.pagespeed.jc.49tRvQkrsh.js

@jeffkaufman
Copy link
Contributor Author

RewriteContext::FetchDone has:

      if (output_resource_->hash() == requested_hash_) {
        response_headers->CopyFrom(*(
            output_resource_->response_headers()));
        // Use the most conservative Cache-Control considering all inputs.
        ApplyInputCacheControl(response_headers);
        AddMetadataHeaderIfNecessary(response_headers);
        StringPiece contents = output_resource_->contents();
        async_fetch_->set_content_length(contents.size());
        async_fetch_->HeadersComplete();
        ok = async_fetch_->Write(contents, handler_);
      } else {
        // Our rewrite produced a different hash than what was requested;
        // we better not give it an ultra-long TTL.
        FetchFallbackDone(output_resource_->contents(),
                          output_resource_->response_headers());

        return;
      }

Calling set_content_length() and then HeadersComplete() in that order should be passing along the content length properly. Hmm.

@jeffkaufman
Copy link
Contributor Author

RewriteContext::FetchDone seems to be called when we handle most resources for the first time, and cache extended resources every time. When we take this path we set the content-length header properly. So what path do we take on other requests?

@jeffkaufman
Copy link
Contributor Author

And if I'd read a little more I would have seen it. That's not RewriteContext::FetchDone, it's RewriteContext::FetchContext::FetchDone which has the comment:

// This class encodes a few data members used for responding to                  
// resource-requests when the output_resource is not in cache.   

Which is exactly what we're seeing: no content length in cases where the output_resource is not in cache.

@jeffkaufman
Copy link
Contributor Author

Current best guess: RewriteDriver::CacheCallback::Done() doesn't set the content length before calling Write(), which is the load-from-cache path.

@jeffkaufman
Copy link
Contributor Author

rewriter/rewrite_driver.cc CacheCallback::Done():

       if (success) {
         output_resource_->Link(value, handler_);
         output_resource_->SetWritten(true);
+        async_fetch_->set_content_length(content.size());
+        async_fetch_->HeadersComplete();
         success = async_fetch_->Write(content, handler_);
       }

This fixes it. I'll make the PSOL change.

@jeffkaufman
Copy link
Contributor Author

Fixed with PSOL r4021.

@jeffkaufman jeffkaufman changed the title chunked encoding on css resources pagespeed resources served with chunked encoding Aug 4, 2014
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants