New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pagespeed resources served with chunked encoding #712

Closed
jeffkaufman opened this Issue May 28, 2014 · 21 comments

Comments

Projects
None yet
3 participants
@jeffkaufman
Copy link
Contributor

jeffkaufman commented May 28, 2014

If I install 1.8.31.2 and fetch http://localhost:8050/mod_pagespeed_example/styles/A.rewrite_css_images.css.pagespeed.cf.HASH.css I will get a Content-Length header and the expected Cache-Control: private. If instead I fetch it with the correct hash [1] I get Transfer-Encoding: chunked and the expected Cache-Control: public .... We have a test that's supposed to be making sure we always send Content-Length, but it's not catching this because it uses a fake hash.

[1] get this with curl -s 'http://localhost:8050/mod_pagespeed_example/rewrite_css_images.html?PageSpeedFilters=rewrite_css' | grep rewrite_css_images | awk -F'"' '{print $(NF-1)}'

@jeffkaufman

This comment has been minimized.

Copy link
Contributor

jeffkaufman commented May 29, 2014

working on this now

@oschaaf

This comment has been minimized.

Copy link
Member

oschaaf commented May 29, 2014

@jeffkaufman I just saw you are looking at this as well - I'll pick something else.
I'm was able to reproduce, and following a suspicion commenting the line that calls ngx_http_clean_header() seems to a least leave a CL header in there. I'm not sure that that is the correct fix, that would need further analysis.
By the way, I'm having trouble passing the system tests, which PSOL revision should I take to test 1.8.31.2?

@jeffkaufman

This comment has been minimized.

Copy link
Contributor

jeffkaufman commented May 29, 2014

This is also a problem in trunk, so I'm debugging it there. Currently writing a test to reproduce, though I see it with manual testing already.

ngx_pagespeed trunk-tracking is currently one commit ahead of PSOL trunk because @xqyin accidentally committed #709 before pushing to PSOL svn, so to build trunk-tracking right now you need to roll back off 96f185d. This should be fixed soon.

But if you do want to look at 1.8.31.2 the psol version to use is the one tagged for the release: /tags/1.8.31.2. (For pagespeed_automatic.a you can just use the released debug binaries, no need to build it.)

@jeffkaufman

This comment has been minimized.

Copy link
Contributor

jeffkaufman commented May 29, 2014

Here's a test to reproduce it:

start_test PageSpeed resources should have a content length.
CONTAINING_URL="$EXAMPLE_ROOT/rewrite_css_images.html"
CONTAINING_URL+="?PageSpeedFilters=rewrite_css"
fetch_until -save "$CONTAINING_URL" \
  "fgrep -c rewrite_css_images.css.pagespeed.cf" 1
# Pull the rewritten resource name out so we get the hash.                       
CSS_FILE="$(grep rewrite_css_images.css.pagespeed $FETCH_UNTIL_OUTFILE |\        
            awk -F\" '{print $(NF-1)}')"
echo $CSS_FILE
OUT=$($WGET_DUMP "$EXAMPLE_ROOT/$CSS_FILE")
check_from "$OUT" grep "^Content-Length:" # this line fails
check_not_from "$OUT" grep "^Transfer-Encoding: chunked"
check_not_from "$OUT" grep "^Connection: close"
@jeffkaufman

This comment has been minimized.

Copy link
Contributor

jeffkaufman commented May 29, 2014

When serving http://localhost:8050/mod_pag_example/styles/A.rewrite_css_images.css.pagespeed.cf.JWIi61uu6v.css, NgxBaseFetch::CollectHeaders sees content_length_known() as false.

@xqyin

This comment has been minimized.

Copy link
Contributor

xqyin commented May 29, 2014

Sorry about that, but I pushed to PSOL svn yesterday, I think.
The trunk-tracking cannot be built without rolling back now?

On Thursday, May 29, 2014, Jeff Kaufman notifications@github.com wrote:

This is also a problem in trunk, so I'm debugging it there. Currently
writing a test to reproduce, though I see it with manual testing already.

ngx_pagespeed trunk-tracking is currently one commit ahead of PSOL trunk
because @xqyin https://github.com/xqyin accidentally committed #709https://github.com/pagespeed/ngx_pagespeed/pull/709before pushing to PSOL svn, so to build trunk-tracking right now you need
to roll back off 96f185dhttps://github.com/pagespeed/ngx_pagespeed/commit/96f185d472bbcad9039683216656245cafbdc388.
This should be fixed soon.

But if you do want to look at 1.8.31.2 the psol version to use is the one
tagged for the release: /tags/1.8.31.2. (For pagespeed_automatic.a you
can just use the released debug binaries, no need to build it.)


Reply to this email directly or view it on GitHubhttps://github.com//issues/712#issuecomment-44528394
.

@jeffkaufman

This comment has been minimized.

Copy link
Contributor

jeffkaufman commented May 29, 2014

In this case we currently call on the base fetch HandleHeadersComplete() and then HandleWrite(). When the base fetch gets HandleHeadersComplete() it asks nginx to send out the headers.

Either PSOL needs to be setting the content length, which it's not doing so, or ngx_pagespeed needs to wait after it gets HandleHeadersComplete until HandleDone before it can send out the headers. Waiting until HandleDone definitely doesn't work in the general case, so I'll look at getting PSOL to set a content length.

@jeffkaufman

This comment has been minimized.

Copy link
Contributor

jeffkaufman commented May 29, 2014

@xqyin you're right, you did. Sorry! Everything should be fine now, and trunk-tracking should build properly.

@jeffkaufman

This comment has been minimized.

Copy link
Contributor

jeffkaufman commented May 29, 2014

(@oschaaf ignore my comment about rolling back 96f185d; @xqyin already fixed things up.)

@jeffkaufman

This comment has been minimized.

Copy link
Contributor

jeffkaufman commented May 29, 2014

For some pagespeed resources the content length is set properly: http://localhost:8050/mod_pagespeed_example/images/Puzzle.jpg.pagespeed.ce.91_WewrLtP.jpg

In these cases NgxBaseFetch::CollectHeaders sees content_length_known() because SharedAsyncFetch::HandleHeadersComplete() ran base_fetch->set_content_length(). So the question is: why does a SharedAsyncFetch only sometimes know it's content length?

@oschaaf

This comment has been minimized.

Copy link
Member

oschaaf commented May 29, 2014

@jeffkaufman Your analysis makes sense to me, setting the CL early when possible sounds good.
However, I'm still wondering why a content-length header actually is emitted when an incorrect hash is used.

@jeffkaufman

This comment has been minimized.

Copy link
Contributor

jeffkaufman commented May 29, 2014

I'm still wondering why a content-length header actually is emitted when an incorrect hash is used.

I'm not sure I totally understand, but the incorrect hash triggers fallback rewriting, and in fallback rewriting we weren't properly clearing the content type before r3996. 1.8.31.2 doesn't have that change, though we're including it in 1.8.31.3.

@jeffkaufman

This comment has been minimized.

Copy link
Contributor

jeffkaufman commented May 29, 2014

On the cache extension path we don't store the rewritten resource (only the input resource) while on all the other paths we do. Perhaps the content length isn't populated when we load from a rewritten resource?

@jeffkaufman

This comment has been minimized.

Copy link
Contributor

jeffkaufman commented May 29, 2014

This code (ProxyFetch etc) is very different from the apache code, and this bug doesn't seem to apply in apache. But it's almost identical to what PSS runs, and I do see Transfer-Encoding: chunked there: http://1-ps.googleusercontent.com/h/www.pssdemos.com/filter/scripts/example.js+example2.js.pagespeed.jc.49tRvQkrsh.js

@jeffkaufman

This comment has been minimized.

Copy link
Contributor

jeffkaufman commented May 29, 2014

RewriteContext::FetchDone has:

      if (output_resource_->hash() == requested_hash_) {
        response_headers->CopyFrom(*(
            output_resource_->response_headers()));
        // Use the most conservative Cache-Control considering all inputs.
        ApplyInputCacheControl(response_headers);
        AddMetadataHeaderIfNecessary(response_headers);
        StringPiece contents = output_resource_->contents();
        async_fetch_->set_content_length(contents.size());
        async_fetch_->HeadersComplete();
        ok = async_fetch_->Write(contents, handler_);
      } else {
        // Our rewrite produced a different hash than what was requested;
        // we better not give it an ultra-long TTL.
        FetchFallbackDone(output_resource_->contents(),
                          output_resource_->response_headers());

        return;
      }

Calling set_content_length() and then HeadersComplete() in that order should be passing along the content length properly. Hmm.

@jeffkaufman

This comment has been minimized.

Copy link
Contributor

jeffkaufman commented May 29, 2014

RewriteContext::FetchDone seems to be called when we handle most resources for the first time, and cache extended resources every time. When we take this path we set the content-length header properly. So what path do we take on other requests?

@jeffkaufman

This comment has been minimized.

Copy link
Contributor

jeffkaufman commented May 29, 2014

And if I'd read a little more I would have seen it. That's not RewriteContext::FetchDone, it's RewriteContext::FetchContext::FetchDone which has the comment:

// This class encodes a few data members used for responding to                  
// resource-requests when the output_resource is not in cache.   

Which is exactly what we're seeing: no content length in cases where the output_resource is not in cache.

@jeffkaufman

This comment has been minimized.

Copy link
Contributor

jeffkaufman commented May 29, 2014

Current best guess: RewriteDriver::CacheCallback::Done() doesn't set the content length before calling Write(), which is the load-from-cache path.

@jeffkaufman

This comment has been minimized.

Copy link
Contributor

jeffkaufman commented May 29, 2014

rewriter/rewrite_driver.cc CacheCallback::Done():

       if (success) {
         output_resource_->Link(value, handler_);
         output_resource_->SetWritten(true);
+        async_fetch_->set_content_length(content.size());
+        async_fetch_->HeadersComplete();
         success = async_fetch_->Write(content, handler_);
       }

This fixes it. I'll make the PSOL change.

@jeffkaufman

This comment has been minimized.

Copy link
Contributor

jeffkaufman commented Jun 3, 2014

Fixed with PSOL r4021.

@jeffkaufman jeffkaufman closed this Jun 3, 2014

@jeffkaufman jeffkaufman changed the title chunked encoding on css resources pagespeed resources served with chunked encoding Aug 4, 2014

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment