Skip to content
This repository has been archived by the owner on Apr 21, 2023. It is now read-only.

Added In-place resource optimization #450

Closed
wants to merge 12 commits into from
Closed

Conversation

chaizhenhua
Copy link
Contributor

No description provided.

@chaizhenhua
Copy link
Contributor Author

now the following test failed with pagespeed InPlaceResourceOptimization on; option.

Failing Tests:
  In-place resource optimization
  In-place resource optimization
  In-place resource optimization
  dedup_inlined_images,inline_images
  Non-local access to statistics fails.
  Non-local access to statistics fails.
  Non-local access to statistics fails.
  Non-local access to statistics fails.
  Blocking rewrite enabled.
  Blocking rewrite enabled.
FAIL.
With serf fetcher setup.

@oschaaf
Copy link
Member

oschaaf commented Jul 22, 2013

@chaizhenhua I will have a go at figuring out why the IPRO tests still fail

@oschaaf
Copy link
Member

oschaaf commented Jul 22, 2013

@chaizhenhua
/cc @sligocki

Test Blocking rewrite enabled fails, because of this line in system_test.sh:

'WGET_ARGS="--header='X-PSA-Blocking-Rewrite:psatest'"` [1]

nginx marks the resulting request header as invalid as it gets send wrapped in single quotes:
2013/07/22 14:28:24 [info] 4548#0: *14 client sent invalid header line: "'X-PSA-Blocking-Rewrite: psatest'" while reading client request headers

[1] https://code.google.com/p/modpagespeed/source/browse/trunk/src/install/system_test.sh#99

@oschaaf
Copy link
Member

oschaaf commented Jul 22, 2013

@chaizhenhua
/cc @sligocki

Testing again - this doesn't fix the failing test for us (though removing the single quotes does stop nginx's complaints about invalid headers)

@oschaaf
Copy link
Member

oschaaf commented Jul 22, 2013

@chaizhenhua
One reason that why it is failing, is that the image used in the tests for IPRO isn't served up being cacheable - which makes IPRO decline. I will make a pull request with a fix for that.

@oschaaf
Copy link
Member

oschaaf commented Jul 22, 2013

@chaizhenhua
I made a little bit of progress on this:
The recorded resource gets marked as non-cacheable here:
https://code.google.com/p/modpagespeed/source/browse/trunk/src/net/instaweb/apache/in_place_resource_recorder.cc#95
It seems that response_headers->IsProxyCacheableGivenRequest(*request_headers_); is returning false. This pull didn't seem to populate those response headers yet before calling DoneAndSetHeaders. I have fixed that, but it still isn't working.

In a few hours, I will look some more at it, and hopefully figure out why IsProxyCacheableGivenRequest is returning false.

@oschaaf
Copy link
Member

oschaaf commented Jul 22, 2013

response_headers->IsProxyCacheableGivenRequest(*request_headers_); is returning false, because somehow the date response header is not passed in.

Adding a little hack to add the current date instead for that, it now fails on something else.

@oschaaf
Copy link
Member

oschaaf commented Jul 22, 2013

@chaizhenhua
I saved state on my work on this:
chaizhenhua-ipro...oschaaf-ipro

It passes 2/3 ipro tests (and all other tests), but doesn't fix the missing date header properly yet.
I suspect the remaining failing tests needs InPlaceResourceOptimization on; at another point in the test configuration - I will fix that tomorrow.

@oschaaf
Copy link
Member

oschaaf commented Jul 23, 2013

@chaizhenhua
https://github.com/pagespeed/ngx_pagespeed/tree/oschaaf-ipro is now passing all tests - I will clean up
and make a pull to you ipro branch.

@chaizhenhua
Copy link
Contributor Author

@oschaaf that's great! Is config file mod_pagespeed/src/install/debug.conf.template for apache test? In the file InPlaceResourceOptimization is set for global config. and now we just set InPlaceResourceOptimization on in /mod_pagespeed_test/ipro/. Is it a problem?

@oschaaf
Copy link
Member

oschaaf commented Jul 23, 2013

@chaizhenhua Yes - I think that indeed is a remaining problem - looking at that. When I re-enable InPlaceResourceOptimization globally, we seem to break other tests. Looking at that, that seems mostly due to incorrect header handling - the first test that breaks is because of a redirect response without a Location header.

@oschaaf
Copy link
Member

oschaaf commented Jul 23, 2013

@chaizhenhua Updated to work with InPlaceResourceOptimization enabled globally.
Pull: #452

oschaaf and others added 2 commits July 23, 2013 12:05
- Enable InPlaceResourceOptimization only in the required location
- Hand over the response headers we received to our
  InPlaceResourceRecorder
- Invent and pass in a Date response header set to the current date
  to the InPlaceResourceRecorder, if none is given.
- Set the appropriate expire configuration in the test configuration
  as required to enable ipro and pass the tests
- Port handle_error_ from ApacheProxyFetch
- Update the 'blocking rewrite enabled' tests expectations for IPRO
@jeffkaufman
Copy link
Contributor

Before this will compile against the 1.6.29.3 distribution of psol we need to start shipping in_place_resource_recorder.cc. Figuring that out now.

jeffkaufman added a commit that referenced this pull request Jul 24, 2013
Squash-merge of chaizhenhua's work over many pull requests, especially #450.

The copy of in_place_resource_recorder.cc is temporary and can be removed after
the next PSOL release.
@jeffkaufman
Copy link
Contributor

Merged as ad6429d

@jeffkaufman jeffkaufman reopened this Jul 24, 2013
@jeffkaufman
Copy link
Contributor

In testing this on jefftk.com it looks like there's a problem with duplicate gzip headers. Most of the time pages load normally and have headers like:

HTTP/1.1 200 OK
Server: nginx/1.4.1
Date: Wed, 24 Jul 2013 15:09:50 GMT
Content-Type: text/html
Transfer-Encoding: chunked
Connection: keep-alive
Vary: Accept-Encoding
Content-Encoding: gzip
X-Cache: HIT
Cache-Control: no-cache, max-age=0

But sometimes I'll get:

HTTP/1.1 200 OK
Server: nginx/1.4.1
Date: Wed, 24 Jul 2013 15:09:34 GMT
Content-Type: text/html
Transfer-Encoding: chunked
Connection: keep-alive
Vary: Accept-Encoding
Content-Encoding: gzip
Content-Encoding: gzip
X-Cache: HIT
Cache-Control: no-cache, max-age=0

And Chrome will report ERR_CONTENT_DECODING_FAILED. Screenshot: http://www.jefftk.com/err-content-decoding-failed.png

Where is the second gzip header coming from?

@jeffkaufman
Copy link
Contributor

(Manually removing the second gzip header and testing with netcat as the server, this fixes the problem. So the question is, why are their two headers?)

@jeffkaufman
Copy link
Contributor

I've now set up http://www.jefftk.com:8091 with IPRO enabled for testing.

@yaoweibin
Copy link
Contributor

Hi Jeff, You can print the debug log to check when the duplicate gzip header is inserted.

@jeffkaufman
Copy link
Contributor

$ headers -H'Accept-Encoding: gzip' http://www.jefftk.com:8091/index
HTTP/1.1 200 OK
Server: nginx/1.4.1
Date: Wed, 24 Jul 2013 15:35:54 GMT
Content-Type: text/html
Last-Modified: Tue, 23 Jul 2013 04:08:54 GMT
Transfer-Encoding: chunked
Connection: keep-alive
Vary: Accept-Encoding
Cache-Control: max-age=600
Content-Encoding: gzip

$ headers -H'Accept-Encoding: gzip' http://www.jefftk.com:8091/index
HTTP/1.1 200 OK
Server: nginx/1.4.1
Content-Type: text/html
Transfer-Encoding: chunked
Connection: keep-alive
Vary: Accept-Encoding
ETag: "51ee01d6-23d9"
Cache-Control: max-age=600
Content-Encoding: gzip
Date: Wed, 24 Jul 2013 15:35:54 GMT
Content-Encoding: gzip

The reason it's being served as cachable is that I have ModifyCachingHeaders off in the config as part of the downstream caching setup.

The duplicate gzip headers are consistently showing up on the second time I load a page:

$ headers -H'Accept-Encoding: gzip' http://www.jefftk.com:8091/news/index
HTTP/1.1 200 OK
Server: nginx/1.4.1
Date: Wed, 24 Jul 2013 15:38:48 GMT
Content-Type: text/html
Last-Modified: Tue, 23 Jul 2013 04:08:55 GMT
Transfer-Encoding: chunked
Connection: keep-alive
Vary: Accept-Encoding
Cache-Control: max-age=600
Content-Encoding: gzip

$ headers -H'Accept-Encoding: gzip' http://www.jefftk.com:8091/news/index
HTTP/1.1 200 OK
Server: nginx/1.4.1
Content-Type: text/html
Transfer-Encoding: chunked
Connection: keep-alive
Vary: Accept-Encoding
ETag: "51ee01d7-18971"
Cache-Control: max-age=600
Content-Encoding: gzip
Date: Wed, 24 Jul 2013 15:38:48 GMT
Content-Encoding: gzip

And:

$ headers -H'Accept-Encoding: gzip' http://www.jefftk.com:8091/news/2013-06-10
HTTP/1.1 200 OK
Server: nginx/1.4.1
Date: Wed, 24 Jul 2013 15:39:43 GMT
Content-Type: text/html
Last-Modified: Tue, 23 Jul 2013 04:08:54 GMT
Transfer-Encoding: chunked
Connection: keep-alive
Vary: Accept-Encoding
Cache-Control: max-age=600
Content-Encoding: gzip

$ headers -H'Accept-Encoding: gzip' http://www.jefftk.com:8091/news/2013-06-10
HTTP/1.1 200 OK
Server: nginx/1.4.1
Content-Type: text/html
Transfer-Encoding: chunked
Connection: keep-alive
Vary: Accept-Encoding
ETag: "51ee01d6-3c2a"
Cache-Control: max-age=600
Content-Encoding: gzip
Date: Wed, 24 Jul 2013 15:39:43 GMT
Content-Encoding: gzip

They also continue showing up once they've done that once.

@oschaaf
Copy link
Member

oschaaf commented Jul 24, 2013

@jeffkaufman
not behind a pc right now, but I wild guess is that the problem spot is around here:

Are we copying over the gzip header there?

@chaizhenhua
Copy link
Contributor Author

In testing this on jefftk.com it looks like there's a problem with duplicate gzip headers. Most of the time pages load normally and have headers like:

@jeffkaufman This is bacause copy_response_headers_to_ngx did not handle Content-Encoding correctly. to fix this we should set headers_out.content_encoding to corresponding table_elt_t in copy_response_headers_to_ngx , and disable ngx_http_gzip_filter in ps_inplace_check_header_filter.

@jeffkaufman
Copy link
Contributor

@chaizhenhua setting headers_out->content_encoding in copy_response_headers_to_ngx seems to do it:

diff --git a/src/ngx_pagespeed.cc b/src/ngx_pagespeed.cc
index ce461b9..4f14b66 100644
--- a/src/ngx_pagespeed.cc
+++ b/src/ngx_pagespeed.cc
@@ -346,6 +346,8 @@ ngx_int_t copy_response_headers_to_ngx(
       headers_out->location = header;
     } else if (STR_EQ_LITERAL(name, "Server")) {
       headers_out->server = header;
+    } else if (STR_EQ_LITERAL(name, "Content-Encoding")) {
+      headers_out->content_encoding = header;
     } else if (STR_EQ_LITERAL(name, "Content-Length")) {
       int64 len;
       CHECK(pagespeed_headers.FindContentLength(&len));
$ headers -H 'Accept-Encoding: gzip' www.jefftk.com:8091/news/index
HTTP/1.1 200 OK
Server: nginx/1.4.1
Date: Wed, 24 Jul 2013 17:04:34 GMT
Content-Type: text/html
Last-Modified: Tue, 23 Jul 2013 04:08:55 GMT
Transfer-Encoding: chunked
Connection: keep-alive
Vary: Accept-Encoding
Cache-Control: max-age=600
Content-Encoding: gzip

$ headers -H 'Accept-Encoding: gzip' www.jefftk.com:8091/news/index
HTTP/1.1 200 OK
Server: nginx/1.4.1
Content-Type: text/html
Transfer-Encoding: chunked
Connection: keep-alive
ETag: "51ee01d7-18971"
Cache-Control: max-age=600
Content-Encoding: gzip
Date: Wed, 24 Jul 2013 17:04:34 GMT

(I'm still not sure why there's not an etag on the first load but there is on later loads.)

You had also suggested disabling ngx_http_gzip_filter in ps_inplace_check_header_filter: why?

@jeffkaufman
Copy link
Contributor

You had also suggested disabling ngx_http_gzip_filter in ps_inplace_check_header_filter: why?

I think I see why: my patch above makes it so we don't have a duplicate gzip header, but it means the output claims to be gzip encoded when it's actually plain.

@jeffkaufman
Copy link
Contributor

$ curl -D- -s -H 'User-Agent: Chrome/100' -H 'Accept-Encoding: gzip' http://www.jefftk.com:8091/news/2011-10-04
HTTP/1.1 200 OK
Server: nginx/1.4.1
Date: Wed, 24 Jul 2013 17:11:12 GMT
Content-Type: text/html
Last-Modified: Tue, 23 Jul 2013 04:08:55 GMT
Transfer-Encoding: chunked
Connection: keep-alive
Vary: Accept-Encoding
Cache-Control: max-age=600
Content-Encoding: gzip

[gzip-encoded html]

$ curl -D- -s -H 'User-Agent: Chrome/100' -H 'Accept-Encoding: gzip' http://www.jefftk.com:8091/news/2011-10-04
HTTP/1.1 200 OK
Server: nginx/1.4.1
Content-Type: text/html
Transfer-Encoding: chunked
Connection: keep-alive
ETag: "51ee01d7-31b6"
Cache-Control: max-age=600
Content-Encoding: gzip
Date: Wed, 24 Jul 2013 17:11:12 GMT

<html>
...

@jeffkaufman
Copy link
Contributor

How do we disable ngx_http_gzip_filter?

@jeffkaufman
Copy link
Contributor

We should move the discussion off this closed pull request. I've created a bug: #456

return NGX_ERROR;
}

ctx->base_fetch->SetRequestHeadersTakingOwnership(request_headers.release());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I make a debug build I get a check failure here. The code in question is in async_fetch.cc:set_request_headers:

DCHECK(!owns_request_headers_);
if (owns_request_headers_) {
  delete request_headers_;
}
request_headers_ = headers;
owns_request_headers_ = false;

(SetRequestHeadersTakingOwnership first calls set_request_headers and then sets owns_request_headers_ = true.)

It looks like ctx->base_fetch already has request headers.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When a BaseFetch is constructed it calls PopulateRequestHeaders which calls copy_request_headers_from_ngx. This used to make sense but doesn't anymore because now in the only place where we create a BaseFetch we immediately set request headers in a different way. I'll fix it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jeffkaufman #461 already fix this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's also fixed in #484 (merged)

@jeffkaufman jeffkaufman deleted the chaizhenhua-ipro branch December 6, 2016 16:49
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants