New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mod_pagespeed strips custom HTML headers #367

Closed
GoogleCodeExporter opened this Issue Apr 6, 2015 · 69 comments

Comments

Projects
None yet
1 participant
@GoogleCodeExporter
Copy link

GoogleCodeExporter commented Apr 6, 2015

A user has this in his .conf file:
   Header set ServerID  "XXXX"
he reports that the ServerID is stripped by mod_pagespeed on serving.  Looking 
at the code I think this is correct.

in apache/mod_instaweb.cc, function rewrite_html, we have this block of code:

  if (!context->sent_headers()) {
    ResponseHeaders* headers = context->response_headers();
    apr_table_clear(request->headers_out);
    AddResponseHeadersToRequest(*headers, request);
    headers->Clear();
    context->set_sent_headers(true);
  }

That call to apr_table_clear is the culprit.  I think instead we need to 
rewrite the existing headers, replacing only the ones that we need to override.

Original issue reported on code.google.com by jmara...@google.com on 4 Jan 2012 at 3:57

@GoogleCodeExporter

This comment has been minimized.

Copy link

GoogleCodeExporter commented Apr 6, 2015

Original comment by jmara...@google.com on 4 Jan 2012 at 3:57

  • Added labels: Type-Defect
  • Removed labels: Type-Enhancement
@GoogleCodeExporter

This comment has been minimized.

Copy link

GoogleCodeExporter commented Apr 6, 2015

On Wed, Jan 4, 2012 at 11:06 AM, Jimy Johny <jimyjohny@gmail.com> wrote:
  The wordpress or htaccess hasn't got any header information. Its directly
  added to the virtualhost configuration. I was also in the assumption that
  mod_headers runs after mod_pagespeed.  In fact it works for images and css
  while it fails for html.

  Is this a simple issue which can be fixed very soon ?  or is their any way
  where I could change the order of apache module processing? 

You can't change the order except by source-code changes.  If you are willing 
to edit C++ code and build yourself, you can do something right now as 
suggested in the thread & I could help you.

If you can update from trunk we could probably fix it this week, assuming we 
can reproduce it.

If you need a binary release it will take longer, as we don't have a release 
scheduled right now.

Original comment by jmara...@google.com on 4 Jan 2012 at 4:25

@GoogleCodeExporter

This comment has been minimized.

Copy link

GoogleCodeExporter commented Apr 6, 2015

I can update from trunk and test it if its available by next week. At the same 
time, I love to edit the code and try on myside if you could provide the help. 
Just let me know what change needs to be applied.

Original comment by jimyjo...@gmail.com on 4 Jan 2012 at 4:33

@GoogleCodeExporter

This comment has been minimized.

Copy link

GoogleCodeExporter commented Apr 6, 2015

Please try commenting out the call to apr_table_clear in 
net/instaweb/apache/mod_instaweb.cc, function rewrite_html().

Original comment by jmara...@google.com on 4 Jan 2012 at 4:41

@GoogleCodeExporter

This comment has been minimized.

Copy link

GoogleCodeExporter commented Apr 6, 2015

I tried this, but didn't fixed the issue. Duplicate headers were added.

Date    Wed, 04 Jan 2012 17:13:31 GMT
Server  Apache
X-Powered-By    PHP/5.2.17, PHP/5.2.17
X-Pingback  http://www.xxxx.com/xmlrpc.php, http://www.xxxx.com/xmlrpc.php
Content-Type    text/html; charset=UTF-8
X-Mod-Pagespeed 0.10.19.5-1298, 0.10.19.5-1298
Cache-Control   max-age=0, no-cache, no-store, max-age=0, no-cache, no-store
Vary    Accept-Encoding
Content-Encoding    gzip
Keep-Alive  timeout=1, max=100
Connection  Keep-Alive
Transfer-Encoding   chunked

Original comment by jimyjo...@gmail.com on 4 Jan 2012 at 5:17

@GoogleCodeExporter

This comment has been minimized.

Copy link

GoogleCodeExporter commented Apr 6, 2015

Thanks for giving this a try.  I can certainly understand why we are doubling 
some of the resources, but I can't think of why we are losing the ServerID.  We 
certainly don't look for that particular header.

I have another thing for you to try.  Go ahead & replace the apr_table_clear.  
I don't think that was the problem after all.  Instead try commenting out the 
entire contents of the function DisableDownstreamHeaderFilters in 
net/instaweb/apache/header_util.cc.

This is more likely the culprit.  The reason that we have this call is that we 
are trying to automatically set the cache lifetimes of mod_pagespeed-rewritten 
HTML & resources and we were running into trouble with configuration files 
overriding our settings, eliminating some of the benefit of using mod_pagespeed.

I'm not entirely sure how to work around this in a way that allows 
non-caching-related headers to be added to HTML resources from the .conf file.  
We might need to do this using a mod_pagespeed option to leave HTML headers 
alone.

We happen to be working on such an option already.  But in the meantime please 
give this a try.

Original comment by jmara...@google.com on 4 Jan 2012 at 6:07

@GoogleCodeExporter

This comment has been minimized.

Copy link

GoogleCodeExporter commented Apr 6, 2015

Issue Fixed by commenting out DisableDownstreamHeaderFilters . Is this going to 
reduce the performance ?

Original comment by jimyjo...@gmail.com on 5 Jan 2012 at 4:08

@GoogleCodeExporter

This comment has been minimized.

Copy link

GoogleCodeExporter commented Apr 6, 2015

The benefit provided by DisableDownstreamHeaderFilters is that it allows 
mod_pagespeed to determine caching headers for rewritten files, overriding the 
settings added using "Header Set" and "Expires".  Those settings are needed to 
allow the site owner to control the cache lifetime of origin resources, but we 
don't want them to apply to rewritten resources.

So when you comment out this function, mod_pagespeed's caching control might 
not work.


The proper fix for this is, I think, for mod_pagespeed to offer a 
pagespeed.conf setting which will apply only to HTML files.  If you turn this 
new feature on, mod_pagespeed will not enforce any particular caching policy 
for HTML, and the responsibility for setting appropriate HTML caching headers 
will lie with the site owner.  Because of the way mod_pagespeed extends cache 
lifetime for resources using a content hash in a resource URL, a long HTML 
cache lifetime could result in stale resources (typically CSS files).  This is 
why mod_pagespeed currently takes a conservative approach and disables caching 
of HTML -- to avoid stale cache-extended resources.

Hope that made sense!

Original comment by jmara...@google.com on 5 Jan 2012 at 4:51

@GoogleCodeExporter

This comment has been minimized.

Copy link

GoogleCodeExporter commented Apr 6, 2015

That sounds good. maybe I should wait for the proper fix.

Original comment by jimyjo...@gmail.com on 5 Jan 2012 at 5:41

@GoogleCodeExporter

This comment has been minimized.

Copy link

GoogleCodeExporter commented Apr 6, 2015

I have another thought about a proper fix that might work without requiring a 
new mod_pagespeed option.

For HTML, rather than removing the mod_headers and mod_expires filters, we 
should install a very late fixup handler to put in the caching semantics we've 
computed for HTML.

For resources, we can leave the existing flow as is because we are capturing 
any custom headers when we fetch the origin resource.

Original comment by jmara...@google.com on 6 Jan 2012 at 2:52

@GoogleCodeExporter

This comment has been minimized.

Copy link

GoogleCodeExporter commented Apr 6, 2015

Original comment by jmara...@google.com on 6 Jan 2012 at 6:07

  • Changed state: Accepted
@GoogleCodeExporter

This comment has been minimized.

Copy link

GoogleCodeExporter commented Apr 6, 2015

OK that last idea worked, and is now committed to trunk.  If you prefer to work 
from a release branch, you can try to patch in the changes attached to the bug 
report.

Original comment by jmara...@google.com on 6 Jan 2012 at 9:48

  • Changed state: Fixed
  • Added labels: release-note

Attachments:

@GoogleCodeExporter

This comment has been minimized.

Copy link

GoogleCodeExporter commented Apr 6, 2015

Thank you for the fix.

I'll try this. In the meantime, I tried something different. After disabling 
the function, I noticed that its able to add Cache-Control headers also. As it 
worked like that, I tried to host the html via CDN which will cache for 5 min. 
So after 5 min, it will come back again to my server and can complete the 
pagespeed processing. Is it possible that I get same functionality with 
Cache-Control header after I apply this patch ?

Original comment by jimyjo...@gmail.com on 7 Jan 2012 at 2:39

@GoogleCodeExporter

This comment has been minimized.

Copy link

GoogleCodeExporter commented Apr 6, 2015

The latest release (0.10.21.2) added a new directive 
ModPagespeedModifyCachingHeaders.
http://code.google.com/speed/page-speed/docs/install.html#ModifyCachingHeaders

Would this address your requirements?

Original comment by matterb...@google.com on 2 Mar 2012 at 3:37

@GoogleCodeExporter

This comment has been minimized.

Copy link

GoogleCodeExporter commented Apr 6, 2015

It provides solution to an extend. I'm using a proxy as frontend. So it seems I 
have to set the cache header in all backend servers when the new option is 
enabled. The fix which applied earlier by editing the code allowed me to add 
the header in frontend server where pagespeed is installed.

Original comment by jimyjo...@gmail.com on 2 Mar 2012 at 4:52

@GoogleCodeExporter

This comment has been minimized.

Copy link

GoogleCodeExporter commented Apr 6, 2015

One final note:  There is a new option in the most recent binary release that 
provides the behavior you want: explicitly control of your HTML caching headers:
   ModPagespeedModifyCachingHeaders off
http://code.google.com/speed/page-speed/docs/install.html#ModifyCachingHeaders

Original comment by jmara...@google.com on 2 Mar 2012 at 10:21

@GoogleCodeExporter

This comment has been minimized.

Copy link

GoogleCodeExporter commented Apr 6, 2015

Original comment by jmara...@google.com on 22 May 2012 at 7:59

  • Added labels: Milestone-v21
@GoogleCodeExporter

This comment has been minimized.

Copy link

GoogleCodeExporter commented Apr 6, 2015

Seems like this issue is back again with the latest release (0.10.22.4-1648). I 
tried to insert a custom header via proxy, but didn't worked. This was working  
when disabling DisableDownstreamHeaderFilters in older releases ( 
0.10.21.2-1527 ). But, in the new release, this fix isn't working at all.

Original comment by jimyjo...@gmail.com on 28 Aug 2012 at 4:31

@GoogleCodeExporter

This comment has been minimized.

Copy link

GoogleCodeExporter commented Apr 6, 2015

Hi -- and sorry for the re-occurrence of this.  Actually I think we never 
really fixed it for you.  Your fix to comment out 
DisableDownstreamHeaderFilters was not incorporated in the new option 
"ModPagespeedModifyCachingHeaders".  Is that what you tried with 0.10.22.4?

Does it still work for you if you again comment out 
DisableDownstreamHeaderFilters?

Original comment by jmara...@google.com on 28 Aug 2012 at 5:19

  • Changed state: Started
@GoogleCodeExporter

This comment has been minimized.

Copy link

GoogleCodeExporter commented Apr 6, 2015

Also, could you please try the latest trunk version? It's @1815 at the moment.

Original comment by matterb...@google.com on 28 Aug 2012 at 5:21

@GoogleCodeExporter

This comment has been minimized.

Copy link

GoogleCodeExporter commented Apr 6, 2015

I'm testing that now. But when I tried to recompile I got the following error 
message. "make: *** No rule to make target `linux_package_rpm'.  Stop." . any 
idea what could have stopped me from compiling ?

Original comment by jimyjo...@gmail.com on 28 Aug 2012 at 5:25

@GoogleCodeExporter

This comment has been minimized.

Copy link

GoogleCodeExporter commented Apr 6, 2015

Seems like svn checkout didn't completed and that lead to the compilation issue.

Syncing projects:  79% (43/54) 
________ running 'svn checkout 
http://code.opencv.org/svn/opencv/tags/2.3.1/opencv/include@head 
/root/mod_pagespeed/src/third_party/opencv/src/opencv/include --revision head 
--force --ignore-externals' in '/root/mod_pagespeed'
svn: Could not open the requested SVN filesystem
Error: Command svn checkout 
http://code.opencv.org/svn/opencv/tags/2.3.1/opencv/include@head 
/root/mod_pagespeed/src/third_party/opencv/src/opencv/include --revision head 
--force --ignore-externals returned non-zero exit status 1 in 
/root/mod_pagespeed

Original comment by jimyjo...@gmail.com on 28 Aug 2012 at 6:09

@GoogleCodeExporter

This comment has been minimized.

Copy link

GoogleCodeExporter commented Apr 6, 2015

Please use branches/22 rather than tags/2.3.1

Original comment by matterb...@google.com on 28 Aug 2012 at 6:22

@GoogleCodeExporter

This comment has been minimized.

Copy link

GoogleCodeExporter commented Apr 6, 2015

:( ... opencv svn repository not available any more as per 
http://code.opencv.org. They migrate to git. So it looks like, pagespeed 
version 0.10.22.4 and lower cannot be used from the source code any more. I'm 
going to try the latest trunk.

Original comment by jimyjo...@gmail.com on 28 Aug 2012 at 6:23

@GoogleCodeExporter

This comment has been minimized.

Copy link

GoogleCodeExporter commented Apr 6, 2015

I've had some trouble reproducing this bug but nevertheless we think we know 
where the bug is.  The problem is that we capture the headers very early in the 
life of a request, copying them into our own structure.  We modify our own 
structure, and later on, we clear out the apache request headers and replace 
them with our own copy.

This is done here: 
http://code.google.com/p/modpagespeed/source/browse/trunk/src/net/instaweb/apach
e/mod_instaweb.cc#329

I think we should try to reproduce the problem in a test, but we could in any 
case change the strategy to incrementally update the headers with the values in 
our own copy, rather than clearing and replacing.

Original comment by jmara...@google.com on 30 Oct 2012 at 7:22

@GoogleCodeExporter

This comment has been minimized.

Copy link

GoogleCodeExporter commented Apr 6, 2015

We've also been burned by this, and are fairly certain the issue in our case 
that any GET which has a Cache-Control header set will inspire mod_pagespeed to 
disable later header fixups (i.e. explicit Header directives in the apache 
configuration). This is done in the DisableDownstreamHeaderFilters function. 
Our apache server is proxying for a JBoss/Tomcat application and we're setting 
"Cache-Control: no-cache" there. Fundamentally I object to the idea that 
mod_pagespeed "knows better" what headers should be set than I do, especially 
when I'm explicitly setting them in the server configuration. I'm prepared to 
create a patch to introduce a configuration option to, basically, stop 
DisableDownstreamHeaderFilters from doing what it's doing. Would such a patch 
be accepted? Is there a better approach?

Original comment by m...@favoritemo.com on 15 Nov 2012 at 8:29

@GoogleCodeExporter

This comment has been minimized.

Copy link

GoogleCodeExporter commented Apr 6, 2015

I should add that this bug is so severe from our perspective, that we have 
essentially rejected ModPageSpeed's promotion for use in our production 
environment and we have disabled it in our QA environment for the foreseeable 
future. I will be following this ticket and may opt for reconsideration of our 
use of ModPageSpeed if this issue is reliably resolved.

Original comment by alexel...@gmail.com on 15 Nov 2012 at 8:41

@GoogleCodeExporter

This comment has been minimized.

Copy link

GoogleCodeExporter commented Apr 6, 2015

We'd really like to fix this bug but were unable to reproduce it.  Can someone 
publish enough detail on what their setup is that we can see it happening 
ourselves?

In general MPS needs to make caching of HTML more pessimistic than sites have 
originally because we have made caching of resources more aggressive, if that 
makes any sense.

Matt&Alexelman: you are both running 1.1.23.1, is that right, and still seeing 
this?  You both have specified
   ModPagespeedModifyCachingHeaders off
and this still is affecting you?

Original comment by jmara...@google.com on 15 Nov 2012 at 11:00

@GoogleCodeExporter

This comment has been minimized.

Copy link

GoogleCodeExporter commented Apr 6, 2015

Yes, I have specified ModPagespeedModifyCachingHeaders off as mentioned 
previously. This is the version we are using:

mod-pagespeed-stable-1.0.22.7-2005

Original comment by alexel...@gmail.com on 15 Nov 2012 at 11:11

@GoogleCodeExporter

This comment has been minimized.

Copy link

GoogleCodeExporter commented Apr 6, 2015

Can you try 1.1.23.1?  One theory was that we had trouble reproducing because 
we'd fixed the problem in that release already.

Original comment by jmara...@google.com on 15 Nov 2012 at 11:14

@GoogleCodeExporter

This comment has been minimized.

Copy link

GoogleCodeExporter commented Apr 6, 2015

We have experimented with several versions, including the latest as of a
few weeks ago. After we observed that the header behavior was different
depending on whether or not your request is a POST or a GET, we decided to
just look at the source. And if you are getting caught in the Cache-Control
trap like we are, I am absolutely certain that behavior has not changed as
of revision 2191.

I think I'll go ahead and patch it and see if that fixes anyone's problem.
I'm unsure what motivated the creation of DisableDownstreamHeaderFilters --
to me it seems like something you would never want; if users were messing
up pagespeed headers with downstream filters, well, they should just stop
doing that (via configuration) rather than having pagespeed ruin _all_
attempts to set output headers on pages it's rewriting. So if it would make
more sense to remove it, I'm happy to do that instead.

-m

Original comment by m...@favoritemo.com on 15 Nov 2012 at 11:45

@GoogleCodeExporter

This comment has been minimized.

Copy link

GoogleCodeExporter commented Apr 6, 2015

Matt,

I think the fear is that, if a user of MPS has an aggressive caching policy, 
this would essentially render many of the features of MPS useless. The 
inspiration behind DDHF is to circumvent these aggressive caching measures and 
for MPS to institute something more useful.

That being said, DDHF seems to be overly assertive in removing headers for some 
reason that is unclear to me. I have not tried 1.1.23.1 but it seems like it 
might be effective as long as we use ModPagespeedModifyCachingHeaders off

Original comment by alexel...@gmail.com on 16 Nov 2012 at 12:26

@GoogleCodeExporter

This comment has been minimized.

Copy link

GoogleCodeExporter commented Apr 6, 2015

I don't have a great repro yet (which is really essential to me delivering a 
fix with confidence).

But I think I have an idea of what the problem might be; it might be some 
convergence in the helper functions used to enforce 1-year TTL for rewritten 
resources (which we definitely want) and the code to enforce cc:nocache for 
HTML (which should be configurable).

Do all the people suffering from this issue using JBoss/Tomcat?  I don't know 
much about either of those.  If detailed configuration information could be 
provided maybe I could repro.

Original comment by jmara...@google.com on 16 Nov 2012 at 12:36

@GoogleCodeExporter

This comment has been minimized.

Copy link

GoogleCodeExporter commented Apr 6, 2015

We use a both Tomcat 5 and Tomcat 7. We see the same issue on applications 
using both 5 and 7.

Original comment by alexel...@gmail.com on 16 Nov 2012 at 12:54

@GoogleCodeExporter

This comment has been minimized.

Copy link

GoogleCodeExporter commented Apr 6, 2015

Ah to clarify we see issues on applications running on tomcat 5 as well as apps 
running on tomcat 7

Original comment by alexel...@gmail.com on 16 Nov 2012 at 12:54

@GoogleCodeExporter

This comment has been minimized.

Copy link

GoogleCodeExporter commented Apr 6, 2015

I have never used Tomcat.  Could you possibly post the Apache config needed to 
connect to Tomcat and whatever files you need to repro the problem in Tomcat.  
Finally (seems like a long list I know...) some kind of 'wget' or 'curl' based 
query to the server showing the response-headers you are complaining about?

Naturally we do have tests that make us think 'ModifyCachingHeaders off' works, 
but it evidently doesn't prove that it works for *your* configuration.

Thanks!

Original comment by jmara...@google.com on 16 Nov 2012 at 1:23

@GoogleCodeExporter

This comment has been minimized.

Copy link

GoogleCodeExporter commented Apr 6, 2015

I gave 1.1.23.1 a try and MPS seems to preserve all headers!

MPS on and ModifyCachingHeaders on: curl shows that filters work on html and 
headers remain intact
MPS on and ModifyCachingHeaders off: curl shows that filters do NOT work on 
html and 
 headers remain intact
MPS off and ModifyCachingHeaders on: curl shows filters do not work and headers 
remain intact
MPS off and ModifyCachingHeaders on: curl shows filters do not work and headers 
remain intact

Tomorrow I will do some further testing but so far, it seems that the headers 
always remain intact.

Original comment by alexel...@gmail.com on 16 Nov 2012 at 3:51

@GoogleCodeExporter

This comment has been minimized.

Copy link

GoogleCodeExporter commented Apr 6, 2015

matt@favoritemo.com it would be great if you could try 1.1.23.2 a try (a minor 
patch (.2) was just released today, mainly to quiet down some noisy logging).

matterbury@ made a specific change made since the previous functional release 
(http://code.google.com/p/modpagespeed/source/detail?r=1816), that addressed a 
related issue (Issue 385).  1.1.23.1 was the first release incorporating that 
fix.

We were attempting to get to the bottom of this bug before 1.1.23.1 was frozen, 
but could not repro so we're still looking for more detail.  If you can repro 
on that release we'd love to hear more detail about the setup (conf files, 
other modules, etc).

Original comment by jmara...@google.com on 16 Nov 2012 at 4:14

@GoogleCodeExporter

This comment has been minimized.

Copy link

GoogleCodeExporter commented Apr 6, 2015

First some background on our setup: we have apache sitting in front of 
JBoss/Tomcat servers, proxying requests to them. Our intention is to have 
apache both do MPS magic as well as set content security policy (CSP) headers 
on responses. The headers in question I've been losing are the CSP ones, 
defined very simply and statically in the apache configuration to be set 
always...except if MPS gets a hold of a proxied GET request, they didn't appear 
there.

Anyway, I think I've found something interesting. I tried building from source 
and rigging up a test case whereby apache was MPS-processing requests to itself 
(using either ProxyPass or rewrite rules) and having no luck reproducing the 
problem -- all of my post-MPS response headers were coming back intact. I then 
took that built module and installed it on one of our test servers; no joy, 
same behavior wherein apache refuses to set response headers after MPS is 
triggered. There was only one remaining difference: instead of proxying over 
HTTP, we are configured to proxy over AJP.

Once I configured the test server to proxy to Tomcat via HTTP instead of AJP, 
the apache-set response headers started to appear.

I don't know enough about the apache proxying code, the behavior of AJP, or 
whatever other subtle differences exist between AJP responses and HTTP 
responses to know what to do next, exactly, but I did at least figure out the 
cause of what I've been seeing. If anyone has some pointers in the right 
direction, I'd love to hear them.

Original comment by m...@favoritemo.com on 20 Nov 2012 at 11:47

@GoogleCodeExporter

This comment has been minimized.

Copy link

GoogleCodeExporter commented Apr 6, 2015

I also just now tried commenting out the call to 
DisableDownstreamHeaderFilters, just to see if it made a difference -- it did 
not. Whatever's stopping those headers from being set is not there.

Original comment by m...@favoritemo.com on 21 Nov 2012 at 12:20

@GoogleCodeExporter

This comment has been minimized.

Copy link

GoogleCodeExporter commented Apr 6, 2015

Thanks for sticking with this Matt.  Once we can repro we can easily fix.  
There are a few different theories on what might be broken, one of which you 
have just eliminated with your experiment; I even have some code to avoid 
calling DisableDownstreamHeaderFilters in the HTML flow, but it does not break 
or fix any testcase I have so I haven't proposed submitting that.  Commenting 
out DisableDownstreamHeaderFilters of course definitely breaks some of our 
tests that make sure we cache .pagespeed. resources for a year.

Another theory is that the timing of the arrival of the header is causing us 
trouble.  That was the problem with Issue 385 -- we had captured the original 
headers in our own data structure net_instaweb::ResponseHeaders before they 
were set by PHP, so when we wrote the final headers back with our caching 
updates we lost the ones PHP had tried to set.  This was fixed by delaying the 
strobe for the upstream headers.  An alternative fix might be a little more 
robust, which is to avoid copying over the headers, and instead merge them back.

Yet another theory is that some of the upstream headers might be arriving in 
request->headers_out, while others are arriving in request->err_headers_out.  
Our system might be interfering with the Apache logic.

Do you know if the JbBoss/Tomcat interface is open-source?  If so we can look 
at the source & try to figure out how it's transmitting headers, especially via 
AJP (those initials mean nothing to me by the way...)

Original comment by jmara...@google.com on 21 Nov 2012 at 2:58

@GoogleCodeExporter

This comment has been minimized.

Copy link

GoogleCodeExporter commented Apr 6, 2015

The AJP interface is definitely open source. Some lazy googling yields

https://bitbucket.org/mirror/apache-http-server/src/eb7db2fa8607/modules/proxy/m
od_proxy_ajp.c?at=trunk

Which has a possibly interesting comment about the protocol's lack of a 
specific flush message requires some shenanigans with polling. I had a theory 
that the issue might have something to do with the buckets coming back from the 
AJP proxy in a way that was violating some assumption that MPS was making, but 
haven't investigated that myself.

This is close to an academic problem for us, as we can probably just switch to 
proxying via HTTP instead (among many other ways to get a header added to the 
response post-MPS), but I'm willing to help you guys chase it down.

Original comment by m...@favoritemo.com on 21 Nov 2012 at 7:44

@GoogleCodeExporter

This comment has been minimized.

Copy link

GoogleCodeExporter commented Apr 6, 2015

Now, after having looked at it for a while longer, I think I might have not 
been correct about DisableDownstreamHeaderFilters not fixing anything. I've 
added a bunch of logging to try to figure out what's going on. While proxying 
via HTTP and via AJP, I see that DDHF is being called due to Cache-Control 
being set by the application server, but via HTTP the filters it ends up 
iterating through look like this

looking at filter mod_pagespeed_output_filter
looking at filter deflate
looking at filter content_length
looking at filter http_outerror
looking at filter chunk
looking at filter log_input_output
looking at filter core

While using AJP it looks like this:

looking at filter mod_pagespeed_output_filter
looking at filter fixup_headers_out
looking at filter deflate
looking at filter deflate
looking at filter mod_pagespeed_fix_headers_filter
looking at filter byterange
looking at filter content_length
looking at filter http_header
looking at filter http_outerror
looking at filter log_input_output
looking at filter core

Strangely, of course, fixup_headers_out appears to have already happened (?!) 
in the HTTP proxying setup (which is why the custom headers end up appearing in 
those responses) but not yet in the AJP proxying setup (and since DDHF is 
happening they don't appear). Bizarre, not to mention I can't explain why 
"deflate" is in there twice and several others only appear in the AJP proxying 
setup. The only difference between the two is that in HTTP mode I have the line 
"ProxyPass /app-path http://localhost:8080/app-path" and in AJP mode "ProxyPass 
/app-path ajp://localhost:8009/app-path".

Really confusing.

Original comment by m...@favoritemo.com on 23 Nov 2012 at 11:20

@GoogleCodeExporter

This comment has been minimized.

Copy link

GoogleCodeExporter commented Apr 6, 2015

Original comment by jkar...@google.com on 26 Nov 2012 at 2:22

@GoogleCodeExporter

This comment has been minimized.

Copy link

GoogleCodeExporter commented Apr 6, 2015

Bouncing back to Matt

Original comment by jmara...@google.com on 28 Nov 2012 at 7:13

@GoogleCodeExporter

This comment has been minimized.

Copy link

GoogleCodeExporter commented Apr 6, 2015

AJP vs. HTTP weirdness aside, I still stand by my original assertion: if I want 
to explicitly set headers downstream from MPS, I should be allowed to do so. 
Even if they happen to screw up headers that MPS is itself trying to set, but 
especially if they don't. DisableDownstreamHeaderFilters, as currently 
implemented and invoked, is a poor solution to a "doctor, it hurts when I do 
this" problem.

Original comment by m...@favoritemo.com on 29 Nov 2012 at 7:29

@GoogleCodeExporter

This comment has been minimized.

Copy link

GoogleCodeExporter commented Apr 6, 2015

We agree with you 100% with your desire to set your own headers, and we have 
unit tests that -- at least for our setup -- prove that you can.

DisableDownstreamHeaderFilters is probably applied too aggressively, and in 
fact I have a change-list that tries to narrow its application to the case we 
need, which is to avoid having the origin Expires/Cache-Control headers applied 
to resources that we are cache-extending.

However I have been unable to come up with a testcase that this change fixes.  
If you are able to build from source I could attach a patch-file for you to try 
to let me know if it fixes your problem.  Would that be feasible?

Original comment by jmara...@google.com on 29 Nov 2012 at 8:34

@GoogleCodeExporter

This comment has been minimized.

Copy link

GoogleCodeExporter commented Apr 6, 2015

I'm not sure how to help you construct a test case that doesn't also depend on 
the...unusual ordering of the filters as described in comment 58. The key to 
triggering the DisableDownstreamHeaderFilters behavior is having whatever's 
upstream of MPS set Cache-Control to something. It appears that anything doing 
HTTP proxying won't reproduce it because fixup_headers_out is happening before 
MPS, but if the test harness has another way to produce a resource for MPS to 
process that doesn't also lead to fixup_headers_out happening before MPS gets 
its chance then that would be a promising angle.

I can definitely build from source and would love a patch. In fact I have 
patched out the DDHF behavior myself, but I don't want to maintain a fork...

Original comment by m...@favoritemo.com on 29 Nov 2012 at 11:58

@GoogleCodeExporter

This comment has been minimized.

Copy link

GoogleCodeExporter commented Apr 6, 2015

I was going to send you one today but then... 
http://bostinno.com/2012/11/29/cambridge-power-outage-mbta-signal-problems/#ss__
265464_1_0__ss

Just to confirm, you said commenting out DDHF did not help in comment 55 but 
then you found in comment 58 it did.  So if I prevent that function from being 
called in the HTML path (where I think we don't need it) that should solve the 
problem for you while still allowing us to extend cache-lifetime of resources.

Also to confirm you are talking purely about HTML and do not have a problem 
with our handling of resources.  Right?

I'll try to get a patch to you as soon as practical (e.g. somebody starts 
injecting enough voltage into my disk drive to make it spin).

Original comment by jmara...@google.com on 30 Nov 2012 at 12:12

@GoogleCodeExporter

This comment has been minimized.

Copy link

GoogleCodeExporter commented Apr 6, 2015

OK here's a candidate patch, taken from trunk @ r2230

It seems to pass basic unit & system tests.  Full suite isn't done yet. 

Matt please let me know if this resolves the issue.

Original comment by jmara...@google.com on 30 Nov 2012 at 4:05

Attachments:

@GoogleCodeExporter

This comment has been minimized.

Copy link

GoogleCodeExporter commented Apr 6, 2015

We are about to freeze a new release so if you can confirm this fixes the 
problem, I'll put it in.

Original comment by jmara...@google.com on 30 Nov 2012 at 2:52

  • Added labels: Milestone-v24
  • Removed labels: Milestone-v21
@GoogleCodeExporter

This comment has been minimized.

Copy link

GoogleCodeExporter commented Apr 6, 2015

I haven't had a chance to build this and run through my tests today, but
looking at the diff reveals promise. It's definitely in the right area of
the code and looks like it has the right spirit. I didn't do any particular
analysis of not-HTML payloads (which is what I assume you are referring to
as "resources" here). I might have time to kick the tires on this tomorrow.

-m

Original comment by m...@favoritemo.com on 1 Dec 2012 at 12:32

@GoogleCodeExporter

This comment has been minimized.

Copy link

GoogleCodeExporter commented Apr 6, 2015

That patch does the trick, assuming ModPagespeedModifyCachingHeaders is set to 
"off". Without it, my AJP-proxied GETs have no mod_headers-set output headers; 
with it, DDHF is not called so the headers are set correctly.

Original comment by m...@favoritemo.com on 3 Dec 2012 at 6:00

@GoogleCodeExporter

This comment has been minimized.

Copy link

GoogleCodeExporter commented Apr 6, 2015

This issue was closed by revision r2254.

Original comment by jmara...@google.com on 4 Dec 2012 at 4:09

  • Changed state: Fixed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment