This repository was archived by the owner on Apr 10, 2025. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 157
This repository was archived by the owner on Apr 10, 2025. It is now read-only.
Retain case of all tag and attribute names by default. #206
Copy link
Copy link
Closed
Description
Currently mod_pagespeed lower-cases all tag and attribute names. This is the
correct thing to do for HTML, but is the wrong thing to do for XML.
Unfortunately, XML is frequently served with Content-type: text/html and even a
.html extension.
An example of a site that breaks due to mod_pagespeed lower-casing XML files is
http://www.ynet.co.il/Ext/Comp/Ticker/Dhtml_Flash_Ticker/0,12114,L-184-244-132,0
0.html
When run without mod_pagespeed you see a scrolling list of headlines. When run
with mod_pagespeed you see nothing -- no errors either.
The issue is that the site contains a call to AC_FL_RunContent with:
newsxml=http://www.ynet.co.il/Ext/Comp/Ticker/Flash_Ticker_Data/0,12115,L-184,00
.html?timestamp=42652030
When I run this from the command line, I can see that the content is really XML
(not HTML), but the Content-Type is specified as text/html:
% wget -O - -q --save-headers
'http://www.ynet.co.il/Ext/Comp/Ticker/Flash_Ticker_Data/0,12115,L-184,00.html?t
imestamp=42652030'|head -20
HTTP/1.0 200 OK
Server: Microsoft-IIS/5.0
Content-Length: 2019
Content-Type: text/html
Cache-Control: max-age=814
Date: Wed, 02 Feb 2011 18:35:34 GMT
Connection: keep-alive
<!-- Vignette V6 Wed Feb 02 20:23:36 2011 -->
<TickerItems>
<item>
<date>02/02/2011 20:19</date>
<link>/Ext/Comp/CdaNewsFlash/0,2297,L-4023126_184,00.html</link>
<message><![CDATA[ : , " ]]></message>
</item>
<item>
<date>02/02/2011 20:12</date>
<link>/Ext/Comp/CdaNewsFlash/0,2297,L-4023121_184,00.html</link>
The problem is that when your mod_pagespeed-enabled server serves up that
request, it has an .html extension, and a content-type of text/html. It also
begins with a "<". These are all signals to mod_pagespeed that this is really
HTML which is not case sensitive. mod_pagespeed, following best practices,
will lower-case all the tag names, which probably breaks the site.
The flip-side of this is that web-best-practices suggest that we should
lower-case html keywords to improve the effectiveness of gzip. But I think on
balance it's better not to break sites, so we should make the case-folding be a
filter that's off by default.
Original issue reported on code.google.com by jmara...@google.com on 2 Feb 2011 at 6:44