New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Retain case of all tag and attribute names by default. #206

Closed
GoogleCodeExporter opened this Issue Apr 6, 2015 · 2 comments

Comments

Projects
None yet
1 participant
@GoogleCodeExporter

GoogleCodeExporter commented Apr 6, 2015

Currently mod_pagespeed lower-cases all tag and attribute names.  This is the 
correct thing to do for HTML, but is the wrong thing to do for XML.  
Unfortunately, XML is frequently served with Content-type: text/html and even a 
.html extension.

An example of a site that breaks due to mod_pagespeed lower-casing XML files is

http://www.ynet.co.il/Ext/Comp/Ticker/Dhtml_Flash_Ticker/0,12114,L-184-244-132,0
0.html

When run without mod_pagespeed you see a scrolling list of headlines.  When run 
with mod_pagespeed you see nothing -- no errors either.

The issue is that the site contains a call to AC_FL_RunContent with:

newsxml=http://www.ynet.co.il/Ext/Comp/Ticker/Flash_Ticker_Data/0,12115,L-184,00
.html?timestamp=42652030

When I run this from the command line, I can see that the content is really XML 
(not HTML), but the Content-Type is specified as text/html:

% wget -O - -q --save-headers 
'http://www.ynet.co.il/Ext/Comp/Ticker/Flash_Ticker_Data/0,12115,L-184,00.html?t
imestamp=42652030'|head -20
HTTP/1.0 200 OK
Server: Microsoft-IIS/5.0
Content-Length: 2019
Content-Type: text/html
Cache-Control: max-age=814
Date: Wed, 02 Feb 2011 18:35:34 GMT
Connection: keep-alive

<!-- Vignette V6 Wed Feb 02 20:23:36 2011 -->


<TickerItems>
<item>
        <date>02/02/2011 20:19</date>
        <link>/Ext/Comp/CdaNewsFlash/0,2297,L-4023126_184,00.html</link>
        <message><![CDATA[           :               ,      "     ]]></message>
    </item>
<item>
        <date>02/02/2011 20:12</date>
        <link>/Ext/Comp/CdaNewsFlash/0,2297,L-4023121_184,00.html</link>

The problem is that when your mod_pagespeed-enabled server serves up that 
request, it has an .html extension, and a content-type of text/html.  It also 
begins with a "<".  These are all signals to mod_pagespeed that this is really 
HTML which is not case sensitive.  mod_pagespeed, following best practices, 
will lower-case all the tag names, which probably breaks the site.



The flip-side of this is that web-best-practices suggest that we should 
lower-case html keywords to improve the effectiveness of gzip.  But I think on 
balance it's better not to break sites, so we should make the case-folding be a 
filter that's off by default.

Original issue reported on code.google.com by jmara...@google.com on 2 Feb 2011 at 6:44

@GoogleCodeExporter

This comment has been minimized.

GoogleCodeExporter commented Apr 6, 2015

This is fixed in the trunk.  There is a new option
   ModPagespeedLowercaseHtmlNames on
which can be used to case-fold them again.

Original comment by jmara...@google.com on 14 Feb 2011 at 1:56

  • Changed state: Fixed
@GoogleCodeExporter

This comment has been minimized.

GoogleCodeExporter commented Apr 6, 2015

Original comment by sligocki@google.com on 24 Feb 2011 at 9:45

  • Added labels: release-note
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment