Skip to content
This repository has been archived by the owner on Apr 21, 2023. It is now read-only.

Newline in Content-Type meta tag breaks the page #1083

Closed
sachinjsk opened this issue May 14, 2015 · 5 comments
Closed

Newline in Content-Type meta tag breaks the page #1083

sachinjsk opened this issue May 14, 2015 · 5 comments

Comments

@sachinjsk
Copy link

What steps will reproduce the problem?

  1. Make a new webpage with the following meta tag:
    <meta http-equiv="Content-Type" content="text/html;

charset=UTF-8">
2. Upload the file to your server with the standard mod_pagepeed
configuration loaded.
3. Open the website in the Chrome browser

What is the expected output?
You should see the webpage you've just created.

What do you see instead?
This webpage is not available

ERR_CONTENT_DECODING_FAILED

What version of the product are you using (please check X-Mod-Pagespeed
header)?
X-Mod-Pagespeed: 1.9.32.3-4448

On what operating system?
Redhat 5

Which version of Apache?
Apache version 2.2.3-91

Please provide any additional information below, especially a URL or an
HTML file that exhibits the problem.
Here's a demo page - http://www.allesamerika.sbi1b.sitesell.com/cultuur-amerika.html

@jmarantz
Copy link
Contributor

Out of curiosity, how do you manage to get newlines embedded in the
meta-tag string?

On Thu, May 14, 2015 at 5:06 PM, Sachin Sebastian notifications@github.com
wrote:

What steps will reproduce the problem?

  1. Make a new webpage with the following meta tag:
    <meta http-equiv="Content-Type" content="text/html;

charset=UTF-8">
2. Upload the file to your server with the standard mod_pagepeed
configuration loaded.
3. Open the website in the Chrome browser

What is the expected output?
You should see the webpage you've just created.

What do you see instead?
This webpage is not available

ERR_CONTENT_DECODING_FAILED

What version of the product are you using (please check X-Mod-Pagespeed
header)?
X-Mod-Pagespeed: 1.9.32.3-4448

On what operating system?
Redhat 5

Which version of Apache?
Apache version 2.2.3-91

Please provide any additional information below, especially a URL or an
HTML file that exhibits the problem.
Here's a demo page -
http://www.allesamerika.sbi1b.sitesell.com/cultuur-amerika.html


Reply to this email directly or view it on GitHub
#1083.

@jmarantz
Copy link
Contributor

Because I was curious, I checked the spec. It is not valid HTML to embed
newlines in quoted attribute values, at least by my interpretation of
http://www.w3.org/TR/html-markup/syntax.html#syntax-attribute-value :4.5.
Text and character data

Text in element contents
http://www.w3.org/TR/html-markup/syntax.html#contents (including in
comments http://www.w3.org/TR/html-markup/syntax.html#syntax-comments)
and attribute values
http://www.w3.org/TR/html-markup/syntax.html#syntax-attribute-value must
http://www.w3.org/TR/html-markup/terminology.html#must-requirement consist
of Unicode characters, with the following restrictions:

However I agree that the bug is valid, and we should not generate an
invalid HTTP attribute just because we were given an invalid HTML attribute.

-Josh

On Thu, May 14, 2015 at 6:24 PM, Joshua Marantz jmarantz@gmail.com wrote:

Out of curiosity, how do you manage to get newlines embedded in the
meta-tag string?

On Thu, May 14, 2015 at 5:06 PM, Sachin Sebastian <
notifications@github.com> wrote:

What steps will reproduce the problem?

  1. Make a new webpage with the following meta tag:
    <meta http-equiv="Content-Type" content="text/html;

charset=UTF-8">
2. Upload the file to your server with the standard mod_pagepeed
configuration loaded.
3. Open the website in the Chrome browser

What is the expected output?
You should see the webpage you've just created.

What do you see instead?
This webpage is not available

ERR_CONTENT_DECODING_FAILED

What version of the product are you using (please check X-Mod-Pagespeed
header)?
X-Mod-Pagespeed: 1.9.32.3-4448

On what operating system?
Redhat 5

Which version of Apache?
Apache version 2.2.3-91

Please provide any additional information below, especially a URL or an
HTML file that exhibits the problem.
Here's a demo page -
http://www.allesamerika.sbi1b.sitesell.com/cultuur-amerika.html


Reply to this email directly or view it on GitHub
#1083.

@sachinjsk
Copy link
Author

@jmarantz It does look like newline is invalid. That site is one of my client site, so I don't know how that newline came up there in the first place. I was assuming it could be one of the html editors doing it.

For now I've disabled 'convert_meta_tags' filter, and that fixed the issue.

@oschaaf oschaaf self-assigned this Oct 27, 2015
@oschaaf
Copy link
Member

oschaaf commented Oct 27, 2015

I tried to reproduce this on master, but no luck.
Perhaps there has been a fix for this specific example, or it was fixed as a side-effect of an unrelated change. Tracing the code, it parses the mimetype/character set as declared within the attribute.

Looking closer, I could figure out the following example would still break, this still reproduces the problem on master for me:

<html>
<head>
<meta http-equiv="Content-Type" content="text/html;                                                                                                                                                         

charset=UT                                                                                                                                                                                                  


F-8">
</head>
<body>
foo
</body>
</html>

@oschaaf
Copy link
Member

oschaaf commented Oct 27, 2015

For completeness, sample response headers resulting from similar html input:

2015/10/27 19:30:20 [debug] 103935#0: *2 HTTP/1.1 200 OK
Server: nginx/1.9.6
Content-Type: text/html; charset=zwa


hili
Transfer-Encoding: chunked
Connection: keep-alive
Vary: Accept-Encoding
Date: Tue, 27 Oct 2015 18:30:20 GMT
X-Page-Speed: 1.10.0.0-7430
Cache-Control: max-age=0, no-cache


oschaaf added a commit that referenced this issue Nov 30, 2015
This change makes ResponseHeaders::MergeContentType reject values
containing unprintable characters.

Fixes #1083
oschaaf added a commit that referenced this issue Nov 30, 2015
This change makes ResponseHeaders::MergeContentType reject values
containing unprintable characters.

Fixes #1083
oschaaf added a commit that referenced this issue Nov 30, 2015
This change makes ResponseHeaders::MergeContentType reject values
containing unprintable characters.

Fixes #1083
oschaaf added a commit that referenced this issue Nov 30, 2015
This change makes ResponseHeaders::MergeContentType reject values
containing unprintable characters.

Fixes #1083
oschaaf added a commit that referenced this issue Dec 1, 2015
This change makes ResponseHeaders::MergeContentType reject values
containing unprintable characters.

Fixes #1083
oschaaf added a commit that referenced this issue Dec 1, 2015
This change makes ResponseHeaders::MergeContentType reject values
containing unprintable characters.

Fixes #1083
jeffkaufman added a commit that referenced this issue Dec 17, 2015
This change makes ResponseHeaders::MergeContentType reject values
containing unprintable characters.

Fixes #1083

This is Otto's work from #1196
@pono pono unassigned oschaaf Jan 8, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants