New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Newline in Content-Type meta tag breaks the page #1083

Closed
sachinjsk opened this Issue May 14, 2015 · 5 comments

Comments

Projects
None yet
3 participants
@sachinjsk
Copy link

sachinjsk commented May 14, 2015

What steps will reproduce the problem?

  1. Make a new webpage with the following meta tag:
    <meta http-equiv="Content-Type" content="text/html;

charset=UTF-8">
2. Upload the file to your server with the standard mod_pagepeed
configuration loaded.
3. Open the website in the Chrome browser

What is the expected output?
You should see the webpage you've just created.

What do you see instead?
This webpage is not available

ERR_CONTENT_DECODING_FAILED

What version of the product are you using (please check X-Mod-Pagespeed
header)?
X-Mod-Pagespeed: 1.9.32.3-4448

On what operating system?
Redhat 5

Which version of Apache?
Apache version 2.2.3-91

Please provide any additional information below, especially a URL or an
HTML file that exhibits the problem.
Here's a demo page - http://www.allesamerika.sbi1b.sitesell.com/cultuur-amerika.html

@jmarantz

This comment has been minimized.

Copy link
Contributor

jmarantz commented May 14, 2015

Out of curiosity, how do you manage to get newlines embedded in the
meta-tag string?

On Thu, May 14, 2015 at 5:06 PM, Sachin Sebastian notifications@github.com
wrote:

What steps will reproduce the problem?

  1. Make a new webpage with the following meta tag:
    <meta http-equiv="Content-Type" content="text/html;

charset=UTF-8">
2. Upload the file to your server with the standard mod_pagepeed
configuration loaded.
3. Open the website in the Chrome browser

What is the expected output?
You should see the webpage you've just created.

What do you see instead?
This webpage is not available

ERR_CONTENT_DECODING_FAILED

What version of the product are you using (please check X-Mod-Pagespeed
header)?
X-Mod-Pagespeed: 1.9.32.3-4448

On what operating system?
Redhat 5

Which version of Apache?
Apache version 2.2.3-91

Please provide any additional information below, especially a URL or an
HTML file that exhibits the problem.
Here's a demo page -
http://www.allesamerika.sbi1b.sitesell.com/cultuur-amerika.html


Reply to this email directly or view it on GitHub
#1083.

@jmarantz

This comment has been minimized.

Copy link
Contributor

jmarantz commented May 15, 2015

Because I was curious, I checked the spec. It is not valid HTML to embed
newlines in quoted attribute values, at least by my interpretation of
http://www.w3.org/TR/html-markup/syntax.html#syntax-attribute-value :4.5.
Text and character data

Text in element contents
http://www.w3.org/TR/html-markup/syntax.html#contents (including in
comments http://www.w3.org/TR/html-markup/syntax.html#syntax-comments)
and attribute values
http://www.w3.org/TR/html-markup/syntax.html#syntax-attribute-value must
http://www.w3.org/TR/html-markup/terminology.html#must-requirement consist
of Unicode characters, with the following restrictions:

However I agree that the bug is valid, and we should not generate an
invalid HTTP attribute just because we were given an invalid HTML attribute.

-Josh

On Thu, May 14, 2015 at 6:24 PM, Joshua Marantz jmarantz@gmail.com wrote:

Out of curiosity, how do you manage to get newlines embedded in the
meta-tag string?

On Thu, May 14, 2015 at 5:06 PM, Sachin Sebastian <
notifications@github.com> wrote:

What steps will reproduce the problem?

  1. Make a new webpage with the following meta tag:
    <meta http-equiv="Content-Type" content="text/html;

charset=UTF-8">
2. Upload the file to your server with the standard mod_pagepeed
configuration loaded.
3. Open the website in the Chrome browser

What is the expected output?
You should see the webpage you've just created.

What do you see instead?
This webpage is not available

ERR_CONTENT_DECODING_FAILED

What version of the product are you using (please check X-Mod-Pagespeed
header)?
X-Mod-Pagespeed: 1.9.32.3-4448

On what operating system?
Redhat 5

Which version of Apache?
Apache version 2.2.3-91

Please provide any additional information below, especially a URL or an
HTML file that exhibits the problem.
Here's a demo page -
http://www.allesamerika.sbi1b.sitesell.com/cultuur-amerika.html


Reply to this email directly or view it on GitHub
#1083.

@sachinjsk

This comment has been minimized.

Copy link

sachinjsk commented May 15, 2015

@jmarantz It does look like newline is invalid. That site is one of my client site, so I don't know how that newline came up there in the first place. I was assuming it could be one of the html editors doing it.

For now I've disabled 'convert_meta_tags' filter, and that fixed the issue.

@oschaaf oschaaf self-assigned this Oct 27, 2015

@oschaaf

This comment has been minimized.

Copy link
Member

oschaaf commented Oct 27, 2015

I tried to reproduce this on master, but no luck.
Perhaps there has been a fix for this specific example, or it was fixed as a side-effect of an unrelated change. Tracing the code, it parses the mimetype/character set as declared within the attribute.

Looking closer, I could figure out the following example would still break, this still reproduces the problem on master for me:

<html>
<head>
<meta http-equiv="Content-Type" content="text/html;                                                                                                                                                         

charset=UT                                                                                                                                                                                                  


F-8">
</head>
<body>
foo
</body>
</html>
@oschaaf

This comment has been minimized.

Copy link
Member

oschaaf commented Oct 27, 2015

For completeness, sample response headers resulting from similar html input:

2015/10/27 19:30:20 [debug] 103935#0: *2 HTTP/1.1 200 OK
Server: nginx/1.9.6
Content-Type: text/html; charset=zwa


hili
Transfer-Encoding: chunked
Connection: keep-alive
Vary: Accept-Encoding
Date: Tue, 27 Oct 2015 18:30:20 GMT
X-Page-Speed: 1.10.0.0-7430
Cache-Control: max-age=0, no-cache


oschaaf added a commit that referenced this issue Nov 30, 2015

convert-meta-tags: don't allow newlines when converting meta tags.
This change makes ResponseHeaders::MergeContentType reject values
containing unprintable characters.

Fixes #1083

oschaaf added a commit that referenced this issue Nov 30, 2015

convert-meta-tags: don't allow newlines when converting meta tags.
This change makes ResponseHeaders::MergeContentType reject values
containing unprintable characters.

Fixes #1083

oschaaf added a commit that referenced this issue Nov 30, 2015

convert-meta-tags: don't allow newlines when converting meta tags.
This change makes ResponseHeaders::MergeContentType reject values
containing unprintable characters.

Fixes #1083

oschaaf added a commit that referenced this issue Nov 30, 2015

convert-meta-tags: don't allow newlines when converting meta tags.
This change makes ResponseHeaders::MergeContentType reject values
containing unprintable characters.

Fixes #1083

oschaaf added a commit that referenced this issue Dec 1, 2015

convert-meta-tags: don't allow newlines when converting meta tags.
This change makes ResponseHeaders::MergeContentType reject values
containing unprintable characters.

Fixes #1083

oschaaf added a commit that referenced this issue Dec 1, 2015

convert-meta-tags: don't allow newlines when converting meta tags.
This change makes ResponseHeaders::MergeContentType reject values
containing unprintable characters.

Fixes #1083

jeffkaufman added a commit that referenced this issue Dec 17, 2015

convert-meta-tags: don't allow newlines when converting meta tags.
This change makes ResponseHeaders::MergeContentType reject values
containing unprintable characters.

Fixes #1083

This is Otto's work from #1196

@pono pono unassigned oschaaf Jan 8, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment