New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unexpected tag <head/> inside pages #1405

Closed
michelep opened this Issue Sep 27, 2016 · 16 comments

Comments

Projects
None yet
5 participants
@michelep
Copy link

michelep commented Sep 27, 2016

Pagespeed currend stable i386

On a provisioning system, echoing an XML file, after mod_pagespeed was activated i have a tag inserted after head:

<?xml version="1.0" encoding="utf-8" ?><head/><settings>

This caused invalidation of XML file and unexpected errors in system. Disabling mod_pagespeed solve the issue.

@oschaaf

This comment has been minimized.

Copy link
Member

oschaaf commented Sep 27, 2016

@michelep what do the response headers look like for the xml response? Can you post them here?

@jmarantz

This comment has been minimized.

Copy link
Contributor

jmarantz commented Sep 27, 2016

Also: what does the config file-look like? For example, in our default configuration template, we have:

# Direct Apache to send all HTML output to the mod_pagespeed
# output handler.
AddOutputFilterByType MOD_PAGESPEED_OUTPUT_FILTER text/html

# If you want mod_pagespeed process XHTML as well, please uncomment this
# line.
# AddOutputFilterByType MOD_PAGESPEED_OUTPUT_FILTER application/xhtml+xml

So mod_pagespeed should not be running on XML at all, only HTML and maybe XHTML.

@michelep

This comment has been minimized.

Copy link

michelep commented Sep 27, 2016

Head with mod_pagespeed enabled:

<?xml version="1.0" encoding="utf-8" ?><head/><settings>

Head without mod_pagespeed:

<?xml version="1.0" encoding="utf-8" ?><settings>

However i'll try with jmaranz suggestions to config file, disabling module for xml documents.

@jmarantz

This comment has been minimized.

Copy link
Contributor

jmarantz commented Sep 27, 2016

I think Otto was asking for the HTTP response-headers, rather than the first few bytes of HTML.

For example, you can type:

wget -S http://yoursite

@michelep

This comment has been minimized.

Copy link

michelep commented Sep 27, 2016

Here are:

  HTTP/1.1 200 OK
  Date: Tue, 27 Sep 2016 13:41:39 GMT
  Server: Apache/2.4.10 (Debian)
  Expires: Thu, 19 Nov 1981 08:52:00 GMT
  Cache-Control: no-store, no-cache, must-revalidate, post-check=0, pre-check=0
  Pragma: no-cache
  Vary: Accept-Encoding
  Content-Length: 4565
  Content-Type: text/html; charset=UTF-8
  Set-Cookie: php-console-server=5; path=/
  Set-Cookie: PHPSESSID=xxxxxxxxxxxxxxxxxx; path=/
  Keep-Alive: timeout=5, max=100
  Connection: Keep-Alive
@jmarantz

This comment has been minimized.

Copy link
Contributor

jmarantz commented Sep 27, 2016

You might want to take a quick look at your Apache configuration, and set up an XML content-type when serving XML. That way PageSpeed will know not to treat it as HTML.

@jeffkaufman

This comment has been minimized.

Copy link
Contributor

jeffkaufman commented Sep 29, 2016

Is this a real xml file, processed by something that's expecting xml? If so, serving it with the proper xml content type as @jmarantz says is the right way to do it.

The caution I have, is that if this is actually xhtml intended to be processed by a browser, switching from an html content type to an xml one can turn minor xml violations that the browser would quietly ignore into page load blocking errors.

@jmarantz

This comment has been minimized.

Copy link
Contributor

jmarantz commented Oct 3, 2016

@michelep : did adjusting the content-type served for your XML files solve the problem?

@michelep

This comment has been minimized.

Copy link

michelep commented Oct 3, 2016

@jmarantz yes, and works perfectly !

@teolaz

This comment has been minimized.

Copy link

teolaz commented Nov 10, 2016

@jeffkaufman Same problem for me...
This morning i stayed 4 hours in front of stackoverflow to find WHY my wp_ajax actions (ajax calls on Wordpress on url wp-admin/admin-ajax.php) had this strange <head/> tag on the beginning of the html response... then i remembered in production i installed the Core Rules of pagespeed, and well... i disabled the add_head filter and all returned to work good.

I believe the section "Risks" in here https://developers.google.com/speed/pagespeed/module/filter-head-add needs to be updated :D

I'm wondering if i can filter all these requests in pagespeed.conf and disable mod_pagespeed add_head only there.

@jmarantz

This comment has been minimized.

Copy link
Contributor

jmarantz commented Nov 10, 2016

I don't think disabling add_head is the best way to go, because other filters rely on that. Just make sure you are sending an XML content-type when you are sending XML, rather than using content-type:text/html for your XML.

@jmarantz jmarantz closed this Nov 10, 2016

@teolaz

This comment has been minimized.

Copy link

teolaz commented Nov 10, 2016

Eh?
Maybe @jmarantz you didn't read what i wrote... i had the same problem with AJAX responses, not xml in specific... i would need to remove this filter on all wp-admin/admin-ajax.php . Pay attention that Wordpress plugins rely on different tecniques of communicating responses in AJAX, the problem seems to happen when ajax responses are full blocks of html (without head and body)... no json response is touched...

@jmarantz

This comment has been minimized.

Copy link
Contributor

jmarantz commented Nov 10, 2016

I see. You are not the first to find this problem. Can you do a workaround of

    ModPagespeedDisallow */wp-admin/*

At one point I think we had that disabled in our default settings but felt it was no longer needed, and now I think maybe it is.

@jmarantz jmarantz reopened this Nov 10, 2016

@jmarantz

This comment has been minimized.

Copy link
Contributor

jmarantz commented Nov 10, 2016

A possible fix for this problem in the code is to disable all structure-modifying filters in an ajax response, but allow the filters that modify URLs and minify css/js/html.

Currently we have this (in rewrite_query.cc):

  if (request_headers != NULL && request_headers->IsXmlHttpRequest()) {
    if (options_.get() == NULL) {
      options_.reset(factory->NewRewriteOptionsForQuery());
    }
    options_->DisableFiltersRequiringScriptExecution();
    options_->DisableFilter(RewriteOptions::kPrioritizeCriticalCss);
  }

We could add another broad disable to that if-clause.

@teolaz

This comment has been minimized.

Copy link

teolaz commented Nov 10, 2016

@jmarantz i like the first solution... i can try that, but i suggest you to go with a new version update with that rule set on it... in the end wp-admin (except for ajax requests) is entirely the Wordpress backend, and i don't think there's any need to have js and css and images optimized right there...

i cannot ensure you about the correctness of the second fix, as i should read the source code to understand what you've done :)

@jmarantz

This comment has been minimized.

Copy link
Contributor

jmarantz commented Nov 10, 2016

I agree. I'm definitely going to put the 'Disallow' back.

@jmarantz jmarantz self-assigned this Nov 10, 2016

@jmarantz jmarantz closed this in c6f01f2 Nov 14, 2016

ashishk-1 added a commit to ashishk-1/mod_pagespeed that referenced this issue Apr 13, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment