Skip to content
This repository has been archived by the owner on Apr 21, 2023. It is now read-only.

Unexpected tag <head/> inside pages #1405

Closed
michelep opened this issue Sep 27, 2016 · 16 comments
Closed

Unexpected tag <head/> inside pages #1405

michelep opened this issue Sep 27, 2016 · 16 comments

Comments

@michelep
Copy link

michelep commented Sep 27, 2016

Pagespeed currend stable i386

On a provisioning system, echoing an XML file, after mod_pagespeed was activated i have a tag inserted after head:

<?xml version="1.0" encoding="utf-8" ?><head/><settings>

This caused invalidation of XML file and unexpected errors in system. Disabling mod_pagespeed solve the issue.

@oschaaf
Copy link
Member

oschaaf commented Sep 27, 2016

@michelep what do the response headers look like for the xml response? Can you post them here?

@jmarantz
Copy link
Contributor

jmarantz commented Sep 27, 2016

Also: what does the config file-look like? For example, in our default configuration template, we have:

# Direct Apache to send all HTML output to the mod_pagespeed
# output handler.
AddOutputFilterByType MOD_PAGESPEED_OUTPUT_FILTER text/html

# If you want mod_pagespeed process XHTML as well, please uncomment this
# line.
# AddOutputFilterByType MOD_PAGESPEED_OUTPUT_FILTER application/xhtml+xml

So mod_pagespeed should not be running on XML at all, only HTML and maybe XHTML.

@michelep
Copy link
Author

Head with mod_pagespeed enabled:

<?xml version="1.0" encoding="utf-8" ?><head/><settings>

Head without mod_pagespeed:

<?xml version="1.0" encoding="utf-8" ?><settings>

However i'll try with jmaranz suggestions to config file, disabling module for xml documents.

@jmarantz
Copy link
Contributor

I think Otto was asking for the HTTP response-headers, rather than the first few bytes of HTML.

For example, you can type:

wget -S http://yoursite

@michelep
Copy link
Author

michelep commented Sep 27, 2016

Here are:

  HTTP/1.1 200 OK
  Date: Tue, 27 Sep 2016 13:41:39 GMT
  Server: Apache/2.4.10 (Debian)
  Expires: Thu, 19 Nov 1981 08:52:00 GMT
  Cache-Control: no-store, no-cache, must-revalidate, post-check=0, pre-check=0
  Pragma: no-cache
  Vary: Accept-Encoding
  Content-Length: 4565
  Content-Type: text/html; charset=UTF-8
  Set-Cookie: php-console-server=5; path=/
  Set-Cookie: PHPSESSID=xxxxxxxxxxxxxxxxxx; path=/
  Keep-Alive: timeout=5, max=100
  Connection: Keep-Alive

@jmarantz
Copy link
Contributor

jmarantz commented Sep 27, 2016

You might want to take a quick look at your Apache configuration, and set up an XML content-type when serving XML. That way PageSpeed will know not to treat it as HTML.

@jeffkaufman
Copy link
Contributor

Is this a real xml file, processed by something that's expecting xml? If so, serving it with the proper xml content type as @jmarantz says is the right way to do it.

The caution I have, is that if this is actually xhtml intended to be processed by a browser, switching from an html content type to an xml one can turn minor xml violations that the browser would quietly ignore into page load blocking errors.

@jmarantz
Copy link
Contributor

jmarantz commented Oct 3, 2016

@michelep : did adjusting the content-type served for your XML files solve the problem?

@michelep
Copy link
Author

michelep commented Oct 3, 2016

@jmarantz yes, and works perfectly !

@teolaz
Copy link

teolaz commented Nov 10, 2016

@jeffkaufman Same problem for me...
This morning i stayed 4 hours in front of stackoverflow to find WHY my wp_ajax actions (ajax calls on Wordpress on url wp-admin/admin-ajax.php) had this strange <head/> tag on the beginning of the html response... then i remembered in production i installed the Core Rules of pagespeed, and well... i disabled the add_head filter and all returned to work good.

I believe the section "Risks" in here https://developers.google.com/speed/pagespeed/module/filter-head-add needs to be updated :D

I'm wondering if i can filter all these requests in pagespeed.conf and disable mod_pagespeed add_head only there.

@jmarantz
Copy link
Contributor

I don't think disabling add_head is the best way to go, because other filters rely on that. Just make sure you are sending an XML content-type when you are sending XML, rather than using content-type:text/html for your XML.

@teolaz
Copy link

teolaz commented Nov 10, 2016

Eh?
Maybe @jmarantz you didn't read what i wrote... i had the same problem with AJAX responses, not xml in specific... i would need to remove this filter on all wp-admin/admin-ajax.php . Pay attention that Wordpress plugins rely on different tecniques of communicating responses in AJAX, the problem seems to happen when ajax responses are full blocks of html (without head and body)... no json response is touched...

@jmarantz
Copy link
Contributor

jmarantz commented Nov 10, 2016

I see. You are not the first to find this problem. Can you do a workaround of

    ModPagespeedDisallow */wp-admin/*

At one point I think we had that disabled in our default settings but felt it was no longer needed, and now I think maybe it is.

@jmarantz jmarantz reopened this Nov 10, 2016
@jmarantz
Copy link
Contributor

A possible fix for this problem in the code is to disable all structure-modifying filters in an ajax response, but allow the filters that modify URLs and minify css/js/html.

Currently we have this (in rewrite_query.cc):

  if (request_headers != NULL && request_headers->IsXmlHttpRequest()) {
    if (options_.get() == NULL) {
      options_.reset(factory->NewRewriteOptionsForQuery());
    }
    options_->DisableFiltersRequiringScriptExecution();
    options_->DisableFilter(RewriteOptions::kPrioritizeCriticalCss);
  }

We could add another broad disable to that if-clause.

@teolaz
Copy link

teolaz commented Nov 10, 2016

@jmarantz i like the first solution... i can try that, but i suggest you to go with a new version update with that rule set on it... in the end wp-admin (except for ajax requests) is entirely the Wordpress backend, and i don't think there's any need to have js and css and images optimized right there...

i cannot ensure you about the correctness of the second fix, as i should read the source code to understand what you've done :)

@jmarantz
Copy link
Contributor

I agree. I'm definitely going to put the 'Disallow' back.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants