Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problematic XML Sitemap URL - WP_Http says: Operation timed out after 5001 milliseconds with 0 bytes received #643

Closed
raamdev opened this issue Dec 26, 2015 · 12 comments

Comments

@raamdev
Copy link
Contributor

@raamdev raamdev commented Dec 26, 2015

I've seen several reports of WP_Http timing out (after the default timeout value of 5 seconds) and triggering the following Dashboard notice about the Auto-Cache Engine being unable to reach the XML Sitemap:

2015-12-26_08-55-10

At the time I reproduced this on my own server (running ZenCache Pro v151220), the server was under heavier load than usual as it was doing the nightly backup. On the very next page load the error went away.

I've also seen reports of a similar timeout error inside the Auto-Cache Engine log file, related to auto-caching each page.


ZenCache uses the default timeout when calling wp_remote_head() while checking to make sure the XML Sitemap is valid. I suggest we change the default timeout there (and here) to 10 seconds.

Additionally, I suggest that we create a new option that allows a site owner to set the timeout value for WP_Http requests made as part of the Auto-Cache Engine (including those made when caching each page) so that site owners can tweak this depending on their specific server environment.

I would put that new option right underneath the ZenCache → Plugin Options → Auto-Cache Engine → Auto-Cache Delay Timer (in Milliseconds):

2015-12-26_09-01-32

I also suggest we update the Dashboard error message to reference a new KB Article that explains this new section of the Auto-Cache Engine and suggests that site owners try adjusting the timeout value if they see this message and they know their XML Sitemap is valid.


While this is not really a bug, I'm labeling it as such because from a site owners' perspective it will look like a bug (there's no problem with the sitemap, yet ZenCache reports that there is).

@raamdev raamdev added this to the Next Release (Pro) milestone Dec 26, 2015
@raamdev raamdev removed the needs research label Dec 26, 2015
@raamdev
Copy link
Contributor Author

@raamdev raamdev commented Dec 26, 2015

Temporary Workaround

You can use an MU-Plugin to change the default value of all WP_Http requests.

Create the following directory and file:
wp-content/mu-plugins/increase_http_timeout.php

<?php
add_filter( 'http_request_timeout', function() { return 10; /* seconds */ });
@jaswrks
Copy link

@jaswrks jaswrks commented Dec 29, 2015

Additionally, I suggest that we create a new option that allows a site owner to set the timeout value for WP_Http requests made as part of the Auto-Cache Engine (including those made when caching each page) so that site owners can tweak this depending on their specific server environment.

The ACE makes a non-blocking request. So for there to be timeout, that would require the timeout to occur in the connection phase, not in the response phase. For that reason, I don't see the need for a configurable timeout in this particular case.

If it takes longer than 5 seconds for your site to connect to itself, there is a problem.

At the time I reproduced this on my own server (running ZenCache Pro v151220), the server was under heavier load than usual as it was doing the nightly backup. On the very next page load the error went away.

Interesting. I have seen that error also. My feeling is that it's a WP core bug, but I haven't been able to put my finger on it yet. Why it would ever timeout at all is still a mystery to me.

@raamdev
Copy link
Contributor Author

@raamdev raamdev commented Dec 29, 2015

If it takes longer than 5 seconds for your site to connect to itself, there is a problem.

Well then how can we detect such a problem so that the site owner doesn't think that the Auto-Cache Engine is broken? Because that's how it looks right now.

@jaswrks
Copy link

@jaswrks jaswrks commented Dec 29, 2015

Agree. I'm not sure what causes this timeout yet. If it occurred only very rarely then I'd just chalk that up to it being a connection timeout during a time when there was a network connectivity problem. However, from what you are describing and from what I have seen, this occurs more often than I would ever expect it to. We might need to work at debugging the WP_Http class in order to determine why a non-blocking request ever hits a timeout at all under normal conditions.

What should happen is that a connection should be opened, then immediately closed. On the other end, the process was spawned on the server and it will run until caching is done. However long that takes, it takes. ZenCache will have already closed the connection; i.e., it was non-blocking.

@raamdev
Copy link
Contributor Author

@raamdev raamdev commented Dec 30, 2015

@jaswsinc I wonder... could this have something to do with how I have allow_url_fopen=0 on my server...? Could it be that the WP_Http class is using a transport that doesn't support non-blocking requests??

raamdev added a commit to wpsharks/comet-cache-pro that referenced this issue Dec 30, 2015
raamdev added a commit to wpsharks/comet-cache-pro that referenced this issue Dec 30, 2015
- Optimize autoCacheMaybeClearPhpIniError() with cacheKey()
- Remove `$force` option from autoCacheMaybeClearPhpIniError()
- Remove `$force` option from autoCacheMaybeClearPrimaryXmlSitemapError()
- Simplify how we dismiss `xml_sitemap_missing` and `allow_url_fopen_disabled`

See wpsharks/comet-cache#643
@raamdev
Copy link
Contributor Author

@raamdev raamdev commented Dec 30, 2015

@jaswsinc I just enabled WP_Http API debugging on my raam.org site and traced all requests to raam.org/sitemap.xml. Interestingly, every time I saved the ZenCache plugin options there were 4 calls:

[30-Dec-2015 01:51:33 UTC] Request URL: 'https://raam.org/sitemap.xml'
[30-Dec-2015 01:51:33 UTC] Request URL: 'http://raam.org/sitemap.xml'
[30-Dec-2015 01:51:35 UTC] Request URL: 'https://raam.org/sitemap.xml'
[30-Dec-2015 01:51:35 UTC] Request URL: 'http://raam.org/sitemap.xml'

That was repeated every time I saved the ZenCache plugin options.

Here are the full details of those requests: http://pastebin.com/2chSjQG9

While reviewing the details of the requests, I noticed this:

  'blocking' => true,

Why is blocking true?? Oh, right, because wp_remote_head() isn't being told to use a non-blocking request when checking if the Sitemap is accessible, so it defaults to 'blocking' => true.

And that's probably a good thing, right? We want that specific check to be blocking because we're trying to make sure the sitemap.xml file exists before we proceed with using it.

So, maybe we just need to increase the timeout when checking if the Sitemap is accessible? Increase the timeout there to 30 seconds maybe?


This also makes it clear that we're checking the XML Sitemap too many times (which is likely increasing the probability that we'll see a timeout when the server load is high). (Edit: Fixed in wpsharks/comet-cache-pro@de2736e and wpsharks/comet-cache-pro@da6136d.)

Also, I noticed that since I'm using https:// in my Site URL it appears that multiple requests are happening--one for the https:// version and once for the http:// version. I believe that is caused by this line specifically requesting the non-HTTPS version of the home URL. Why is that? Why can't we just leave the scheme parameter out of that call to network_home_url()?

@jaswrks
Copy link

@jaswrks jaswrks commented Dec 30, 2015

Why is blocking true?? Oh, right, because wp_remote_head() isn't being told to use a non-blocking request when checking if the Sitemap is accessible, so it defaults to 'blocking' => true.

And that's probably a good thing, right? We want that specific check to be blocking because we're trying to make sure the sitemap.xml file exists before we proceed with using it.

Exactly, yes. Blocking is the default methodology, so unless it is changed specifically to use a non-blocking request, any WP_Http wrapper will use a blocking request.

So, maybe we just need to increase the timeout when checking if the Sitemap is accessible? Increase the timeout there to 30 seconds maybe?

30 seconds seems too long. Even 5 seconds is pushing it IMO. In this specific case we are dealing with a blocking request that is performed on admin_init. So forcing a site owner to wait for up to 30 seconds in order to tell them that their sitemap has failed, and then continuing to do that again and again, will become a problem. If the connection were being made asynchronously then I'd agree, but since the connection is blocking and it prevents the page from loading, 5 seconds is plenty IMO.

This also makes it clear that we're checking the XML Sitemap too many times (which is likely increasing the probability that we'll see a timeout when the server load is high). (Edit: Fixed in wpsharks/comet-cache-pro@de2736e and wpsharks/comet-cache-pro@da6136d.)

Cool. Great catch there.

Also, I noticed that since I'm using https:// in my Site URL it appears that multiple requests are happening--one for the https:// version and once for the http:// version. I believe that is caused by this line specifically requesting the non-HTTPS version of the home URL. Why is that? Why can't we just leave the scheme parameter out of that call to network_home_url()?

I agree. The reason I forced http there was because that runs on admin_init. I thought network_home_url() (and home_url()) would default to the current scheme, and that it did not include a check if is_admin(). I just reviewed the source code and I see that it does. So I was mistaken. Looks like we can safely exclude the hard-coded scheme there. Great!

@raamdev
Copy link
Contributor Author

@raamdev raamdev commented Dec 30, 2015

30 seconds seems too long. Even 5 seconds is pushing it IMO. In this specific case we are dealing with a blocking request that is performed on admin_init. So forcing a site owner to wait for up to 30 seconds in order to tell them that their sitemap has failed, and then continuing to do that again and again, will become a problem. If the connection were being made asynchronously then I'd agree, but since the connection is blocking and it prevents the page from loading, 5 seconds is plenty IMO.

I agree. I think what we need to do is add a note to the bottom of the error message when timeouts occur with the Auto-Cache Engine that suggests refreshing the page, and if the error persists to see a related KB article that explains why they might be seeing that error. That article can also suggest temporarily increasing the timeout using a filter. That would give us a way to address scenarios where a site owners server is bogged down enough that they're just passing the default 5 second timeout; they might be able to resolve things by increasing the timeout to 6 or 7 seconds.

raamdev added a commit to wpsharks/comet-cache-pro that referenced this issue Dec 30, 2015
raamdev added a commit to wpsharks/comet-cache-pro that referenced this issue Dec 30, 2015
There's no need for a ZenCache-specific filter here; we can use the http_request_timeout filter provided by WordPress.

See wpsharks/comet-cache#643
raamdev added a commit to wpsharks/comet-cache-pro that referenced this issue Dec 30, 2015
@raamdev
Copy link
Contributor Author

@raamdev raamdev commented Dec 30, 2015

Added note regarding timeout errors when the error message mentions timeout issues:

2015-12-30_11-51-42

raamdev added a commit to wpsharks/comet-cache-pro that referenced this issue Dec 30, 2015
  - If XML Sitemap URL is valid, it is not checked again for at least 1 hour, or until the next time the Auto-Cache Engine runs (every 15 minutes by default).
  - If XML Sitemap URL is invalid, it is checked repeatedly (every `admin_init`) until it becomes valid.
  - If XML Sitemap URL changes, it is checked immediately.
  - If the Auto-Cache Engine encounters an error with the primary XML Sitemap URL while it's running, the error notice is displayed and it is checked repeatedly (every `admin_init`).
  - When saving the plugin options, the XML Sitemap URL is always validated once (assuming the Auto-Cache Engine is enabled and there is an XML Sitemap URL to validate).

See wpsharks/comet-cache#643
@raamdev
Copy link
Contributor Author

@raamdev raamdev commented Dec 30, 2015

@jaswsinc Commit wpsharks/comet-cache-pro@0655c41 reduces repeated attempts to validate XML Sitemap URL once successful by storing the last checked time as a WP Transient:

  • If XML Sitemap URL is valid, it is not checked again for at least 1 hour, or until the next time the Auto-Cache Engine runs (every 15 minutes by default).
  • If XML Sitemap URL is invalid, it is checked repeatedly (every admin_init) until it becomes valid.
  • If XML Sitemap URL changes, it is checked immediately.
  • If the Auto-Cache Engine encounters an error with the primary XML Sitemap URL while it's running, the error notice is displayed and it is checked repeatedly (every admin_init).
  • When saving the plugin options, the XML Sitemap URL is always validated once (assuming the Auto-Cache Engine is enabled and there is an XML Sitemap URL to validate).

I tested the above and it's working great. When the XML Sitemap URL is found to be valid, we don't unnecessarily check it over and over, and when there's an issue with the XML Sitemap URL (such as when an XML Sitemap plugin is not installed), we do check it on every admin_init, until the problem is fixed.

If you have a valid XML Sitemap, there are only two occasions when it will be checked:

  • When saving the plugin options
  • When the Auto-Cache Engine is running (every 15 minutes)

The only scenario where your XML Sitemap could be invalid and you won't be told immediately by ZenCache is when you had a working XML Sitemap and then something happen to it, e.g., you disabled your XML Sitemap plugin. In that scenario, it would take at least 15 minutes (the next Auto-Cache Engine run) for the error to appear (unless you save your ZenCache plugin options; then it would be checked immediately).

My feeling is that this is a good balance between checking the XML Sitemap URL excessively to ensure that we remove the error as soon as the problem is fixed (more important, IMO) and not checking it excessively when it has been proven to be valid (worst case scenario: the site owner doesn't know about the problem for 15 minutes).

@raamdev
Copy link
Contributor Author

@raamdev raamdev commented Dec 31, 2015

Next Pro Release Changelog:

  • Bug Fix: Fixed an Auto-Cache Engine bug that was producing false-positive Dashboard errors related to timeouts: "Problematic XML Sitemap URL - WP_Http says: Operation timed out after 5001 milliseconds with 0 bytes received." Due to the way ZenCache was checking the XML Sitemap URL more than necessary, timeouts were more likely to occur. We now only check the URL repeatedly when a failure has been encountered. If the URL is confirmed as working, we don't check the URL again until the Auto-Cache Engine runs (every 15 minutes by default) or until the Plugin Options are saved. If you are still seeing timeout errors after this update, please see this article. See Issue #643.
  • Bug Fix: Fixed an Auto-Cache Engine bug where ZenCache would would check both the non-SSL (http://) and the SSL (https://) version of the XML Sitemap URL when the Site Address was configured to use SSL. See Issue #643.
@raamdev raamdev closed this Dec 31, 2015
@wpsharks wpsharks locked and limited conversation to collaborators Jan 3, 2016
@raamdev
Copy link
Contributor Author

@raamdev raamdev commented Jan 3, 2016

ZenCache Pro v160103 has been released and includes changes from this GitHub Issue. See the v160103 announcement for further details.


This issue will now be locked to further updates. If you have something to add related to this GitHub Issue, please open a new GitHub Issue and reference this one (#643).

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
2 participants