Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Blank startpage, somehow caused by cache #3111

Closed
mindeffects opened this issue Dec 17, 2010 · 61 comments
Closed

Blank startpage, somehow caused by cache #3111

mindeffects opened this issue Dec 17, 2010 · 61 comments
Labels
area-core bug The issue in the code or project, which should be addressed.

Comments

@mindeffects
Copy link
Contributor

mindeffects created Redmine issue ID 3111

I have a strange problem with two different websites running MODx Revo 2.0.4-pl2 and 2.0.5-pl: The start page (id=1) "forgets" the value of "[ [publishedon] ]" from time to time. The field is just empty. I have not checked the actual MySQL value, because the problem does not occur so often (but often enough). This results in the disappearance of the whole website since MODx thinks, that the page has not yet been published and does not create any output. Just a blank white page. And this is funny, because ALL other pages work fine. It's just the start pages (id=1) that are affected.

The band thing is: I am not alone! Look here: [[http://modxcms.com/forums/index.php/topic,58532.0.html]]

Clearing the cache usualy helps, but only if you know that the page is "down" and revive it manualy. Since there is nothing, not even a menu, to be found on the blank page, the visitor has no chance to go to one of the existing other pages and leaves, tries again tomorrow, only the get the white page again, and NEVER comes back. And that sucks a lot.

Update: The "white ghost" hit again! This time I got the chance to dig a little in the database and found: nothing! The date fields had the usual entries (and this time a value was visible in the field "publischedon" (damn)). The only thing I saw was that "created by" had the value "0" (not "1") which obviously means, that this page was created by MODx and not a user. Anyway, changing this value did not help. In fact, nothing of the DB manipulations helped.

But: I opend the resource (the white startpage), change something (inserted a blank and removed it again) and hit "save". Since "cachable" and "empty cache" was set to "on" the cache was refreshed and bamm the site was back to normal!

So, I dare to make a statement here: The problem is the cache!

Some workarounds:

  1. disable the cache fpr that page (and see if it helps)
  2. disable the system cache (big thing for an error only on one page)
  3. change the "Expiration time for default cache" to get a clean cache after some time, e.g. 12 hours or so.

I went for 3 and see what happens. On the other website I will do 1.

Man, this is a realy annoying bug! Since there does not seem to be any rule, I cannot understand the circumstances on which the error occurs. This is driving me nuts (and the massive snow outside ;-)

I would realy like to supply you guys with al the information there is. But "MODX System Info" kinda sucks, because I cannot copy/paste the text because of that massiv div-"flood". Would be a nice "make bug tracking easier"-feature for 2.0.6 just to use text or - now comes the bad word - a table. The law: Only use tables where they make sense! Like in this case...

OK, how can I get you the information?

Browser: Does not matter. Blank page on all Browsers, all OS, all Lifeforms.

Rest: I atteched a fle with the "system infos" (at last ;-). Maybe this helps.

THANKS A LOT FOR MODx AND A YOUR SUper GREAT WORK!
Oliver

@mindeffects
Copy link
Contributor Author

mindeffects submitted:

I have also a full db dump but cannot post it here. If the Modx developers want to have a look at it just contact me.
Oliver

@opengeek
Copy link
Member

opengeek submitted:

You are going to have to describe what cache file is getting corrupted. Do you have any SymLinks pointing to the Resource that is coming up blank? Is that Resource also being used as the error_page or unauthorized_page?

@mindeffects
Copy link
Contributor Author

mindeffects submitted:

Tanks for the reply. The Websites have no symlinks, but you are right with the other two things: the startpage is also the error_page and the unauthorized_page. Both values are set for id=1 to be the called resource. That can cause the trouble?

Best
Oliver

@mindeffects
Copy link
Contributor Author

mindeffects submitted:

Was this bug solved in 2.0.6-pl? I could not find anything in the readme so I guess not.

How can I prevent the websites to go "blind"? Any cache-tweaking possibilities?

cya
Oliver

@mindeffects
Copy link
Contributor Author

mindeffects submitted:

SORRY, BUT THIS BUG REALY NEEDS SOME ATTENTION!!!

Today I got a call that the website is "white" again. No sourcecode, just nothing, a complete blank page!

@jason Coward: You were so nice to ask for cached files etc. I now have a full TGZ-Archive including database and filesystem. Is there any way that you could have a look at it? Or do you have an idea how to solve that problem? That would be so AWESOME!!!

Merry Christmas
Oliver

@mindeffects
Copy link
Contributor Author

mindeffects submitted:

OK, fresh start in 2011.

I am still trying to figure out, what causes this catastrophe and I am getting nowhere. Somehow the cache for the "site_start" is affected and delivers a blank page (NOTHING in it, not a single sign). Only clearing the cache helps (edit the resource and save it with "Empty Cache" activated. The error does not seem to follow a sheme or at least none I am aware of.

Is there any information that I can deliver to you that would help? I now have a full TGZ-Archive including database and filesystem. Is there any way one of you could have a look at it? Or do you have an idea how to solve that problem? That would be so AWESOME!!!

THANKS IN ADVANCE!
Oliver

@modxbot
Copy link
Contributor

modxbot commented Jan 6, 2011

rethrash submitted:

Oliver, did you update to 2.0.6 yet and are able to reproduce? There were many under the hood tweaks that could have silently squashed this one in 2.0.6 wrt how it deals with forwarding/redirects in general.

@mindeffects
Copy link
Contributor Author

mindeffects submitted:

Yes, I did! I was sooo hoping that 2.0.6-pl would somehow help but it did not.

Are there any PHP settings that could affect the caching? The provider those two sites are hosted at lets me change some of the settings:

Settings                             Status              
PHP-Errors                           Inside browser
PHP-RegisterGlobals                  Serverstandard
PHP-mbstring.func_overload           Serverstandard
PHP-allow_url_fopen                  Off
PHP-allow_url_include                Off
PHP-Magic-Quotes-GPC                 Serverstandard
PHP-Zend-ZE1-Kompatibilität          Serverstandard
PHP-Register-Long-Arrays             Serverstandard
PHP-Session-Use-Trans-SID            Off
PHP-Allow-Call-Time-Pass-Reference   Serverstandard
PHP-MySQL-Secure-Login               Serverstandard
PHP Suhosin Executor Allow Symlink   Serverstandard
PHP Suhosin Session Encryption       Serverstandard
PHP Suhosin Mail Protection          1
PHP Suhosin RPG Max Vars             Serverstandard
PHP5-Extensions einstellen           phtml php5 php4 php3 php
CGI-Extensions einstellen            cgi pl py sh rb
SSI-Extensions einstellen            
Directoryindex einstellen            index.html index.htm index.shtml index.php index.php5 index.wml index.xml

I had to turn off "PHP-Session-Use-Trans-SID" because Google indexed the pages with the session appended to the filename. Could this perhaps conflict with the caching?

@modxbot
Copy link
Contributor

modxbot commented Jan 7, 2011

dj13 submitted:

Oliver Haase wrote:

OK, fresh start in 2011.

I am still trying to figure out, what causes this catastrophe and I am getting nowhere. Somehow the cache for the "site_start" is affected and delivers a blank page (NOTHING in it, not a single sign). Only clearing the cache helps (edit the resource and save it with "Empty Cache" activated. The error does not seem to follow a sheme or at least none I am aware of.

Is there any information that I can deliver to you that would help? I now have a full TGZ-Archive including database and filesystem. Is there any way one of you could have a look at it? Or do you have an idea how to solve that problem? That would be so AWESOME!!!

THANKS IN ADVANCE!
Oliver

Same BUG for me. From time to time just blank home page. Absolute nothing. Clear cache helps, but that is not solution.

@mindeffects
Copy link
Contributor Author

mindeffects submitted:

Same BUG for me. From time to time just blank home page. Absolute nothing. Clear cache helps, but that is not solution.

Hey, another hit! HOORAY! But of course I also feel sorry for your trouble.

One incommodious work-around that slows the thing down quite a lot: disable the cache just for the start page. You will have to wait some extra seconds (3-4 on a shared hosting server ) but at least the site stays reachable. The other page can have their caches still turned on.

Hope some brain can help us with this issue.

@mindeffects
Copy link
Contributor Author

mindeffects submitted:

Jason Coward wrote:

You are going to have to describe what cache file is getting corrupted. Do you have any SymLinks pointing to the Resource that is coming up blank? Is that Resource also being used as the error_page or unauthorized_page?

Hey, Jason! Do you perhaps have some more detailed clues about what could be happening to the cache? I am stuck - and others to:
[[http://modxcms.com/forums/index.php/topic,58532.msg334103.html#msg334103]]
[[http://bugs.modx.com/issues/3111#note-9]]

A excerpt of my error log:
[[http://modxcms.com/forums/index.php/topic,59414.msg338265.html#msg338265]]

This bug is nasty.

@mindeffects
Copy link
Contributor Author

mindeffects submitted:

HELLO? Anybody? I now have 3 webites that produce a white start page! With 2 different providers! So now it's clear that it MODx the problem. I will have to STOP ALL MY PROJECTS if this bug does not get investigated!

Dear developers, please tell me what you need to fox this! I have, again, a full-backup of the files and the DB!

I NEED A RESPONSE! PLEASE!!!

THANKS!
Oliver

@modxbot
Copy link
Contributor

modxbot commented Jan 12, 2011

rethrash submitted:

Which webhosts are you able to reproduce this on? Is there a consistent way to make it trigger this behavior?

Also it will be quite helpful to have your full version information for Apache, PHP and MySQL here as well. Possibly some other things that Jason or others will request.

@opengeek
Copy link
Member

opengeek submitted:

I cannot reproduce this—you will have to provide more detailed information so I can reproduce the problem, starting with environment information as Ryan suggests.

@mindeffects
Copy link
Contributor Author

mindeffects submitted:

Here is the full phpinfo(): [[http://ellen.pandorafilm.com/phpinfo.php.html]]

You can have a look at the "problem" site at:

START (blank) [[http://ellen.pandorafilm.com/]]

Subpage (OK) [[http://www.ellen.pandorafilm.com/press-dossier.html]]

Do you have any ideas what's causing this?

THANK YOU SOOOOO MUCH IN ADVANCE!
Oliver

@modxbot
Copy link
Contributor

modxbot commented Jan 12, 2011

rethrash submitted:

I saw this previously on a dedicated server with an older version of Revo. When we updated to 2.0.5 and updated our PHP version, this issue has not reoccurred. Revo is a sophisticated application that takes advantage of much that PHP has to offer, and gets more efficient with each release. Since Revo uses much of the available methods in PHP, more than many other projects, it may be that newer releases are addressing the bug that results in this behavior.

Seeing that you're running 5.2.14, and "support for the 5.2 branch has ended":http://www.php.net/archive/2010.php#id2010-12-16-1, and there's a major security vulnerability with versions lower than 5.2.17 and 5.3.5, I would highly suggest moving to 5.3.5.

Also you might consider increasing your memory limit above 40MB, say to 128MB and see if it reoccurs.

I'm not claiming it will fix it but it's worth a shot.

@opengeek
Copy link
Member

opengeek submitted:

This has nothing do with PHP versions AFAIK — more likely something to do with this www.ellen. vs. ellen. subdomain thing you linked to there. You can't have a different hostname for the start page than a subpage in the site. ???

@mindeffects
Copy link
Contributor Author

mindeffects submitted:

The error occurs also on sites without subdomains. I attached some system settings of "ELLEN". I did not set "http_host" and "site_url". Can this cause the problems?

On a multidomain installation I use 3 subdomains and there I don't have the cache trouble. I attached the setting of "HOWL", which is one of those subdomains.

Thanks
Oliver

@modxbot
Copy link
Contributor

modxbot commented Jan 12, 2011

rethrash submitted:

You might try setting the unavailable/error/unauthorized pages to independent pages rather than site start for grins as well. Again, don't know if it will have an affect but worth a shot.

How long does it typically take to reproduce the blank page failure?

@mindeffects
Copy link
Contributor Author

mindeffects submitted:

I have absolute no idea when the error occurs. There just is no "typical" trigger that I have detected yet. There must be one, but those sites are so different. No clue. I cannout force the error to happen. That's too bad. Would be great for testing.

One thing they had in common is the shared "unavailable/error/unauthorized". This is allways page id=1. So I had this changed for one of the sites by creating one "error" page for those 3 (unavailable/error/unauthorized) which was a weblink to page id=1. Would that be OK? Since I turned off the cache for the star page (makes the "first contact" realy slow!) I cannout say whether this helped. I cannot risk anyother customers call about a "blank page".

best
Oliver

@modxbot
Copy link
Contributor

modxbot commented Jan 12, 2011

rethrash submitted:

If it can't be reproduced, it can't be fixed. We're not able to reproduce the issue, and you've disabled caching on site start so it won't resurface. We'll have to defer this until a demo environment can be set up that makes this issue reproducible.

@mindeffects
Copy link
Contributor Author

mindeffects submitted:

Demo environment? I have one here with a frozen start page [[http://ellen.pandorafilm.com/]] and a working subpage [[http://ellen.pandorafilm.com/press-dossier.html]].

You want full access? Manager, ftp, database? No problem. Just tell me how to get the passcodes to you. You shall have anything I can offer to catch this bug! You find my contact details on [[http://www.mindeffects.de]]

Hope to hear from you soon
Oliver

@modxbot
Copy link
Contributor

modxbot commented Jan 12, 2011

rethrash submitted:

Oliver, please open a support ticket by emailing help@modx.com with the confidential details to access things and we'll see what we can do.

@mindeffects
Copy link
Contributor Author

mindeffects submitted:

I just saw the site is up again. Were you able to find anything that leeds to something like a a reason for this startpage blanking?

@mindeffects
Copy link
Contributor Author

mindeffects submitted:

May this is a good approach: [[http://modxcms.com/forums/index.php/topic,58532.msg339546.html#msg339546]]

"Tried http://www.ellen.pandorafilm.com/index.php?id=1.0 and it worked instantly"

"tried id 2,3,4 all working ,tried some arbitrary numbers and got blank pages just like id 1, that's when I suspect 1 as an integer might somehow being mishandled by routing in situations and tried 1.0 instead."

There are pages with the ids 2,3,4! Maybe that's what happening: the "1" might get misinterpreted under some special circumstances?

@modxbot
Copy link
Contributor

modxbot commented Jan 12, 2011

rethrash submitted:

Oliver we've not started digging extensively. We will need to create a test script that can reproduce the blanking by hammering on the server with a series of valid and invalid requests, and testing the standard home page for content. First we have to reproduce...

@mindeffects
Copy link
Contributor Author

mindeffects submitted:

YEEEEEEEEESSSSSSSSSSSSS!!! I have found a pattern! And: A quick work around which is SOOOOOO simple!!!

First, let's have a look at the webservers logfiles (I deleted some columns):

IP               DATE / TIM             REQUEST                   ERROR  SIZE  DOMAIN
95.108.151.244   [18/Nov/2010:03:53:14  GET /robots.txt HTTP/1.1  404    5700  my-modx-domain.de
95.108.151.244   [18/Nov/2010:03:53:14  GET /robots.txt HTTP/1.1  404    5704  www.my-modx-domain.de
95.108.151.244   [18/Nov/2010:03:53:16  GET / HTTP/1.1            200     435  my-modx-domain.de
95.108.151.244   [18/Nov/2010:03:53:16  GET / HTTP/1.1            200     435  www.my-modx-domain.de

or here

                                                                          
IP               DATE / TIME            REQUEST                   ERROR  SIZE  DOMAIN
95.108.150.235   [08/Dec/2010:06:42:15  GET /robots.txt HTTP/1.1  404    5703  www.my-modx-domain.de
95.108.150.235   [08/Dec/2010:06:42:15  GET /robots.txt HTTP/1.1  404    5699  my-modx-domain.de
95.108.150.235   [08/Dec/2010:06:42:19  GET / HTTP/1.1            200     435  my-modx-domain.de
95.108.150.235   [08/Dec/2010:06:42:20  GET / HTTP/1.1            200     435  www.my-modx-domain.de

or here

                                                                          
IP               DATE / TIME            REQUEST                   ERROR  SIZE  DOMAIN
208.115.111.250  [29/Jan/2011:19:11:31  GET /robots.txt HTTP/1.1  404    9782  my-modx-domain2.com
208.115.111.250  [29/Jan/2011:19:11:31  GET /robots.txt HTTP/1.1  404    9786  www.my-modx-domain2.com
64.34.218.178    [29/Jan/2011:19:21:04  GET / HTTP/1.1            200     399  my-modx-domain2.com
64.34.218.178    [29/Jan/2011:19:21:05  GET / HTTP/1.1            200     399  www.my-modx-domain2.com

any many more times.

Allways with the same pattern:
An external website (a crawler) tries to get the "/robots.txt" which is not part of the standard MODx installation and gets an error 404. This happens at the SAME TIME for the domain WITH and WITHOUT the WWW. Since the error page is by default the id 1 the crawler is redirected to the start page where something in the cache goes wrong. The size of the resulting data drops from about 5600 bytes (or about 9700) to about 400 bytes: The request delivers only the header without any data. From now on every visitor of the startpage gets the "blank white screen of death".

A first step to avoid that problem would be the creation of the "/robots.txt". Crawler finds file, crawler is happy, everybody is happy! (Perhaps "/robots.txt" could be included in the future MODx releases?!)

Second, using a different error page than the start page should also work but I dont know if then the error page gets blanked if the "/robots.txt" is requested at the same time at two places (with and without the "www"). Still the default id 1 for all those "special" pages can cause trouble in some cases (as seen above).

Finaly: Perhaps the MODx caching algorithm can be modified to be prepared for these rare (crawler) situations. That would make MODx even more robust and the user can decide if he wants an "/robots.txt" or if his error page is identical with the start page. Many of my customers like that "error redirection" A LOT since they make error free products and want to have errors on their website.

  • MODX is a content management platform that lets you fit the tool to your website, not the other way around.
  • MODX gives you total creative freedom. Build any site you imagine—without compromise.

We all LOVE MODx!!!

@mindeffects
Copy link
Contributor Author

mindeffects submitted:

Hi Jason,

OK, Mr Euphoria is a little sad again. Today just another white screen happend BUT: I thought about that "www.domain.com" AND "domain.com" leading to the same page maybe also something the cache does not like!?

This might help inside the ".htaccess", redirecting every not "www." to the page with the "www.":

RewriteCond %{HTTP_HOST} ^domain.com$ [NC]
RewriteRule ^(.*)$ http://www.domain.com/$1 [R=301,L]

For subdmain it would be stripping the "www":

RewriteCond %{HTTP_HOST} !^subdomain.domain.com$
RewriteRule ^(.*)$ http://subdomain.domain.com/$1 [L,R=301]

Do I limit the features of MODx in some way by that?

If not these lines SHOULD BE INCLUDED into the standard "ht.access" at least as a comment (starting with "#") to avoid the trouble I have with my clients.

A quote from my client : "If this does not fix the white screen problem ultimately, we will drop MODx!"
This would have severe consequences for me, hopefully not including a lawsuit. They would kick my ass!

Any reply from you is very welcome!

'till soon
Oliver

@mindeffects
Copy link
Contributor Author

mindeffects submitted:

Ryan Thrash wrote:

Oliver we've not started digging extensively. We will need to create a test script that can reproduce the blanking by hammering on the server with a series of valid and invalid requests, and testing the standard home page for content. First we have to reproduce...

OK, Ryan, do you NOW have enough material to start the investigation? I found a pattern, some other flaws and would now love to hear ANYTHING from the MODx people.

I don't want to sound to harsh but this "white screen" thing is REALY SERIOUS! After my last phone call with my client I am one step ahead of being sued for damages. I need a solution NOW! And I am not the only one with this cache problem. PLEASE HELP ME!!! Read my postings and comment on the "robots.txt" and the ".htaccess" ideas. PLEASE!

@mindeffects
Copy link
Contributor Author

mindeffects submitted:

Jason Coward wrote:

This has nothing do with PHP versions AFAIK — more likely something to do with this www.ellen. vs. ellen. subdomain thing you linked to there. You can't have a different hostname for the start page than a subpage in the site. ???

Maybe I ovesaw your post. Your idea sounds good! Looking at post #28: do you think that would help?

@opengeek
Copy link
Member

opengeek commented Feb 8, 2011

opengeek submitted:

Oliver Haase wrote:

Jason Coward wrote:

This has nothing do with PHP versions AFAIK — more likely something to do with this www.ellen. vs. ellen. subdomain thing you linked to there. You can't have a different hostname for the start page than a subpage in the site. ???

Maybe I ovesaw your post. Your idea sounds good! Looking at post #28: do you think that would help?

I definitely don't leave my sites accessible from multiple domains since the same user could access it from different domains and have completely different sessions—plus I just prefer consistency. And it is in the default ht.access that is shipped with the product. I'm assuming you have created a robots.txt to avoid the 404 issue? And made sure that you have assigned the error_page and unauthorized_page to Resources other than the site_start?

Definitely need to know more about the hosting provider (shared hosting? dedicated? virtual private?) at this point if all of those things have been done and you are still experiencing a problem. This sure smacks of intermittent resource shortages if not related to setting the error/unauthorized_page the same as the site_start. Also might want to start investigating every Snippet and/or Plugin associated with the site_start Resource to rule those out as the culprit.

@opengeek
Copy link
Member

opengeek commented Feb 8, 2011

opengeek submitted:

Also, have you tried making the site_start non-cacheable for now to avoid the issue? If you still get blank views occasionally that way, then it's definitely either a Snippet/Plugin or some kind of temporary resource shortage from the host server.

@modxbot
Copy link
Contributor

modxbot commented Feb 9, 2011

rethrash submitted:

Which hosting plan is this at your webhost?

@modxbot
Copy link
Contributor

modxbot commented Feb 9, 2011

rethrash submitted:

Being that it appears to be caused by aggressive search spiders, the choice of hosting plans is very important, given that shared hosts sometimes are over-provisioned and this can lead to databases timing out, php not being available, and apache connections dying randomly. In comment http://bugs.modx.com/issues/3111#note-10 above you seem to indicate this is on a shared hosting plan ...

@mindeffects
Copy link
Contributor Author

mindeffects submitted:

Thanks for answering so fast! Cool! And THANKS FOR YOU HELP!!!

Jason Coward wrote:

I definitely don't leave my sites accessible from multiple domains since the same user could access it from different domains and have completely different sessions—plus I just prefer consistency. And it is in the default ht.access that is shipped with the product.

OK, I found it. Slightly other syntax. I now use the MODx included version of htaccess. Just to be sure.

I'm assuming you have created a robots.txt to avoid the 404 issue? And made sure that you have assigned the error_page and unauthorized_page to Resources other than the site_start?

robots.txt: YES
ErrorPg: Yes
UnAuthPg: Yes (Can this be the same as the ErrorPg?)

Definitely need to know more about the hosting provider (shared hosting? dedicated? virtual private?) at this point if all of those things have been done and you are still experiencing a problem. This sure smacks of intermittent resource shortages if not related to setting the error/unauthorized_page the same as the site_start.

Shared hosting @ http://www.hosteurop.de/ "WebPack L" (sorry, no english version)

Also might want to start investigating every Snippet and/or Plugin associated with the site_start Resource to rule those out as the culprit.

Nothing to special:

[[!setlocaleDe?]] = 
[[!If? &subject=`[[*slideshow]]` &operator=`==` &operand=`on` &then=`[[$slideshow-nav]]` ]]
[[!If? &subject=`[[*slideshow]]` &operator=`!=` &operand=`on` &then=`

[[*longtitle]]

` &else=`

[[*longtitle]]

`]] [[getResourceField? &id=`22` &field=`longtitle` &processTV=`0`]] [[getResourceField? &id=`22` &field=`teaser` &processTV=`1`]] [[getResourceField? &id=`24` &field=`longtitle` &processTV=`0`]] [[getResourceField? &id=`24` &field=`teaser` &processTV=`1`]] [[getResourceField? &id=`25` &field=`longtitle` &processTV=`0`]] [[getResourceField? &id=`25` &field=`teaser` &processTV=`1`]] [[Wayfinder? &startId=`3` &selfClass=`self`]] [[!$MODxSpeed]]

Also, have you tried making the site_start non-cacheable for now to avoid the issue? If you still get blank views occasionally that way, then it's definitely either a Snippet/Plugin or some kind of temporary resource shortage from the host server.

Tried that but since it is a shared hoster, it was kinda slow (2-3sec) during the main business hours. I turned the startpage cache off for now.

To you have an idea, why (when the page has blanked) "index?id=1.0" works and "index?id=1" fails? Also, only the startpage is affected, not one of the other pages is.

@opengeek
Copy link
Member

opengeek commented Feb 9, 2011

opengeek submitted:

Can I ask why all the separate getResourceField calls instead of one getResources call that collects that data from those three resources and outputs it in a simple chunk tpl?

@mindeffects
Copy link
Contributor Author

mindeffects submitted:

Jason Coward wrote:

Can I ask why all the separate getResourceField calls instead of one getResources call that collects that data from those three resources and outputs it in a simple chunk tpl?

Because I did not know that this possible: "getResourceField is a simple snippet which can be used to display a single field, including template variables, of a different resource for MODx Revolution."

If this is possible for multiple fields and different resources at the same time that would be awesome (and would speed things up)! How that could be done? http://rtfm.modx.com/display/ADDON/getResourceField does not tell to much.

@mindeffects
Copy link
Contributor Author

mindeffects submitted:

Jason Coward wrote:

Can I ask why all the separate getResourceField calls instead of one getResources call that collects that data from those three resources and outputs it in a simple chunk tpl?

OK, stupid me. You wrote "getResources" NOT "getResourceField". Good point. I changed that and gained a little more speed. And elegance. Thanks! Still learning.

@mindeffects
Copy link
Contributor Author

mindeffects submitted:

Ryan Thrash wrote:

Being that it appears to be caused by aggressive search spiders, the choice of hosting plans is very important, given that shared hosts sometimes are over-provisioned and this can lead to databases timing out, php not being available, and apache connections dying randomly. In comment http://bugs.modx.com/issues/3111#note-10 above you seem to indicate this is on a shared hosting plan ...

Yes, a shared hosting plan: HostEurope.de with a "Webpack L". Works like a charm.

And again, yes, there is allways a bot near the blanking of the startpage. Stupid bots damaging my website! But why is only the start page affected, not any of the subpages?

@modxbot
Copy link
Contributor

modxbot commented Feb 9, 2011

rethrash submitted:

You almost certainly would not experience this issue with a more robust server that is not sporadically, and unpredicatably, resource starved. A VPS with a guaranteed minimum amount of resources would be an infinitely better choice for Revo.

Shared servers by their nature can have dozens, hundreds or more(!) websites running on the same piece of hardware, using the same disk, network, processor and memory. Usually it doesn't matter, but sometimes a hog of a site is added to an server and things start falling apart. Things could have been smooth sailing for months or years and then they go horribly awry. It's why years ago I ceased using shared accounts for anything but testing and non-critical sites: it's not a matter of if there will be an outage but when and how frequently, even with the best of shared hosts. I'm not claiming that's what's happening in this case but it would not surprise me at all.

You'll wind up spending dozens of (expensive) hours trying to trace down the source of the problem that a few extra dollars a month would make a non-issue. I don't think you'll ever be able to pinpoint the cause. In fact it wouldn't surprise me if the search spiders were hitting many sites on the same shared server at once and starving the site in general. Microsoft's Bing spiders were banned for a while from the MODX website itself on a very robust 16GB RAM/8-processor dedicated server for similar reasons.

The most likely reason it's affecting just the home page is because that's where spiders start. Once the starved resources strike, and the page blanks, the spiders having no links to follow so they simply go away.

I really don't think there's much more that we can do right now until the caching overhaul is complete (no ETA yet), and I'm not sure that would help in this situation. I'm 99% sure it's simply an occasionally overloaded server.

@modxbot
Copy link
Contributor

modxbot commented Feb 9, 2011

danny_kay1710 submitted:

I also experience this bug... My start page, error page and unauthorised page are different. I also have and have always had a robot.txt

Just had to clear the cache this morning to get it to display the home page this morning

We use a local SEO/hosting firm which has it's own dedicated server. This server we are on hosts 39 domains with only 12 active websites which in terms of shared hosting isn't too bad. All 12 sites (except ours) are basically managed by the company with client access to the CMS set up at most really.

@mindeffects
Copy link
Contributor Author

mindeffects submitted:

Ryan Thrash wrote:

You almost certainly would not experience this issue with a more robust server that is not sporadically, and unpredicatably, resource starved. A VPS with a guaranteed minimum amount of resources would be an infinitely better choice for Revo.

Yes, but NONE of my customers is having an own webserver for their CMS because the did not need one (until now?). Shall I tell them "Well, you could use this super cool CMS MODx instead of your Joomla crap, but you will have to migrate to your own server machine for that."? These guys are not Amazon or EBAY. They run small companies, make some culture things, do pivate stuff or whatever. They are the mayority of users all over the world!

Dianiel (http://bugs.modx.com/issues/3111#note-44) seems to have a pretty decent system facing the same "white screen" problem.

The most likely reason it's affecting just the home page is because that's where spiders start. Once the starved resources strike, and the page blanks, the spiders having no links to follow so they simply go away.

And why does MODx remain in a "blanked state"? If the spider treats the system so badly, that it quits, why doesn't it get up on it's feet after they are gone? The server is running, everything is working fine. "Only" the start page died over a cache hickup. It's not that the whole system was beaten up and lies around crying for help. It's only the startpage! The manager is up and running, all subpage are happy to serve every visitor, even on a crappy shared hoster.

I really don't think there's much more that we can do right now until the caching overhaul is complete (no ETA yet), and I'm not sure that would help in this situation. I'm 99% sure it's simply an occasionally overloaded server.

"occasionally overloaded server" is OK. But not his staying in memento, thinking of the hard times he had when the spiders came by!

Perhaps a quick cache-self-diagnose for MODx would be possible as a "fix for the moment". Perhaps Michael (lo9on) can help with his killer-script to narrow things down.

@modxbot
Copy link
Contributor

modxbot commented Feb 9, 2011

danny_kay1710 submitted:

I know clearing the cache fixes the problem and temporarily disabling the cache on the page may help.

But I highly doubt it's due to spiders. Our site is consistently having spiders crawl the site and it is only every now and again that the page goes blank.

A spider requires links to continue it's crawl. If the page is blank it has no links to continue to crawl - but this is assuming the page is broken before the spider gets to the home page. If it breaks the home page but actually manages to get itself a normal result first then it should continue without any problems and then the question is why doesn't it break the other pages?

If all spiders broke your home page and went away it would be almost impossible to rank in any search engines.... however this is simply not the case. Does your log include a user agent string so we can identify which bot it is?

EDIT: Sorry should have re-read through everything before asking that - I have seen the user agent string now...

@mindeffects
Copy link
Contributor Author

mindeffects submitted:

Daniel Kay wrote:

EDIT: Sorry should have re-read through everything before asking that - I have seen the user agent string now...

@daniel: There is some more stuff to be found in the forum (starting at page 3): [[http://modxcms.com/forums/index.php/topic,58532.40.html]]

@modxbot
Copy link
Contributor

modxbot commented Feb 9, 2011

rethrash submitted:

Daniel Kay wrote:

We use a local SEO/hosting firm which has it's own dedicated server. This server we are on hosts 39 domains with only 12 active websites which in terms of shared hosting isn't too bad. All 12 sites (except ours) are basically managed by the company with client access to the CMS set up at most really.

I do not have any additional information about what version of software you're running or other information about the server configuration; we had occasional blank pages on a server running 2.0.x that when it was upgraded to 2.0.5 along with the latest version of PHP ceased being an issue. Further, I've got an old P4 dedicated server laying around with just 2GB RAM that 12 sites getting hit by an overly zealous spider would bring to its knees.

Oliver Haase wrote:

Yes, but NONE of my customers is having an own webserver for their CMS because the did not need one (until now?). Shall I tell them "Well, you could use this super cool CMS MODx instead of your Joomla crap, but you will have to migrate to your own server machine for that."? These guys are not Amazon or EBAY. They run small companies, make some culture things, do pivate stuff or whatever. They are the mayority of users all over the world!

No I'm saying someone should even if that someone is you. Share it amongst several clients if cost is the concern. This would then put you at a known starting point, without other sites over which you have no control being involved. And yes Revo may require more resources, but if it makes sense in that it provides more control over the output and lets you build exactly what you want, I would hope it's worth $10/month premium in hosting plans. Not to mention that it's likely for many custom marketing sites you can build out the site faster and save hundreds of dollars in up front investment.

To really diagnose what's going on on a shared server, you'd have to have a lot of knowledge about the entire OS stack, and access to everything on the box. You'll never get that access. Upgrading from your shared L package to a managed VPS would cost you 10-20 more Euros a month. Would that be worth it not to have to deal with this type of headache? Since you're not a server admin as described above this makes a ton of sense. (Don't run email on the box, either ... let Google or a hosted Exchange provider handle the load.)

@modxbot
Copy link
Contributor

modxbot commented Feb 9, 2011

danny_kay1710 submitted:

Ryan Thrash wrote:

I do not have any additional information about what version of software you're running or other information about the server configuration; we had occasional blank pages on a server running 2.0.x that when it was upgraded to 2.0.5 along with the latest version of PHP ceased being an issue. Further, I've got an old P4 dedicated server laying around with just 2GB RAM that 12 sites getting hit by an overly zealous spider would bring to its knees.

That is fair enough, I will enquire as the exact specification of the server.
I am running MySQL 5.2.16 with the latest version of ModX (more details to come when I get them). If it is of any relevance I started with the first release version of ModX and it has been upgraded to every last version since then. The installation has been moved once... is there anything I could have missed in this process that could be causing it.

Also I never noticed it whilst in development or live until I added Google Analytics... is there a potential link here?

However I wouldn't expect an overloaded web server to be completely breaking pages and/or it's cache systems regardless of how many times the load spiked.

Sites that are essentially under a denial of service attack after traffic spikes from attention from the likes of Digg or Slashdot don't suddenly have their home page hidden by their application framework until an admin manually logs in and clears a cache and a spider being a little over zealous won't be anything in comparison.

@modxbot
Copy link
Contributor

modxbot commented Feb 9, 2011

dmhufford submitted:

I would just like to report that as of late I have been experiencing this issue as well. I have experienced it regardless of version. I believe it started with 2.0.5-pl and I'm now up to 2.0.7-pl and it just happened again today.

I'm currently on a shared host that is running 5.2.9- so given the information here that PHP should be 5.3.5 or up, I'm hoping that I can get that updated. If not I might need to jump up to a VPS package.

Clearing the cache fixes this every time. I have a robots.txt so that shouldn't be the issue.

Temporarily I've turned off caching for the start page, and am going to see if I can get my PHP version updated somehow or another.

I noticed Daniel just now mentioned Google Analytics- I recently installed started using that as well on this site, and I don't recall the issue before then.

MODx version: 2.0.7-pl
Database version: 5.0.77
PHP Version: 5.2.9
System OS: Linux (kernel 2.6.18-164)
PHP Memory Limit: 128MB

@modxbot
Copy link
Contributor

modxbot commented Feb 10, 2011

danny_kay1710 submitted:

The server my site is running in is an 8 Core Xeon@2.00Ghz with 16GB RAM. It is more than capable of the load it is running.

Again the big question surrounding all of this is why just the start page. The error pages are all directed elsewhere and have been tested to ensure the setting is actually applying.

Surely if it was a bug in that version of PHP then it would occur on all pages not just one? A bug in PHP that at an unforeseen time completely prevents only a certain page in a single application framework from working seem's just a little far-fetched to me.

I am happy to provide any more information that you require. Please just ask.

@cyclissmo
Copy link

cyclissmo submitted:

@ryan: Did you get my forum PM? I posted a technique to trigger the white screen. I didn't think it would be prudent to have a script in the open that could take some Revo sites offline. Let me know how I can help.

@mindeffects
Copy link
Contributor Author

mindeffects submitted:

Mike Zeballos wrote:

@ryan: Did you get my forum PM? I posted a technique to trigger the white screen. I didn't think it would be prudent to have a script in the open that could take some Revo sites offline. Let me know how I can help.

@mike: I would also love to have your script, since my own script did not manage to trigger the white screen :-( I just have to make sure, that this one client of mine will not get blanked again and that I did all to prevent it!
You find my e-mail contact at www.mindeffects.de. Thanks in advance!
Oliver

@Greex
Copy link

Greex commented Feb 13, 2011

greex submitted:

Hello from another german user,

first: After working with Wordpress, Drupal, Typo3, Contao ... ModX is the BEST CMS I ever worked with and I already infected some other Website-Workers with the ModX-Virus. Even my customers are very happy with the manager ... but this error here ist able to ruin all expectations.

I think I have the same thing here. I have to say, that I already changed the full Server because of this error. The new server has a totaly different setup than the first one, but the same error occurs:

Without any viewable reason, the Startpage went from >6KB to 462 Byte and throw a 500 Server error. All other pages and the manager are fine. After deleting the cache, the start page is back again.
A few days ago, I disabled caching for the start page. But the day later my customer called me, that another site was gone. He was able for himself to delete the cache, but he was not happy :/
So I enabled the cache again and set up a "Is-Alive" Script to the start page.

To say something more about the setup:

  • Domain: http://www.sparkassen-muensterland-giro.de/
  • ModX Revo 2.0.7-pl // Same problem was from 2.0.5
  • Ubuntu 10.04, apache2, Plesk 10.01 // Old Server had another linux distribution, no Plesk
  • PHP Version 5.3.2-1ubuntu4.7 // Old Server had php 5.26
  • Memory Limit 128 MB
  • PDO Driver MySQL 5.1.41
  • No APC or other extra caches
  • phpinfo for the new serer here: http://radreisen.elektrodampf.de/infome.php (another testdomain, but the same server)
  • phpinfo for the old server here: http://radreisen.rad-net.de/infome.php
  • Custom 404-page with caching disabled
  • Only www.domain.de - not domain.de without www ist possible
  • robots.txt available
  • There are no warnings, errors or notices in other logs. Only in the access log I can see what happened.
  • The only packages that I use in the startpage are wayfinder, getresources and getpage.
  • There is no time, when the error "normally" occurs. Luckily often in the night so that I can fix it without making my customer too angry, but there is no time-pattern I can see.

This website is the only one on this "new" server. It's a dedicated root server with power for a lot more sites than this small one.
Something more: The site moved from a very old CMS after 6 years and there was a decision not to move all files, URLs and images to the new site. So I have a lot of 404/301, especially from bots visits.

So the access_log looks like this:

38.99.96.89 - - [13/Feb/2011:00:35:53 +0100] "GET /index.php?ref=nf&pgID_Newsticker=1&menuid=366 HTTP/1.1" 200 6114 "-" "Mozilla/5.0 (compatible; ScoutJet; +http://www.scoutjet.com/)"
38.99.96.89 - - [13/Feb/2011:00:36:01 +0100] "GET /index.php?ref=nf&pgID_Newsticker=2&menuid=366 HTTP/1.1" 200 6111 "-" "Mozilla/5.0 (compatible; ScoutJet; +http://www.scoutjet.com/)"
66.249.72.105 - - [13/Feb/2011:00:36:01 +0100] "GET /?newsid=331&rss=1&menuid=366&page=97 HTTP/1.1" 200 5845 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
38.99.96.89 - - [13/Feb/2011:00:36:09 +0100] "GET /index.php?ref=nf&pgID_Newsticker=3&menuid=366 HTTP/1.1" 500 462 "-" "Mozilla/5.0 (compatible; ScoutJet; +http://www.scoutjet.com/)"

The requests to /index.php?xyz are a heritage from the old CMS (Super-SEO work ... :/ )
First request: Fine
Second request: Fine
Third request: Fine
Fourth request from the same bot than 1st and 2nd: Boom!

Here another example from two days earlier:

217.231.92.66 - - [11/Feb/2011:11:30:46 +0100] "GET / HTTP/1.1" 200 6185 "http://www.cycling-cup.de/index.php?id=24" "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.1; WOW64; Trident/4.0; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; Media Center PC 6.0; .NET4.0C)"
66.249.68.169 - - [11/Feb/2011:11:30:46 +0100] "GET /?home=www.dee...d3.txt%3F&pgID_Newsticker=2&page=49 HTTP/1.1" 200 6099 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
[... a lot of images out of a newsletter ...]
94.216.238.117 - - [11/Feb/2011:11:32:46 +0100] "GET / HTTP/1.0" 500 411 "-" "-" <-- a "is alive" call from a small script that i wrote to react as quickly as possible
80.153.229.243 - - [11/Feb/2011:11:33:51 +0100] "GET / HTTP/1.1" 500 365 "-" "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; GTB6.5; .NET CLR 1.1.4322)"

I am not sure, but the problem only seems to occur after a bot visit in my logs. Not sure if it is because the site has not that much traffic a the moment and bots are coming quite often the real users, but perhaps it's an idea.

I would be so glad if you can look at this. I have a good relation to my customer and his happiness about the manager is still high enough that he give me some time to fix this problem. But I'm not sure how long this will be ok and I'm getting not so much sleep in the last days.

Sorry for my poor english.

Greeting from germany,

Sebastian

@opengeek
Copy link
Member

opengeek submitted:

There are some significant cache refactorings now in the develop branch for 2.1.0 release, and more on the way, that should help avoid any potential conflicts involving file locking and reduce chances these blank page are being caused by the caching system itself. I'll create a build in the next few hours with these latest changes for testing if anyone wants to see if it resolves their issues, and will be working on documentation for best practices in developing caching strategies/configurations based on various deployment profiles.

@esnyder
Copy link

esnyder commented Feb 19, 2011

esnyder submitted:

I have a better workaround than making the homepage uncacheable, but it only works if your homepage can be served equivalently as a static HTML file.

Simply copy the HTML source code for the homepage to a text file, and call it index-static.html.

Then add this line to your .htaccess right before the friendly URL redirect

RewriteRule ^$ index-static.html

This rewrites (note that it's a rewrite not a redirect, that's important) requests for root to the static file, while still allowing MODx to serve requests for all other pages. Requests for root will load fast, and the rest of the site can be served fast from the cache.

Keep in mind that you'll need to update index-static.html whenever you make changes that affect the content of the homepage.

By the way, I'm not actually seeing this bug. I implemented the above workaround because I can't afford to have my homepage go down even for a few minutes, and its content very rarely changes anyway.

@opengeek
Copy link
Member

opengeek submitted:

Alright, I believe this mystery has now been solved. The problem turns out to be that the process of caching Resources is not properly checking to make sure the Resource wasn't loaded from the cache before caching it. This means the cache file was being re-written unnecessarily even when it was successfully read from the cache. This easily triggers race conditions since the file is being written by almost every request for a specific Resource, especially the site_start, and especially if used as the error_page as well.

You can see commit details for the fix applied to 2.0.7-pl at 82ad456 and a 2.0.8-pl will be released shortly to address this critical production bug.

If you are experiencing this bug and can apply this fix manually to confirm it does resolve the problem, please do and report back in this ticket.

@opengeek
Copy link
Member

opengeek submitted:

Marking this resolved, and this is addressed in 2.0.8-pl—I have not gotten any feedback on the problem still occurring. Will not close until 2.1.0-rc-1 is released however.

@meezyart
Copy link

meezyart commented Mar 6, 2012

meezyart submitted:

I would just like to report that as of late I have been experiencing this issue as well. I"m running the newest version of modx and the client is at his wits end. is there a firm solution for this before we lose a client.

MODX Revolution 2.2.0-pl2 (traditional)
php version: 5.2.17

white screen on the start page and occasionally the other pages to.

@kenquad
Copy link

kenquad commented Mar 7, 2012

kenquad submitted:

Have you tried disabling cache sitewide as a stopgap fix?

@opengeek
Copy link
Member

opengeek commented Mar 7, 2012

opengeek submitted:

Please, DO NOT update the target version on CLOSED tickets. This bug was resolved in MODX 2.1. If you have a bug that you think exhibits similar behavior to this ticket, enter a new ticket and reference the closed ticket.

This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-core bug The issue in the code or project, which should be addressed.
Projects
None yet
Development

No branches or pull requests

8 participants