New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Page cache refactor full implementation #302
Comments
From carnav on February 25, 2012 10:02:50 (BTW I said "calling getPagesXmlValues()" without thinking much, :) I'm not sure it's the correct way, I haven't checked...) |
From m...@digimute.com on February 25, 2012 14:53:32 |
From tablatronics on June 26, 2012 15:39:26 |
I'll take this one. |
The following function need to be updated to use the caching rather than reading all the files in. theme_ functions template_ functions |
With list_pages_json fixed, the page editor will load faster. [edit: I said get_available_pages() but it's list_pages_json for the page editor] |
Yeah these are alot of things this will improve. |
just updated all the remaining function to use caching functions. can ye guys double check everything is working fine. all looks good to me. its a straight swap of the file reading for the global $pagesArray |
Looks ok to me, no errors. |
After this change, I found an issue I haven't thought about, with plugins that call A possible fix would be calling Just after: global $pagesArray; insert this: if (!$pagesArray) getPagesXmlValues(); both in |
Wait why is it not populated ? It's a cache it should be prepolated before anything might use. And after anything changes it. So its never initialized ? It should be globalized in common.php and initialized as soon as possible afterwards. |
Caching started as a plugin, and still works more or less like that. I suppose this also needs some refactoring. Anyway, why should $pagesArray always be available/populated? If a backend page (GS or plugin) is not going to need it, isn't it better to leave out this extra file read, etc.? |
Looks like a typo in the caching functions init code. Firing on the header was not actually doing anything. It now loads $pagesArray , if file doesn't exist it will create it. Can you check again that everything is working fine. |
I see that now, so we have core plugins ? Also we need to look at its performance, say with a site with 500-2000 pages. I do not see a problem with it being loaded all the time, we could probably use it in core for some stuff. Are we not ? pages display ? how we doing that ? |
caching started off as a plugin but is now fully integrated. It just uses hooks rather than hard coding the calls. Initial testing of the plugin was done on a site with 2000 pages each one set as a menu item to ensure each file was read in. As far as I remember there was about a 70-80% speed increase when using caching than reading in the 2000 files each time, and sometimes depending on the template/functions those 2000 files were read in multiple times per page refresh. I have an old plugin i'll dig out to create large test sites for this type of testing. Caching should now be used on internal core routines which previously read in the files individually. $pagesArray should now be available and populated for front/backend use at all time. |
Ok so... // load pages cache for frontend
add_action('index-pretemplate','getPagesXmlValues',array('false')); // make $pagesArray available to the theme
// load pages cache for backend
add_action('header', 'getPagesXmlValues',array('false')); Perhaps future speak on Once we have some stats I think this will be the most efficient way to optimize the core. I honestly do not know how it works. Did you write it ? |
yeah I wrote it, cant you tell from the coding style... 8) I think loading it straight after the include would work best. Its just legacy that we did it that way as it was any easy way to add plugins to the core without having to rewrite anything. (Just delete the plugin init code and it should work) |
So this should be a really good speed improvement now, I am not sure how often those functions were called, but it seems like it could have been significant. excellent. |
Why load pages.xml in admin pages like settings, plugins, files, etc.? IMHO it would be better loading + populating if necessary. if (!$pagesArray) getPagesXmlValues(); after every I'd even suggest to not to load in the frontend by default (changing |
As for loading caching functions in a hook: a plugin that runs at 'header' and wants to use getPageField, etc. may stop with an error, depending on the order the server uses. (I think this has happened to me with some experiments, but I forgot to report). |
Yeah well when we do it we should do it right. |
cnb , remember we have alot of refactoring going on, so there is no reason to do major changes like that when we will probably be refactoring cache anyway. Can we use pagecache to eliminate some directory reads ? For example updateSlugs does a readdir and then loads the filenames into an array ( which it then filters again for some reason, after it was already filtered for .xml files., then it loops the bitches and looks for parents that changed and then updates it. Seems like this could all be done with a cache loop. updateslugs is called
I will add more as I find them |
one problem with this I am seeing is the same issue with plugins we had. You are loading the cache file after saving the cache file. Also some more enhancements we can make possibly is that xml object are navigable using xpath, so when changing something we can modify xml per node and resave instead of recreating the entire node tree from an array. But i guess these would require modification of the createpages function to create a xml save and the pages array at the same time. So they are not instant things. |
Looks like I'll need to restore that hook call at the start of the cahing functions for the moment until I can sort out the flow of things as its not working as it should. Test: first create a new plugin called caching_info.php ( i use this for looking at the cache) <?php
/****************************************************
*
* @File: caching_info.php
* @Package: GetSimple
* @Action: Plugin to view caching information
*
*****************************************************/
# get correct id for plugin
$thisfile=basename(__FILE__, ".php");
# register plugin
register_plugin(
$thisfile,
'Caching Info',
'2.3',
'Mike Swan',
'http://www.digimute.com/',
'Pages Caching Info',
'plugins',
'showDebugInfo'
);
add_action('plugins-sidebar','createSideMenu',array($thisfile,'Caching Info')); // add menu entry
function showDebugInfo(){
getPagesXmlValues();
global $pagesArray;
echo '
<style type="text/css">
#load pre code {
display:block;font-size:11px;width:560px;line-height:13px;
white-space: pre-wrap; /* css-3 */
white-space: -moz-pre-wrap !important; /* Mozilla, since 1999 */
white-space: -pre-wrap; /* Opera 4-6 */
white-space: -o-pre-wrap; /* Opera 7 */
word-wrap: break-word; /* Internet Explorer 5.5+ */}
</style>
';
echo "<h3>Caching Info</h3>";
echo "<pre><code>";
print_r($pagesArray);
echo "</code></pre>";
}
?>
'''
then create 2 pages, test & test 2
make test2 a child of test.
check the plugin info, parent is not updated.
restore caching_function.php hook to
add_action('header', 'create_pagesxml',array('false'));
and all works as it should . |
Sweet I was thinking we need a unit tester for this |
probably because there is no hook to catch changedata.php with Your just recreating the pagesxml every time a backend page is loaded. So it probably catches everything, and is probably definitely wrong. A 'changedata-aftersave' should fix that for now. |
Just a comment: I'm not sure if it's still this way in current dev version. |
What is the purpose of |
I edited it above, that fixes it. |
This same code exists in edit.php same loop. REFACTOR 🔴 |
Ok, great. Another comment, have you seen the 3rd line in that piece of code you've pasted? $parentdata = getXML(GSDATAPAGESPATH . $page['parent'] .'.xml'); 'tis reading the parent data from disk... Should it get it from the global $pagesArray? |
Yup, I forgot to actually search for getxml() as a file read function for refactoring. And again in edit.php. |
As a side note, the $pagesArray variable name was used in GS before caching was implemented. |
ahh, ok. |
BTW the other day I did some tests by patching getXML : function getXML($file) {
$xml = file_get_contents($file);
$data = simplexml_load_string($xml, 'SimpleXMLExtended', LIBXML_NOCDATA);
echo '[read:',$file,']<br />'; // <-- ADDED FOR TESTING
return $data;
} I used this isntead of debugLog to also test in the frontend. |
Yeah my goal is to have all file reads and writes use core functions or classes, so we can profile this stuff easier. Also we should be able to modify the cache ondemand instead of rescaning all pages every time to check for changes. We will still need a basic check in case people meddle with the files, same as the issues with plugins, i found those issues btw. lol |
refs #345 |
Some commits cherry picked for #603 |
Cherry-picked into v3.3.0 |
Make page cache protected somehow, obviously we are already using a global array so its a bit late to move to a class. It is very important that cache can be available as an associative array so that it can be cross indexed by key to sorting arrays and menu arrays. I would like to see cache always accessed via a wrapper function, this wrapper will perform the following.
|
(previously posted here http://get-simple.info/forums/showthread.php?tid=5520&pid=42315#pid42315) I'd rather like that these functions (getPageContent, getPageField, etc., available since GS 3.1) were not camelCase, but lowercase with underscores, to be consistent with the other template tags. If it wasn't because get_page_content, etc. already have an optional parameter ($echo), I would suggest that they had it but for the slug of other page. An alternative could be an equivalent, like get_other_page_content, get_a_page_content or something (that actually would be an alias for getPageContent) |
(I said)
Another option would be using the first parameter for $echo OR $slug |
Yeah i was thinking that also, booleans can be evaluated with === cant they ? |
more re-factoring done, all legacy functions are aliases for the most part, aside form the getters getPagesXml values now returns pagecache, so you can avoid using global in functions unless you need to modify it. Not passing by reference, perhaps this is a good way to prevent altering it when using that as a wrapper. eg. readonly
|
|
as mentioned above, possible solutions
It should allow individual page updates, update from objects ( avoid file reads ), and upload only pages that have changed, use time stamps or something ( only if newer ) timestamps is the best optimization as it covers all these cases, but might causes issues with plugins and filters. |
|
Original author: carnav (February 24, 2012 18:36:58)
I just noticed that in template_functions.php, function generate_sitemap() is populating $pagesArray 'the old way' (lines 1039 and 1044-1059) instead of simply calling getPagesXmlValues()
Original issue: http://code.google.com/p/get-simple-cms/issues/detail?id=302
The text was updated successfully, but these errors were encountered: