New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Caching support for UnIndex #9
Conversation
@@ -35,6 +35,7 @@ | |||
|
|||
from Products.PluginIndexes.common import safe_callable | |||
from Products.PluginIndexes.common.UnIndex import UnIndex | |||
from Products.PluginIndexes.common.UnIndex import RequestCache |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unused import.
Thanks for the patch! I'm not sure I've been able to grasp entirely how the cache interacts with |
Just to refresh my memory, the way the date range index cache worked was:
So this cache already can be wrong in a number of cases, when data changes during the same request. First do a query, then change some items in the catalog, then do a query again for the same term and the cache might not have been invalidated. The extra 'divide by 10' to construct the cache key is also a bit weird. I think it was supposed to catch queries when you handle a request that takes a couple seconds and starts at second 59, going over to second 01. If something does repeated queries for "now" taking the exact minute would result in cache misses here. It would probably be better to use a relative difference here and consider cache entries if the difference between a cached value and the search term is below some threshold like 60 seconds instead. In order to generalize this concept it might be better if the getCounter logic gets attached to each index to actually catch all changes to it, not just those done via the ZCatalog API. It might be best to implement such a change counter first in one PR and than let the cache generalization PR build on top of it. |
I have trouble following what is happening in this code (not the pull request per se, but the catalog code in general), but the direction seems good. I agree with the detailed code quality remarks by Tres and the notes by Hanno. |
Thanks for the comments. I agree with Hanno's idea to implement the getCounter logic first in order to get rid of the dependency on plone. If you agree I create a PR for the getCounter logic and a separate PR for the generalized caching logic. |
@@ -288,7 +275,7 @@ def _apply_index(self, request, resultset=None): | |||
until = multiunion(self._until.values(term)) | |||
|
|||
# Total result is bound by resultset | |||
if REQUEST is None: | |||
if cache is None: | |||
until = intersection(resultset, until) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not pull request related, but while reviewing this I stumbled over this:
We are in a block where it is guaranteed that resultset is None
- why do we do intersection(None, until)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I noticed the same thing. That block can just be removed.
PR #12 implements now the generalized caching logic based on the getCounter logic of master branch (incl. code quality corrections). |
Hi,
I have adopted the caching mechanism of DateRangeIndex in order to improve the performance of catalog queries calling repetitively indexes inherited from UnIndex (e.g. FieldIndex, KeywordIndex etc.). The performance enhancement becomes especially noticeable on big sites. For example, in plone are the indexes review_state, portal_type, allowedRolesAndUsers etc. within a request affected.
Kind regards
Andreas