Skip to content

Commit

Permalink
CompositeIndex: Enhancement for Field- and KeywordIndexes (#15)
Browse files Browse the repository at this point in the history
Add CompositeIndex
  • Loading branch information
andbag authored and hannosch committed Aug 25, 2016
1 parent d76ce90 commit f7e8853
Show file tree
Hide file tree
Showing 13 changed files with 1,224 additions and 8 deletions.
512 changes: 512 additions & 0 deletions src/Products/PluginIndexes/CompositeIndex/CompositeIndex.py

Large diffs are not rendered by default.

38 changes: 38 additions & 0 deletions src/Products/PluginIndexes/CompositeIndex/README.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
CompositeIndex README

Overview

CompositeIndex is a plugin index for the ZCatalog. Indexes containing
more than one attribute to index an object are called “composite
index”. Such indexes should be created if you expect to run
queries that will have multiple attributes in the search phrase
and all attributes combined will give significantly less hits
than the any of the attributes alone. The key of a composite
index is called “composite key” and is composed of two or more
attributes of an object.

Catalog queries containing attributes managed by CompositeIndex
are transparently catched and transformed seamlessly into a
CompositeIndex query. In particular, large sites with a
combination of additional indexes (FieldIndex, KeywordIndex, BooleanIndex)
and lots of content (>100k) will profit. The expected performance
enhancement for combined index queries is about a factor of >2-3.

For example many catalog queries in plone are based on the combination of
indexed attributes as follows: 'Language', 'review_state',
'portal_type' and 'allowedRolesAndUsers'. Normally, the ZCatalog
sequentially executes each corresponding atomic index and
calculates intersection between each result. This strategy, in
particular for large sites, decreases the performance of the
catalog and simultaneously increases the volatility of ZODB’s
object cache, because each index individually has a high number
of hits whereas the the intersection between each index result
has a low number of hits.

CompositeIndex overcomes this difficulty because it already
contains a pre-calculateted intersection by means of its
composite keys. The loading of large sets and the following
expensive computation of the intersection is therefore obsolete.

IMPORTANT: CompositeIndex can only be used as an add-on not as
a replacement for field, keyword and boolean indexes.
1 change: 1 addition & 0 deletions src/Products/PluginIndexes/CompositeIndex/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
# empty comment for winzip and friends
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
<dtml-var manage_page_header>

<dtml-var "manage_form_title(this(), _,
form_title='Add CompositeIndex',
)">


<p class="form-help">
<strong>Composite Indexes</strong>. Indexes
containing more than one attribute to index an object are called
"composite index". Such indexes should be created if you expect to run
queries that will have multiple attributes in the search phrase and
all attributes combined will give significantly less hits than the any
of the attributes alone. The key of a composite index is called
"composite key" and is composed of two or more attributes of an
object.
</p>


<form action="manage_addCompositeIndex" method="post" enctype="multipart/form-data">
<table cellspacing="0" cellpadding="2" border="0">
<tr>
<td align="left" valign="top">
<div class="form-label">
Id
</div>
</td>
<td align="left" valign="top">
<input type="text" name="id" size="10" />
</td>
</tr>

<tr>
<td align="left" valign="top">
<div class="form-optional">
Type
</div>
</td>
<td align="left" valign="top">
Composite Index
</td>
</tr>
<tr>
<td align="left" valign="top">
</td>
<td align="left" valign="top">
<div class="form-element">
<input class="form-element" type="submit" name="submit"
value=" Add " />
</div>
</td>
</tr>
</table>
</form>

<dtml-var manage_page_footer>
64 changes: 64 additions & 0 deletions src/Products/PluginIndexes/CompositeIndex/dtml/browseIndex.dtml
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
<dtml-var manage_page_header>
<dtml-var manage_tabs>

<dtml-call "REQUEST.RESPONSE.setHeader('Content-Type', 'text/html; charset=UTF-8')" >

<p class="form-text">
The index "&dtml-getId;" contains <dtml-var items fmt=collection-length thousands_commas> distinct values
</p>

<dtml-let size="20"> <!-- batch size -->

<div class="form-text">
<dtml-in items previous size=size start=query_start >
<a href="&dtml-URL;?query_start=&dtml-previous-sequence-start-number;">
[Previous <dtml-var previous-sequence-size> entries]
</a>
</dtml-in>
<dtml-in items next size=size start=query_start >
<a href="&dtml-URL;?query_start=&dtml-next-sequence-start-number;">
[Next <dtml-var next-sequence-size> entries]
</a>
</dtml-in>
</div>

<table border="1" align="center" width="100%" class="form-help">

<tr><th>composite key (internally managed by integer hash)</th><th>object path</th></tr>
<dtml-in items start=query_start size=size>
<tr>
<td>
<dtml-if "meta_type in ('DateIndex',)">
<dtml-comment><!--
DateIndexes store dates packed into an integer, unpack
into year, month, day, hour and minute, no seconds and UTC.
--></dtml-comment>
<dtml-var "DateTime(((_['sequence-key'] - 44640) / 535680),
((_['sequence-key'] - 1440) / 44640 ) % 12 or 12,
(_['sequence-key'] / 1440 ) % 31 or 31,
(_['sequence-key'] / 60 ) % 24,
(_['sequence-key'] ) % 60,
0, 'UTC')">
<dtml-else>
&dtml-sequence-key;
</dtml-if>
</td>
<td>
<ul>
<dtml-let v="_['sequence-item']">
<dtml-if "isinstance(v, int)">
<li><a href="<dtml-var "getpath(v)">"<dtml-var "getpath(v)"></a></li>
<dtml-else>
<dtml-in "v.keys()">
<li> <a href="<dtml-var "getpath(_['sequence-item'])">"><dtml-var "getpath(_['sequence-item'])"></a></li>
</dtml-in>
</dtml-if>
</dtml-let>
</ul>
</td>
</tr>
</dtml-in>
</table>
</dtml-let>

<dtml-var manage_page_footer>
Original file line number Diff line number Diff line change
@@ -0,0 +1,105 @@
<dtml-var manage_page_header>
<dtml-var manage_tabs>

<p class="form-help">
Objects indexed: <dtml-var numObjects>
<br>
Distinct values: <dtml-var indexSize>
</p>

<!-- form action="manage_addComponent" method="post" enctype="multipart/form-data" -->
<form action="&dtml-URL1;/" name="adminComponents" method="post">

<table cellspacing="0" cellpadding="2" border="1">
<tr><td>&nbsp;</td><td>Component id</td><td>Type</td><td>Indexed attributes (attribute1,attribute2,... or leave empty)</td>
<dtml-if expr="indexSize() == 0">
<dtml-in expr="getIndexComponents()">
<tr>
<td align="left" valign="top">
<input name="del_ids:list" value="<dtml-var "_['sequence-item'].id" >" type="checkbox">
</td>
<td align="left" valign="top">
<input type="hidden" name="components.old_id:records:string" value="<dtml-var "_['sequence-item'].id" >" />
<input type="text" name="components.id:records:string" size="20" value="<dtml-var "_['sequence-item'].id" >" />
</td>
<td align="left" valign="top">
<select name="components.meta_type:records:string">
<option <dtml-if "_['sequence-item'].meta_type=='FieldIndex'">SELECTED</dtml-if> value="FieldIndex">FieldIndex</option>
<option <dtml-if "_['sequence-item'].meta_type=='KeywordIndex'">SELECTED</dtml-if> value="KeywordIndex">KeywordIndex</option>
<option <dtml-if "_['sequence-item'].meta_type=='BooleanIndex'">SELECTED</dtml-if> value="BooleanIndex">BooleanIndex</option>
</select>
</td>
<td align="left" valign="top">
<input type="text" name="components.attributes:records:string" size="60" value="&dtml-rawAttributes;"/><br/>
</td>
</tr>
</dtml-in>
<tr>
<td>&nbsp;</td>
<td colspan="4">
<div class="form-element">
<input class="form-element" type="submit" name="manage_saveComponents:method" value=" Save " />
<input class="form-element" type="submit" name="manage_delComponents:method" value=" Delete " />
</div>
</td>
</tr>
<tr>
<td colspan="5">&nbsp;</td>
</tr>
<tr>
<td>&nbsp;</td>
<td align="left" valign="top">
<input type="text" name="c_id:string" size="20" value="" />
</td>
<td align="left" valign="top">
<select name="c_meta_type:string">
<option>FieldIndex</option>
<option>KeywordIndex</option>
<option>BooleanIndex</option>
</select>
</td>
<td align="left" valign="top">
<input type="text" name="c_attributes:string" size="60" value=""/><br/>
</td>
</tr>
<tr>
<td>&nbsp;</td>
<td colspan="4">
<div class="form-element">
<input class="form-element" type="submit" name="manage_addComponent:method" value=" Add " />
</div>
</td>
</tr>
<dtml-else>

<dtml-in expr="getIndexComponents()">
<tr>
<td align="left" valign="top">
&nbsp;
</td>
<td align="left" valign="top">
&dtml-id;
</td>
<td align="left" valign="top">
&dtml-meta_type;
</td>
<td align="left" valign="top">
&dtml-rawAttributes; &nbsp;
</td>
</tr>
</dtml-in>
</dtml-if>


</table>
</form>
<form action="&dtml-URL1;/" name="fastBuild" method="post">
<p class="form-help">
Build composite index directly based on catalog brains and corresponding attribute values from matching field and keyword indexes.
</p>
<input class="form-element" type="submit" name="manage_fastBuild:method" value=" fast build " />

</form>


<dtml-var manage_page_footer>
15 changes: 15 additions & 0 deletions src/Products/PluginIndexes/CompositeIndex/tests/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
##############################################################################
#
# Copyright (c) 2003 Zope Foundation and Contributors.
# All Rights Reserved.
#
# This software is subject to the provisions of the Zope Public License,
# Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution.
# THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED
# WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
# WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS
# FOR A PARTICULAR PURPOSE.
#
##############################################################################

# This file is needed to make this a package.
Loading

0 comments on commit f7e8853

Please sign in to comment.