Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP
Browse files

very unstable at the moment. Working on autoincrement option for model.

  • Loading branch information...
commit 381862ec76381f428c33e06fe015c5b682c1a786 1 parent cde03b9
quantmind authored
View
2  .coveragerc
@@ -1,3 +1,3 @@
[run]
source = stdnet
-omit = stdnet/apps/searchengine/processors/*,stdnet/utils/fallbacks/*
+omit = stdnet/apps/searchengine/processors/metaphone.py,stdnet/apps/searchengine/processors/porter.py,stdnet/utils/fallbacks/*
View
1  docs/source/examples/index.rst
@@ -25,6 +25,7 @@ available in :mod:`stdnet`.
tutorial
query
+ sorting
tutorial2
performance
timeseries
View
75 docs/source/examples/sorting.rst
@@ -0,0 +1,75 @@
+
+.. _sorting:
+
+=======================
+Sorting and Ordering
+=======================
+Stdnet can sort instances of a model in three different ways:
+
+* :ref:`Explicit sorting <explicit-sorting>` using the
+ :attr:`stdnet.orm.Query.sort_by` method.
+* :ref:`Implicit sorting <implicit-sorting>` via the
+ :attr:`stdnet.orm.Metaclass.ordering` attribute of the model metaclass.
+* :ref:`Incremental sorting <incremental-sorting>`, a variant of the
+ implicit sorting for models which require to keep track how many
+ times instances with the same id are created.
+
+
+.. _explicit-sorting:
+
+Explicit Sorting
+=======================
+
+Sorting is usually achieved by using the :meth:`stdnet.orm.query.QuerySet.sort_by`
+method with a field name as parameter. Lets consider the following model::
+
+ class SportActivity(orm.StdNet):
+ person = orm.SymbolField()
+ activity = orm.SymbolField()
+ dt = orm.DateTimeField()
+
+
+To obtained a sorted query on dates for a given person::
+
+ SportActivity.objects.filter(person='pippo').sort_by('-dt')
+
+The negative sign in front of ``dt`` indicates descending order.
+
+
+.. _implicit-sorting:
+
+Implicit Sorting
+===================
+
+Implicit sorting is achieved by setting the ``ordering`` attribute in the model Meta class.
+Let's consider the following Log model example::
+
+ class Log(orm.StdModel):
+ '''A database log entry'''
+ timestamp = orm.DateTimeField(default=datetime.now)
+ level = orm.SymbolField()
+ msg = orm.CharField()
+ source = orm.CharField()
+ host = orm.CharField()
+ user = orm.SymbolField(required=False)
+ client = orm.CharField()
+
+ class Meta:
+ ordering = '-timestamp'
+
+It makes lots of sense to have the log entries always sorted in a descending
+order with respect to the ``timestamp`` field.
+This solution always returns querysets in this order, without the need to
+call ``sort_by`` method.
+
+.. note:: Implicit sorting is a much faster solution than explicit sorting,
+ since there is no sorting step involved (which is a ``N log(N)``
+ time complexity algorithm). Instead, the order is maintained by using
+ sorted sets as indices rather than sets.
+
+
+.. _incremental-sorting:
+
+Incremental Sorting
+========================
+
View
76 docs/source/examples/tutorial.rst
@@ -91,6 +91,16 @@ The only difference is the ``prices`` :class:`stdnet.orm.ListField`
in the ``Instrument`` model which is
not available in a traditional relational database.
+The metaclass
+~~~~~~~~~~~~~~~~~~~~~~~
+The ``Position`` models specifies a ``Meta`` class with an ``ordering`` attribute.
+When provided, as in this case, the Meta class fields are used by the ``orm``
+to customize the build of the :class:`stdnet.orm.Metaclass` for the model.
+In this case we instruct the mapper to manage the ``Position`` model
+as ordered with respect to the :class:`stdnet.orm.DateField` ``dt``
+in descending order. Check the :ref:`sorting <sorting>`
+documentation for more details or ordering and sorting.
+
Registering Models
================================
@@ -149,69 +159,3 @@ Here's an example::
'EUR'
>>> b.description
''
-
-
-
-.. _sorting:
-
-Sorting
-==================
-Since version 0.6.0, stdnet provides sorting using two different ways:
-
-* Explicit sorting using the :attr:`stdnet.orm.query.QuerySet.sort_by` attribute
- of a queryset.
-* Implicit sorting via the :attr:`stdnet.orm.Meta.ordering` attribute of
- the model metaclass.
-
-
-Explicit Sorting
-~~~~~~~~~~~~~~~~~~~~
-
-Sorting is usually achieved by using the :meth:`stdnet.orm.query.QuerySet.sort_by`
-method with a field name as parameter. Lets consider the following model::
-
- class SportActivity(orm.StdNet):
- person = orm.SymbolField()
- activity = orm.SymbolField()
- dt = orm.DateTimeField()
-
-
-To obtained a sorted query on dates for a given person::
-
- SportActivity.objects.filter(person='pippo').sort_by('-dt')
-
-The negative sign in front of ``dt`` indicates descending order.
-
-.. _implicit-sorting:
-
-Implicit Sorting
-~~~~~~~~~~~~~~~~~~~~
-
-Implicit sorting is achieved by setting the ``ordering`` attribute in the model Meta class.
-Let's consider the following Log model example::
-
- class Log(orm.StdModel):
- '''A database log entry'''
- timestamp = orm.DateTimeField(default=datetime.now)
- level = orm.SymbolField()
- msg = orm.CharField()
- source = orm.CharField()
- host = orm.CharField()
- user = orm.SymbolField(required=False)
- client = orm.CharField()
-
- class Meta:
- ordering = '-timestamp'
-
-It makes lots of sense to have the log entries always sorted in a descending
-order with respect to the ``timestamp`` field.
-This solution always returns querysets in this order, without the need to
-call ``sort_by`` method.
-
-.. note:: Implicit sorting is a much faster solution than explicit sorting,
- since there is no sorting step involved (which is a ``N log(N)``
- time complexity algorithm). Instead, the order is maintained by using
- sorted sets as indices rather than sets.
-
-
-.. _django: http://www.djangoproject.com/
View
8 docs/source/model/models.rst
@@ -54,6 +54,14 @@ Query
.. automethod:: __init__
+autoincrement
+~~~~~~~~~~~~~~~~~~
+
+.. autoclass:: autoincrement
+ :members:
+ :member-order: bysource
+
+
.. _model-structures:
Data Structures
View
63 stdnet/apps/searchengine/__init__.py
@@ -42,54 +42,7 @@
from stdnet.utils import to_string, iteritems
from .models import Word, WordItem, Tag
-from .ignore import STOP_WORDS, PUNCTUATION_CHARS
-from .processors.metaphone import dm as double_metaphone
-from .processors.porter import PorterStemmer
-
-
-class stopwords:
-
- def __init__(self, stp):
- self.stp = stp
-
- def __call__(self, words):
- stp = self.stp
- for word in words:
- if word not in stp:
- yield word
-
-
-def metaphone_processor(words):
- '''Double metaphone word processor.'''
- for word in words:
- for w in double_metaphone(word):
- if w:
- w = w.strip()
- if w:
- yield w
-
-
-def tolerant_metaphone_processor(words):
- '''Double metaphone word processor slightly modified so that when no
-words are returned by the algorithm, the original word is returned.'''
- for word in words:
- r = 0
- for w in double_metaphone(word):
- if w:
- w = w.strip()
- if w:
- r += 1
- yield w
- if not r:
- yield word
-
-
-def stemming_processor(words):
- '''Double metaphone word processor'''
- stem = PorterStemmer().stem
- for word in words:
- word = stem(word, 0, len(word)-1)
- yield word
+from . import processors
class SearchEngine(orm.SearchEngine):
@@ -131,19 +84,19 @@ def __init__(self, min_word_length = 3, stop_words = None,
splitters = None):
super(SearchEngine,self).__init__()
self.MIN_WORD_LENGTH = min_word_length
- stop_words = stop_words if stop_words is not None else STOP_WORDS
- splitters = splitters if splitters is not None else PUNCTUATION_CHARS
+ splitters = splitters if splitters is not None else\
+ processors.PUNCTUATION_CHARS
if splitters:
self.punctuation_regex = re.compile(\
r"[%s]" % re.escape(splitters))
else:
self.punctuation_regex = None
- if stop_words:
- self.add_word_middleware(stopwords(stop_words),False)
+ # The stop words middleware is only used for the indexing part
+ self.add_word_middleware(processors.stopwords(stop_words), False)
if stemming:
- self.add_word_middleware(stemming_processor)
+ self.add_word_middleware(processors.stemming_processor)
if metaphone:
- self.add_word_middleware(tolerant_metaphone_processor)
+ self.add_word_middleware(processors.tolerant_metaphone_processor)
def split_text(self, text):
if self.punctuation_regex:
@@ -159,7 +112,7 @@ def flush(self, full = False):
if full:
Word.objects.flush()
- def _index_item(self, item, words, session):
+ def add_item(self, item, words, session):
link = self._link_item_and_word
for word,count in iteritems(words):
session.add(link(item, word, count))
View
23 stdnet/apps/searchengine/models.py
@@ -23,27 +23,12 @@ class Word(orm.StdModel):
'''Model which hold a word as primary key'''
id = orm.SymbolField(primary_key = True)
tag = orm.BooleanField(default = False)
- # denormalised fields for frequency
- frequency = orm.IntegerField(default = 0)
- model_frequency = orm.HashField()
def __unicode__(self):
return self.id
- def update_frequency(self):
- f = 0
- mf = {}
- for item in self.items().all():
- m = item.model_type
- if m in mf:
- mf[m] += 1
- else:
- mf[m] = 1
- f += 1
- self.model_frequency.delete()
- self.model_frequency.update(mf)
- self.frequency = f
- self.save()
+ class Meta:
+ ordering = -orm.autoincrement()
class WordItem(orm.StdModel):
@@ -62,6 +47,10 @@ def __unicode__(self):
objects = WordItemManager()
+ class Meta:
+ ordering = -orm.autoincrement()
+ unique_together = ('word', 'model_type', 'object_id')
+
@property
def object(self):
'''Instance of :attr:`model_type` with id :attr:`object_id`.'''
View
49 stdnet/apps/searchengine/processors/__init__.py
@@ -1 +1,48 @@
-__test__ = False
+from .ignore import STOP_WORDS, PUNCTUATION_CHARS
+from .metaphone import dm as double_metaphone
+from .porter import PorterStemmer
+
+
+class stopwords:
+
+ def __init__(self, stp = None):
+ self.stp = stp if stp is not None else STOP_WORDS
+
+ def __call__(self, words):
+ stp = self.stp
+ for word in words:
+ if word not in stp:
+ yield word
+
+
+def metaphone_processor(words):
+ '''Double metaphone word processor.'''
+ for word in words:
+ for w in double_metaphone(word):
+ if w:
+ w = w.strip()
+ if w:
+ yield w
+
+
+def tolerant_metaphone_processor(words):
+ '''Double metaphone word processor slightly modified so that when no
+words are returned by the algorithm, the original word is returned.'''
+ for word in words:
+ r = 0
+ for w in double_metaphone(word):
+ if w:
+ w = w.strip()
+ if w:
+ r += 1
+ yield w
+ if not r:
+ yield word
+
+
+def stemming_processor(words):
+ '''Porter Stemmer word processor'''
+ stem = PorterStemmer().stem
+ for word in words:
+ word = stem(word, 0, len(word)-1)
+ yield word
View
0  stdnet/apps/searchengine/ignore.py → ...et/apps/searchengine/processors/ignore.py
File renamed without changes
View
10 stdnet/backends/redisb.py
@@ -838,9 +838,13 @@ def execute_session(self, session, callback):
json.dumps(instance._dbdata['errors']))
score = MIN_FLOAT
if meta.ordering:
- v = getattr(instance,meta.ordering.name,None)
- if v is not None:
- score = meta.ordering.field.scorefun(v)
+ if meta.ordering.auto:
+ score = 'auto {0}'.format(\
+ meta.ordering.name.incrby)
+ else:
+ v = getattr(instance,meta.ordering.name,None)
+ if v is not None:
+ score = meta.ordering.field.scorefun(v)
data = instance._dbdata['cleaned_data']
if state.persistent:
action = 'o' if instance.has_all_data else 'c'
View
99 stdnet/lib/redis/lua/session.lua
@@ -21,21 +21,60 @@ function table_slice (values,i1,i2)
return res
end
+results = {}
+local i = 0
+local bk = KEYS[1]
+local s = ARGV[i+1] -- 's' for sorted sets, 'z' for zsets
+local num_instances = ARGV[i+2] + 0
+local length_indices = ARGV[i+3] + 0
+local idx1 = i+3
+i = idx1 + 2*length_indices
+local indices = table_slice(ARGV,idx1+1,idx1+length_indices)
+local uniques = table_slice(ARGV,idx1+length_indices+1,i)
+local idset = bk .. ':id'
+local j = 0
+local result = {}
+
+function fidkey(id)
+ return bk .. ':obj:' .. id
+end
+
+function delete(id, created_id)
+ redis.call('del', fidkey(id))
+ redis.call(s .. 'rem', idset, id)
+ if created_id then
+ redis.call('decr', bk .. ':ids')
+ return ''
+ else
+ return id
+ end
+end
+
-- Add or remove indices for an instance.
-- Return nothing if the update was succesful otherwise it returns the
-- error message (constaints were violated)
-function update_indices(s, score, bk, id, idkey, indices, uniques, add)
- errors = {}
- for i,name in pairs(indices) do
- value = redis.call('hget', idkey, name)
+function update_indices(score, id, autoincr, created_id, add)
+ local errors = {}
+ local result = {id,errors}
+ local idkey = fidkey(id)
+ local objects = {}
+ -- Loop over indices names
+ for i, name in pairs(indices) do
+ local value = redis.call('hget', idkey, name)
if uniques[i] == '1' then
idxkey = bk .. ':uni:' .. name
if add then
if redis.call('hsetnx', idxkey, value, id) + 0 == 0 then
- -- remove field `name` from the instance hashtable so that
- -- the next call to update_indices won't delete the index
- redis.call('hdel', idkey, name)
- table.insert(errors, 'Unique constraint "' .. name .. '" violated.')
+ if not autoincr then
+ -- remove field `name` from the instance hashtable so that
+ -- the next call to update_indices won't delete the index
+ redis.call('hdel', idkey, name)
+ table.insert(errors, 'Unique constraint "' .. name .. '" violated.')
+ elseif created_id then
+ -- if autoincrementing
+ local previd = redis.call('hget', idxkey, value)
+ table.insert(objects,previd)
+ end
end
else
redis.call('hdel', idxkey, value)
@@ -60,28 +99,19 @@ function update_indices(s, score, bk, id, idkey, indices, uniques, add)
end
-- LOOP OVER INSTANCES TO ADD/CHANGE
-results = {}
-local i = 0
-local bk = KEYS[1]
-local s = ARGV[i+1] -- 's' for sorted sets, 'z' for zsets
-local num_instances = ARGV[i+2] + 0
-local length_indices = ARGV[i+3] + 0
-local idx1 = i+3
-i = idx1 + 2*length_indices
-local indices = table_slice(ARGV,idx1+1,idx1+length_indices)
-local uniques = table_slice(ARGV,idx1+length_indices+1,i)
-local idset = bk .. ':id'
-local j = 0
-local result = {}
-
while j < num_instances do
local action = ARGV[i+1]
local id = ARGV[i+2]
- local score = ARGV[i+3]
+ local score = ARGV[i+3] ... ''
local idx0 = i+4
local length_data = ARGV[idx0] + 0
local data = table_slice(ARGV,idx0+1,idx0+length_data)
local created_id = false
+ local autoincr = false
+ if score:find(' ') == 5 then
+ score = score:sub(6) + 0
+ autoincr = true
+ end
i = idx0 + length_data
-- ID NOT AVAILABLE. CREATE ONE
@@ -89,17 +119,21 @@ while j < num_instances do
created_id = true
id = redis.call('incr', bk .. ':ids')
end
- local idkey = bk .. ':obj:' .. id
local original_values = {}
- if action == 'o' or action == 'c' then -- override or change
+ -- override or change we remove the indices
+ if action == 'o' or action == 'c' then
original_values = redis.call('hgetall', idkey)
- update_indices(s, score, bk, id, idkey, indices, uniques, false)
+ update_indices(score, id, autoincr, created_id, false)
+ -- When overriding we delete the key. This means we restart from a
+ -- bright new hash table
if action == 'o' then
redis.call('del', idkey)
end
end
if s == 's' then
redis.call('sadd', idset, id)
+ elseif autoincr:
+ score = redis.call('zincrby', idset, score, id) + 0
else
redis.call('zadd', idset, score, id)
end
@@ -107,22 +141,17 @@ while j < num_instances do
redis.call('hmset', idkey, unpack(data))
end
j = j + 1
- local error = update_indices(s, score, bk, id, idkey, indices, uniques, true)
+ local error = update_indices(score, id, autoincr, created_id, true)
-- An error has occured. Rollback changes.
if # error > 0 then
-- Remove indices
error = error[1]
- update_indices(s, score, bk, id, idkey, indices, uniques, false)
+ update_indices(score, id, autoincr, created_id, false)
if action == 'a' then
- redis.call('del', idkey)
- redis.call(s .. 'rem', idset, id)
- if created_id then
- redis.call('decr', bk .. ':ids')
- id = ''
- end
+ id = delete(id)
elseif # original_values > 0 then
redis.call('hmset', idkey, unpack(original_values))
- update_indices(s, score, bk, id, idkey, indices, uniques, true)
+ update_indices(score, id, autoincr, created_id, true)
end
result[j] = {id, error}
else
View
64 stdnet/orm/base.py
@@ -1,6 +1,6 @@
'''Defines Metaclasses and Base classes for stdnet Models.'''
import sys
-import copy
+from copy import copy, deepcopy
import hashlib
import weakref
@@ -17,6 +17,7 @@
__all__ = ['Metaclass',
'Model',
'ModelBase',
+ 'autoincrement',
'ModelType', # Metaclass for all stdnet ModelBase classes
'StdNetType', # derived from ModelType, metaclass fro StdModel
'from_uuid']
@@ -26,7 +27,7 @@ def get_fields(bases, attrs):
fields = {}
for base in bases:
if hasattr(base, '_meta'):
- fields.update(copy.deepcopy(base._meta.dfields))
+ fields.update(deepcopy(base._meta.dfields))
for name,field in list(attrs.items()):
if isinstance(field,Field):
@@ -140,8 +141,8 @@ class Meta:
.. attribute:: ordering
Optional name of a :class:`stdnet.orm.Field` in the :attr:`model`.
- If provided, indeces will be sorted with respect the value of the
- field specidied.
+ If provided, model indices will be sorted with respect to the value of the
+ specified field. It can also be a :class:`autoincrement` instance.
Check the :ref:`sorting <sorting>` documentation for more details.
Default: ``None``.
@@ -252,16 +253,19 @@ def is_valid(self, instance):
def get_sorting(self, sortby, errorClass = None):
s = None
desc = False
- if sortby.startswith('-'):
+ if isinstance(sortby,autoincrement):
+ f = self.pk
+ return orderinginfo(sortby, f, desc, self.model, None, True)
+ elif sortby.startswith('-'):
desc = True
sortby = sortby[1:]
if sortby == 'id':
f = self.pk
- return orderinginfo(f.attname, f, desc, self.model, None)
+ return orderinginfo(f.attname, f, desc, self.model, None, False)
else:
if sortby in self.dfields:
f = self.dfields[sortby]
- return orderinginfo(f.attname, f, desc, self.model, None)
+ return orderinginfo(f.attname, f, desc, self.model, None, False)
sortbys = sortby.split(JSPLITTER)
s0 = sortbys[0]
if len(sortbys) > 1 and s0 in self.dfields:
@@ -269,7 +273,7 @@ def get_sorting(self, sortby, errorClass = None):
nested = f.get_sorting(JSPLITTER.join(sortbys[1:]),errorClass)
if nested:
sortby = f.attname
- return orderinginfo(sortby, f, desc, self.model, nested)
+ return orderinginfo(sortby, f, desc, self.model, nested, False)
errorClass = errorClass or ValueError
raise errorClass('Cannot Order by attribute "{0}".\
It is not a scalar field.'.format(sortby))
@@ -307,7 +311,51 @@ def multifields_ids_todelete(self, instance):
if field.todelete())
return [fid for fid in gen if fid]
+
+class autoincrement(object):
+ '''An :class:`autoincrement` is used in a :class:`StdModel` Meta
+class to specify a model with :ref:`incremental sorting <incremental-sorting>`.
+
+.. attribute:: incrby
+
+ The ammount to increment the score by when a duplicate element is saved.
+
+ Default: 1.
+
+For example, the :class:`stdnet.apps.searchengine.Word` model is defined as::
+
+ class Word(orm.StdModel):
+ id = orm.SymbolField(primary_key = True)
+
+ class Meta:
+ ordering = -autoincrement()
+
+This means every time we save a new instance of Word, and that instance has
+an id already available, the score of that word is incremented by the
+:attr:`incrby` attribute.
+
+'''
+ def __init__(self, incrby = 1, desc = False):
+ self.incrby = incrby
+ self._asce = -1 if desc else 1
+
+ def __neg__(self):
+ c = copy(self)
+ c._asce *= -1
+ return c
+
+ @property
+ def desc(self):
+ return True if self._asce == -1 else False
+ def __repr__(self):
+ return ('' if self._asce == 1 else '-') + '{0}({1})'\
+ .format(self.__class__.__name__,self.incrby)
+
+ def __str__(self):
+ return self.__repr__()
+
+
class ModelType(type):
'''StdModel python metaclass'''
is_base_class = True
View
2  stdnet/orm/fields.py
@@ -18,7 +18,7 @@
from .globals import get_model_from_hash, JSPLITTER
-orderinginfo = namedtuple('orderinginfo','name field desc model nested')
+orderinginfo = namedtuple('orderinginfo','name field desc model nested, auto')
logger = logging.getLogger('stdnet.orm')
View
86 stdnet/orm/search.py
@@ -13,8 +13,8 @@ class SearchEngine(object):
Stdnet also provides a :ref:`python implementation <apps-searchengine>`
of this interface.
-The main methods to be implemented are :meth:`_index_item`
-and meth:`remove_index`.
+The main methods to be implemented are :meth:`add_item`,
+:meth:`remove_index` and :meth:`search_model`.
.. attribute:: word_middleware
@@ -29,12 +29,17 @@ class SearchEngine(object):
se = SearchEngine()
- def stopwords(words):
- for word in words:
- if word not in ('this','that','and'):
- yield word
+ class stopwords(object):
+
+ def __init__(self, *swords):
+ self.swords = set(swords)
+
+ def __call__(self, words):
+ for word in words:
+ if word not in self.swords:
+ yield word
- se.add_word_middleware(stopwords)
+ se.add_word_middleware(stopwords('and','or','this','that',...))
"""
REGISTERED_MODELS = {}
ITEM_PROCESSORS = []
@@ -72,8 +77,16 @@ def words_from_text(self, text, for_search = False):
to process the text.
:parameter text: string from which to extract words.
+:parameter for_search: flag indicating if the the words will be used for search
+ or to index the database. This flug is used in conjunction with the
+ middleware flag *for_search*. If this flag is ``True`` (i.e. we need to
+ search the database for the words in *text*), only the
+ middleware functions in :attr:`word_middleware` enabled for searching are
+ used.
+
+ Default: ``False``.
-return a list of cleaned words.
+return a *list* of cleaned words.
'''
if not text:
return []
@@ -100,7 +113,13 @@ def add_processor(self, processor):
self.ITEM_PROCESSORS.append(processor)
def add_word_middleware(self, middleware, for_search = True):
- '''Add a *middleware* function for preprocessing words to be indexed.'''
+ '''Add a *middleware* function to the list of :attr:`word_middleware`,
+for preprocessing words to be indexed.
+
+:parameter middleware: a callable receving an iterable over words.
+:parameter for_search: flag indicating if the *middleware* can be used for the
+ text to search. Default: ``True``.
+'''
if hasattr(middleware,'__call__'):
self.word_middleware.append((middleware,for_search))
@@ -122,26 +141,8 @@ def index_item(self, item, session = None):
wc[word] = 1
session = session or item.session
- self._index_item(item, wc, session)
+ self.add_item(item, wc, session)
return session
-
- def flush(self, full = False):
- '''Clean the search engine'''
- raise NotImplementedError
-
- def remove_item(self, item_or_model, ids = None, session = None):
- '''Remove an item from the search indices'''
- raise NotImplementedError
-
- def search_model(self, model, text):
- '''Search *text* in *model* instances. This is the functions
-needing implementation by custom serach engines.
-
-:parameter model: a :class:`stdnet.orm.StdModel` class.
-:parameter text: text to search
-:rtype: A :class:`stdnet.orm.query.QuerySet` of model instances containing the
- text to search.'''
- raise NotImplementedError
def reindex(self, full = True):
'''Reindex models by removing items in
@@ -159,10 +160,35 @@ def reindex(self, full = True):
self.index_item(obj)
return n
- # ABSTRACT INTERNAL FUNCTIONS
+ # ABSTRACT FUNCTIONS
################################################################
- def _index_item(self, item, words, session):
+ def remove_item(self, item_or_model, ids = None, session = None):
+ '''Remove an item from the search indices'''
+ raise NotImplementedError
+
+ def add_item(self, item, words, session):
+ '''Create indices for *item* and each word in *words*.
+
+:parameter item: a *model* instance to be indexed. It does not need to be
+ a class;`stdnet.orm.StdModel`.
+:parameter words: iterable over words. This iterable has been obtained from the
+ text in *item* via the :attr:`word_middleware`.
+'''
+ raise NotImplementedError
+
+ def search_model(self, model, text):
+ '''Search *text* in *model* instances. This is the functions
+needing implementation by custom serach engines.
+
+:parameter model: a :class:`stdnet.orm.StdModel` class.
+:parameter text: text to search
+:rtype: A :class:`stdnet.orm.query.QuerySet` of model instances containing the
+ text to search.'''
+ raise NotImplementedError
+
+ def flush(self, full = False):
+ '''Clean the search engine'''
raise NotImplementedError
View
66 stdnet/utils/populate.py
@@ -20,43 +20,43 @@ def populate(datatype = 'string', size = 10,
Useful for populating database with data for fuzzy testing.
Supported data-types
- * *string*
- For example::
-
- populate('string',100, min_len=3, max_len=10)
+* *string*
+ For example::
- create a 100 elements list with random strings
- with random length between 3 and 10
-
- * *date*
- For example::
-
- from datetime import date
- populate('date',200, start = date(1997,1,1), end = date.today())
-
- create a 200 elements list with random datetime.date objects
- between *start* and *end*
-
- * *integer*
- For example::
-
- populate('integer',200, start = 0, end = 1000)
+ populate('string',100, min_len=3, max_len=10)
+
+ create a 100 elements list with random strings
+ with random length between 3 and 10
+
+* *date*
+ For example::
- create a 200 elements list with random int between *start* and *end*
+ from datetime import date
+ populate('date',200, start = date(1997,1,1), end = date.today())
+
+ create a 200 elements list with random datetime.date objects
+ between *start* and *end*
+
+* *integer*
+ For example::
- * *float*
- For example::
-
- populate('float', 200, start = 0, end = 10)
+ populate('integer',200, start = 0, end = 1000)
+
+ create a 200 elements list with random int between *start* and *end*
+
+* *float*
+ For example::
- create a 200 elements list with random floats between *start* and *end*
+ populate('float', 200, start = 0, end = 10)
+
+ create a 200 elements list with random floats between *start* and *end*
- * *choice* (elements of an iterable)
- For example::
-
- populate('choice', 200, choice_from = ['pippo','pluto','blob'])
+* *choice* (elements of an iterable)
+ For example::
- create a 200 elements list with random elements from *choice_from*
+ populate('choice', 200, choice_from = ['pippo','pluto','blob'])
+
+ create a 200 elements list with random elements from *choice_from*.
'''
data = []
converter = converter or def_converter
@@ -76,9 +76,9 @@ def populate(datatype = 'string', size = 10,
end = end or 10
for s in range(size):
data.append(converter(uniform(start,end)))
- elif datatype == 'choice' or choice_from:
+ elif datatype == 'choice' and choice_from:
for s in range(size):
- data.append(choice(choice_from))
+ data.append(choice(list(choice_from)))
else:
for s in range(size):
data.append(converter(random_string(**kwargs)))
View
27 tests/regression/autoincrement.py
@@ -0,0 +1,27 @@
+from stdnet import test, orm
+from stdnet.apps.searchengine.models import Word, WordItem
+
+
+class TestCase(test.TestCase):
+ models = (Word, WordItem)
+
+ def setUp(self):
+ self.register()
+
+ def testAutoIncrement(self):
+ a = orm.autoincrement()
+ self.assertEqual(a.incrby,1)
+ self.assertEqual(a.desc,False)
+ self.assertEqual(str(a),'autoincrement(1)')
+ a = orm.autoincrement(3)
+ self.assertEqual(a.incrby,3)
+ self.assertEqual(a.desc,False)
+ self.assertEqual(str(a),'autoincrement(3)')
+ b = -a
+ self.assertEqual(str(a),'autoincrement(3)')
+ self.assertEqual(b.desc,True)
+ self.assertEqual(str(b),'-autoincrement(3)')
+
+ def testSimple(self):
+ w = Word(id='ciao').save()
+ self.assertEqual(w.id,'ciao')
View
124 tests/regression/search.py → tests/regression/searchengine.py
@@ -3,10 +3,9 @@
from datetime import date
from stdnet import test
-from stdnet.utils import to_string, range
-from stdnet.apps.searchengine import SearchEngine, double_metaphone
+from stdnet.utils import to_string, range, populate
+from stdnet.apps.searchengine import SearchEngine, processors
from stdnet.apps.searchengine.models import Word, WordItem
-from stdnet.utils import populate
from examples.wordsearch.basicwords import basic_english_words
from examples.wordsearch.models import Item, RelatedItem
@@ -57,8 +56,8 @@
WORDS_GROUPS = lambda size : (' '.join(populate('choice', NUM_WORDS,\
choice_from = basic_english_words))\
for i in range(size))
-
-
+
+
class TestCase(test.TestCase):
'''Mixin for testing the search engine. No tests implemented here,
just registration and some utility functions. All search-engine tests
@@ -75,34 +74,47 @@ def setUp(self):
def make_item(self,name='python',counter=10,content=None,related=None):
session = self.session()
+ content = content if content is not None else python_content
with session.begin():
- item = session.add(Item(name=name, counter = counter,
- content=content if content is not None else python_content,
- related = related))
+ item = session.add(Item(name=name,
+ counter = counter,
+ content=content,
+ related = related))
return item
def make_items(self, num = 30, content = False, related = None):
+ '''Bulk creation of Item for testing search engine. Return a set
+of words which have been included in the Items.'''
names = populate('choice', num, choice_from=basic_english_words)
session = self.session()
+ words = set()
if content:
contents = WORDS_GROUPS(num)
else:
contents = ['']*num
with session.begin():
- for name,co in zip(names,contents):
+ for name, content in zip(names,contents):
if len(name) > 3:
+ words.add(name)
+ if content:
+ words.update(content.split())
session.add(Item(name=name,
counter=randint(0,10),
- content = co,
- related = related))
+ content=content,
+ related=related))
+ wis = WordItem.objects.for_model(Item)
+ self.assertTrue(wis.count())
+ return words
def simpleadd(self, name = 'python', counter = 10, content = None,
related = None):
- item = self.make_item(name,counter,content,related)
- self.assertEqual(item.last_indexed.date(),date.today())
- wi = WordItem.objects.for_model(item)
- self.assertTrue(wi.count())
- return item, wi
+ item = self.make_item(name, counter, content, related)
+ self.assertEqual(item.last_indexed.date(), date.today())
+ wis = WordItem.objects.for_model(item).all()
+ self.assertTrue(wis)
+ for wi in wis:
+ self.assertEqual(wi.object, item)
+ return item, wis
def sometags(self, num = 10, minlen = 3):
def _():
@@ -126,10 +138,20 @@ def testSplitting(self):
self.assertEqual(list(eg.words_from_text('bla bla____bla')),\
['bla','bla','bla'])
+ def testSplitters(self):
+ eg = SearchEngine(splitters = False)
+ self.assertEqual(eg.punctuation_regex, None)
+ words = list(eg.split_text('pippo:pluto'))
+ self.assertEqual(len(words),1)
+ self.assertEqual(words[0],'pippo:pluto')
+ words = list(eg.split_text('pippo: pluto'))
+ self.assertEqual(len(words),2)
+ self.assertEqual(words[0],'pippo:')
+
def testMetaphone(self):
'''Test metaphone algorithm'''
for name in NAMES:
- d = double_metaphone(name)
+ d = processors.double_metaphone(name)
self.assertEqual(d,NAMES[name])
def testRegistered(self):
@@ -209,6 +231,35 @@ def testRelatedModel(self):
self.assertEqual(qc.keyword,'intersect')
self.assertEqual(qs.count(),1)
+ def testBigSearch(self):
+ words = self.make_items(num = 30, content = True)
+ sw = ' '.join(populate('choice', 1, choice_from = words))
+ qs = Item.objects.query().search(sw)
+ self.assertTrue(qs)
+
+ def testFlush(self):
+ self.make_items()
+ self.engine.flush()
+ self.assertFalse(WordItem.objects.query())
+ self.assertTrue(Word.objects.query())
+
+ def testFlushFull(self):
+ self.make_items()
+ self.engine.flush(full=True)
+ self.assertFalse(WordItem.objects.query())
+ self.assertFalse(Word.objects.query())
+
+ def testDelete(self):
+ item = self.make_item()
+ words = list(Word.objects.query())
+ item.delete()
+ wis = WordItem.objects.filter(model_type = item.__class__)
+ self.assertFalse(wis.count(),0)
+ self.assertEqual(len(words),len(Word.objects.query()))
+
+
+class TestTags(TestCase):
+
def _testAddTag(self):
item = self.make_item()
engine = self.engine
@@ -227,27 +278,28 @@ def _testAddTags(self):
self.assertTrue(engine.add_tag(item,self.sometags()))
tags = self.engine.alltags()
self.assertTrue(tags)
-
-
-class TestSearchEngineWithRegistration(TestCase):
- def make_item(self,**kwargs):
- item = super(TestSearchEngineWithRegistration,self).make_item(**kwargs)
- wis = WordItem.objects.filter(model_type = item.__class__)
- self.assertTrue(wis)
- for wi in wis:
- self.assertEqual(wi.object,item)
- return item
+
+class TestCoverage(TestCase):
+
+ def setUp(self):
+ eg = SearchEngine(metaphone = False)
+ eg.add_word_middleware(processors.metaphone_processor)
+ self.register()
+ self.engine = eg
+ self.engine.register(Item,('related',))
def testAdd(self):
- self.make_item()
+ item, wi = self.simpleadd('pink',
+ content='the dark side of the moon 10y')
+ wi = set((str(w.word) for w in wi))
+ self.assertEqual(len(wi),4)
+ self.assertFalse('10y' in wi)
- def testDelete(self):
- item = self.make_item()
- words = list(Word.objects.query())
- item.delete()
- wis = WordItem.objects.filter(model_type = item.__class__)
- self.assertFalse(wis.count(),0)
- self.assertEqual(len(words),len(Word.objects.query()))
-
+ def testRepr(self):
+ item, wi = self.simpleadd('pink',
+ content='the dark side of the moon 10y')
+ for w in wi:
+ self.assertEqual(str(w),str(w.word))
+
View
1  tests/regression/sorting.py
@@ -162,3 +162,4 @@ def testExclude(self):
class TestOrderingModelDesc(TestOrderingModel):
model = SportAtDate2
desc = True
+
Please sign in to comment.
Something went wrong with that request. Please try again.