Permalink
Browse files

very unstable at the moment. Working on autoincrement option for model.

  • Loading branch information...
1 parent cde03b9 commit 381862ec76381f428c33e06fe015c5b682c1a786 quantmind committed Mar 16, 2012
View
@@ -1,3 +1,3 @@
[run]
source = stdnet
-omit = stdnet/apps/searchengine/processors/*,stdnet/utils/fallbacks/*
+omit = stdnet/apps/searchengine/processors/metaphone.py,stdnet/apps/searchengine/processors/porter.py,stdnet/utils/fallbacks/*
@@ -25,6 +25,7 @@ available in :mod:`stdnet`.
tutorial
query
+ sorting
tutorial2
performance
timeseries
@@ -0,0 +1,75 @@
+
+.. _sorting:
+
+=======================
+Sorting and Ordering
+=======================
+Stdnet can sort instances of a model in three different ways:
+
+* :ref:`Explicit sorting <explicit-sorting>` using the
+ :attr:`stdnet.orm.Query.sort_by` method.
+* :ref:`Implicit sorting <implicit-sorting>` via the
+ :attr:`stdnet.orm.Metaclass.ordering` attribute of the model metaclass.
+* :ref:`Incremental sorting <incremental-sorting>`, a variant of the
+ implicit sorting for models which require to keep track how many
+ times instances with the same id are created.
+
+
+.. _explicit-sorting:
+
+Explicit Sorting
+=======================
+
+Sorting is usually achieved by using the :meth:`stdnet.orm.query.QuerySet.sort_by`
+method with a field name as parameter. Lets consider the following model::
+
+ class SportActivity(orm.StdNet):
+ person = orm.SymbolField()
+ activity = orm.SymbolField()
+ dt = orm.DateTimeField()
+
+
+To obtained a sorted query on dates for a given person::
+
+ SportActivity.objects.filter(person='pippo').sort_by('-dt')
+
+The negative sign in front of ``dt`` indicates descending order.
+
+
+.. _implicit-sorting:
+
+Implicit Sorting
+===================
+
+Implicit sorting is achieved by setting the ``ordering`` attribute in the model Meta class.
+Let's consider the following Log model example::
+
+ class Log(orm.StdModel):
+ '''A database log entry'''
+ timestamp = orm.DateTimeField(default=datetime.now)
+ level = orm.SymbolField()
+ msg = orm.CharField()
+ source = orm.CharField()
+ host = orm.CharField()
+ user = orm.SymbolField(required=False)
+ client = orm.CharField()
+
+ class Meta:
+ ordering = '-timestamp'
+
+It makes lots of sense to have the log entries always sorted in a descending
+order with respect to the ``timestamp`` field.
+This solution always returns querysets in this order, without the need to
+call ``sort_by`` method.
+
+.. note:: Implicit sorting is a much faster solution than explicit sorting,
+ since there is no sorting step involved (which is a ``N log(N)``
+ time complexity algorithm). Instead, the order is maintained by using
+ sorted sets as indices rather than sets.
+
+
+.. _incremental-sorting:
+
+Incremental Sorting
+========================
+
@@ -91,6 +91,16 @@ The only difference is the ``prices`` :class:`stdnet.orm.ListField`
in the ``Instrument`` model which is
not available in a traditional relational database.
+The metaclass
+~~~~~~~~~~~~~~~~~~~~~~~
+The ``Position`` models specifies a ``Meta`` class with an ``ordering`` attribute.
+When provided, as in this case, the Meta class fields are used by the ``orm``
+to customize the build of the :class:`stdnet.orm.Metaclass` for the model.
+In this case we instruct the mapper to manage the ``Position`` model
+as ordered with respect to the :class:`stdnet.orm.DateField` ``dt``
+in descending order. Check the :ref:`sorting <sorting>`
+documentation for more details or ordering and sorting.
+
Registering Models
================================
@@ -149,69 +159,3 @@ Here's an example::
'EUR'
>>> b.description
''
-
-
-
-.. _sorting:
-
-Sorting
-==================
-Since version 0.6.0, stdnet provides sorting using two different ways:
-
-* Explicit sorting using the :attr:`stdnet.orm.query.QuerySet.sort_by` attribute
- of a queryset.
-* Implicit sorting via the :attr:`stdnet.orm.Meta.ordering` attribute of
- the model metaclass.
-
-
-Explicit Sorting
-~~~~~~~~~~~~~~~~~~~~
-
-Sorting is usually achieved by using the :meth:`stdnet.orm.query.QuerySet.sort_by`
-method with a field name as parameter. Lets consider the following model::
-
- class SportActivity(orm.StdNet):
- person = orm.SymbolField()
- activity = orm.SymbolField()
- dt = orm.DateTimeField()
-
-
-To obtained a sorted query on dates for a given person::
-
- SportActivity.objects.filter(person='pippo').sort_by('-dt')
-
-The negative sign in front of ``dt`` indicates descending order.
-
-.. _implicit-sorting:
-
-Implicit Sorting
-~~~~~~~~~~~~~~~~~~~~
-
-Implicit sorting is achieved by setting the ``ordering`` attribute in the model Meta class.
-Let's consider the following Log model example::
-
- class Log(orm.StdModel):
- '''A database log entry'''
- timestamp = orm.DateTimeField(default=datetime.now)
- level = orm.SymbolField()
- msg = orm.CharField()
- source = orm.CharField()
- host = orm.CharField()
- user = orm.SymbolField(required=False)
- client = orm.CharField()
-
- class Meta:
- ordering = '-timestamp'
-
-It makes lots of sense to have the log entries always sorted in a descending
-order with respect to the ``timestamp`` field.
-This solution always returns querysets in this order, without the need to
-call ``sort_by`` method.
-
-.. note:: Implicit sorting is a much faster solution than explicit sorting,
- since there is no sorting step involved (which is a ``N log(N)``
- time complexity algorithm). Instead, the order is maintained by using
- sorted sets as indices rather than sets.
-
-
-.. _django: http://www.djangoproject.com/
@@ -54,6 +54,14 @@ Query
.. automethod:: __init__
+autoincrement
+~~~~~~~~~~~~~~~~~~
+
+.. autoclass:: autoincrement
+ :members:
+ :member-order: bysource
+
+
.. _model-structures:
Data Structures
@@ -42,54 +42,7 @@
from stdnet.utils import to_string, iteritems
from .models import Word, WordItem, Tag
-from .ignore import STOP_WORDS, PUNCTUATION_CHARS
-from .processors.metaphone import dm as double_metaphone
-from .processors.porter import PorterStemmer
-
-
-class stopwords:
-
- def __init__(self, stp):
- self.stp = stp
-
- def __call__(self, words):
- stp = self.stp
- for word in words:
- if word not in stp:
- yield word
-
-
-def metaphone_processor(words):
- '''Double metaphone word processor.'''
- for word in words:
- for w in double_metaphone(word):
- if w:
- w = w.strip()
- if w:
- yield w
-
-
-def tolerant_metaphone_processor(words):
- '''Double metaphone word processor slightly modified so that when no
-words are returned by the algorithm, the original word is returned.'''
- for word in words:
- r = 0
- for w in double_metaphone(word):
- if w:
- w = w.strip()
- if w:
- r += 1
- yield w
- if not r:
- yield word
-
-
-def stemming_processor(words):
- '''Double metaphone word processor'''
- stem = PorterStemmer().stem
- for word in words:
- word = stem(word, 0, len(word)-1)
- yield word
+from . import processors
class SearchEngine(orm.SearchEngine):
@@ -131,19 +84,19 @@ def __init__(self, min_word_length = 3, stop_words = None,
splitters = None):
super(SearchEngine,self).__init__()
self.MIN_WORD_LENGTH = min_word_length
- stop_words = stop_words if stop_words is not None else STOP_WORDS
- splitters = splitters if splitters is not None else PUNCTUATION_CHARS
+ splitters = splitters if splitters is not None else\
+ processors.PUNCTUATION_CHARS
if splitters:
self.punctuation_regex = re.compile(\
r"[%s]" % re.escape(splitters))
else:
self.punctuation_regex = None
- if stop_words:
- self.add_word_middleware(stopwords(stop_words),False)
+ # The stop words middleware is only used for the indexing part
+ self.add_word_middleware(processors.stopwords(stop_words), False)
if stemming:
- self.add_word_middleware(stemming_processor)
+ self.add_word_middleware(processors.stemming_processor)
if metaphone:
- self.add_word_middleware(tolerant_metaphone_processor)
+ self.add_word_middleware(processors.tolerant_metaphone_processor)
def split_text(self, text):
if self.punctuation_regex:
@@ -159,7 +112,7 @@ def flush(self, full = False):
if full:
Word.objects.flush()
- def _index_item(self, item, words, session):
+ def add_item(self, item, words, session):
link = self._link_item_and_word
for word,count in iteritems(words):
session.add(link(item, word, count))
@@ -23,27 +23,12 @@ class Word(orm.StdModel):
'''Model which hold a word as primary key'''
id = orm.SymbolField(primary_key = True)
tag = orm.BooleanField(default = False)
- # denormalised fields for frequency
- frequency = orm.IntegerField(default = 0)
- model_frequency = orm.HashField()
def __unicode__(self):
return self.id
- def update_frequency(self):
- f = 0
- mf = {}
- for item in self.items().all():
- m = item.model_type
- if m in mf:
- mf[m] += 1
- else:
- mf[m] = 1
- f += 1
- self.model_frequency.delete()
- self.model_frequency.update(mf)
- self.frequency = f
- self.save()
+ class Meta:
+ ordering = -orm.autoincrement()
class WordItem(orm.StdModel):
@@ -62,6 +47,10 @@ def __unicode__(self):
objects = WordItemManager()
+ class Meta:
+ ordering = -orm.autoincrement()
+ unique_together = ('word', 'model_type', 'object_id')
+
@property
def object(self):
'''Instance of :attr:`model_type` with id :attr:`object_id`.'''
@@ -1 +1,48 @@
-__test__ = False
+from .ignore import STOP_WORDS, PUNCTUATION_CHARS
+from .metaphone import dm as double_metaphone
+from .porter import PorterStemmer
+
+
+class stopwords:
+
+ def __init__(self, stp = None):
+ self.stp = stp if stp is not None else STOP_WORDS
+
+ def __call__(self, words):
+ stp = self.stp
+ for word in words:
+ if word not in stp:
+ yield word
+
+
+def metaphone_processor(words):
+ '''Double metaphone word processor.'''
+ for word in words:
+ for w in double_metaphone(word):
+ if w:
+ w = w.strip()
+ if w:
+ yield w
+
+
+def tolerant_metaphone_processor(words):
+ '''Double metaphone word processor slightly modified so that when no
+words are returned by the algorithm, the original word is returned.'''
+ for word in words:
+ r = 0
+ for w in double_metaphone(word):
+ if w:
+ w = w.strip()
+ if w:
+ r += 1
+ yield w
+ if not r:
+ yield word
+
+
+def stemming_processor(words):
+ '''Porter Stemmer word processor'''
+ stem = PorterStemmer().stem
+ for word in words:
+ word = stem(word, 0, len(word)-1)
+ yield word
View
@@ -838,9 +838,13 @@ def execute_session(self, session, callback):
json.dumps(instance._dbdata['errors']))
score = MIN_FLOAT
if meta.ordering:
- v = getattr(instance,meta.ordering.name,None)
- if v is not None:
- score = meta.ordering.field.scorefun(v)
+ if meta.ordering.auto:
+ score = 'auto {0}'.format(\
+ meta.ordering.name.incrby)
+ else:
+ v = getattr(instance,meta.ordering.name,None)
+ if v is not None:
+ score = meta.ordering.field.scorefun(v)
data = instance._dbdata['cleaned_data']
if state.persistent:
action = 'o' if instance.has_all_data else 'c'
Oops, something went wrong.

0 comments on commit 381862e

Please sign in to comment.