Skip to content

Commit

Permalink
updated documentation
Browse files Browse the repository at this point in the history
  • Loading branch information
alexgarel committed Sep 22, 2020
1 parent 3395bde commit 65ed443
Show file tree
Hide file tree
Showing 6 changed files with 108 additions and 17 deletions.
4 changes: 3 additions & 1 deletion CHANGELOG.rst
Expand Up @@ -16,6 +16,8 @@ Added
- support for parsing Regular expressions like `/foo/` (no transformation to Elasticsearch DSL yet)
- basic support for head and tail of expressions (the separators)
and for their position (pos and size) in original text
- added `auto_head_tail` util
(use it if you build your tree programatically and want a printable representation)
- tree item now support a `clone_item` method and a setter for children.
This should help with making transformation pattern easier.
- New `visitor.TreeVisitor` and `visitor.TreeTransformer` classes to help in processing trees
Expand All @@ -26,7 +28,7 @@ Changed
-------

- support for python 3.8 added, support for python 3.4 and 3.5 dropped
- better printing of Proximity and Fuzzy items (preserve implicit caracter of degree)
- better printing of Proximity and Fuzzy items (preserve implicit nature of degree)

Fixed
-----
Expand Down
14 changes: 14 additions & 0 deletions docs/source/api.rst
Expand Up @@ -44,13 +44,27 @@ Utilities
==========


luqum.visitor: Manipulating trees
----------------------------------

.. automodule:: luqum.visitor
:members:
:member-order: bysource


luqum.naming: Naming query parts
---------------------------------

.. automodule:: luqum.naming
:members:
:member-order: bysource

luqum.auto_head_tail: Automatic addition of spaces
--------------------------------------------------

.. automodule:: luqum.auto_head_tail
:members:

luqum.pretty: Pretty printing
------------------------------

Expand Down
63 changes: 54 additions & 9 deletions docs/source/quick_start.rst
Expand Up @@ -63,20 +63,22 @@ In the previous request we can modify ``"foo bar"`` for ``"lazy dog"``::
>>> print(str(tree))
(title:"lazy dog" AND body:"quick fox") OR title:fox

That was a bit tedious. Of course, normally you will use recursion and visitor pattern
to do such things.
That was a bit tedious and also very specific to this tree.
Usually you will use recursion and visitor pattern to do such things.


.. py:currentmodule:: luqum.utils
.. py:currentmodule:: luqum.visitor
Luqum does provide some helpers like :py:class:`TreeTransformer` for this::

>>> from luqum.utils import TreeTransformer
>>> from luqum.visitor import TreeTransformer
>>> class MyTransformer(TreeTransformer):
... def visit_search_field(self, node, parents):
... def visit_search_field(self, node, context):
... if node.expr.value == '"lazy dog"':
... node.expr.value = '"back to foo bar"'
... return node
... new_node = node.clone_item()
... new_node.expr = node.expr.clone_item(value = '"back to foo bar"')
... yield new_node
... else:
... yield from self.generic_visit(node, context)
...
>>> transformer = MyTransformer()
>>> new_tree = transformer.visit(tree)
Expand Down Expand Up @@ -262,6 +264,50 @@ that will smartly replace ``UnkownOperation`` by ``AndOperation`` or ``OrOperati

.. _tutorial-pretty-printing:

Head and tail
--------------

In an expression there may be different spaces, or other characters
as delimiter between sub expressions.

Most of the time, as we manipulate expressions we may want to keep those spaces,
as they may be meaningful to their author (for example formating the expression).

Luqum manage this by computing a `head` and `tail` on each element
that gives the characters before and after the part of the expression the item represents.

Those properties are computed at parsing time.
If you build trees computationaly (or change them), you will have to set them yourself.

For example, do not write::

>>> from luqum.tree import AndOperation, Word
>>> my_tree = AndOperation(Word('foo'), Word('bar'))

As it would result in::

>>> print(my_tree)
fooANDbar

Instead, you may write:

>>> my_tree = AndOperation(Word('foo', tail=" "), Word('bar', head=" "))
>>> print(my_tree)
foo AND bar

.. py:currentmodule:: luqum.auto_head_tail
Although luqum provides a util :py:func:`auto_head_tail`
to quickly add minimal head / tail where needed::

>>> from luqum.tree import Not
>>> from luqum.auto_head_tail import auto_head_tail
>>> my_tree = AndOperation(Word('foo'), Not(Word('bar')))
>>> my_tree = auto_head_tail(my_tree)
>>> print(my_tree)
foo AND NOT bar


Pretty printing
---------------

Expand Down Expand Up @@ -296,7 +342,6 @@ We can pretty print it::
Named Queries
--------------


.. py:currentmodule:: luqum.naming
Luqum support using named queries.
Expand Down
24 changes: 23 additions & 1 deletion luqum/head_tail.py
Expand Up @@ -22,13 +22,23 @@ def __str__(self):


class HeadTailLexer:
"""Utility to handle head and tail at lexer time
"""Utility to handle head and tail at lexer time.
"""

LEXER_ATTR = "_luqum_headtail"

@classmethod
def handle(cls, token, orig_value):
"""Handling a token.
.. note::
PLY does not gives acces to previous tokens,
although it does not provide any infrastructure for handling specific state.
So we use the strategy
of puting a :py:cls:`HeadTailLexer`instance as an attribute of the lexer
each time we start a new tokenization.
"""
# get instance
if token.lexpos == 0:
# first token make instance
Expand All @@ -41,7 +51,11 @@ def handle(cls, token, orig_value):

def __init__(self):
self.head = None
"""This will track the head of next element, useful only for first element
"""
self.last_elt = None
"""This will track the last token, so we can use it to add the tail to it.
"""

def handle_token(self, token, orig_value):
"""Handle head and tail for tokens
Expand Down Expand Up @@ -79,6 +93,12 @@ class HeadTailManager:
"""

def pos(self, p, head_transfer=False, tail_transfer=False):
"""Compute pos and size of element 0 based on it's parts (p[1:])
:param list p: the parser expression as in PLY
:param bool head_transfer: True if head of first child will be transfered to p[0]
:param bool tail_transfer: True if tail of last child wiil be transfered to p[0]
"""
# pos
if p[1].pos is not None:
p[0].pos = p[1].pos
Expand Down Expand Up @@ -159,3 +179,5 @@ def search_field(self, p):


head_tail = HeadTailManager()
"""singleton of HeadTailManager
"""
1 change: 0 additions & 1 deletion luqum/naming.py
Expand Up @@ -4,7 +4,6 @@
and retrieve their positions in the query text.
This module adds support for that.
"""
from .visitor import TreeVisitor

Expand Down
19 changes: 14 additions & 5 deletions luqum/tree.py
Expand Up @@ -15,7 +15,8 @@ class Item(object):
An item is a part of a request.
:param int pos: position of element in orginal text (not accounting for tail)
:param int pos: position of element in orginal text (not accounting for head)
:param int size: size of element in orginal text (not accounting for head and tail)
:param str head: non meaningful text before this element
:param str tail: non meaningful text after this element
"""
Expand All @@ -35,7 +36,7 @@ class Item(object):
# other we might add in the future).

#: this attribute permits to list attributes that participate in telling equality of two item
#: this exclude children (for generic `__eq__` methode will already recursively compare them)
#: this excludes children (for generic `__eq__` methode will already recursively compare them)
_equality_attrs = []
#: this attribute permits to list attributes that defines the children.
#: Order is important.
Expand All @@ -50,7 +51,10 @@ def __init__(self, pos=None, size=None, head="", tail=""):
def clone_item(self, **kwargs):
"""clone an item, but not its children !
:param dict **kwargs: those item will be added to __init__ call
This is particularly useful for the :py:class:`.visitor.TreeTransformer` pattern.
:param dict kwargs: those item will be added to `__init__` call.
It's a simple way to change some values of target item.
"""
return self._clone_item(cls=type(self), **kwargs)

Expand All @@ -75,7 +79,10 @@ def children(self):

@children.setter
def children(self, value):
"""generic setter for children
"""generic setter for children.
Having a setter for children in useful for generic manipulations
like in :py:mod:`luqum.visitor`
"""
if len(value) != len(self._children_attrs):
num_children = len(value) if value else "no"
Expand All @@ -93,7 +100,9 @@ def _head_tail(self, value, head_tail):
return value

def span(self, head_tail=False):
"""return (sart, end) position of this element in global expression
"""return (start, end) position of this element in global expression.
:param bool head_tail: should span include head and tail of element ?
"""
if self.pos is None:
start, end = None, None
Expand Down

0 comments on commit 65ed443

Please sign in to comment.