Skip to content

Commit

Permalink
Complete the manual and clean the code for v.1.0.7
Browse files Browse the repository at this point in the history
  - A first version of RtD manual done
  - Improved comments for create almost all the manual with autodoc
  - Remove abandoned internal helper functions
  • Loading branch information
brunato committed May 7, 2018
1 parent fc642c9 commit f4ddf81
Show file tree
Hide file tree
Showing 6 changed files with 102 additions and 74 deletions.
8 changes: 4 additions & 4 deletions doc/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@
# The short X.Y version
version = ''
# The full version, including alpha/beta/rc tags
release = '1.0.6'
release = '1.0.7'


# -- General configuration ---------------------------------------------------
Expand Down Expand Up @@ -135,7 +135,7 @@
# (source start file, target name, title,
# author, documentclass [howto, manual, or own class]).
latex_documents = [
(master_doc, 'elementpath.tex', 'elementpath Documentation',
(master_doc, 'elementpath.tex', 'elementpath Manual',
'Davide Brunato', 'manual'),
]

Expand All @@ -145,7 +145,7 @@
# One entry per manual page. List of tuples
# (source start file, name, description, authors, manual section).
man_pages = [
(master_doc, 'elementpath', 'elementpath Documentation',
(master_doc, 'elementpath', 'elementpath Manual',
[author], 1)
]

Expand All @@ -156,7 +156,7 @@
# (source start file, target name, title, author,
# dir menu entry, description, category)
texinfo_documents = [
(master_doc, 'elementpath', 'elementpath Documentation',
(master_doc, 'elementpath', 'elementpath Manual',
author, 'elementpath', 'One line description of project.',
'Miscellaneous'),
]
Expand Down
49 changes: 39 additions & 10 deletions doc/pratt_api.rst
Original file line number Diff line number Diff line change
@@ -1,33 +1,62 @@
Pratt's parser API
==================

The TDOP (Top Down Operator Precedence) parser implemented within this library is variant of the original
Pratt's parser based on a class for the parser and metaclasses for tokens.
The TDOP (Top Down Operator Precedence) parser implemented within this library is a variant of the
original Pratt's parser based on a class for the parser and metaclasses for tokens.

The parser base class includes helper functions for registering token classes,
the Pratt's methods and a regexp-based tokenizer builder. There are also additional
methods and attributes to help the developing of new parsers. Parsers can be defined
by class derivation and following a tokens registration procedure.

Token Base Class
Token base class
----------------

.. autoclass:: elementpath.Token

.. autoattribute:: arity
.. autoattribute:: tree
.. autoattribute:: source

Parser Base Class
.. automethod:: nud
.. automethod:: led
.. automethod:: evaluate
.. automethod:: iter

Helper methods for checking symbols and for error raising:

.. automethod:: expected
.. automethod:: unexpected
.. automethod:: wrong_syntax
.. automethod:: wrong_name
.. automethod:: wrong_value
.. automethod:: wrong_type


Parser base class
-----------------

.. autoclass:: elementpath.Parser

Parsing methods:

.. automethod:: build_tokenizer
.. automethod:: parse
.. automethod:: advance
.. automethod:: raw_advance
.. automethod:: expression







Helper methods for building effective parser classes:

.. automethod:: begin
.. automethod:: end
.. automethod:: register
.. automethod:: unregister
.. automethod:: unregistered
.. automethod:: literal
.. automethod:: nullary
.. automethod:: prefix
.. automethod:: postfix
.. automethod:: infix
.. automethod:: infixr
.. automethod:: method
11 changes: 11 additions & 0 deletions doc/xpath_api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -3,11 +3,22 @@ Public XPath API

The package includes some classes and functions that implement XPath parsers and selectors.

XPath tokens
------------

.. autoclass:: elementpath.XPathToken

.. automethod:: evaluate
.. automethod:: select


XPath parsers
-------------

.. autoclass:: elementpath.XPath1Parser

.. autoattribute:: DEFAULT_NAMESPACES

.. autoclass:: elementpath.XPath2Parser


Expand Down
94 changes: 41 additions & 53 deletions elementpath/tdop_parser.py
Original file line number Diff line number Diff line change
Expand Up @@ -45,10 +45,14 @@

class Token(MutableSequence):
"""
Token base class for defining a parser based on Pratt's method.
Token base class for defining a parser based on Pratt's method. Each token instance
is a list-like object. Empty tokens represent simple symbols, names and literals.
Not empty tokens represent operators where list's items are the operands.
:cvar symbol: The symbol of the token class.
:param parser: The parser instance that creates the token instance.
:param value: The token value. If not provided defaults to token symbol.
:cvar symbol: The symbol of the token class.
:cvar lbp: Pratt's left binding power, defaults to 0.
:cvar rbp: Pratt's right binding power, defaults to 0.
:cvar label: A label that can be changed to put a custom category to a token \
Expand Down Expand Up @@ -103,6 +107,7 @@ def arity(self):

@property
def tree(self):
"""Returns a tree representation string."""
symbol, length = self.symbol, len(self)
if symbol == '(name)':
return u'(%s)' % self.value
Expand All @@ -117,6 +122,7 @@ def tree(self):

@property
def source(self):
"""Returns the source representation string."""
symbol = self.symbol
if symbol == '(name)':
return self.value
Expand All @@ -134,17 +140,18 @@ def source(self):
return u'%s %s' % (symbol, ' '.join(item.source for item in self))

def nud(self):
"""Null denotation method"""
"""Pratt's null denotation method"""
self.wrong_syntax()

def led(self, left):
"""Left denotation method"""
"""Pratt's left denotation method"""
self.wrong_syntax()

def evaluate(self, *args, **kwargs):
"""Evaluation method"""

def iter(self):
"""Returns a generator for iterating the token's tree."""
for t in self[:1]:
for token in t.iter():
yield token
Expand Down Expand Up @@ -179,15 +186,15 @@ def wrong_type(self, message=None):

class Parser(object):
"""
Parser class for implementing a version of a Top Down Operator Precedence parser.
Parser class for implementing a Top Down Operator Precedence parser.
:cvar symbol_table: A dictionary that stores the token classes defined for the language.
:type symbol_table: dict
:cvar token_base_class: The base class for creating language's token classes.
:type token_base_class: Token
:cvar tokenizer: The language tokenizer compiled regexp.
:cvar SYMBOLS: A unified list of the definable tokens. It's an optional list useful \
if you want to make sure all language's symbols are included and defined.
:cvar SYMBOLS: A list of the definable tokens for the parser. It's an optional list useful \
if you want to make sure that all formal language's symbols are included and defined.
"""
symbol_table = {}
token_base_class = Token
Expand Down Expand Up @@ -242,7 +249,8 @@ def __init__(self):

def parse(self, source):
"""
The method for parsing a source code of the formal language.
Parses a source code of the formal language. This is the main method that has to be
called for a parser's instance.
:param source: The source string.
:return: The root of the token's tree that parse the source.
Expand All @@ -264,12 +272,12 @@ def parse(self, source):

def advance(self, *symbols):
"""
The function for advance to next token.
The Pratt's function for advancing to next token.
:param symbols: Optional arguments tuple. If not empty one of the provided \
symbols is expected. If the next token's symbol differs the parser raise a \
parse error.
:return: The next token.
:return: The next token instance.
"""
if getattr(self.next_token, 'symbol', None) == '(end)':
if self.token is None:
Expand Down Expand Up @@ -321,7 +329,7 @@ def raw_advance(self, *stop_symbols):
"""
Advances until one of the symbols is found or the end of source is reached, returning
the raw source string placed before. Useful for raw parsing of comments and references
enclosed between specific symbols.
enclosed between specific symbols. This is an extension provided by this implementation.
:param stop_symbols: The symbols that have to be found for stopping advance.
:return: The source string chunk enclosed between the initial position and the first stop symbol.
Expand Down Expand Up @@ -361,7 +369,7 @@ def raw_advance(self, *stop_symbols):

def expression(self, rbp=0):
"""
Parse function for expressions. It calls token.nud() and then advance
Pratt's function for parsing an expression. It calls token.nud() and then advances
until the right binding power is less the left binding power of the next
token, invoking the led() method on the following token.
Expand Down Expand Up @@ -416,24 +424,12 @@ def wrong_syntax(self, symbol):

@classmethod
def begin(cls):
"""
Begin the symbol registration. Helper functions are bound to global names.
"""
"""Begin the symbols registration nullifying the parser tokenizer."""
cls.tokenizer = None
globals().update({
'register': cls.register,
'literal': cls.literal,
'prefix': cls.prefix,
'infix': cls.infix,
'infixr': cls.infixr,
'method': cls.method,
})

@classmethod
def end(cls):
"""
End the symbol registration. Registers the special (end) symbol and sets the tokenizer.
"""
"""End the symbols registration and build the tokenizer."""
cls.register('(end)')
cls.build_tokenizer()

Expand Down Expand Up @@ -497,35 +493,18 @@ def symbol_escape(s):

@classmethod
def unregister(cls, symbol):
"""Unregister a token class from the symbol table."""
del cls.symbol_table[symbol.strip()]

@classmethod
def alias(cls, symbol, other):
symbol = symbol.strip()
try:
other_class = cls.symbol_table[other]
except KeyError:
raise ElementPathKeyError("%r is not a registered symbol for %r." % (other, cls))

token_class = cls.register(symbol)
token_class.lbp = other_class.lbp
token_class.rbp = other_class.rbp
token_class.nud = other_class.nud
token_class.led = other_class.led
token_class.evaluate = other_class.evaluate
return token_class

@classmethod
def unregistered(cls):
"""Helper function that returns SYMBOLS not yet registered in the symbol table."""
if cls.SYMBOLS:
return [s for s in cls.SYMBOLS if s not in cls.symbol_table]

@classmethod
def symbol(cls, s):
return cls.register(s)

@classmethod
def literal(cls, symbol, bp=0):
"""Register a token for a symbol that represents a *literal*."""
def nud(self):
return self

Expand All @@ -536,40 +515,49 @@ def evaluate(self, *args, **kwargs):

@classmethod
def nullary(cls, symbol, bp=0):
"""Register a token for a symbol that represents a *nullary* operator."""
def nud(self):
return self
return cls.register(symbol, label='operator', lbp=bp, nud=nud)

@classmethod
def prefix(cls, symbol, bp=0):
"""Register a token for a symbol that represents a *prefix* unary operator."""
def nud(self):
self[:] = self.parser.expression(rbp=bp),
return self
return cls.register(symbol, label='operator', lbp=bp, rbp=bp, nud=nud)

@classmethod
def postfix(cls, symbol, bp=0):
"""Register a token for a symbol that represents a *postfix* unary operator."""
def led(self, left):
self[:] = left,
return self
return cls.register(symbol, label='operator', lbp=bp, rbp=bp, led=led)

@classmethod
def infix(cls, symbol, bp=0):
"""Register a token for a symbol that represents an *infix* binary operator."""
def led(self, left):
self[:] = left, self.parser.expression(rbp=bp)
return self
return cls.register(symbol, label='operator', lbp=bp, rbp=bp, led=led)

@classmethod
def infixr(cls, symbol, bp=0):
"""Register a token for a symbol that represents an *infixr* binary operator."""
def led(self, left):
self[:] = left, self.parser.expression(rbp=bp-1)
return self
return cls.register(symbol, label='operator', lbp=bp, rbp=bp-1, led=led)

@classmethod
def postfix(cls, symbol, bp=0):
def led(self, left):
self[:] = left,
return self
return cls.register(symbol, label='operator', lbp=bp, rbp=bp, led=led)

@classmethod
def method(cls, symbol, bp=0):
"""
Register a token for a symbol that represents a custom operator or redefine
a method for an existing token.
"""
token_class = cls.register(symbol, label='operator', lbp=bp, rbp=bp)

def bind(func):
Expand Down

0 comments on commit f4ddf81

Please sign in to comment.