Skip to content

Commit

Permalink
Create template filtertransformer BaseTransformer (#287)
Browse files Browse the repository at this point in the history
* Added BaseTransformer submodule

* Created v1.0.0 grammar and a link to v0.10.1

* Tidied older transformer code:

- Raise deprecation warning for DjangoTransformer
- Removed debug/json transformers
- Tweaked basetransformer docstrings

* Updated contributing docs

* Add no cover to CLI validator method
  • Loading branch information
ml-evs committed Dec 7, 2020
1 parent 36af320 commit f5e7b57
Show file tree
Hide file tree
Showing 17 changed files with 390 additions and 508 deletions.
120 changes: 43 additions & 77 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,37 +4,40 @@ The [Materials Consortia](https://github.com/Materials-Consortia) is very open t

This may be anything from simple feedback and raising [new issues](https://github.com/Materials-Consortia/optimade-python-tools/issues/new) to creating [new PRs](https://github.com/Materials-Consortia/optimade-python-tools/compare).

We have below recommendations for setting up an environment in which one may develop the package further.
Recommendations for setting up a development environment can be found in the [Installation instructions](https://www.optimade.org/optimade-python-tools/install/#full-development-installation).

## Getting Started with Filter Parsing and Transforming

Example use:

```python
from optimade.filterparser import Parser
from optimade.filterparser import LarkParser

p = Parser(version=(0,9,7))
p = LarkParser(version=(1, 0, 0))
tree = p.parse("nelements<3")
print(tree)
```

```shell
Tree(start, [Tree(expression, [Tree(term, [Tree(atom, [Tree(comparison, [Token(VALUE, 'nelements'), Token(OPERATOR, '<'), Token(VALUE, '3')])])])])])
Tree('filter', [Tree('expression', [Tree('expression_clause', [Tree('expression_phrase', [Tree('comparison', [Tree('property_first_comparison', [Tree('property', [Token('IDENTIFIER', 'nelements')]), Tree('value_op_rhs', [Token('OPERATOR', '<'), Tree('value', [Tree('number', [Token('SIGNED_INT', '3')])])])])])])])])])
```

```python
print(tree.pretty())
```

```shell
start
filter
expression
term
atom
expression_clause
expression_phrase
comparison
nelements
<
3
property_first_comparison
property nelements
value_op_rhs
<
value
number 3
```

```python
Expand All @@ -43,36 +46,31 @@ print(tree.pretty())
```

```shell
start
filter
expression
term
term
atom
comparison
_mp_bandgap
>
5.0
AND
atom
expression_clause
expression_phrase
comparison
_cod_molecular_weight
<
350
```

```python
# Assumes graphviz installed on system (e.g. `conda install -c anaconda graphviz`) and `pip install pydot`
from lark.tree import pydot__tree_to_png

pydot__tree_to_png(tree, "exampletree.png")
property_first_comparison
property _mp_bandgap
value_op_rhs
>
value
number 5.0
expression_phrase
comparison
property_first_comparison
property _cod_molecular_weight
value_op_rhs
<
value
number 350
```

![example tree](images/exampletree.png)

### Flow for Parsing User-Supplied Filter and Converting to Backend Query

`optimade.filterparser.Parser` will take user input to generate a `lark.Tree` and feed that to a `lark.Transformer`.
E.g., `optimade.filtertransformers.mongo.MongoTransformer` will turn the tree into something useful for your MondoDB backend:
`optimade.filterparser.LarkParser` will take user input to generate a `lark.Tree` and feed that to a `lark.Transformer`.
E.g., `optimade.filtertransformers.mongo.MongoTransformer` will turn the tree into something useful for your MongoDB backend:

```python
# Example: Converting to MongoDB Query Syntax
Expand All @@ -85,55 +83,23 @@ query = transformer.transform(tree)
print(query)
```

```python
{'$and': [{'_mp_bandgap': {'$gt': 5.0}}, {'_cod_molecular_weight': {'$lt': 350.0}}]}
```json
{
"$and": [
{"_mp_bandgap": {"$gt": 5.0}},
{"_cod_molecular_weight": {"$lt": 350.0}}
]
}
```

There is also a [basic JSON transformer][optimade.filtertransformers.json] you can use as a simple example for developing your own transformer.
You can also use the JSON output it produces as an easy-to-parse input for a "transformer" in your programming language of choice.

```python
class JSONTransformer(Transformer):
def __init__(self, compact=False):
self.compact = compact
super().__init__()

def __default__(self, data, children):
items = []
for c in children:
if isinstance(c, Token):
token_repr = {
"@module": "lark.lexer",
"@class": "Token",
"type_": c.type,
"value": c.value,
}
if self.compact:
del token_repr["@module"]
del token_repr["@class"]
items.append(token_repr)
elif isinstance(c, dict):
items.append(c)
else:
raise ValueError(f"Unknown type {type(c)} for tree child {c}")
tree_repr = {
"@module": "lark",
"@class": "Tree",
"data": data,
"children": items,
}
if self.compact:
del tree_repr["@module"]
del tree_repr["@class"]
return tree_repr
```

### Developing New Filter Transformers

If you would like to add a new transformer, please add:
If you would like to add a new transformer, please raise an issue to signal your intent (in case someone else is already working on this).
Adding a transformer requires the following:

1. A module (.py file) in the `optimade/filtertransformers` folder.
2. Any additional Python requirements must be optional and provided as a separate "`extra_requires`" entry in `setup.py`.
1. A new submodule (`.py` file) in the `optimade/filtertransformers` folder containing an implementation of the transformer object, preferably one that extends `optimade.filtertransformers.base_transformer.BaseTransformer`.
2. Any additional Python requirements must be optional and provided as a separate "`extra_requires`" entry in `setup.py` and in the `requirements.txt` file.
3. Tests in `optimade/filtertransformers/tests` that are skipped if the required packages fail to import.

For examples, please check out existing filter transformers.
3 changes: 0 additions & 3 deletions docs/api_reference/filtertransformers/debug.md

This file was deleted.

3 changes: 0 additions & 3 deletions docs/api_reference/filtertransformers/json.md

This file was deleted.

5 changes: 5 additions & 0 deletions optimade/filtertransformers/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
""" This module implements filter transformer classes for different backends. These
classes typically parse the filter with Lark and produce an appropriate query for the
given backend.
"""
179 changes: 179 additions & 0 deletions optimade/filtertransformers/base_transformer.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,179 @@
import abc
from lark import Transformer, v_args
from typing import Dict
from optimade.server.mappers import BaseResourceMapper

__all__ = ("BaseTransformer",)


class BaseTransformer(abc.ABC, Transformer):
"""Generic filter transformer that handles various
parts of the grammar in a backend non-specific way.
"""

# map from standard comparison operators to the backend-specific version
operator_map: Dict[str, str] = {
"<": None,
"<=": None,
">": None,
">=": None,
"!=": None,
"=": None,
}

# map from back-end specific operators to their inverse
# e.g. {"$lt": "$gt"} for MongoDB.
reversed_operator_map: Dict[str, str] = {}

def __init__(self, mapper: BaseResourceMapper = None):
"""Initialise the transformer object, optionally loading in a
resource mapper for use when post-processing.
"""
self.mapper = mapper

def postprocess(self, query):
"""Post-process the query according to the rules defined for
the backend.
"""
return query

def transform(self, tree):
""" Transform the query using the Lark transformer then post-process. """
return self.postprocess(super().transform(tree))

def __default__(self, data, children, meta):
raise NotImplementedError(
f"Calling __default__, i.e., unknown grammar concept. data: {data}, children: {children}, meta: {meta}"
)

def filter(self, arg):
""" filter: expression* """
return arg[0] if arg else None

@v_args(inline=True)
def constant(self, value):
""" constant: string | number """
# Note: Return as is.
return value

@v_args(inline=True)
def value(self, value):
""" value: string | number | property """
# Note: Return as is.
return value

@v_args(inline=True)
def non_string_value(self, value):
""" non_string_value: number | property """
# Note: Return as is.
return value

@v_args(inline=True)
def not_implemented_string(self, value):
"""not_implemented_string: value
Raises:
NotImplementedError: For further information, see Materials-Consortia/OPTIMADE issue 157:
https://github.com/Materials-Consortia/OPTIMADE/issues/157
"""
raise NotImplementedError("Comparing strings is not yet implemented.")

def property(self, arg):
""" property: IDENTIFIER ( "." IDENTIFIER )* """
return ".".join(arg)

@v_args(inline=True)
def string(self, string):
""" string: ESCAPED_STRING """
return string.strip('"')

@v_args(inline=True)
def signed_int(self, number):
""" signed_int : SIGNED_INT """
return int(number)

@v_args(inline=True)
def number(self, number):
""" number: SIGNED_INT | SIGNED_FLOAT """
if number.type == "SIGNED_INT":
type_ = int
elif number.type == "SIGNED_FLOAT":
type_ = float
return type_(number)

@v_args(inline=True)
def comparison(self, value):
""" comparison: constant_first_comparison | property_first_comparison """
# Note: Return as is.
return value

@abc.abstractmethod
def value_list(self, arg):
""" value_list: [ OPERATOR ] value ( "," [ OPERATOR ] value )* """

@abc.abstractmethod
def value_zip(self, arg):
""" value_zip: [ OPERATOR ] value ":" [ OPERATOR ] value (":" [ OPERATOR ] value)* """

@abc.abstractmethod
def value_zip_list(self, arg):
""" value_zip_list: value_zip ( "," value_zip )* """

@abc.abstractmethod
def expression(self, arg):
""" expression: expression_clause ( OR expression_clause ) """

@abc.abstractmethod
def expression_clause(self, arg):
""" expression_clause: expression_phrase ( AND expression_phrase )* """

@abc.abstractmethod
def expression_phrase(self, arg):
""" expression_phrase: [ NOT ] ( comparison | "(" expression ")" ) """

@abc.abstractmethod
def property_first_comparison(self, arg):
"""property_first_comparison: property ( value_op_rhs | known_op_rhs | fuzzy_string_op_rhs | set_op_rhs |
set_zip_op_rhs | length_op_rhs )
"""

@abc.abstractmethod
def constant_first_comparison(self, arg):
""" constant_first_comparison: constant OPERATOR ( non_string_value | not_implemented_string ) """

@v_args(inline=True)
@abc.abstractmethod
def value_op_rhs(self, operator, value):
""" value_op_rhs: OPERATOR value """

@abc.abstractmethod
def known_op_rhs(self, arg):
""" known_op_rhs: IS ( KNOWN | UNKNOWN ) """

@abc.abstractmethod
def fuzzy_string_op_rhs(self, arg):
""" fuzzy_string_op_rhs: CONTAINS value | STARTS [ WITH ] value | ENDS [ WITH ] value """

@abc.abstractmethod
def set_op_rhs(self, arg):
""" set_op_rhs: HAS ( [ OPERATOR ] value | ALL value_list | ANY value_list | ONLY value_list ) """

@abc.abstractmethod
def length_op_rhs(self, arg):
""" length_op_rhs: LENGTH [ OPERATOR ] value """

@abc.abstractmethod
def set_zip_op_rhs(self, arg):
"""set_zip_op_rhs: property_zip_addon HAS ( value_zip | ONLY value_zip_list | ALL value_zip_list |
ANY value_zip_list )
"""

@abc.abstractmethod
def property_zip_addon(self, arg):
""" property_zip_addon: ":" property (":" property)* """

0 comments on commit f5e7b57

Please sign in to comment.