Merge 843f8b3 into e550dc8

textX · Sep 9, 2020 · cd16c8c · cd16c8c
2 parents e550dc8 + 843f8b3
commit cd16c8c
Show file tree

Hide file tree

Showing 16 changed files with 1,587 additions and 54 deletions.
diff --git a/docs/rrel.md b/docs/rrel.md
@@ -0,0 +1,183 @@
+# Reference resolving expression language (RREL)
+RREL allows to specify scope provider (lookup) specification in the
+grammar itself ([grammar example](tests/functional/test_scoping/components_model1/ComponentsRrel.tx) and 
+an example [test](https://github.com/textX/textX/blob/master/tests/functional/examples/test_hierarchical_data_structures_referencing_attributes.py)).
+
+The idea is to support all current builtin scoping providers (e.g., `FQN`,
+`RelativeName` etc.; see [scoping](scoping.md)) while the user would have to resort to Python only to
+support some very specific cases or referring to models not handled by textX.
+
+# Reference resolving expression language (RREL)
+
+Each reference in the model forms a dot separated name, match by the second part
+of the grammar reference, where a plain ID is just a special case. For example,
+a reference could be `package1.component4` or just `component4`. We could further
+generalize this by saying that a reference is a sequence of names where a plain
+ID is just a sequence of length 1. It doesn't have to be a dot separated. A user
+could provide a match (like `FQN` in the above example) and a match processor to
+convert the matched string to a sequence of names. But for simplicity sake in
+this text we assume that the name is a dot separated string which consists of
+name parts separated with dots.
+
+
+For reference resolving as an input we have:
+- Dot separated name where ID is a special case
+- RREL expression
+
+We evaluate RREL expression using the name in the process and we yield referenced
+object or an error.
+
+## RREL operators
+
+Reference resolving expression language (RREL) consists of several operators (see [test](tests/functional/test_scoping/test_rrel.py)):
+- `.` Dot navigation. Search for the attribute in the current AST context. Can
+  be used for navigation up the parent chain, e.g. `.` is this object, `..` is
+  parent, `...` is a parent of a parent. If the expression starts with a `.`
+  than we have a relative path starting from the current AST context. Otherwise
+  we have an absolute path starting from the root of the model unless `^` is
+  used (see below). For example, `.a.b` means search for `a` attribute at the
+  current level (`.`) and than search for `b` attribute. Expression `a.b` would
+  search starting from the root of the model.
+- `parent(TYPE)` - navigate up the parent chain until the exact type is found.
+- `~` This is a marker applied to a path element to inform resolver that the
+  current collection should not be searched by the current name part but that
+  all elements should be processed. For example, to search for a method in the
+  inheritance hierarchy one would write `~extends*.methods` which (due to `*`,
+  see below) first searches `methods` collection of the current context object,
+  if not found, all elements of the current `extends` collection are iterated in
+  the order of defintion without consuming name part, and then name would be
+  searched in the `methods` collection of each object from the `extends`
+  collection. If not found `*` would expand `extends` to `extends.extends` if
+  possible and the search would continue.
+- `*` - Repeat/expand. Used in expansion step to expand sub-expression by 0+
+  times. First expansion tried will be 0, then once, then twice etc. For
+  example, `~extends*.methods` would search in `methods` collection in the
+  current context object for the current name part. If not found expansion of
+  `*` would took place and we would search in `~extends.methods` by iterating
+  over `extends` collection without consuming part name (due to `~`) and
+  searching by ref. name part inside `methods` collection of each iterated
+  object. The process would continue (i.e. `~extends.~extends.methods` ...)
+  until no more expansion is possible as we reach the end of inheritance chain.
+- `^` - Bottom-up search. This operator specifies that the given path should be
+  expanded bottom-up, up the parent chain. The search should start at the
+  current AST context and go up the parent chain for the number of components in
+  the current expanded path. Then the match should be tried. See the components
+  example above using `^` in `extends`. For example, `^a.b.c` would start from
+  the current AST level and go to the parent of the parent, search there for
+  `a`, then would search for `b` in the context of the previous AST search
+  result, and finally would search for attribute `c`. `^` is a marker applied to
+  path search subexpression, i.e. it doesn't apply to the whole sequence (see
+  below).
+- `,` - Defines a sequence, i.e. a sequence of RREL expressions which should
+  tried in order.
+
+Priorities from highest to lowest: `*`, `.`, `,`.
+
+`~` and `^` are regarded as markers, not operators.
+
+## RREL evaluation
+
+Evaluation goes like this:
+1. Expand the expression. Expand `*` starting from 0 times.
+2. Match/navigate the expression (consume part names in the process)
+3. Repeat
+
+The process stops when either:
+- all possibilities are exhausted and we haven't find anything -> error.
+- in `*` we came to a situation where we consume all part names before we
+  finished with the RREL expression -> error.
+- We have consumed all path name elements, finished with RREL expression and
+  found the object. If the type is not the same as the type given in the grammar
+  reference we report an error, else we found our object.
+
+## RREL reference name deduction
+
+The name of a referenced object is transformed into a list of non-empty
+name parts, which is processed by a RREL expression to navigate through the
+model. Possible names are defined in the grammar, e.g. `FQN` in the
+following example (used in rule `Attribute` to reference a model class:
+
+    Model:     packages*=Package;
+    Package:   'package' name=ID '{' classes*=Class '}';
+    Class:     'class' name=ID '{' attributes*=Attribute '}';
+    Attribute: 'attr' ref=[Class|FQN|^packages*.classes] name=ID ';';
+    Comment:   /#.*/;
+    FQN:       ID('.'ID)*;
+
+The name of a reference (`Attribute.ref`) could then be,
+e.g., `P1.Part1` (the package `P1` and the class `Part1`),
+separated by a dot. The **dot is the default separator**
+(if no other separator is specified).
+
+    package P1 {
+        class Part1 {
+        }
+    }
+    package P2 {
+        class Part2 {
+            attr C2 rec;
+        }
+        class C2 {
+            attr P1.Part1 p1;
+            attr Part2 p2a;
+            attr P2.Part2 p2b;
+        }
+    }
+
+The match rule used to specify possible reference names (e.g., `FQN`)
+can **specify a separator used to split the reference name into individual
+name parts**. Use the rule parameter `split`, which must be a non-empty
+string (e.g. `split='/'`; note that the match rule itself should produce
+names, for which the given separator makes sense):
+
+    Model:          packages*=Package;
+    Package:        'package' name=ID '{' classes*=Class '}';
+    Class:          'class' name=ID '{' attributes*=Attribute '}';
+    Attribute:      'attr' ref=[Class|FQN|^packages*.classes] name=ID ';';
+    Comment:        /#.*/;
+    FQN[split='/']: ID('/'ID)*;  // separator split='/'
+
+Then the RREL scope provider (using the match rule with the extra
+rule parameter `split`) automatically uses the given split
+character to process the name.
+
+## RREL and multi files model
+
+Use the prefix `+m:` for an RREL expression to activate a multi file model
+scope provider. Then, in case of no match, other loaded models are searched.
+When using this extra prefix the importURI feature is activated
+(see [scoping](scoping.md) and
+[grammar example](https://github.com/textX/textX/blob/master/tests/functional/registration/projects/data_dsl/data_dsl/Data.tx)).
+
+## Using RREL from Python code
+
+RREL expression could be used during registration in place of scoping provider.
+For example:
+
+```Python
+my_meta_model.register_scope_providers({
+        "*.*": scoping_providers.FQN(),
+        "Connection.from_port": "from_inst.component.slots"  # RREL
+        "Connection.to_port": "from_inst.component.slots"      # RREL
+    })
+```
+
+## RREL processing (internal)
+
+RREL expression are parsed when the grammar is loaded and transformed to AST
+consisting of RREL operator nodes (each node could be an instance of `RREL`
+prefixed class, e.g `RRELSequence`). The expressions ASTs are stateless and thus
+it is an important possibility to define the same expression for multiple
+attributes by using wildcard as the same expression tree would be used for the
+evaluation.
+
+In the process of evaluation the root of the expression tree is given the
+sequence of part names and the current context which represent the parent object
+of the reference in the model AST. The evaluation is then carried out by
+recursive calls of the RREL AST nodes. Each node gets the AST context consisting
+of a collection of objects from the model and a current unconsumed part names
+collection, which are the result of the previous operation or, in the case of
+the root expression AST node, an initial input. Each operator object should
+return the next AST context and the unconsumed part of the name. At the end of
+the successful search AST context should be a single object and the names parts
+should be empty.
diff --git a/mkdocs.yml b/mkdocs.yml
@@ -15,6 +15,7 @@ nav:
   - Meta-model: metamodel.md
   - Model: model.md
   - Scoping: scoping.md
+  - Reference resolving expression language: rrel.md
   - Multi meta-model: multimetamodel.md
   - Parser configuration: parser_config.md
   - Registration/Discovery: registration.md

diff --git a/tests/functional/examples/test_hierarchical_data_structures_referencing_attributes.py b/tests/functional/examples/test_hierarchical_data_structures_referencing_attributes.py
@@ -25,6 +25,30 @@ class RefItem(object):
     valref = attr.ib()
 
 
+model_text = '''
+    struct A {
+        val x
+    }
+    struct B {
+        val a: A
+    }
+    struct C {
+        val b: B
+        val a: A
+    }
+    struct D {
+        val c: C
+        val b1: B
+        val a: A
+    }
+    instance d: D
+    instance a: A
+    reference d.c.b.a.x
+    reference d.b1.a.x
+    reference a.x
+'''
+
+
 def test_referencing_attributes():
     """
     The key idea is that the list of references to "Val"s in the
@@ -34,7 +58,6 @@ def test_referencing_attributes():
     With this, the list "refs" to "RefItem"s in the "Reference" object is
     build completely during initial parsing. The references inside the
     "RefItem"s, can the be resolved on after the other...
-
     We also show how to handle custom classes here.
     """
     grammar = '''
@@ -53,28 +76,6 @@ def test_referencing_attributes():
     RefItem:
         '.' valref=[Val];
     '''
-    model_text = '''
-    struct A {
-        val x
-    }
-    struct B {
-        val a: A
-    }
-    struct C {
-        val b: B
-        val a: A
-    }
-    struct D {
-        val c: C
-        val b1: B
-        val a: A
-    }
-    instance d: D
-    instance a: A
-    reference d.c.b.a.x
-    reference d.b1.a.x
-    reference a.x
-    '''
 
     for classes in [[], [Instance, Reference, RefItem]]:
 
@@ -126,7 +127,7 @@ def ref_scope(refItem, myattr, attr_ref):
         assert m.references[2].refs[0].valref == m.structs[0].vals[0]
 
         # negative tests
-        # error: "not_there" not pasrt of A
+        # error: "not_there" not part of A
         with raises(textx.exceptions.TextXSemanticError,
                     match=r'.*Unknown object.*not_there.*'):
             mm.model_from_str('''
@@ -153,3 +154,117 @@ def ref_scope(refItem, myattr, attr_ref):
             instance c: C
             reference c.b.a.x
             ''')
+
+
+def test_referencing_attributes_with_rrel_all_in_one():
+    """
+    RREL solution: all scope provider information encoded in the grammar.
+    """
+
+    mm = metamodel_from_str('''
+        Model:
+            structs+=Struct
+            instances+=Instance
+            references+=Reference;
+        Struct:
+            'struct' name=ID '{' vals+=Val '}';
+        Val:
+            'val' name=ID (':' type=[Struct])?;
+        Instance:
+            'instance' name=ID (':' type=[Struct])?;
+        Reference:
+            'reference' ref=[Val|FQN|instances.~type.vals.(~type.vals)*];
+        FQN: ID ('.' ID)*;
+        ''')
+    m = mm.model_from_str(model_text)
+    m.references[-1].ref == m.structs[0].vals[0]  # a.x
+
+    assert m.references[0].ref.name == 'x'
+    assert m.references[0].ref == m.structs[0].vals[0]
+
+    assert m.references[1].ref == m.structs[0].vals[0]
+
+    assert m.references[2].ref.name == 'x'
+    assert m.references[2].ref == m.structs[0].vals[0]
+
+    # negative tests
+    # error: "not_there" not part of A
+    with raises(textx.exceptions.TextXSemanticError,
+                match=r'.*Unknown object "c.b.a.not_there".*'):
+        mm.model_from_str('''
+        struct A { val x }
+        struct B { val a: A}
+        struct C {
+            val b: B
+            val a: A
+        }
+        instance c: C
+        reference c.b.a.not_there
+        ''')
+
+    # error: B.a is not of type A
+    with raises(textx.exceptions.TextXSemanticError,
+                match=r'.*Unknown object "c.b.a.x".*'):
+        mm.model_from_str('''
+        struct A { val x }
+        struct B { val a }
+        struct C {
+            val b: B
+            val a: A
+        }
+        instance c: C
+        reference c.b.a.x
+        ''')
+
+
+def test_referencing_attributes_with_rrel_all_in_one_splitstring():
+    """
+    RREL solution: variation with diffferent split string specified in match rule.
+    """
+
+    mm = metamodel_from_str('''
+        Model:
+            structs+=Struct
+            instances+=Instance
+            references+=Reference;
+        Struct:
+            'struct' name=ID '{' vals+=Val '}';
+        Val:
+            'val' name=ID (':' type=[Struct])?;
+        Instance:
+            'instance' name=ID (':' type=[Struct])?;
+        Reference:
+            'reference' instance=[Instance]
+            '.' ref=[Val|FQN|.~instance.~type.vals.(~type.vals)*];
+        FQN[split='->']: ID ('->' ID)*;
+        ''')
+    m = mm.model_from_str('''
+        struct A {
+            val x
+        }
+        struct B {
+            val a: A
+        }
+        struct C {
+            val b: B
+            val a: A
+        }
+        struct D {
+            val c: C
+            val b1: B
+            val a: A
+        }
+        instance d: D
+        instance a: A
+        reference d.c->b->a->x
+        reference d.b1->a->x
+        reference a.x
+    ''')
+
+    assert m.references[0].ref.name == 'x'
+    assert m.references[0].ref == m.structs[0].vals[0]
+
+    assert m.references[1].ref == m.structs[0].vals[0]
+
+    assert m.references[2].ref.name == 'x'
+    assert m.references[2].ref == m.structs[0].vals[0]
diff --git a/tests/functional/registration/projects/data_dsl/data_dsl/Data.tx b/tests/functional/registration/projects/data_dsl/data_dsl/Data.tx
@@ -4,6 +4,6 @@ Model: includes*=Include data+=Data;
 Data: 'data' name=ID '{'
     attributes+=Attribute
 '}';
-Attribute: name=ID ':' type=[t.Type];
+Attribute: name=ID ':' type=[t.Type|ID|+m:types];
 Include: '#include' importURI=STRING;
 Comment: /\/\/.*$/;