Skip to content

Latest commit

 

History

History
482 lines (373 loc) · 22.4 KB

File metadata and controls

482 lines (373 loc) · 22.4 KB

MontiCore Grammars for Expressions, Literals and Types - an Overview

MontiCore is a language workbench. It uses grammars as primary mechanism to describe DSLs. The extended grammar format allows to compose language components by (1) inheriting, (2) extending, (3) embedding and (4) aggregating grammars (see the reference manual for details). From the grammars a lot of infrastructructure is generated, that is as well composable, can be extended with handwrittten code and most importandly, these extensions and the grammar composition are compatible, which leads to optimal forms of reuse.

The following is a library of language components that the core MontiCore project provides, mainly defined through a primary grammar plus associated Java- and Template-Files. These are available in the MontiCore core project together with short descriptions and their status (Status of Grammars).

The list covers mainly the core grammars to be found in the MontiCore/monticore project under monticore-grammar/src/main/grammars/ in packages

  • de.monticore
  • de.monticore.expressions
  • de.monticore.literals
  • de.monticore.statements
  • de.monticore.symbols
  • de.monticore.types

and some expression/type related grammars in extending MontiCore projects. For more langauges and language components, see here.

General: List of Grammars in package de.monticore

MCBasics.mc4 (stable)

  • This grammar defines absolute basics, such as spaces, Java-like comments and Names. It should be useful in many languages.

Types: List of Grammars in package de.monticore.types

These grammars generally deal with type definitions and build on each other. Some snipets for type definitions:

grammars          some examples
MCBasicTypes      boolean  byte  short  int
                  long  char  float  double
                  void  Person  a.b.Person
                  import a.b.Foo.*;
MCCollectionTypes List<.>   Set<.>
                  Optional<.>   Map<.,.>
MCSimpleGenericTypes
                  Foo<.>  a.b.Bar<.,..,.>
MCFullGenericTypes
                  Foo<? extends .>
                  Foo<? super .>
MCArrayTypes      Person[]  int[][]
SI Unit types     km/h  km/h<long>
RegExType         R"[a-z][0-9*]"
  • This grammar defines basic types. This eases the reuse of type structures in languages similar to Java, that are somewhat simplified, e.g. without generics.
  • The grammar contains types from Java, e.g., primitives, void, classes (also sometimes called "reference types").
  • This grammar defines four generics: List<A>, Map<A,B>, Set<A> and Optional<A> on top of basic types.
  • These four generics correspond to a typical predefined set of generic types for example used in connection with UML class diagrams or the OCL. UML associations typically have those association multiplicities and therefore these types are of interest.
  • This eases the reuse of type structures in languages similar to Java, that are somewhat simplified, e.g. without general generics.
  • This grammar introduces freely defined generic types such as Blubb<A>, Bla<B,C>, Foo<Blubb<D>>
  • These generics are covering a wide range of uses for generic types, although they don't cover type restrictions on the arguments, like in Java.
  • This grammar completes the type definitions to support the full Java type system including wildcards Blubb<? extends A>
  • A general advice: When you are not sure that you need this kind of types, then use a simpler version from above. Type checking ist tricky.

Arrays are orthogonal to the generic extensions and thus be combined with any of the above variants. Language component MCArrayTypes provides possibilities to add arrays, such as Person[] or int[][].

The known units s, m, kg, A, K, mol, cd from the international system of units (SI Units) and their combinations, such as km/h or mg, etc. can be used as ordinary types (instead of only numbers). The typecheck is extended to prevent e.g. assignment of a weight to a length variable or to add appropriate conversion, e.g. when a km/h-based velocity is e.g. stored in a m/s-based variable.

The grammar resides in the MontiCore/SIunits project.

Includes the types from SIUnitTypes4Math(see above), like km/h, but also allows to add a resolution, such as km/h<int>. Here SI Unit types, like km/h<.>, are used as generic type constructor that may take a number type, such as int, long, double, float as argument.

The grammar resides in the MontiCore/SIunits project.

Embedded in R"..." a regular expressions can be used as ordinary type to constrain the values allowed for stored variables, attributes, parameters. Types are e.g. , such as R"[a-z]" (single character) or R"^([01][0-9]|2[0-3])$" (hours). A typecheck for these types can only be executed at runtime and e.g. issue exceptions (or trigger repair functions) if violated. The static typecheck only uses String as underlying carrier type.

This grammar resides in the MontiCore/RegEx (not yet publicly available) project.

Symbols: List of Grammars in package de.monticore.symbols

These two grammars do not provide syntax themselves, but characterize important forms of symbols, that will be used in the type and the expression grammars to define shared kinds of symbols.

  • This grammar defines symbols for Types (of all kinds), Functions, Variables and TypeVariables.
  • The defined symbols are of general form and can be used in functional, OO and other contexts. They do not preculde a concrete syntax and do not yet embody OO specifics.
  • Remark: This grammar is not intended to define concrete or abstract syntax, but the infrastructure for symbols.

OOSymbols.mc4 (stable)

  • This grammar defines symbols for objectoriented Types, Methods, and Fields by mainly extending the symbols defined in BasicTypeSymbols.
  • The newly defined symbols extend the general ones by typical objectoriented features, such as private, static, etc. Again they do not preculde a concrete syntax.
  • Remark: This grammar is not intended to define concrete or abstract syntax, but the infrastructure for symbols in objectoriented context.

Expressions: List of Grammars in package de.monticore.expressions

Expressions are defined in several grammars forming a (nonlinear) hierarchy, so that developers can choose the optimal grammar they want to build on for their language and combine these with the appropriate typing infrastructure.

This modularity of expressions and associated types greatly eases the reuse of type structures in languages similar to Java. Some snipets for operators defined in expressions:

grammar        operators and examples in this grammar
CommonExp:     /  %  +  -  <=  >=  ==  >  <  !=  ~.  !.  .?.:.
               &&  ||  ~. 
AssigementExp: ++  --  =  +=  -=  *=  /=  &=  |=  ^=  >>=  >>>=  <<=  %=
BitExp:        &  |  ^  <<  >>  >>>
OclExp:        implies  <=>  |  &  forall  exists  let.in. .@pre  .[.]  .**
               Set{.|.}
SetExp:        .isin.  .in.  union  intersect  setand  setor
               { item | specifier }
OptionalOps:   ?:  ?<=  ?>=  ?<  ?>  ?==  ?!=  ?~~   ?!~ 
SIUnits:       5km  3,2m/s  22l  2.400J  
JavaClass:     this  .[.]  (.).  super  .instanceof.
  • This grammar defines core interfaces for expressions and imports the kinds of symbols necessary.
  • The symbols are taken over from the TypeSymbols grammar (see below).
  • A hierarchy of conservative extensions to this grammar realize these interfaces in various forms.
  • This grammar defines a typical standard set of operations for expressions.
  • This is a subset of Java as well as OCL/P, mainly for arithmetic, comparisons, variable use (v), attribute use (o.att), method call (foo(arg,arg2)) and brackets (exp).
  • This grammar defines all Java expressions that have side effects.
  • This includes assignment expressions like =, +=, etc. and suffix and prefix expressions like ++, --, etc.
  • This grammar defines a typical standard set of operations for expressions.
  • This is a subset of Java for binary expressions like <<, >>, >>>, &, ^ and |
  • This grammar defines expressions typical to UMLs OCL . OCL expressions can savely be composed if with other forms of expressions
    given in the MontiCore core project (i.e. as conservative extension).
  • It contains various logical operations, such as quantifiers, the let and the @pre construct, and a transitive closure for associations, as discussed in [Rum17,Rum17].
  • This grammar resides in the MontiCore/OCL (not yet publicly available) project.
  • This grammar defines set expressions like set union, intersection etc. these operations are typical for a logic with set operations, like UML's OCL. These operators are usually infix and are thus more intuitive as they allow math oriented style of specification.
  • Most of these operators are in principle executable, so it might be interesting to include them in a high level programming language (see e.g. Haskell)
  • This grammar resides in the MontiCore/OCL (not yet publicly available) project.
  • This grammar defines nine operators dealing with optional values, e.g. defined by java.lang.Optional. The operators are also called Elvis operators.
  • E.g.: val ?: 0W equals to val.isPresent ? val.get : 0W
  • x ?>= y equals x.isPresent && x.get >= y
  • This grammar resides in the MontiCore/OCL (not yet publicly available) project.

SIUnits.mc4 (not yet publicly available) for Physical SI Units (stable)

  • This grammar the international system of units (SI units), based on the basis units s, m, kg, A, K, mol, cd, provides a variety of derived units, and can be refined using prefixes such as m(milli), k(kilo), etc.
  • The SI Unit grammar provides an extension to expressions, but also to the typing system, e.g. types such as km/h or km/h<long>, and literals, such as e.g. 5.3 km/h.
  • The grammars reside in the MontiCore/SIunits project
  • This grammar defines Java specific class expressions like super, this, type cast, etc.
  • This grammar should only be included, when a mapping to Java is intended and the full power of Java should be available in the modelling language.

Literals: List of Grammars in package de.monticore.literals

Literals are the basic elements of expressions, such as numbers, strings, truth values. Some snipets:

grammar           examples of this grammar
MCCommonLit       3  -3  2.17  -4  true  false  'c' 
                  3L  2.17d  2.17f  0xAF  "string"  
                  "str\uAF01\u0001"  null
MCJavaLiterals    999_999  0x3F2A  0b0001_0101  0567  1.2e-7F
SIUnitLiterals    5.3km/h  7mg
  • This grammar defines core interface for literals.
  • Several conservative extensions to this grammar realize various forms of literals.
  • This grammar defines the typical literals for an expression language, such as characters: 'c', Strings "text", booleans: "true", "null", or numbers 10, -23, 48l, 23.1f.
  • Strings and characters use the Java-like escapes like " ".
  • Each defined nonterminal is extended by a conversion function getValue() of appropriate type and a retrieve function getSource() for a text representation of the literal.
  • This grammar defines Java compliant literals and builds on MCCommonLiterals.
  • The scope of this grammar is to ease the reuse of literals structures in Java-like sublanguages.
  • The grammar contains literals from Java, e.g., Boolean, Char, String, ....
  • Please note that Java (and this grammar) has an extended syntax e.g. for integers using underscores or other kinds of encodings. They parse e.g. 999_999, 0x3F2A, or 0b10100.
  • Like above getValue() and getSource() allow to retrive the content as value resp. as text string.

Provides concrete values, such as 5.3 km/hor 7 mg for the international system of units (SI Units). The grammar resides in the MontiCore/SIunits project.

Statements: List of Grammars in package de.monticore.statements

Statements are the constructive part of programs: They allow to change variables, call functions, send messages etc. The following hierarchy of statement definitions should allow the developers to choose needed forms of statements and extend it by their own additional needs. The provided list of statements is inspired by Java (actually subset of Java). Some example statements:

int i;   int j = 2;                     Person p[] = { foo(3+7), p2, ...}
if (.) then . else .                    for ( i = .; .; .) {.}
while (.) .                             do . while (.)
switch (.) { case .: .; default: .}
foo(1,2,3)                              return .                                
assert . : "..."
try {.} catch (.) {.} finally {.}       throw .           
break .                                 continue .
label:                                  private  static  final  native ...
  • This grammar defines the core interface for statements.
  • A hierarchy of conservative extensions to this grammar is provided below.
  • This grammar defines typical statements, such as method calls (which are actually expressions), assignment of variables, if, for, while, switch statements, and blocks.
  • This embodies a complete structured statement language, however does not provide return, assert, exceptions, and low-level constructs like break.
  • This grammar defines exactly the assert statement as known from Java.
  • It can be used independently of other Java statements.
  • This grammar defines the exception statements.
  • This includes Java try with catch and finally, as well as throw.
  • This grammar defines the Java-like synchronized statement.
  • This grammar defines three low-level statements that Java provides.
  • It contains the break and continue statements and the possibility to label a statement.
  • This grammar defines the Java-like return statement.
  • This grammar defines all Java statements.
  • This is neither a generalized approximation nor a restricted overapproximation, but exact.

Further grammars in package de.monticore

several other grammars are also available:

  • This grammar defines regular expressions (RegEx) as used in Java (see e.g. java.util.regex.Pattern).
  • It provides common regex tokens such as
    • character classes, e.g., lowercase letters ([a-z]), the letters a, b, and c ([abc])
    • anchors, e.g., start of line (^), end of line ($), word boundary (),
    • quantifiers, e.g., zero or one (?), zero or more (*), exactly 3 ({3}),
    • RegEx also supports to capture groups and referencing these captured groups in replacements.
  • For example, ^([01][0-9]|2[0-3]):[0-5][0-9]$ matches all valid timestamps in HH:MM format.
  • The main nonterminal RegularExpression is not part of the expression hierarchy and thus regular expressions are not used as ordinary values. Instead the nonterminal RegularExpression is can be used in aother places of a language e.g. we do that as additional restriction for String values in input/output channels in architectural langages.
  • This grammar resides in the MontiCore/RegEx (not yet publicly available) project
  • This grammar defines UML Cardinalities of forms *, [n..m] or [n..*].
  • This grammar defines completeness information in UML like ..., (c), but also (...,c).
  • The grammar contains the modifiers that UML provides.
  • This includes public private, protected, final, abstract, local, derived, readonly, and static, but also the compact syntactic versions +, #, -, / and ? (for readonly).
  • UML modifiers are not identical to Java modifiers (e.g. native or threadsafe are missing.)
  • This grammars defines Stereotypes like <<val1,val2="text",...>>
  • Methods contains(name), getValue(name) assist Stereotype retrieval.
  • Values may only be of type String. The real value unfortunately in UML is only encoded as String.
  • We suggest to use a tagging infrastructure that even allows to type the possible forms of tags.

MCCommon.mc4 (stable)

  • This grammar composes typical UML like grammar components.
  • This includes Cardinality, Completeness, UMLModifier, and UMLStereotype.

JavaLight.mc4 (stable)

int age = 3+x; 
List<Person> myParents;

@Override
public int print(String name, Set<Person> p) {
  int a = 2 + name.length();
  if(a < p.size()) {
    System.out.println("Hello " + name);
  }
  return a;
}
  • JavaLight is a subset of Java that MontiCore itself uses as intermediate language for the code generation process.
  • JavaLight doesn't provide all forms of classes (e.g. inner classes) and reduces the type system to normal generic types.
    However, that is sufficient for representation of all generated pieces of code that MontiCore wants to make.
  • Included are: the full Java expressions (without anonymous classes), the relevant Java statements, declaration of methods, constructors, constants, interface methods, and annotations.
  • JavaLight composes from CommonExpressions, AssignmentExpressions, JavaClassExpressions, MCCommonStatements, MCBasicTypes, and OOSymbols.
  • JavaLight can be used for other generator tools as well, especially as its core templates are reusable and new templates for specific method bodies can be added using MontiCore's Hook-Mechanisms.

Examples for Grammars under monticore-grammar/src/main/examples

These can also be used if someone is interested:

Further Information