Skip to content

Transformation Rule Language

Josef Hardi edited this page Mar 6, 2020 · 28 revisions

Introduction

Note: This wiki page is for MappingMaster v2.0 and it is not released yet

MappingMaster has a template processor to combine a template string with values provided in a spreadsheet to produce OWL axioms. The template string is written in OWL2 Manchester Syntax with the addition of having a cell reference symbol to indicate the value's location in the spreadsheet. We often called our template string as the transformation rule.

An example of a MappingMaster transformation rule:

Class: @A1
   SubClassOf: GroceryItem, containsFoodStuff some @B1
   Annotations: schema:description @C1(xml:lang="en"),
                schema:image @D1(IRI)

Given the following spreadsheet:

Will produce the following axioms:

Class: LotusBiscoffCookiesItem
   SubClassOf: GroceryItem, containsFoodStuff some BiscoffCookie
   Annotations: schema:description "For decades, Lotus Biscoff [...] to be served on airplanes."@en,
                schema:image http://i.imgur.com/E2DQvDy.jpg

We are going to provide more information about the features of the language in the following sections.


Table of Content


Cell Reference

Cell references are indicated by the @ prefix. @A1 is a cell reference, as is @B*, @*2 and @**. The asterisk (*) symbol represents a wildcard to be replaced by a valid character given a cell range. Using these examples, let's talk about the different types of cell references.

■ Absolute references

The most basic cell reference type is the absolute reference. A @A1 reference points to a cell in the spreadsheet that is located at the intersection of column A and row 1. The MappingMaster processor will look at this coordinate and fetch the value before it finally replaces the symbol with the value in the rule string.

Transformation rule:

Class: @A1

Spreadsheet:

Output:

Class: Wheat

■ Row-relative references

Relative references need a cell range to operate. A cell range is a collection of selected cells and is defined by the reference of the upper-left cell (minimum value) and the reference of the lower-right cell (maximum value). MappingMaster supports only symmetrical ranges (square-shaped) instead of irregular ranges.

An asterisk (*) is used to indicate relative references. A row-relative reference puts the asterisk sign in the row position.

Transformation rule:

Class: @A*

Cell range: A1 to A4

Spreadsheet:

Output:

Class: Wheat
Class: Egg
Class: Flour
Class: Bean

■ Column-relative references

Similarly, a column-relative reference puts the asterisk sign in the column position.

Transformation rule:

Class: @*1

Cell range: A1 to D1

Spreadsheet:

Output:

Class: ChiaFlour
Class: OatFlour
Class: RiceFlour
Class: WheatFlour

■ Table-relative references

A table-relative reference puts the asterisk in both row and column positions. MappingMaster will scan the cell range by taking a column, iterating over its cells, and repeat until reaching the last column.

Transformation rule:

Class: @**

Cell range: A1 to C3

Spreadsheet:

Output:

Class: Wheat
Class: Bean
Class: Egg
Class: Oat
Class: Millet
Class: Rice
Class: Salt
Class: Sugar
Class: Honey

Directives

Directives are the special parameterized commands or keywords that will set the behavior of the MappingMaster's data transformation operation. A directive always comes after the use of a cell reference, e.g., @A2(mm:snakeCaseEncode), and users can have none or many directives.

■ Prefix assignment

Symbol Type Note
mm:Prefix="<value>" Parameterized The expected value is a valid prefix that is already known by Protege.

Transformation rule:

Class: @A1(mm:Prefix="ex")

Spreadsheet:

Output:

Class: ex:BiscoffCookie

■ Language assignment

Symbol Type Note
xml:lang="<value>" Parameterized The expected value is a valid languange code (see ISO 639-1).

Transformation rule:

Class: @A1
   Annotations: schema:description @C2(xml:lang="en")

Spreadsheet:

Output:

Class: BiscoffCookie
   Annotations: schema:description "For decades, Lotus Biscoff [...] to be served on airplanes."@en,

■ Entity indicator

Symbol Type Note
Class Keyword Indicate the cell as an OWL Class.
Individual Keyword Indicate the cell as an OWL Individual.
ObjectProperty Keyword Indicate the cell as an OWL Object Property.
DataProperty Keyword Indicate the cell as an OWL Data Property.
AnnotationProperty Keyword Indicate the cell as an OWL Annotation Property.
Literal Keyword Indicate the cell as an OWL Literal.
Datatype Keyword Indicate the cell as an OWL Datatype.
IRI Keyword Indicate the cell as an IRI.

Transformation rule:

Individual: @A1
   Annotations: schema:image @C3(IRI)

Spreadsheet:

Output:

Class: BiscoffCookie
   Facts: schema:image http://i.imgur.com/E2DQvDy.jpg

■ Value type indicator

Symbol Type Note
mm:entityIRI Keyword Indicate the entity name in an IRI
mm:entityTerm Keyword Indicate the entity name in a prefixed name.

Transformation rule:

Class: @A1
   Annotations: @B7(ObjectProperty) @C7(mm:entityIRI)

Spreadsheet:

Output:

Class: BiscoffCookie
   Annotations: isSoldBy <http://example.org/stores/Costco_Palo_Alto>

■ Datatype casting

Symbol Type Note
xsd:string Keyword Cast the cell as a string-typed value.
xsd:decimal Keyword Cast the cell as a decimal-typed value.
xsd:byte Keyword Cast the cell as a byte-typed value.
xsd:short Keyword Cast the cell as a short-typed value.
xsd:integer Keyword Cast the cell as an integer-typed value.
xsd:long Keyword Cast the cell as a long-typed value.
xsd:float Keyword Cast the cell as a float-typed value.
xsd:double Keyword Cast the cell as a double-typed value.
xsd:boolean Keyword Cast the cell as a boolean-typed value.
xsd:dateTime Keyword Cast the cell as a datetime-typed value.
xsd:time Keyword Cast the cell as a time-typed value.
xsd:duration Keyword Cast the cell as a duration-typed value.
xsd:date Keyword Cast the cell as a date-typed value.
rdf:PlainLiteral Keyword Cast the cell as a plain literal-typed value.

Transformation rule:

Individual: @A1
   Facts: hasTotalFat @C4(xsd:integer)

Spreadsheet:

Output:

Individual: BiscoffCookie
   Facts: hasTotalFat 6

■ IRI encoding

Symbol Type Note
mm:camelCaseEncode Keyword Rewrite the value in a camelCase format.
mm:snakeCaseEncode Keyword Rewrite the value in a snake_case format.
mm:uuidEncode Keyword Rewrite the cell address in a UUID format.
mm:hashEncode Keyword Rewrite the value in a hash string format.

Transformation rule:

Class: @A1(mm:snakeCase)

Spreadsheet:

Output:

Class: Biscoff_Cookie

■ Shift direction

Symbol Type Note
mm:shiftUp Keyword Move the cursor upward until it finds a value.
mm:shiftDown Keyword Move the cursor down until it finds a value.
mm:shiftLeft Keyword Move the cursor left until it finds a value.
mm:shiftRight Keyword Move the cursor right until it finds a value.

Transformation rule:

Class: @A6(mm:shiftUp)

Spreadsheet:

Output:

Class: BiscoffCookie

■ Order if cell empty

Symbol Type Note
mm:createIfCellEmpty Keyword Create an entity (using the UUID encoding) when the cursor gets an empty cell.
mm:ignoreIfCellEmpty Keyword Don't create any entity when the cursor gets an empty cell.
mm:warningIfCellEmpty Keyword Log a warning message when the cursor gets an empty cell.
mm:errorIfCellEmpty Keyword Log an error message and stop the operation when the cursor gets an empty cell.

Transformation rule:

Class: @A6(mm:createIfCellEmpty)

Spreadsheet:

Output:

Class: 8de0b2e9-5413-3215-91d0-1cf84a8834fa

Built-in Functions

The MappingMaster library has a number of functions that are always available for use, and they are called built-in functions. A function always comes after the directives in a cell reference, e.g., @A2(Class mm:shiftUp mm:append("GO_")), and users can only assign a single function at the moment.

■ To upper case

Symbol Type Description
mm:toUpperCase Keyword Convert the string value to upper case.

Transformation rule:

Class: @A1(mm:toUpperCase)

Spreadsheet:

Output:

Class: BISCOFFCOOKIE

■ To lower case

Symbol Type Description
mm:toLowerCase Keyword Convert the string value to lower case.

Transformation rule:

Class: @A1(mm:toLowerCase)

Spreadsheet:

Output:

Class: biscoffcookie

■ Trim

Symbol Type Description
mm:trim Keyword Remove the leading and trailing spaces from a string

Transformation rule:

Class: @A1(mm:trim)

Spreadsheet:

Output:

Class: BiscoffCookie

■ Printf

Symbol Type Description
mm:printf("formatted-string") Parameterized The "formatted-string" must include %s format specifier to display the cell value.

Transformation rule:

Class: @A1
   Annotations: rdfs:label @C1(mm:printf("%s Item"))

Spreadsheet:

Output:

Class: BiscoffCookie
   Annotations: rdfs:label "BiscoffCookie Item"

■ Decimal format

Symbol Type Description
mm:decimalFormat("decimal-format") Parameterized The "decimal-format" contains a formatting pattern to display a number string is a certain way. The special pattern characters can be found here.

Transformation rule:

Individual: @A1
   Facts: hasTotalFat @C4(mm:decimalFormat("###.00"))

Spreadsheet:

Output:

Individual: BiscoffCookie
   Facts: hasTotalFat 6.00

■ Capturing

Symbol Type Description
mm:capturing("regex") Parameterized Extract a sequence of characters that matches the regular expression ("regex").

Transformation rule:

Individual: @A1
   Facts: hasSodium @C6(mm:capturing("([0-9]+)"))

Spreadsheet:

Output:

Individual: BiscoffCookie
   Facts: hasSodium 120

■ Reverse

Symbol Type Description
mm:reverse Keyword Reverse the order of characters in a string.

Transformation rule:

Class: @A1(mm:reverse)

Spreadsheet:

Output:

Class: eikooCffocsiB

■ Replace

Symbol Type Description
mm:replace("old-char", "new-char") Parameterized Replace all occurrence of "old-char" to the "new-char".

Transformation rule:

Class: @A1
   Annotations: rdfs:label @C1(mm:replace("Cookie", "Biscuit"))

Spreadsheet:

Output:

Class: BiscoffCookie
   Annotations: rdfs:label "BiscoffBiscuit"

■ Replace all

Symbol Type Description
mm:replaceAll("regex", "replacement") Parameterized Replace all the sequence of characters matching the regular expression ("regex") with the "replacement" string.

Transformation rule:

Class: @A1
   Annotations: rdfs:label @C1(mm:replaceAll("o", "a"))

Spreadsheet:

Output:

Class: BiscoffCookie
   Annotations: rdfs:label "BiscaffCaakie"

■ Replace first

Symbol Type Description
mm:replaceFirst("regex", "replacement") Parameterized Replace the first sequence of characters matching the regular expression ("regex") with the "replacement" string.

Transformation rule:

Class: @A1
   Annotations: rdfs:label @C1(mm:replaceFirst("o", "a"))

Spreadsheet:

Output:

Class: BiscoffCookie
   Annotations: rdfs:label "BiscaffCookie"

■ Append

Symbol Type Description
mm:append("string") Parameterized Concatenate the cell value after input "string".

Transformation rule:

Class: @A1
   Annotations: rdfs:label @C1(mm:append(" Item"))

Spreadsheet:

Output:

Class: BiscoffCookie
   Annotations: rdfs:label "BiscoffCookie Item"

■ Prepend

Symbol Type Description
mm:prepend("string") Parameterized Concatenate the cell value before the input "string".

Transformation rule:

Class: @A1
   Annotations: rdfs:label @C1(mm:prepend("Pack of "))

Spreadsheet:

Output:

Class: BiscoffCookie
   Annotations: rdfs:label "Pack of BiscoffCookie"

Language Grammar

The grammar for the transformation rule language is defined using a standard BNF notation, which is summarized in the table below.

Construct Syntax Example
non-terminal symbols normal font ClassExpression
terminal symbols single-quoted 'Class:'
zero or more curly braces { ClassExpression }
zero or one square brackets [ ClassExpression ]
alternative vertical bar IRI | Literal
grouping parentheses ( ClassExpression )

■ Transformation Rules

transformationRule ::= ruleExpression

ruleExpression ::= classFrame | individualFrame

classFrame ::= classDeclaration 
      { subClasses
        | equivalentClasses
        | annotationAssertions }

classDeclaration ::= 'Class:' atomicClass

subClasses ::= 'SubClassOf:' classExpression { ',' classExpression }

equivalentClasses ::= 'EquivalentTo:' classExpression { ',' classExpression }

annotationAssertions ::= 'Annotations:' annotation { ',' annotation }

annotation ::= property value

classExpression ::= atomicClass
      | restriction
      | '(' classExpressionList ')'
      | '{' objectList '}'

restriction ::= propertySomeValue
      | propertyOnlyValue
      | propertyHasValue
      | propertyExactCardinalty
      | propertyMinCardinality
      | propertyMaxCardinality

propertySomeValue ::= property 'some' filler

propertyOnlyValue ::= property 'only' filler

propertyHasValue ::= property 'value' value

propertyExactCardinality ::= property 'exactly' cardinalityValue [ filler ]

propertyMinCardinality ::= property 'min' cardinalityValue [ filler ]

propertyMaxCardinality ::= property 'max' cardinalityValue [ filler ]

property ::= iri | prefixedName | cellReference

filler ::= datatype
      | iri
      | prefixedName
      | literal
      | cellReference
      | '(' classExpressionList ')'
      | '{' objectList '}'

classExpressionList ::= unionOfClassExpression | 'not' classExpression

unionOfClassExpression ::= intersectionOfClassExpression { 'or' intersectionOfClassExpression }

intersectionOfClassExpression ::= classExpression { 'and' classExpression }

entityList ::= entity { ',' entity }

entity ::= iri | prefixedName | cellReference

atomicClass ::= iri | prefixedName | cellReference

value ::= iri | prefixedName | literal | cellReference

cardinalityValue ::= integer | cellReference

cellReference ::= '@' cellCoordinate [ '(' { directive } [ builtinFunction ] ')' ]

literal ::= integer | float | string | boolean

■ Directives

directive ::= prefixAssignment
      | languageAssignment
      | entityCast
      | valueCast
      | datatypeCast
      | iriEncoding
      | shiftDirection
      | orderIfCellEmpty

prefixAssignment ::= 'mm:Prefix=' string

languageAssignment ::= 'xml:lang=' string

entityCast ::= 'Class'
      | 'Individual'
      | 'ObjectProperty'
      | 'DataProperty'
      | 'AnnotationProperty'
      | 'Literal'
      | 'Datatype'

valueCast ::= 'IRI' | 'Literal'

datatypeCast ::= 'xsd:string'
      | 'xsd:decimal' 
      | 'xsd:byte'
      | 'xsd:short'
      | 'xsd:integer'
      | 'xsd:long'
      | 'xsd:float'
      | 'xsd:double'
      | 'xsd:boolean'
      | 'xsd:dateTime'
      | 'xsd:time'
      | 'xsd:duration'
      | 'xsd:date'
      | 'rdf:PlainLiteral'

iriEncoding ::= 'mm:camelCaseEncode'
      | 'mm:snakeCaseEncode'
      | 'mm:uuidEncode'
      | 'mm:hashEncode'

shiftDirection ::= 'mm:shiftUp'
      | 'mm:shiftDown'
      | 'mm:shiftLeft'
      | 'mm:shiftRight'

orderIfCellEmpty ::= 'mm:createIfCellEmpty'
      | 'mm:ignoreIfCellEmpty'
      | 'mm:warningIfCellEmpty'
      | 'mm:errorIfCellEmpty'

■ Builtin Functions

builtinFunction ::= 
      ( 'mm:toUpperCase'
        | 'mm:toLowerCase'
        | 'mm:trim'
        | 'mm:printf'
        | 'mm:decimalFormat'
        | 'mm:capturing'
        | 'mm:reverse'
        | 'mm:replace'
        | 'mm:replaceAll'
        | 'mm:replaceFirst'
        | 'mm:append'
        | 'mm:prepend'
      ) [ '(' argumentList ')']

argumentList ::= argument { ',' argument }

argument ::= cellReference | literal