# Part 1: Neat Data Model - Transformation Rules

[![Notebook](https://shields.io/badge/notebook-access-green?logo=jupyter&style=for-the-badge)](https://github.com/cognitedata/neat/blob/main/docs/tutorial/notebooks/part-1-data-model-generation.ipynb)

* author: Nikola Vasiljevic, Anders Albert
* date: 2023-10-23


**Prerequisite**: Installed Python with `excel` dependency `pip install cognite-neat[excel]`

**Content** This notebook represent Part 1 of NEAT Onboarding tutorial. In this notebook we will demonstrate how to export data model using NEAT.

This part 1 of a series of tutorials focused on learning the core concepts of `neat` through using it as a package.

## Rules

The *Rules* is a core concept of `neat`. It is how `neat` internally represent a data model with optional knowledge graph transformations from a source to a target (e.g., domain or solution) model. We will go into more detail for the *Rules* in a later tutorial, but for now it is sufficient to note that the *Rules* are exposed to the user of `neat` through a four tables in a spreadsheet. For more information about *Rules* check [this detail overview](../../transformation-rules.html). 

Data modeling flow in `neat` are visually presented in the figure below:

![NEAT Data Modeling Flow](../../figs/data-modeling-flow.png)

## Parsing Transformation Rules

To get started, we use the built in example ` power_grid_model`, which we will import using `ExcelImporter`.

In [1]:
from cognite.neat.rules import importer
from cognite.neat.rules.examples import power_grid_model

%reload_ext autoreload
%autoreload 2

In [2]:
power_rules = importer.ExcelImporter(power_grid_model).to_rules()

In [3]:
power_rules.metadata

Unnamed: 0,value
prefix,power-grid
suffix,power_grid
namespace,http://purl.org/cognite/power-grid#
version,0.1.0
name,Power Grid Example Data Model
description,This is simplified power grid data model used ...
created,2022-09-29 00:00:00
updated,2023-12-06 14:22:14.649091
creator,"[Nikola Vasiljevic, Anders Albert]"
contributor,[Cognite]


As we see above, the example is simply a excel file that we can parse to obtain the *Rules*.
We can inspect the different sheets of the rules using the properties below

In [4]:
power_rules.metadata

Unnamed: 0,value
prefix,power-grid
suffix,power_grid
namespace,http://purl.org/cognite/power-grid#
version,0.1.0
name,Power Grid Example Data Model
description,This is simplified power grid data model used ...
created,2022-09-29 00:00:00
updated,2023-12-06 14:22:14.649091
creator,"[Nikola Vasiljevic, Anders Albert]"
contributor,[Cognite]


In [5]:
power_rules.classes

Unnamed: 0,description,cdf_resource_type,deprecated,class_id,class_name,parent_asset
0,,Asset,False,GeographicalRegion,GeographicalRegion,
1,A subset of a geographical region of a power s...,Asset,False,SubGeographicalRegion,SubGeographicalRegion,GeographicalRegion
2,A substation is a part of an electrical genera...,Asset,False,Substation,Substation,SubGeographicalRegion
3,,Asset,False,Terminal,Terminal,Substation


In [6]:
power_rules.properties

Unnamed: 0,description,cdf_resource_type,deprecated,class_id,property_id,property_name,expected_value_type,min_count,max_count,property_type,resource_type_property,source_type,target_type,label,rule_type,rule,skip_rule
0,The name that identifies Greographical,[Asset],False,GeographicalRegion,name,name,"{'prefix': 'xsd', 'suffix': 'string', 'name': ...",1,1,DatatypeProperty,[name],Asset,Asset,name,rdfpath,cim:GeographicalRegion(cim:IdentifiedObject.name),False
1,The name that identifies SubGreographical,[Asset],False,SubGeographicalRegion,name,name,"{'prefix': 'xsd', 'suffix': 'string', 'name': ...",1,1,DatatypeProperty,[name],Asset,Asset,name,rdfpath,cim:SubGeographicalRegion(cim:IdentifiedObject...,False
2,Region to which subgeographical region belongs to,"[Asset, Relationship]",False,SubGeographicalRegion,region,region,"{'prefix': 'power-grid', 'suffix': 'Geographic...",1,1,ObjectProperty,"[metadata, parent_external_id]",Asset,Asset,belongsTo,rdfpath,cim:SubGeographicalRegion(cim:SubGeographicalR...,False
3,The name that identifies Substation,[Asset],False,Substation,name,name,"{'prefix': 'xsd', 'suffix': 'string', 'name': ...",1,1,DatatypeProperty,[name],Asset,Asset,name,rdfpath,cim:Substation(cim:IdentifiedObject.name),False
4,The subgeographical region containing the subs...,[Asset],False,Substation,subGeographicalRegion,subGeographicalRegion,"{'prefix': 'power-grid', 'suffix': 'SubGeograp...",1,1,ObjectProperty,"[metadata, parent_external_id]",Asset,Asset,subGeographicalRegion,rdfpath,cim:Substation(cim:Substation.Region),False
5,The name that identifies Terminal,[Asset],False,Terminal,name,name,"{'prefix': 'xsd', 'suffix': 'string', 'name': ...",1,1,DatatypeProperty,[name],Asset,Asset,name,rdfpath,cim:Terminal(cim:IdentifiedObject.name),False
6,The alternative name that identifies Substation,[Asset],False,Terminal,aliasName,aliasName,"{'prefix': 'xsd', 'suffix': 'string', 'name': ...",1,1,DatatypeProperty,[name],Asset,Asset,aliasName,rdfpath,cim:Terminal(cim:IdentifiedObject.aliasName),False
7,Substation to which terminal belongs to,"[Asset, Relationship]",False,Terminal,substation,substation,"{'prefix': 'power-grid', 'suffix': 'Substation...",1,1,ObjectProperty,"[metadata, parent_external_id]",Asset,Asset,belongsTo,rdfpath,cim:Terminal->cim:ConnectivityNode->cim:Voltag...,False


### (Optional) Advanced: Create Your Own Transformation Rules

Before proceeding download `Rule` template using [this link](https://drive.google.com/uc?export=download&id=1yJxK35IaKVpZJas60ojReCjh-Ppj9fKX). Unzip file and open template:


<video src="../../videos/tutorial-1-download-rules-template.mp4" controls>
</video>

Let's now fill in the template sheet, going sheet by sheet in the following order
- `Metadata` : where we will provide metadata about data model itself
- `Classes` : where we will defined classes
- `Properties`: where we will define properties for each of defined classes


<video src="../../videos/tutorial-1-defining-data-model.mp4" controls>
</video>


Once we are done with filling in the template sheet, lets parse it as we we did with the `power_grid_model` shown above.


## Export Data Model

After the *Rules* have been imported, `neat` supports multiple export formats. In this tutorial, we will show `GraphQL` and `OWL`

The *Rules* exporters are available in the `exporter` module of the `neat.rules` package

In [7]:
from cognite.neat.rules import exporter
from pathlib import Path
import tempfile

### GraphQL Schema Exporter

In [8]:
file = tempfile.NamedTemporaryFile(suffix=".graphql")

exporter.GraphQLSchemaExporter(power_rules)._export_to_file(Path(file.name))

GraphQL Schema is accessible under `.data` attribute. If we now print its content we can see how each of the objects (i.e. classes) are defined and represented in GraphQL:

In [9]:
print(open(file.name).read())

type GeographicalRegion {
  name: String!
}

type SubGeographicalRegion {
  name: String!
  region: GeographicalRegion
}

type Substation {
  name: String!
  subGeographicalRegion: SubGeographicalRegion
}

type Terminal {
  name: String!
  aliasName: String!
  substation: Substation
}


The derived GraphQL schema now can be uploaded to CDF and resolved as a data model by the Cognite Data Modeling service:

<video src="../../videos/tutorial-1-upload-gql-schema-to-cdf.mp4" controls>
</video>


If you want to store schema as a file you need to pass file path to method `export`:

```
power_graphql.export(filepath=Path("schema_name.graphql"))
```

### OWL and SCHACL Object Constraint

Let's now convert Transformation Rules to OWL based semantic ontology and SHACL object constraints:

In [10]:
file_owl = tempfile.NamedTemporaryFile(suffix=".ttl")
file_shacl = tempfile.NamedTemporaryFile(suffix=".ttl")

exporter.OWLExporter(power_rules)._export_to_file(Path(file_owl.name))
exporter.SHACLExporter(power_rules)._export_to_file(Path(file_shacl.name))

Ontology is stored in RDF Graph accessible, to actually see its content we serialize it and print it out:

In [11]:
print(Path(file_owl.name).read_text())

@prefix dct: <http://purl.org/dc/terms/> .
@prefix owl: <http://www.w3.org/2002/07/owl#> .
@prefix power-grid: <http://purl.org/cognite/power-grid#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

power-grid: a owl:Ontology ;
    rdfs:label "Power Grid Example Data Model" ;
    dct:contributor "Cognite" ;
    dct:created "2022-09-29T00:00:00"^^xsd:dateTime ;
    dct:creator "Anders Albert",
        "Nikola Vasiljevic" ;
    dct:description "This is simplified power grid data model used in NEAT tutorial." ;
    dct:hasVersion "0.1.0" ;
    dct:modified "2023-12-06T14:22:14.649091"^^xsd:dateTime ;
    dct:rights "Free for non-commerical use" ;
    dct:title "Power Grid Example Data Model" ;
    owl:versionInfo "0.1.0" .

power-grid:aliasName a owl:DatatypeProperty ;
    rdfs:label "aliasName" ;
    rdfs:comment "The alternative name that identifies Substation" ;
    

In the same why we access shape constraints:

In [12]:
print(Path(file_shacl.name).read_text())

@prefix power-grid: <http://purl.org/cognite/power-grid#> .
@prefix sh: <http://www.w3.org/ns/shacl#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

power-grid:TerminalShape a sh:NodeShape ;
    sh:property [ sh:datatype xsd:string ;
            sh:maxCount 1 ;
            sh:minCount 1 ;
            sh:nodeKind sh:Literal ;
            sh:path power-grid:name ],
        [ sh:maxCount 1 ;
            sh:minCount 1 ;
            sh:node power-grid:SubstationShape ;
            sh:nodeKind sh:IRI ;
            sh:path power-grid:substation ],
        [ sh:datatype xsd:string ;
            sh:maxCount 1 ;
            sh:minCount 1 ;
            sh:nodeKind sh:Literal ;
            sh:path power-grid:aliasName ] ;
    sh:targetClass power-grid:Terminal .

power-grid:GeographicalRegionShape a sh:NodeShape ;
    sh:property [ sh:datatype xsd:string ;
            sh:maxCount 1 ;
            sh:minCount 1 ;
            sh:nodeKind sh:Literal ;
            sh:path power-grid:name ] ;
   