# Part 1: Export Data Model to CDF

[![Notebook](https://shields.io/badge/notebook-access-green?logo=jupyter&style=for-the-badge)](https://github.com/cognitedata/neat/blob/main/docs/tutorial/notebooks/part-1-data-model-generation.ipynb)

* author: Nikola Vasiljevic, Anders Albert
* date: 2023-10-07


**Prerequisite**: Installed Python with `excel` dependency `pip install cognite-neat[excel]`

**Content** This notebook represent Part 1 of NEAT Onboarding tutorial. In this notebook we will demonstrate how to export data model using NEAT.

This part 1 of a series of tutorials focused on learning the core concepts of `neat` through using it as a package.

## Transformation Rules

The *Transformation Rules* is a core concept of `neat`. It is how `neat` internally represent a data model with the transformations from a source model. We will go into more detail for the *Transformation Rules* in a later tutorial, but for now it is sufficient to note that the *Transformation Rules* are exposed to the user of `neat` through a four tables in a spreadsheet. For more information about *Transformation Rules* check [this detail overview](../../transformation-rules.html). 

## Parsing Transformation Rules

To get started, we use the built in example ` power_grid_model`.

In [1]:
from cognite.neat.rules import parser
from cognite.neat.rules.examples import power_grid_model

In [2]:
power_grid_model.suffix

'.xlsx'

In [3]:
power_rules = parser.parse_rules_from_excel_file(power_grid_model)
power_rules

Unnamed: 0,value
prefix,power-grid
cdf_space_name,playground
namespace,http://purl.org/cognite/power-grid#
data_model_name,power_grid
version,0_1_0
is_current_version,True
created,2022-09-29 00:00:00
updated,2023-10-07 07:30:51.310854
title,Power Grid Example Data Model
description,This is simplified power grid data model used ...


As we see above, the example is simply a excel file that we can parse to obtain the *Transformation Rules*.
We can inspect the different sheets of the rules using the properties below

In [4]:
power_rules.metadata

Unnamed: 0,value
prefix,power-grid
cdf_space_name,playground
namespace,http://purl.org/cognite/power-grid#
data_model_name,power_grid
version,0_1_0
is_current_version,True
created,2022-09-29 00:00:00
updated,2023-10-07 07:30:51.310854
title,Power Grid Example Data Model
description,This is simplified power grid data model used ...


In [5]:
power_rules.classes

Unnamed: 0,description,cdf_resource_type,deprecated,deprecation_date,replaced_by,source,source_entity_name,match_type,comment,class_id,class_name,parent_class,parent_asset,Dataset Id,similarTo,similarityScore,equalTo
0,,Asset,False,,,,,,,GeographicalRegion,GeographicalRegion,,,,,,
1,A subset of a geographical region of a power s...,Asset,False,,,,,,,SubGeographicalRegion,SubGeographicalRegion,,GeographicalRegion,,,,
2,A substation is a part of an electrical genera...,Asset,False,,,,,,,Substation,Substation,,SubGeographicalRegion,,,,
3,,Asset,False,,,,,,,Terminal,Terminal,,Substation,,,,


In [6]:
power_rules.properties

Unnamed: 0,description,cdf_resource_type,deprecated,deprecation_date,replaced_by,source,source_entity_name,match_type,comment,class_id,...,source_type,target_type,label,relationship_external_id_rule,rule_type,rule,skip_rule,similarTo,similarityScore,equalTo
0,The name that identifies Greographical,[Asset],False,,,,,,,GeographicalRegion,...,Asset,Asset,name,,rdfpath,cim:GeographicalRegion(cim:IdentifiedObject.name),False,,,
1,The name that identifies SubGreographical,[Asset],False,,,,,,,SubGeographicalRegion,...,Asset,Asset,name,,rdfpath,cim:SubGeographicalRegion(cim:IdentifiedObject...,False,,,
2,Region to which subgeographical region belongs to,"[Asset, Relationship]",False,,,,,,,SubGeographicalRegion,...,Asset,Asset,belongsTo,,rdfpath,cim:SubGeographicalRegion(cim:SubGeographicalR...,False,,,
3,The name that identifies Substation,[Asset],False,,,,,,,Substation,...,Asset,Asset,name,,rdfpath,cim:Substation(cim:IdentifiedObject.name),False,,,
4,The subgeographical region containing the subs...,[Asset],False,,,,,,,Substation,...,Asset,Asset,subGeographicalRegion,,rdfpath,cim:Substation(cim:Substation.Region),False,,,
5,The name that identifies Terminal,[Asset],False,,,,,,,Terminal,...,Asset,Asset,name,,rdfpath,cim:Terminal(cim:IdentifiedObject.name),False,,,
6,The alternative name that identifies Substation,[Asset],False,,,,,,,Terminal,...,Asset,Asset,aliasName,,rdfpath,cim:Terminal(cim:IdentifiedObject.aliasName),False,,,
7,Substation to which terminal belongs to,"[Asset, Relationship]",False,,,,,,,Terminal,...,Asset,Asset,belongsTo,,rdfpath,cim:Terminal->cim:ConnectivityNode->cim:Voltag...,False,,,


### (Optional) Advanced: Create Your Own Transformation Rules

Before proceeding download `Transformation Rule` template using [this link](https://drive.google.com/uc?export=download&id=1yJxK35IaKVpZJas60ojReCjh-Ppj9fKX). Unzip file and open template:


<video src="../../videos/tutorial-1-download-rules-template.mp4" controls>
</video>

Let's now fill in the template sheet, going sheet by sheet in the following order
- `Metadata` : where we will provide metadata about data model itself
- `Classes` : where we will defined classes
- `Properties`: where we will define properties for each of defined classes


<video src="../../videos/tutorial-1-defining-data-model.mp4" controls>
</video>


Once we are done with filling in the template sheet, lets parse it as we we did with the `power_grid_model` shown above.


## Export Data Model

After the *Transformation Rules* have been parsed, `neat` supports multiple export formats. In this tutorial, we will show `GraphQL` and `OWL`

The *Transformation Rules* exporters are available in the `exporter` module of the `neat.rules` package

In [7]:
from cognite.neat.rules import exporter

### GraphQL

In [8]:
power_graphql = exporter.GraphQLSchema.from_rules(power_rules, verbose=True)

If we now print derive GraphQL schema we can see how each of the objects (i.e. classes) are defined and represented in GraphQL:

In [9]:
print(power_graphql.schema)

type GeographicalRegion {
  """
  The name that identifies Greographical
  @name name
  """
  name: String!
}

"""
A subset of a geographical region of a power system network model.
@name SubGeographicalRegion
"""
type SubGeographicalRegion {
  """
  The name that identifies SubGreographical
  @name name
  """
  name: String!
  """
  Region to which subgeographical region belongs to
  @name region
  """
  region: GeographicalRegion
}

"""
A substation is a part of an electrical generation, transmission, and distribution system.
@name Substation
"""
type Substation {
  """
  The name that identifies Substation
  @name name
  """
  name: String!
  """
  The subgeographical region containing the substation
  @name subGeographicalRegion
  """
  subGeographicalRegion: SubGeographicalRegion
}

type Terminal {
  """
  The name that identifies Terminal
  @name name
  """
  name: String!
  """
  The alternative name that identifies Substation
  @name aliasName
  """
  aliasName: String!
  """
 

The derive GraphQL schema now can be uploaded to CDF and resolved as Flexible Data Model:

<video src="../../videos/tutorial-1-upload-gql-schema-to-cdf.mp4" controls>
</video>


### OWL and SCHACL Object Constraint

Let's now convert Transformation Rules to OWL based semantic ontology and SHACL object constraints:

In [10]:
power_ontology = exporter.Ontology.from_rules(power_rules)

Ontology is stored in RDF Graph accessible through `.ontology` , where to actually see its content we serialize it and print it out:

In [11]:
print(power_ontology.semantic_data_model)

@prefix dct: <http://purl.org/dc/terms/> .
@prefix owl: <http://www.w3.org/2002/07/owl#> .
@prefix power-grid: <http://purl.org/cognite/power-grid#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix sh: <http://www.w3.org/ns/shacl#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

power-grid: a owl:Ontology ;
    rdfs:label "Power Grid Example Data Model" ;
    dct:created "2022-09-29T00:00:00"^^xsd:dateTime ;
    dct:creator "Anders Albert",
        "Nikola Vasiljevic" ;
    dct:description "This is simplified power grid data model used in NEAT tutorial." ;
    dct:hasVersion "0_1_0" ;
    dct:modified "2023-10-07T07:30:51.310854"^^xsd:dateTime ;
    dct:rights "Free for non-commerical use" ;
    dct:title "Power Grid Example Data Model" ;
    owl:versionInfo "0_1_0" .

power-grid:TerminalShape a sh:NodeShape ;
    sh:property [ sh:maxCount 1 ;
            sh:minCount 1 ;
            sh:node power-grid:S

In the same why we access shape constraints:

In [12]:
print(power_ontology.constraints)

@prefix power-grid: <http://purl.org/cognite/power-grid#> .
@prefix sh: <http://www.w3.org/ns/shacl#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

power-grid:TerminalShape a sh:NodeShape ;
    sh:property [ sh:maxCount 1 ;
            sh:minCount 1 ;
            sh:node power-grid:SubstationShape ;
            sh:nodeKind sh:IRI ;
            sh:path power-grid:substation ],
        [ sh:datatype xsd:string ;
            sh:maxCount 1 ;
            sh:minCount 1 ;
            sh:nodeKind sh:Literal ;
            sh:path power-grid:aliasName ],
        [ sh:datatype xsd:string ;
            sh:maxCount 1 ;
            sh:minCount 1 ;
            sh:nodeKind sh:Literal ;
            sh:path power-grid:name ] ;
    sh:targetClass power-grid:Terminal .

power-grid:GeographicalRegionShape a sh:NodeShape ;
    sh:property [ sh:datatype xsd:string ;
            sh:maxCount 1 ;
            sh:minCount 1 ;
            sh:nodeKind sh:Literal ;
            sh:path power-grid:name ] ;
   

Entire Semantic Data Model (ontology + constraints) can be access through property `semantic_data_model`:

In [13]:
print(power_ontology.semantic_data_model)

@prefix dct: <http://purl.org/dc/terms/> .
@prefix owl: <http://www.w3.org/2002/07/owl#> .
@prefix power-grid: <http://purl.org/cognite/power-grid#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix sh: <http://www.w3.org/ns/shacl#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

power-grid: a owl:Ontology ;
    rdfs:label "Power Grid Example Data Model" ;
    dct:created "2022-09-29T00:00:00"^^xsd:dateTime ;
    dct:creator "Anders Albert",
        "Nikola Vasiljevic" ;
    dct:description "This is simplified power grid data model used in NEAT tutorial." ;
    dct:hasVersion "0_1_0" ;
    dct:modified "2023-10-07T07:30:51.310854"^^xsd:dateTime ;
    dct:rights "Free for non-commerical use" ;
    dct:title "Power Grid Example Data Model" ;
    owl:versionInfo "0_1_0" .

power-grid:TerminalShape a sh:NodeShape ;
    sh:property [ sh:maxCount 1 ;
            sh:minCount 1 ;
            sh:node power-grid:S