# Asset Hierarchy Migration

**Prerequisite**:

- Access to a CDF Project.
- Know how to install and setup Python.
- Launch a Python notebook.



In this tutorial, we will show you how to migrate an asset hierarchy to a data model representing the same hierarchy in CDF.

This tutorial is also a good demonstration of the width of capabilities for neat, going from extracting data,
creating a data model, export the data model, and load the data.

## Extract Data from Asset Hierarchy

We will start by extracting the data from an existing asset hierarchy. 

The example we will use in this tutorial is an asset hierarchy for pumps, shown below in CDF's classic Data Exploration

<img src="../artifacts/figs/asset_hierarchy_lift_pump_stations.png" width="400">

In [1]:
from cognite.neat import get_cognite_client
from cognite.neat.graph import extractors, NeatGraphStore

We start by importing the neat extractors, NeatGraphStore, and a utility method for getting a neat client.

In [2]:
store = NeatGraphStore.from_memory_store()

The NeatGraphStore is a triple store that can store both data as well as schemas/data-models.

This store we can populate by extracting from one of the supported sources for neat. 

To see which extractors are available we print the `extractors` module.

In [3]:
extractors

Unnamed: 0,Extractor,Description
0,AssetsExtractor,Extract data from Cognite Data Fusions Assets ...
1,MockGraphGenerator,Class used to generate mock graph data for pur...
2,RelationshipsExtractor,Extract data from Cognite Data Fusions Relatio...
3,TimeSeriesExtractor,Extract data from Cognite Data Fusions TimeSer...
4,SequencesExtractor,Extract data from Cognite Data Fusions Sequenc...
5,EventsExtractor,Extract data from Cognite Data Fusions Events ...
6,FilesExtractor,Extract data from Cognite Data Fusions files m...
7,LabelsExtractor,Extract data from Cognite Data Fusions Labels ...
8,RdfFileExtractor,Extract data from RDF files into Neat.
9,DexpiExtractor,DEXPI-XML extractor of RDF triples


At the top of the list we see the AssetsExtractor which is what we are looking for

In [4]:
extractors.AssetsExtractor

We see that there is a `.from_hierarchy` fatory method that fits well with what we want to do

In [5]:
client = get_cognite_client()

Found .env file in repository root. Loaded variables from .env file.


Note that the utility function `get_cognite_client` will prompt us for credentials if it doesn't find a `.env` file in the repo root.

In [6]:
asset_extractor = extractors.AssetsExtractor.from_hierarchy(client, root_asset_external_id="lift_pump_stations:root")

The asset extractor is now ready and can be written to the store.

In [7]:
store.write(asset_extractor)

We can inspect the store to see what changes have been applied to it:

In [8]:
store

Unnamed: 0,Agent,Activity,Entity,Description
0,http://purl.org/cognite/neat#agent,http://purl.org/cognite/neat#activity-b2c2ee23...,http://purl.org/cognite/neat#graph-store,Initialize graph store as Memory
1,http://purl.org/cognite/neat#agent,http://purl.org/cognite/neat#activity-b2c2ee23...,http://purl.org/cognite/neat#graph-store,Extracted triples to graph store using AssetsE...


## Infering Data Model from Data

To infer a data model from the data we will use an importer

In [9]:
from cognite.neat.rules import importers

In [10]:
importers

Unnamed: 0,Importer,Description
0,OWLImporter,Convert OWL ontology to tables/ transformation...
1,DMSImporter,Imports a Data Model from Cognite Data Fusion.
2,ExcelImporter,Import rules from an Excel file.
3,GoogleSheetImporter,Import rules from a Google Sheet.
4,DTDLImporter,Importer from Azure Digital Twin - DTDL (Digit...
5,YAMLImporter,Imports the rules from a YAML file.
6,InferenceImporter,Infers rules from a triple store.


We see that there are multiple importers availabe, but we will use the `InferenceImporter`

In [11]:
importers.InferenceImporter

In [12]:
importer = importers.InferenceImporter.from_graph_store(store)

Then we use the `.to_rules` method to convert the data into `Rules` which is `Neat`'s format for data models.

In [13]:
rules, issues = importer.to_rules()

First we check if there were any issues with creating the rules, and we find one warning. This warning 
is that there is a property, `Shape__Length`, with a double underscore which is not recommended. However,
wecan continue.

In [14]:
issues

Unnamed: 0,field_name,value
0,property,Shape__Length


Then, we inspect the classes found along with the properties

In [15]:
rules.classes

Unnamed: 0,class_,reference,match_type,comment
0,inferred:Asset,http://purl.org/cognite/neat#Asset,exact,"Inferred from knowledge graph, where this clas..."


In [16]:
rules.properties

Unnamed: 0,class_,property_,value_type,max_count,reference,transformation,comment,inherited
0,inferred:Asset,name,string,1,http://purl.org/cognite/neat#name,inferred:Asset(inferred:name),Class <Asset> has property <name> with value t...,False
1,inferred:Asset,external_id,string,1,http://purl.org/cognite/neat#external_id,inferred:Asset(inferred:external_id),Class <Asset> has property <external_id> with ...,False
2,inferred:Asset,created_time,dateTime,1,http://purl.org/cognite/neat#created_time,inferred:Asset(inferred:created_time),Class <Asset> has property <created_time> with...,False
3,inferred:Asset,last_updated_time,dateTime,1,http://purl.org/cognite/neat#last_updated_time,inferred:Asset(inferred:last_updated_time),Class <Asset> has property <last_updated_time>...,False
4,inferred:Asset,DesignPointFlowGPM,double,1,http://purl.org/cognite/neat#DesignPointFlowGPM,inferred:Asset(inferred:DesignPointFlowGPM),Class <Asset> has property <DesignPointFlowGPM...,False
5,inferred:Asset,DesignPointHeadFT,double,1,http://purl.org/cognite/neat#DesignPointHeadFT,inferred:Asset(inferred:DesignPointHeadFT),Class <Asset> has property <DesignPointHeadFT>...,False
6,inferred:Asset,Enabled,double,1,http://purl.org/cognite/neat#Enabled,inferred:Asset(inferred:Enabled),Class <Asset> has property <Enabled> with valu...,False
7,inferred:Asset,HighHeadShutOff,double,1,http://purl.org/cognite/neat#HighHeadShutOff,inferred:Asset(inferred:HighHeadShutOff),Class <Asset> has property <HighHeadShutOff> w...,False
8,inferred:Asset,LowHeadFT,double,1,http://purl.org/cognite/neat#LowHeadFT,inferred:Asset(inferred:LowHeadFT),Class <Asset> has property <LowHeadFT> with va...,False
9,inferred:Asset,LowHeadFlowGPM,double,1,http://purl.org/cognite/neat#LowHeadFlowGPM,inferred:Asset(inferred:LowHeadFlowGPM),Class <Asset> has property <LowHeadFlowGPM> wi...,False


We notice that for example the `PumpModel` is both an integer and a string, as the `Inference` found data of both types.

We can inspect the comment from the `Inference` type:

In [17]:
rules.properties.data[24].comment

'Class <Asset> has property <PumpModel> with value type <integer> which occurs <6> times in the graph, with value type <string> which occurs <140> times in the graph'

And we see that this is most likely a string as that occured much more for this field in the graph than the integer.

## Exporting Data Model

Lets export our newly created data model to CDF

In [18]:
from cognite.neat.rules import exporters

In [19]:
exporters

Unnamed: 0,Exporter,Description
0,DMSExporter,Export rules to Cognite Data Fusion's Data Mod...
1,SemanticDataModelExporter,Exports rules to a semantic data model.
2,OWLExporter,Exports rules to an OWL ontology.
3,SHACLExporter,Exports rules to a SHACL graph.
4,ExcelExporter,Export rules to Excel.
5,YAMLExporter,Export rules to YAML.


To export the data model we use the `DMSExporter`.

In [20]:
exporters.DMSExporter

To export the rules, we need them in the DMS format, however, the `rules` we have are in Information format.

In [21]:
type(rules)

cognite.neat.rules.models.information._rules.InformationRules

The information rules is used to model the information, while the DMS format is one of the implementation formats that Neat supports.

Neat has an out-of-the box conversion from Information to DMS formats, however, it does not, for example, set indexes.

In [22]:
dms_rules = rules.as_dms_architect_rules()

  self.__pydantic_validator__.validate_python(data, self_instance=self)


In [23]:
dms_rules.metadata

Unnamed: 0,value
role,DMS Architect
data_model_type,enterprise
schema_,partial
extension,addition
space,inferred
name,Inferred Model
description,
external_id,inferred_model
version,inferred
creator,NEAT


In [24]:
dms_rules.views

Unnamed: 0,view,reference,in_model,class_
0,inferred:Asset(version=inferred),http://purl.org/cognite/neat#Asset,True,inferred:Asset


In [25]:
dms_rules.containers

Unnamed: 0,container,class_
0,inferred:Asset,inferred:Asset


If you want to modify the DMS rules it is recommented that you export them using the `ExcelExporter`, modify the resulting spreadsheet
and import it using the `ExcelImporter`. For this tutorial, we are happy with the out-of-the-box DMS rules, so we just pass
the `InformationRules` into the DMS exporter which will automatically do the conversion.

In [26]:
exporter = exporters.DMSExporter()

In [27]:
result = exporter.export_to_cdf(rules, client)

  self.__pydantic_validator__.validate_python(data, self_instance=self)


In [28]:
result

Unnamed: 0,name,created
0,spaces,1
1,containers,1
2,views,1
3,data_models,1


We see the data model was succesfully created.

## Populating Data Model

To populate the data model in CDF, we use a loader.

In [29]:
from cognite.neat.graph import loaders

In [30]:
loaders

Unnamed: 0,Loader,Description
0,DMSLoader,Load data from Cognite Data Fusions Data Model...


In [31]:
loaders.DMSLoader

To load the data from the graph store, we add the rules to the store

In [32]:
store.add_rules(rules)

In [33]:
store

Unnamed: 0,Agent,Activity,Entity,Description
0,http://purl.org/cognite/neat#agent,http://purl.org/cognite/neat#activity-b2c2ee23...,http://purl.org/cognite/neat#graph-store,Initialize graph store as Memory
1,http://purl.org/cognite/neat#agent,http://purl.org/cognite/neat#activity-b2c2ee23...,http://purl.org/cognite/neat#graph-store,Extracted triples to graph store using AssetsE...
2,http://purl.org/cognite/neat#agent,http://purl.org/cognite/neat#activity-b2c2ee23...,http://purl.org/cognite/neat#graph-store,Added rules to graph store as InformationRules
3,http://purl.org/cognite/neat#agent,http://purl.org/cognite/neat#activity-b2c2ee23...,http://purl.org/cognite/neat#graph-store,Upsert prefixes to graph store


This is necessary for the store to be ready to load data to an extrenal system. Note that by adding the `rules` to the store
the prefixes has been updated to match the rules object. This is how the loader knows which triples to fetch from the store.

First, we ensure the instance space exists.

In [34]:
from cognite.client import data_modeling as dm

In [35]:
created = client.data_modeling.spaces.apply(dm.SpaceApply("sp_pump_station"))
created

Unnamed: 0,value
space,sp_pump_station
is_global,False
last_updated_time,2024-07-09 05:31:00.944000
created_time,2024-07-09 05:31:00.944000


We can now use the loader to populate the data model in CDF.

Note that the `DMSLoader` requires that we have the `DMSRules` format, so we pass in the DMS rules we created above.

Alternatively, we can create the loader by passing in a data model ID.

In [36]:
loader = loaders.DMSLoader.from_rules(dms_rules, store, instance_space="sp_pump_station")

In [37]:
result = loader.load_into_cdf(client)

In [38]:
result

Unnamed: 0,name,created
0,Nodes,245.0
1,Edges,


As we see from the result above, Neat has created 245 Nodes in the new data model.

## Results

We can now go into CDF and inspect the results. Looking at the data model we created, we can see the schema for
the inferred Asset

<img src="../artifacts/figs/asset_hierarchy_lift_pump_stations_dms.png" width="400">

Furthermore, we can inspect the populated nodes in this Asset schema

<img src="../artifacts/figs/asset_hierarchy_lift_pump_stations_populated.png" width="1000">

## Final Remarks

* In this tutorial, we used the in-memory version of the Neat store. This works well for small examples, like the toy example here, but for larger asset hierarchies we need to likely use a faster triple store such as `GraphDB` or `Oxigraph`. These are also available in Neat, but might require extra dependencies.
* This can be considered the first step of a full migration. At least two related problems may remain
    1. First, we might want to infer a more specific type than `Asset`, for example, `Pump` and `LiftStation`. This means adding information that is not explicitly set in the existing Asset Hierarchy. The type might be implicitly defined from the level in the hierarchy, or for example, the external ID of the asset.
    2. We might want to map the inferred model onto an existing data model. It this case the existing model would be an `EnterpriseModel` and the inferred model we obtained here would be a `Source` model. 
