# Classic Knowledge Graph Onboarding

**Prerequisite**:

- Access to a CDF Project.
- Know how to install and setup Python.
- Launch a Python notebook.



In this tutorial, we will show you how to onboard data from CDF resources assets, timeseries, files, sequences, events, data sets, and labels into an extension of the Cognite Core data model.

## Use Case

The use case we will use in this tutorial is a WindFarm named `Utsira` with two wind turbines, `WT-01` and `WT-02`. These two turbines each has two time series, one for power production and one for forecast. In addition, one of them has a power curve sequences linked and a maintenance event. In addition, there is a whearther station modeled as an asset that is connected to the two turbines through relationships. Finally, there is a file with a data sheet linked to one of the turbines.

In the CDF UI, it looks like shown below:

<img src="../../artifacts/figs/wind_farm_classic_knowledge_graph.png" width="400">

## Extracting Data

We will start by creating a new `NeatSession` and connect to CDF

In [1]:
from cognite.neat import NeatSession, get_cognite_client

In [2]:
client = get_cognite_client(".env")

Found .env file in repository root. Loaded variables from .env file.


In [3]:
# We have install neat with `pip install "cognite-neat[oxi]"` such that we can use oxigraph for storage.
neat = NeatSession(client, storage="oxigraph")

Neat Engine 2.0.3 loaded.


We start by reading in the knowledge graph, we do this by pointing to the root node. Neat will then get all assets in the hierarchy, as well as all resources linked to this hierarchy.

In [4]:
# This cell is removed from the docs.
from cognite.neat._config import GLOBAL_CONFIG

# Switch to TQDM progress bar as that looks better in the docs
GLOBAL_CONFIG.progress_bar = "tqdm-notebook"

In [5]:
neat.read.cdf.classic.graph("Utsira")

Extracting Asset relationships:   0%|          | 0/1 [00:00<?, ?it/s]

Extracting TimeSeries relationships:   0%|          | 0/1 [00:00<?, ?it/s]

Extracting Sequence relationships:   0%|          | 0/1 [00:00<?, ?it/s]

Extracting Event relationships:   0%|          | 0/1 [00:00<?, ?it/s]

Extracting File relationships:   0%|          | 0/1 [00:00<?, ?it/s]

Extracting end nodes Asset:   0%|          | 0/1 [00:00<?, ?it/s]

Extracting labels:   0%|          | 0/1 [00:00<?, ?it/s]

We can now inspect the instances we have extracted.

In [6]:
neat

Unnamed: 0,Type,Occurrence
0,TimeSeries,4
1,Label,4
2,Asset,4
3,Relationship,2
4,Event,1
...,...,...
6,File,1


## Prepare for CogniteCore

We need to prepare the extracted instances for cognite core. This means, for example, converting relationships to edges and making TimeSeries.isString into an enum.

This preperation steps are gathered in a single method in neat.

In [7]:
neat.prepare.instances.classic_to_core()

Relationships to edges:   0%|          | 0/2 [00:00<?, ?it/s]

## Infering Data Model

We are now ready to infer a data model

In [8]:
neat.infer()

Inferring classes:   0%|          | 0/8 [00:00<?, ?it/s]

Unnamed: 0_level_0,count
NeatIssue,Unnamed: 1_level_1
PropertyValueTypeUndefinedWarning,8
PropertySkippedWarning,6


## Preparing Model for Mapping to CogniteCore

When we create a data model based on assets, time series, we get a model that is using reserved words. To overcome this, we prefix all classes with the organization name. In this tutorial, we use the fictitious organization `DoctrinoInc`.

In [9]:
organiztion_name = "DoctrinoInc"

In [10]:
neat.prepare.data_model.prefix(organiztion_name)

Now we are ready to verify the model, meaning validate that it is consistent.

In [11]:
neat.verify()

In [12]:
neat.show.data_model()

http_purl.org_cognite_neat_data-model_verified_logical_neat_space_NeatInferredDataModel_v1.html


## Converting Information to DMS

The model we inferred was an information model. This is a descriptive model that does not contain any information about implementations. This must be comverted to an domain model storage format.

We do a special handling of the classes that should be implemented as edges with properties. 

First, we set all classes that has the suffxi `Edge` to be implementing `Edge`.
Then, we use a special mode whe convert that makes all classes that implements `Edge` into edges instead of nodes.

In [13]:
neat.prepare.data_model.add_implements_to_classes(suffix="Edge", implements="Edge")

In [14]:
neat.convert("dms", mode="edge_properties")

Rules converted to dms


## Mapping Model to Cognite Core

We now have a DMS representation of our data model. This can be mapped on top of `CogniteCore` using a standard mapping.

In [15]:
neat.show.data_model()

http_purl.org_cognite_neat_data-model_verified_physical_neat_space_NeatInferredDataModel_v1.html


In [16]:
neat.mapping.data_model.classic_to_core(organiztion_name)

Unnamed: 0_level_0,count
NeatIssue,Unnamed: 1_level_1
PropertyOverwritingWarning,6


The warnings is because some of the mappings overwrote the data type

In [17]:
neat.show.data_model()

http_purl.org_cognite_neat_data-model_verified_physical_neat_space_NeatInferredDataModel_v1.html


## Publishing Data Model

We can now publish the data model. First, we set the data model id

In [29]:
neat.set.data_model_id(("sp_doctrino_snapshot", "WindFarm", "v1"))

In [30]:
neat.to.cdf.data_model()

You can inspect the details with the .inspect.outcome.data_model(...) method.


Unnamed: 0,name,created
0,spaces,1
1,containers,9
2,views,9
3,data_models,1
4,nodes,0


## Populate new Model

In [31]:
neat.to.cdf.instances()

You can inspect the details with the .inspect.outcome.instances(...) method.


Unnamed: 0,name,created,changed,unchanged
0,DoctrinoIncAssetToAssetEdge,2,0,0
1,DoctrinoIncLabel,3,1,0
2,DoctrinoIncSequence,1,0,0
3,DoctrinoIncSourceSystem,2,0,0
4,DoctrinoIncAsset,3,3,1
5,edge,2,0,2
6,,3,0,0
7,DoctrinoIncFile,1,0,0
8,DoctrinoIncTimeSeries,4,0,0
9,DoctrinoIncEvent,1,0,0


## (Optional) Create a Data Product Model

The new model will be very large as it includes 33 views from the CogniteCore. We will now make a read-only model called a Data Product that is smaller and only use the original views and properties we identified.

In [32]:
neat.prepare.data_model.to_data_product(("sp_doctrino_readonly", "WindFarmReadOnlny", "v1"))

And the publish this one

In [33]:
neat.to.cdf.data_model(existing="force")

You can inspect the details with the .inspect.outcome.data_model(...) method.


Unnamed: 0,name,created
0,spaces,1
1,containers,0
2,views,9
3,data_models,1
4,nodes,0


We can now inspect the newly created data models with instances in CDF.

<img src="../../artifacts/figs/wind_farm_data_product_instances.png" width="400">