# CSV Onboarding

**Prerequisite**:

- Installed Neat, see [Installation](../../gettingstarted/installation.html)
- Launched a notebook environment.
- Familiar with the `NeatSession` object, see [introduction](../introduction/introduction.html)
- Access to `NeatEngine`



In this tutorial, we will load data from a `csv`, infer a data model from the data and push the model with data to CDF.

## Reading Metadata

We will start by instansiating a `NeatSession` and read the data from an URL.

In [1]:
from cognite.neat import NeatSession, get_cognite_client

In [2]:
neat = NeatSession(get_cognite_client(".env"))

Found .env file in repository root. Loaded variables from .env file.
Neat Engine 2.0.3 loaded.


A snippet of the data we are reading are shown below


```csv
ELC_STATUS_ID,RES_ID,SOURCE_DB,SOURCE_TABLE,WMT_AREA_ID,WMT_CATEGORY_ID,WMT_CONTRACTOR_ID,WMT_FUNC_CODE_ID,WMT_LOCATION_ID,WMT_PO_ID,WMT_SAFETYCRITICALELEMENT_ID,WMT_SYSTEM_ID,WMT_TAG_CREATED_DATE,WMT_TAG_CRITICALLINE,WMT_TAG_DESC,WMT_TAG_GLOBALID,WMT_TAG_HISTORYREQUIRED,WMT_TAG_ID,WMT_TAG_ID_ANCESTOR,WMT_TAG_ISACTIVE,WMT_TAG_ISOWNEDBYPROJECT,WMT_TAG_LOOP,WMT_TAG_MAINID,WMT_TAG_NAME,WMT_TAG_UPDATED_BY,WMT_TAG_UPDATED_DATE,WMT_TAG_STATUSCHGDATE,WMT_TAG_COMMENT,WMT_TAG_SUFFIX,WMT_SYSTEM_ACTIVE,WMT_SYSTEM_CODE,WMT_SYSTEM_DESC,WMT_SYSTEM_NAME,WMT_LOCATION_ACTIVE,WMT_LOCATION_CODE,WMT_LOCATION_EXTENDACTIVEWOP,WMT_LOCATION_EXTERNALOWNERSHIP,WMT_LOCATION_MAITIS,WMT_LOCATION_NAME,WMT_LOCATION_NOCOPIESDEFAULTIC,WMT_LOCATION_NOCOPIESWOPERMIT,WMT_LOCATION_OPERATIONHOURS,WMT_LOCATION_PROGVALUE,WMT_LOCATION_SIMULATETIMEFRAME,WMT_LOCATION_SJAMAXNOOFTASKS,WMT_LOCATION_USEPLOTALTITUDE,WMT_LOCATION_WORKSTART,latestUpdateTimeSource
1211,525283,workmate,wmate_dba.wmt_tag,1600,1116,1686,4564,1004,8309,1060,4440,26/06/2009 15:36,N,VRD - PH 1STSTGGEAR THRUST BRG OUT,1000000000681024,Y,346434,345637,1,0,96116,681760,23-TE-96116-04,8137,11/07/2014 09:25,,,,,,,,,,,,,,,,,,,,,,
1211,532924,workmate,wmate_dba.wmt_tag,1600,1116,1686,4564,1004,8309,1060,4440,26/06/2009 15:36,N,VRD - PH 1STSTG COMP SEAL GAS HTR,1000000000682252,Y,346452,346633,1,0,96148,681760,23-TE-96148,8137,11/07/2014 09:25,,,,,,,,,,,,,,,,,,,,,,
1211,446683,workmate,wmate_dba.wmt_tag,1600,1116,1686,4627,1004,8309,1060,4440,26/06/2009 15:36,N,VRD - PH 1STSTGGEAR 1 JOURNBRG DE,1000000000715794,Y,346995,346935,1,0,96117,681760,23-YT-96117-01,9802,09/12/2013 12:53,,,,,,,,,,,,,,,,,,,,,,
1211,,workmate,wmate_dba.wmt_tag,1600,1152,1686,11275,1004,,,4440,13/12/2012 14:13,N,SOFT TAG VRD - PH 1STSTG PRIM SEAL LEAK DE,1000000000250739,Y,682956,345868,1,0,,681760,23-FI-96151,1001,09/10/2015 11:56,06/10/2014 07:45,,,,,,,,,,,,,,,,,,,,,
```

To read a `csv`, we need to tell neat what typ of data is in the source, as well as which column is the identifier.

We know that this data contains assets and that the column `WMT_TAG_GLOBALID` is the unique identifier of these assets.

In [3]:
url = "https://apps-cdn.cogniteapp.com/toolkit/publicdata/assets.Table.csv"

In [4]:
neat.read.csv(url, type="Asset", primary_key="WMT_TAG_GLOBALID")

In [5]:
neat

Unnamed: 0,Type,Occurrence
0,Asset,1103


Studying the output above, we see that we succesfully read 1103 assets into the `NeatSession`.

## Infer Data Model

We can infer a data model from data in the `NeatSession` by calling `.infer()`.

In [6]:
neat.infer()

In [7]:
neat

Unnamed: 0,Unnamed: 1
type,Logical Data Model
intended for,Information Architect
name,Inferred Model
external_id,NeatInferredDataModel
version,v1
classes,1
properties,29

Unnamed: 0,Type,Occurrence
0,Asset,1103


## Inspect Data Model

In [None]:
neat.inspect.properties()

Unnamed: 0,class_,property_,value_type,max_count,instance_source
0,neat_space:Asset,WMT_TAG_MAINID,long,1,prefix_16:Asset(prefix_16:WMT_TAG_MAINID)
1,neat_space:Asset,WMT_TAG_DESC,string,1,prefix_16:Asset(prefix_16:WMT_TAG_DESC)
2,neat_space:Asset,WMT_CATEGORY_ID,long,1,prefix_16:Asset(prefix_16:WMT_CATEGORY_ID)
3,neat_space:Asset,WMT_LOCATION_ID,long,1,prefix_16:Asset(prefix_16:WMT_LOCATION_ID)
4,neat_space:Asset,WMT_TAG_CRITICALLINE,string,1,prefix_16:Asset(prefix_16:WMT_TAG_CRITICALLINE)
5,neat_space:Asset,WMT_TAG_ISACTIVE,long,1,prefix_16:Asset(prefix_16:WMT_TAG_ISACTIVE)
6,neat_space:Asset,WMT_TAG_GLOBALID,long,1,prefix_16:Asset(prefix_16:WMT_TAG_GLOBALID)
7,neat_space:Asset,WMT_FUNC_CODE_ID,long,1,prefix_16:Asset(prefix_16:WMT_FUNC_CODE_ID)
8,neat_space:Asset,WMT_TAG_ISOWNEDBYPROJECT,long,1,prefix_16:Asset(prefix_16:WMT_TAG_ISOWNEDBYPRO...
9,neat_space:Asset,WMT_TAG_HISTORYREQUIRED,string,1,prefix_16:Asset(prefix_16:WMT_TAG_HISTORYREQUI...


After inspecting the properties, we notice that we have a `Logical Data Model`. This cannot be written to CDF. To do that we will convert it to the `dms` format
which is what CDF expects for data models.

## Convert Data Model 

In [9]:
neat.convert()

Rules converted to dms


In [10]:
neat.inspect.properties

Unnamed: 0,view,view_property,value_type,nullable,is_list,container,container_property,logical
0,neat_space:Asset(version=v1),WMT_TAG_MAINID,int64,True,False,neat_space:Asset,WMT_TAG_MAINID,http://purl.org/cognite/neat/data-model/verifi...
1,neat_space:Asset(version=v1),WMT_TAG_DESC,text,True,False,neat_space:Asset,WMT_TAG_DESC,http://purl.org/cognite/neat/data-model/verifi...
2,neat_space:Asset(version=v1),WMT_CATEGORY_ID,int64,True,False,neat_space:Asset,WMT_CATEGORY_ID,http://purl.org/cognite/neat/data-model/verifi...
3,neat_space:Asset(version=v1),WMT_LOCATION_ID,int64,True,False,neat_space:Asset,WMT_LOCATION_ID,http://purl.org/cognite/neat/data-model/verifi...
4,neat_space:Asset(version=v1),WMT_TAG_CRITICALLINE,text,True,False,neat_space:Asset,WMT_TAG_CRITICALLINE,http://purl.org/cognite/neat/data-model/verifi...
5,neat_space:Asset(version=v1),WMT_TAG_ISACTIVE,int64,True,False,neat_space:Asset,WMT_TAG_ISACTIVE,http://purl.org/cognite/neat/data-model/verifi...
6,neat_space:Asset(version=v1),WMT_TAG_GLOBALID,int64,True,False,neat_space:Asset,WMT_TAG_GLOBALID,http://purl.org/cognite/neat/data-model/verifi...
7,neat_space:Asset(version=v1),WMT_FUNC_CODE_ID,int64,True,False,neat_space:Asset,WMT_FUNC_CODE_ID,http://purl.org/cognite/neat/data-model/verifi...
8,neat_space:Asset(version=v1),WMT_TAG_ISOWNEDBYPROJECT,int64,True,False,neat_space:Asset,WMT_TAG_ISOWNEDBYPROJECT,http://purl.org/cognite/neat/data-model/verifi...
9,neat_space:Asset(version=v1),WMT_TAG_HISTORYREQUIRED,text,True,False,neat_space:Asset,WMT_TAG_HISTORYREQUIRED,http://purl.org/cognite/neat/data-model/verifi...


Now we see that we have information about how the data model is implemented. 

We can further show the steps we have been taking so far, called the provenance of the data model.

In [11]:
neat.show.data_model.provenance()

data_model_provenance_95e2e718.html


We notice that we get the default space and model identifier, so we set it to be unique.

## Publish Data Model

In [12]:
neat.set.data_model_id(("sp_doctrino", "DoctrinoAssetModel", "v1"))

Now we are ready to publish this to CDF.

In [14]:
neat.to.cdf.data_model(existing="recreate")

You can inspect the details with the .inspect.outcome.data_model(...) method.


Unnamed: 0,name,unchanged,skipped,created,deleted
0,spaces,1,0,0,0
1,containers,0,1,0,0
2,views,0,0,1,1
3,data_models,0,0,1,1
4,nodes,0,0,0,0


## Populate Data Model

Neat keeps track of the data, so we can immidiately populate this data model with the original data

In [15]:
neat.to.cdf.instances()

INFO | 2025-01-26 14:18:14,262 | Staring DMSLoader and will process 1 views.
INFO | 2025-01-26 14:18:14,290 | Starting ViewId(space='sp_doctrino', external_id='Asset', version='v1') 1/1.
Loading ViewId(space='sp_doctrino', external_id='Asset', version='v1'): 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████| 1103/1103 [00:02<00:00, 478.40it/s]
INFO | 2025-01-26 14:18:16,637 | Finished ViewId(space='sp_doctrino', external_id='Asset', version='v1').


You can inspect the details with the .inspect.outcome.instances(...) method.


Unnamed: 0,name,unchanged
0,Asset,1103


<img src="../../artifacts/figs/working_with_metadata_published_dm.png" width="1200">