# Copy assets and time series from publicdata tenant

This notebook copies all assets and time series (without data points) from publicdata. This can for instance be used to test contextualization tools.

In [1]:
from getpass import getpass
import json
from cognite.client import CogniteClient
from cognite.client.data_classes import Asset, TimeSeries

We have downloaded all assets and time series to `publicdata.json` so you don't have to generate an API key to publicdata yourself. First we open the file and populate two variables `assets` and `time_series` before we replace them with native `Asset` and `TimeSeries` objects that the SDK wants as input. To preserve the asset hierarchy structure, we use the original internal id as `external_id`, and the `parent_id` as `parent_external_id`.

In [2]:
with open("publicdata.json", "r") as f:
    data = json.load(f)
    assets = data["assets"]
    time_series = data["time_series"]
    print(f"Found {len(assets)} assets and {len(time_series)} time series")
# Create Asset objects and TimeSeries objects. Use id's from existing source as external_id + parent_external_id to preserve asset hierarchy
assets = [Asset(name = asset["name"], description = asset.get("description"), external_id = asset["id"], parent_external_id = asset.get("parent_id"), source="publicdata") for asset in assets]
time_series = [TimeSeries(name = ts["name"], description = ts.get("description"), metadata={"source": "publicdata"}) for ts in time_series]

Found 1106 assets and 363 time series


Here you put in the api key and tenant name to your tenant you want to copy the data to.

In [None]:
api_key = getpass()
client = CogniteClient(
    api_key=api_key, 
    project="functions-tutorial",
    client_name="DSHub",
    base_url="https://greenfield.cognitedata.com"
)

To create an asset hierarchy, we can in principle use the `client.assets.create_hierarchy` function, but we've encountered problems with it, so let's just do it the simple way by manually sorting assets depth by depth.

In [8]:
assets_by_id = {}
for asset in assets:
    assets_by_id[asset.external_id] = asset
    
def find_depth(asset_id, depth=0):
    if asset_id in assets_by_id:
        asset = assets_by_id[asset_id]
        if asset.parent_external_id:
            return find_depth(asset.parent_external_id, depth+1)
        return depth

In [9]:
assets_by_depth = {}
for asset in assets:
    depth = find_depth(asset.external_id)
    if not depth in assets_by_depth:
        assets_by_depth[depth] = []
    assets_by_depth[depth].append(asset)

for depth in sorted(assets_by_depth.keys()):
    print(f"Creating {len(assets_by_depth[depth])} assets for depth {depth}")
    client.assets.create(assets_by_depth[depth])
print(f"Done with {len(assets)} assets. Creating time series ...")

client.time_series.create(time_series)
print(f"Created {len(time_series)} time_series")

Creating 1 assets for depth 0
Creating 1 assets for depth 1
Creating 1 assets for depth 2
Creating 1 assets for depth 3
Creating 3 assets for depth 4
Creating 5 assets for depth 5
Creating 48 assets for depth 6
Creating 155 assets for depth 7
Creating 303 assets for depth 8
Creating 325 assets for depth 9
Creating 194 assets for depth 10
Creating 63 assets for depth 11
Creating 6 assets for depth 12
Done with 1106 assets. Creating time series ...
Created 363 time_series
