---
title: Purview | Custom process
description: How to create a custom Purview dataset and process with python
date: "2021-12-14"
toc: true
badges: true
categories: [purview, python, jupyter]
hide: false
image: images/purview_custom_process.png
---

## Prerequisites

In [3]:
%pip install pyapacheatlas


Collecting pyapacheatlas
  Downloading pyapacheatlas-0.10.0-py3-none-any.whl (68 kB)
     |████████████████████████████████| 68 kB 1.9 MB/s             
[?25hCollecting openpyxl>=3.0
  Downloading openpyxl-3.0.9-py2.py3-none-any.whl (242 kB)
     |████████████████████████████████| 242 kB 3.8 MB/s            
Collecting et-xmlfile
  Using cached et_xmlfile-1.1.0-py3-none-any.whl (4.7 kB)
Installing collected packages: et-xmlfile, openpyxl, pyapacheatlas
Successfully installed et-xmlfile-1.1.0 openpyxl-3.0.9 pyapacheatlas-0.10.0
Note: you may need to restart the kernel to use updated packages.


## Code

### Import

In [60]:
from pyapacheatlas.auth import ServicePrincipalAuthentication
from pyapacheatlas.core.client import PurviewClient
from pyapacheatlas.core.entity import AtlasEntity
from pyapacheatlas.core.typedef import EntityTypeDef
from pyapacheatlas.core.util import GuidTracker
from pyapacheatlas.core.typedef import ChildEndDef
from pyapacheatlas.core.typedef import RelationshipTypeDef
from pyapacheatlas.core.typedef import ParentEndDef
from pyapacheatlas.core.typedef import Cardinality
from pyapacheatlas.core.typedef import AtlasAttributeDef
from pyapacheatlas.core import AtlasProcess


### Settings

In [5]:
tenant_id = ""
client_id = ""
client_secret = ""
purview_name = ""


### Clients

In [61]:
atlas_sp = ServicePrincipalAuthentication(
    tenant_id=tenant_id,
    client_id=client_id,
    client_secret=client_secret
)


In [62]:
purview_client = PurviewClient(
    account_name=purview_name,
    authentication=atlas_sp
)


### Entities

In [63]:
guid_tracker = GuidTracker()


In [64]:
custom_dataset = purview_client.upload_typedefs(
    entityDefs=[
        EntityTypeDef(
            name="myCustomDataSet",
            superTypes=["DataSet"]
        )
    ],
    force_update=True
)


In [66]:
myCustomDataset01 = AtlasEntity(
    name="myCustomDataset01",
    typeName="myCustomDataSet",
    qualified_name="pyapacheatlas://mycustomdataset01",
    guid=guid_tracker.get_guid()
)
myCustomDataset02 = AtlasEntity(
    name="myCustomDataset02",
    typeName="myCustomDataSet",
    qualified_name="pyapacheatlas://mycustomdataset02",
    guid=guid_tracker.get_guid()
)


In [69]:
myCustomProcess01 = AtlasProcess(
    name="myCustomProcess01",
    typeName="Process",
    qualified_name="pyapacheatlas://mycustomprocess01",
    inputs=[myCustomDataset01],
    outputs=[myCustomDataset02],
    guid=guid_tracker.get_guid()
)


In [70]:
results = purview_client.upload_entities(
    batch=[myCustomDataset01, myCustomDataset02, myCustomProcess01]
)


## Result

![Call graph](images/purview_custom_process.png)


## Utils

### Delete entities

In [None]:
purview_client.delete_typedefs(
    entityDefs=[
        {"name": "myCustomProcess"},
    ]
)
