![Egeria Logo](https://raw.githubusercontent.com/odpi/egeria/master/assets/img/ODPi_Egeria_Logo_color.png)

### ODPi Egeria Hands-On Lab
# Welcome to the Building a Data Catalog Lab

## Introduction

ODPi Egeria is an open source project that provides open standards and implementation libraries to connect tools, catalogues and platforms together so they can share information about data and technology (called metadata).

In this hands-on lab you will get a chance to run an Egeria metadata server, build a simple catalog of data sets, connect this server to another Egeria metadata server and then experiment with attaching feedback (comments) to the catalog entries from either server.  This feedback can be seen linked to the catalog entries from both servers.

## The Scenario

The ODPi Egeria team use the personas from the fictitious company called Coco Pharmaceuticals.  (See https://opengovernance.odpi.org/coco-pharmaceuticals/ for more information).

The two main character engaged in this scenario are Peter Profile and Erin Overview.

![Peter and Erin](../images/peter-and-erin.png)

In [None]:
petersUserId = "peterprofile"
erinsUserId  = "erinoverview"

Peter and Erin are cataloguing new data sets that have been received from a hospital.  These data sets are part of a clinical trial that the hospital is participating in.

## Setting up

Coco Pharmaceuticals make widespread use of ODPi Egeria for tracking and managing their data and related assets.
Figure 1 below shows the metadata servers and the platforms that are hosting them.

![Figure 1](../images/coco-pharmaceuticals-systems-omag-server-platforms.png)
> **Figure 1:** Coco Pharmaceuticals' OMAG Server Platforms

Peter is using the data lake operations metadata server called `cocoMDS1`. This server is hosted on the Data Lake OMAG Server Platform.

In [None]:
server1            = "cocoMDS1"
server1PlatformURL = "http://localhost:8081"

The following request checks that this server is running.

In [None]:
import requests
import pprint
import json

adminUserId = "garygeeke"

isServer1ActiveURL = server1PlatformURL + "/open-metadata/platform-services/users/" + adminUserId + "/server-platform/servers/" + server1 + "/status"

print (" ")
print ("GET " + isServer1ActiveURL)
print (" ")

response = requests.get(isServer1ActiveURL)

print ("Returns:")
prettyResponse = json.dumps(response.json(), indent=4)
print (prettyResponse)
print (" ")

serverStatus = response.json().get('active')
if serverStatus == True:
    print("Server " + server1 + " is active - ready to begin")
else:
    print("Server " + server1 + " is down - start it before proceeding")


----
If you see `Server cocoMDS1 is active - ready to begin` then the server is running.  If the server is down, follow the instructions in the **Managing Servers** to start the server.

## Exercise 1

### Adding assets to the catalog

In the first exercise, Peter Profile is adding new data sets (assets) to the catalog. 

Peter needs to use the **Asset Owner** Open Metadata Access Service (OMAS) to manage assets in the catalog.  All of the request for the Asset Owner OMAS begin with the following URL root.

In [None]:
server1AssetOwnerURL = server1PlatformURL + '/servers/' + server1 + '/open-metadata/access-services/asset-owner/users/' + petersUserId 

First Peter will query the current list of Clinical Trial Assets from cocoMDS1

In [None]:

getAssetsURL = server1AssetOwnerURL + '/assets/by-name?startFrom=0&pageSize=50'
searchString="Drop Foot"

print (" ")
print ("GET " + getAssetsURL)
print ("{ " + searchString + " }")
print (" ")

response=requests.post(getAssetsURL, data=searchString)

print ("Returns:")
prettyResponse = json.dumps(response.json(), indent=4)
print (prettyResponse)
print (" ")

if response.json().get('assets'):
    if len(response.json().get('assets')) == 1:
        print ("1 asset found")
    else:
        print (str(len(response.json().get('assets'))) + " assets found")
else:
    print ("No assets found")


----
We can see here that no assets are returned as the repository is empty.

#### Adding weekly clinical trial assets


Peter is now going to create three weeks of clinical asset data. These are 3 data sets. We'll start with week 1

In [None]:
createAssetURL = server1AssetOwnerURL + '/assets/csv-files'
print (createAssetURL)

jsonHeader = {'content-type':'application/json'}
body = {
	"class" : "NewFileAssetRequestBody",
	"displayName" : "Week 1: Drop Foot Clinical Trial Measurements",
	"description" : "One week's data covering foot angle, hip displacement and mobility measurements.",
	"fullPath" : "file://secured/research/clinical-trials/drop-foot/DropFootMeasurementsWeek1.csv"
}

response=requests.post(createAssetURL, json=body, headers=jsonHeader)

response.json()


----
Notice the response includes a property called “guid”.  This is the unique identifier of the asset and we need to save it away in a variable to use later

In [None]:
asset1guid=response.json().get('guid')

print (" ")
print ("The guid for asset 1 is: " + asset1guid)
print (" ")


----
Now let's take a look again at what assets are in the repository using the same get request we used earlier.


In [None]:

print (" ")
print ("GET " + getAssetsURL)
print ("{ " + searchString + " }")
print (" ")

response=requests.post(getAssetsURL, data=searchString)

print ("Returns:")
prettyResponse = json.dumps(response.json(), indent=4)
print (prettyResponse)
print (" ")

if response.json().get('assets'):
    if len(response.json().get('assets')) == 1:
        print ("1 asset found")
    else:
        print (str(len(response.json().get('assets'))) + " assets found")
else:
    print ("No assets found")


----

Peter is now going to add the next two weeks of assets

In [None]:


csvbody2 = {
	"class" : "NewFileAssetRequestBody",
	"displayName" : "Week 2: Drop Foot Clinical Trial Measurements",
	"description" : "One week's data covering foot angle, hip displacement and mobility measurements.",
	"fullPath" : "file://secured/research/clinical-trials/drop-foot/DropFootMeasurementsWeek2.csv"
}

response2=requests.post(createAssetURL, json=csvbody2, headers=jsonHeader)

print ("Second request responded with:" + str(response2.status_code))

asset2guid=response2.json().get('guid')


csvbody3 = {
	"class" : "NewFileAssetRequestBody",
	"displayName" : "Week 3: Drop Foot Clinical Trial Measurements",
	"description" : "One week's data covering foot angle, hip displacement and mobility measurements.",
	"fullPath" : "file://secured/research/clinical-trials/drop-foot/DropFootMeasurementsWeek3.csv"
}

response3=requests.post(createAssetURL, json=csvbody3, headers=jsonHeader)

print ("Third request responded with:"  + str(response3.status_code))

asset3guid=response3.json().get('guid')

print (" ")
print ('Asset 1 guid is: ' + asset1guid)
print ('Asset 2 guid is: ' + asset2guid)
print ('Asset 3 guid is: ' + asset3guid)


----
And let's look again at the assets we have:

In [None]:

print (" ")
print ("GET " + getAssetsURL)
print ("{ " + searchString + " }")
print (" ")

response=requests.post(getAssetsURL, data=searchString)

print ("Returns:")
prettyResponse = json.dumps(response.json(), indent=4)
print (prettyResponse)
print (" ")

if response.json().get('assets'):
    if len(response.json().get('assets')) == 1:
        print ("1 asset found")
    else:
        print (str(len(response.json().get('assets'))) + " assets found")
else:
    print ("No assets found")


----
Peter has successfully created three assets:

In [None]:
print (" ")
print ('Asset 1 guid is: ' + asset1guid)
print ('Asset 2 guid is: ' + asset2guid)
print ('Asset 3 guid is: ' + asset3guid)

## Exercise 2 - Sharing the catalog and adding feedback

In exercise 2 we are going to start working with a second metadata repository server called cocoMDS2.  We will connect it to the same open metadata repository cohort as cocoMDS1 and show that they can share metadata in both directions.

### Bringing cocoMDS2 into the cohort

First let us test that the repository is empty and disconnected.

We'll begin like before with gary starting up a server, but it will be another user, Erin that looks up metadata in the server - and finds it empty


In [None]:
print('Starting server 2 ....')

coreURLroot = "http://localhost:8081"
server2="cocoMDS2"

url=coreURLroot + '/open-metadata/admin-services/users/' + adminUserId + '/servers/' + server2 + '/instance'

print (url)

response=requests.post(url)

response.json()


In [None]:
print ('Checking contents of server (Erin)')



assetUrl=coreURLroot + '/servers/' + server2 + '/open-metadata/access-services/asset-owner/users/' + erinsUserId 
url=assetUrl + '/assets/by-name?startFrom=0&pageSize=50'
print (url)

body="Drop Foot"
response=requests.post(url,data=body)

response.json()

Now we are going to shutdown cocoMDS2 in order to join it to the cohort


In [None]:
print('Stopping server 2 ....')

url=coreURLroot + '/open-metadata/admin-services/users/' + adminUserId + '/servers/' + server2 + '/instance'

print (url)

response=requests.delete(url)

response.json()

Now we are going to add server 2 into the cohort

In [None]:
cohortName="cocoCohort"
url=coreURLroot + '/open-metadata/admin-services/users/' + adminUserId + '/servers/' + server2 + '/cohorts/' + cohortName

print (url)

response=requests.delete(url)

response.json()

Now we'll restart the server and allow it to join into the cohort

In [None]:
print('Starting server 2 ....')

url=coreURLroot + '/open-metadata/admin-services/users/' + adminUserId + '/servers/' + server2 + '/instance'

print (url)

response=requests.post(url)

response.json()


Let's see what assets server2 now knows about

In [None]:
assetUrl=coreURLroot + '/servers/' + server2 + '/open-metadata/access-services/asset-owner/users/' + erinsUserId 
url=assetUrl + '/assets/by-name?startFrom=0&pageSize=50'
print (url)

body="Drop Foot"
response=requests.post(url,data=body)

print ("Returns:")
prettyResponse = json.dumps(response.json(), indent=4)
print (prettyResponse)
print (" ")

Let's see what the configuration of the server is


In [None]:
url=platformURLroot + '/open-metadata/admin-services/users/' + adminUserId + '/servers/' + server1 + '/configuration'

print (url)

response=requests.get(url)

response.json()

In [None]:
corePlatformURL     = "http://localhost:8080"