### Import Assets Data Product Hub Level 3 

This notebook helps expedite the process of importing different connections, governance artifacts, and automating metadata import and metadata enrichment for the level 3 IBM Data Product Hub lab.

1. **Creating Connections to Our Catalog**: Establishing secure and reliable connections to our data catalog to ensure seamless access and integration of data assets.

2. **Importing Metadata into Our Project**: Bringing in relevant metadata into our project to provide context and structure to our data, which is crucial for effective data management and utilization.

3. **Running Metadata Enrichment**: Enhancing the imported metadata by adding valuable information, annotations, and classifications. This step improves data quality and discoverability.

4. **Publishing Enriched Data Assets**: Once enriched, we will publish these data assets back to our catalog. This makes them readily available for creating data products and ensures that the enriched information is accessible for future use.

By using the helper script to automate these steps, we can significantly reduce the setup time, allowing us to focus on more advanced aspects of the lab. 

>**&#10032;** **Note:** The helper script is a Python script that uses the Watson Data API to automate the tasks outlined above. It is imperative that the participant ensures that the **correct environment variables are entered** to enable the script to perform these tasks efficiently. This approach lays a solid foundation for creating a data product, ensuring that all necessary assets are prepared and optimized for efficient use.


### Set Environment Variables

In [1]:
# Define the environment variables content
env_content = """
# MODIFY FOR YOUR ENVIROMENT - This will be the base url for your ENVIROMENT
# Cluster Info
CPD_CLUSTER_HOST=cpd-cpd.apps.6697f88d158155001ef8fc57.ocp.techzone.ibm.com

# Data Producer information
USERNAME=admin
PASSWORD=kjnGi-tDBwr-tWGBA-wKnzU


# Landing Zone information
CATALOG_NAME="DPH Demo Catalog"
PROJECT_ID=46b3240e-8412-4040-af03-df912d4cc28b

#Add Connection info below

# PostgreSQL Warehouse Credentials
PSQL_DB_USERNAME=cpdemo
PSQL_DB_PASSWORD=C!oudP@k4DataDem0s
PSQL_DB_PORT=31128
PSQL_DB_DATABASE=3RDPARTY
PSQL_DB_HOST=85331fa6-6b56-4355-935e-290f3ac8aa8c.8117147f814b4b2ea643610826cd2046.databases.appdomain.cloud
PSQL_DATASOURCE_TYPE=048ed1bf-516c-46f0-ae90-fa3349d8bc1c

# DB2 Warehouse Credentials
DB_USERNAME=cpdemo
DB_PASSWORD=C!oudP@k4DataDem0s
DB_PORT=50001
DB_DATABASE=BLUDB
DB_HOST=db2w-ovqfeqq.us-south.db2w.cloud.ibm.com
DB2_DATASOURCE_TYPE=cfdcb449-1204-44ba-baa6-9a8a878e6aa7

# Cloud Object Storage Credentials
COS_BUCKET=cpd-outcomes
COS_SECRET_KEY=6f3fafc225b2c8527f22d13e2c67034f4da989dc08adffac
COS_API_KEY=_-W6DMVd3cQV7YrWIdmExhX29ApoDIBAu4y0C07dCMPB
COS_ACCESS_KEY=bd3d4829929a41ac8a2e8e7ab58e5cdb
COS_RESOURCE_INSTANCE_ID=crn:v1:bluemix:public:cloud-object-storage:global:a/feb0a088323a45db90b8dd694b314c53:88ee55e3-fc61-46a9-9ccd-e81593585ba6::
COS_DATASOURCE_TYPE=193a97c1-4475-4a19-b90c-295c4fdc6517

ORIGIN_COUNTRY=us

"""


In [2]:
# Define the path for the .env file
env_file_path = './.env'

# Write the content to the .env file
with open(env_file_path, 'w') as env_file:
    env_file.write(env_content)

print(f".env file created at {env_file_path}")

.env file created at ./.env


In [3]:
# Suppress the output of the wget command
!wget https://raw.githubusercontent.com/CloudPak-Outcomes/Outcomes-Projects/main/Data-Product-Hub-L3/client.py -q
!wget https://github.com/CloudPak-Outcomes/Outcomes-Projects/raw/main/Data-Product-Hub-L3/governance_artifacts.zip -q
# Suppress the output of the pip install command
!pip install cowsay > /dev/null 2>&1
print("Complete")

Complete


## Env Variables

## Use the import client to set up environment

In [4]:
import time
from client import ImportClient

client = ImportClient()


In [6]:

# Check Variables/Credentials
client.verify_vars()

# Check Bearer Token is valid
bearer_token = client.get_bearer_token()
    
# Cloud Object Storage
client.define_cos_connection(bearer_token, catalog=False)

# DB2 Warehouse
client.define_db2_connection(bearer_token, catalog=False)

# PostgreSQL
client.define_psql_connection(bearer_token, catalog=False)

client.main_import_process("governance_artifacts.zip", process_id=None)


[1;34mcpd_cluster_host[0m: cpd-cpd.apps.6697f88d158155001ef8fc57.ocp.techzone.ibm.com
[1;34musername[0m: admin
[1;34mpassword[0m: kjnGi-tDBwr-tWGBA-wKnzU
[1;34mcatalog_name[0m: DPH Demo Catalog
[1;34mproject_id[0m: 46b3240e-8412-4040-af03-df912d4cc28b
[1;34mbase_url[0m: https://cpd-cpd.apps.6697f88d158155001ef8fc57.ocp.techzone.ibm.com
[1;34mdb_username[0m: cpdemo
[1;34mdb_password[0m: C!oudP@k4DataDem0s
[1;34mdatabase[0m: BLUDB
[1;34mhost[0m: db2w-ovqfeqq.us-south.db2w.cloud.ibm.com
[1;34mdb_port[0m: 50001
[1;34mdb2_name[0m: Data Warehouse
[1;34mdb2_description[0m: Database that contains warehouse data needed by the business for analytics and AI.
[1;34morigin_country[0m: us
[1;34mdb2_datasource_type[0m: cfdcb449-1204-44ba-baa6-9a8a878e6aa7
[1;34mpsql_username[0m: cpdemo
[1;34mpsql_password[0m: C!oudP@k4DataDem0s
[1;34mpsql_database[0m: 3RDPARTY
[1;34mpsql_host[0m: 85331fa6-6b56-4355-935e-290f3ac8aa8c.8117147f814b4b2ea643610826cd2046.databases.appd

## Metadata Import

In [7]:

# Create and run the metadata import DB2 Warehouse
path_db2=["/EMPLOYEE/EMPLOYEE_HISTORY","/EMPLOYEE/EMPLOYEE_RECORDS","/EMPLOYEE/EMPLOYEE_SUMMARY","/EMPLOYEE/EMPLOYEE"]
db2_mdid, db2_mdi_response = client.create_and_run_metadata_import(client.db2_id, path_db2, name="2 DB2 Metadata Import")

# Create and run the metadata import COS
path_cos = ["/cpd-outcomes/Warehouse/WAREHOUSE_ASSIGNED_SHIFTS.csv","/cpd-outcomes/Warehouse/WAREHOUSE_SHIFTS.csv","/cpd-outcomes/Warehouse/WAREHOUSE_STAFF.csv","/cpd-outcomes/Warehouse/WAREHOUSE_STAFFING.csv"]
cos_mdid, cos_mdi_response = client.create_and_run_metadata_import(client.cos_id, path_cos, name="2 Cloud Object Storage Metadata Import")

# Create and run the metadata import
path_psql = ["/CUSTOMER/CUSTOMER_LOYALTY"]
psql_mdid, psql_mdid_response = client.create_and_run_metadata_import(client.psql_id, path_psql, name="2 Postgresql Metadata Import")

MDI Creation Response: 201
Job Creation Response: 201
Patch MDI Response: 200
Run Job Response: 201

MDI Creation Response: 201
Job Creation Response: 201
Patch MDI Response: 200
Run Job Response: 201

MDI Creation Response: 201
Job Creation Response: 201
Patch MDI Response: 200
Run Job Response: 201



## Metadata Enrichment

In [8]:
db2_result = client.create_and_run_metadata_enrichment(
        name="Db2 Warehouse MDE",
        mdi_id=db2_mdid,
        job_name=client.db2_name+" Enrichment Job",
        publish_job_name=client.db2_name+" publish Job",
    )

cos_result = client.create_and_run_metadata_enrichment(
        name="Cloud Object Storage Enrichment",
        mdi_id=cos_mdid,
        job_name=client.cos_name+" Enrichment Job",
        publish_job_name=client.cos_name+" publish Job",
    )
psql_result = client.create_and_run_metadata_enrichment(
        name="Postgresql MDE",
        mdi_id=psql_mdid,
        job_name=client.psql_name+" Enrichment Job",
        publish_job_name=client.psql_name+" publish Job",
    )
time.sleep(180)

Db2 Warehouse MDE: Metadata Enrichment Creation Response: 201
Cloud Object Storage Enrichment: Metadata Enrichment Creation Response: 201
Postgresql MDE: Metadata Enrichment Creation Response: 201


NameError: name 'time' is not defined

## Publish Assets
> You may need to wait about 2-3 minutes before running this cell to ensure that the Metadata Enrichment Job has completed.

In [None]:
db2_mde_id = db2_result.get("metadata", {}).get("asset_id")
db2_publish_result = client.publish_metadata_enrichment_assets(db2_mde_id)

cos_mde_id = cos_result.get("metadata", {}).get("asset_id")
cos_publish_result = client.publish_metadata_enrichment_assets(cos_mde_id)

psql_mde_id = psql_result.get("metadata", {}).get("asset_id")
psql_publish_result = client.publish_metadata_enrichment_assets(psql_mde_id)


In [None]:
print("Succesfully imported environment!")
print("You are ready to begin the lab!")