# CPDCTL Demo - Python script lifecycle

CPDCTL is a command-line interface (CLI) you can use to manage the lifecycle of notebooks. By using the notebook CLI, you can automate the flow for creating notebooks and running notebook jobs, moving notebooks between projects in Watson Studio, and adding custom libraries to notebook runtime environments.   

This notebook begins by showing you how to install and configure CPDCTL and is then split up into three sections with examples of how to use the commands to:

- Promote python script from the project to the space
- Promote python script to the QA space
- Download the script from the QA space and upload it to the production space

## Table of Contents

[1. Install and Configure CPDCTL](#part1)
- [1.1 Installing the latest version of CPDCTL](#part1.1)
- [1.2 Adding CPD cluster configuration settings](#part1.2)

[2. Demo 1: Promote python script from the project to the space](#part2)
- [2.1 Creating a notebook asset](#part2.1)
- [2.2 Running a deployment job](#part2.2)

[3. Demo 2: Promote python script to the QA space](#part3)
- [3.1 Export all asssets from the source (DEV) space](#part3.1)
- [3.2 Create QA space and import assets there](#part3.2)
- [3.3 Run batch deployment job in the QA space](#part3.3)

[4. Demo 3: Download the script from the QA space and upload it to the production space](#part4)
- [4.1 Downloading a script](#part4.1)
- [4.2 Creating the new script in the production space](#part4.2)
- [4.2 Updating the existing script deployment](#part4.3)

[5. Cleanup](#part5)

## Before you begin
Import the following libraries:

In [1]:
import base64
import json
import os
import requests
import platform
import tarfile
import zipfile
from datetime import datetime
from IPython.core.display import display, HTML

##  1. Installing and configurating CPDCTL <a class="anchor" id="part1"></a>

### 1.1 Installing the latest version of CPDCTL <a class="anchor" id="part1.1"></a>

To use the notebook and environment CLI commands, you need to install CPDCTL. Download the binary from the [CPDCTL GitHub respository](https://github.com/IBM/cpdctl/releases).

Download the binary and then display the version number:

In [2]:
PLATFORM = platform.system().lower()
CPDCTL_ARCH = "{}_amd64".format(PLATFORM)
CPDCTL_RELEASES_URL="https://api.github.com/repos/IBM/cpdctl/releases"
CWD = os.getcwd()
PATH = os.environ['PATH']
CPD_CONFIG = os.path.join(CWD, '.cpdctl.config.yml')

response = requests.get(CPDCTL_RELEASES_URL)
assets = response.json()[0]['assets']
platform_asset = next(a for a in assets if CPDCTL_ARCH in a['name'])
cpdctl_url = platform_asset['url']
cpdctl_file_name = platform_asset['name']

response = requests.get(cpdctl_url, headers={'Accept': 'application/octet-stream'})
with open(cpdctl_file_name, 'wb') as f:
    f.write(response.content)
    
display(HTML('<code>cpdctl</code> binary downloaded from: <a href="{}">{}</a>'.format(platform_asset['browser_download_url'], platform_asset['name'])))
display(HTML("<style>div.output_area pre {white-space: pre;}</style>"))
display(HTML("<style>.container { width:90% !important; }</style>"))

In [3]:
%%capture

%env PATH={CWD}:{PATH}
%env CPD_CONFIG={CPD_CONFIG}

In [4]:
if cpdctl_file_name.endswith('tar.gz'):
    with tarfile.open(cpdctl_file_name, "r:gz") as tar:
        tar.extractall()
elif cpdctl_file_name.endswith('zip'):
    with zipfile.ZipFile(cpdctl_file_name, 'r') as zf:
        zf.extractall()

if CPD_CONFIG and os.path.exists(CPD_CONFIG):
    os.remove(CPD_CONFIG)
    
version_r = ! cpdctl version
CPDCTL_VERSION = version_r.s

print("cpdctl version: {}".format(CPDCTL_VERSION))

cpdctl version: 1.0.46


### 1.2  Adding CPD cluster configuration settings <a class="anchor" id="part1.2"></a>

Before you can use CPDCTL, you need to add configuration settings. You only need to configure these settings once for the same CPD user and cluster. Begin by entering your IBM Cloud Pak for Data (CPD) credentials and the URL to the CPD cluster.<br>**Note**: when running this notebook inside IBM Cloud Pak for Data (CP4D) cluster, cpdctl takes advantage of [zero-configuration mode](https://github.com/IBM/cpdctl#zero-configuration) which means it can connect to the CP4D without explicit configuration. In that case the cells below that set credential and URL variables as well as cells that run `cpdctl config ...` commands can be skipped.

In [None]:
CPD_USER_NAME = #'YOUR CPD user name'
CPD_USER_PASSWORD = #'YOUR CPD user password'
CPD_URL = #'YOUR CPD CLUSTER URL'

CPD_PROD_USER_NAME = #'YOUR CPD production user name'
CPD_PROD_USER_PASSWORD = #'YOUR CPD production user password'
CPD_PROD_URL = #'YOUR CPD PRODUCTION CLUSTER URL'

Add "cpd_user" user to the cpdctl configuration:

In [7]:
! cpdctl config user set cpd_user --username {CPD_USER_NAME} --password {CPD_USER_PASSWORD}

Add "cpd" cluster to the cpdctl configuration:

In [8]:
! cpdctl config profile set cpd --url {CPD_URL} --user cpd_user

The same for the PROD cluster

In [10]:
! cpdctl config user set cpd_prod_user --username {CPD_PROD_USER_NAME} --password {CPD_PROD_USER_PASSWORD}
! cpdctl config profile set cpd_prod --url {CPD_PROD_URL} --user cpd_prod_user

List available profiles:

In [1]:
! cpdctl config profile list

[1mName[0m       [1mType[0m      [1mUser[0m            [1mURL[0m                                              [1mCurrent[0m
[36;1mcpd[0m        private   cpd_user        https://cpd-zen.apps.wp463case.cp.fyre.ibm.com   *
[36;1mcpd_prod[0m   private   cpd_prod_user   https://cpd-zen.apps.wp463case.cp.fyre.ibm.com   


Switch the current profile:

In [2]:
! cpdctl config profile use cpd

Switched to profile "cpd".


List available projects in profile:

In [14]:
! cpdctl project list

...
[1mID[0m                                     [1mName[0m                               [1mCreated[0m                    [1mDescription[0m   [1mTags[0m   
[36;1m0cabb425-56d9-48fc-a178-f30d22737778[0m   git-demo-project                   2021-02-19T13:47:11.811Z                 []   
[36;1m45f48375-4cdf-4354-8f51-3dbe933dc0aa[0m   Clustering Demo (git integrated)   2021-05-20T06:31:48.099Z                 []   
[36;1m7fb76cf7-25be-435d-818e-bd6e9b5254f5[0m   cpdctl-demo                        2021-01-29T08:01:23.363Z                 []   


Choose a project in which you will work:

In [15]:
result = ! cpdctl project list --output json -j "(resources[].metadata.guid)[0]" --raw-output
project_id = result.s
print("project id: {}".format(project_id))

# You can also specify your project id directly:
# project_id = "Your project ID"

project id: 0cabb425-56d9-48fc-a178-f30d22737778


## 2. Demo 1: Promote python script from the project to the space  <a class="anchor" id="part2"></a>

Before starting with this section, please ensure that you have run the cells in [Section 1](#part1) and specified the ID of the project in which you will work.

Suppose you have a python script created with JupyterLab in Watson Studio and you would like to run the code on a CPD cluster. This section shows how to promote a script asset from a project to a space and run a job on a CPD cluster. 



### 2.1 Promote script asset to the space<a class="anchor" id="part2.1"></a>

List all the script assets in your project, filter them by their display name and get the ID of the script:

In [16]:
! cpdctl asset search --project-id {project_id} --type-name script --query "*:*"

...
[1mID[0m                                     [1mName[0m               [1mCreated[0m                    [1mDescription[0m   [1mType[0m     [1mState[0m       [1mTags[0m   [1mSize[0m   
[36;1m14766a5b-b842-4f19-b2d0-690b64e46d25[0m   batch_job_script   2021-02-19T22:08:45.000Z                 script   available   []     4183   


In [17]:
script_name = "batch_job_script"
query = "asset.name:{}".format(script_name)
jmes_query = "results[0].metadata.asset_id"

In [18]:
! cpdctl asset search --project-id {project_id} --query {query} --type-name script --output json | jq

[1;39m{
  [0m[34;1m"results"[0m[1;39m: [0m[1;39m[
    [1;39m{
      [0m[34;1m"href"[0m[1;39m: [0m[0;32m"/v2/assets/14766a5b-b842-4f19-b2d0-690b64e46d25?project_id=0cabb425-56d9-48fc-a178-f30d22737778"[0m[1;39m,
      [0m[34;1m"metadata"[0m[1;39m: [0m[1;39m{
        [0m[34;1m"asset_attributes"[0m[1;39m: [0m[1;39m[
          [0;32m"script"[0m[1;39m
        [1;39m][0m[1;39m,
        [0m[34;1m"asset_category"[0m[1;39m: [0m[0;32m"USER"[0m[1;39m,
        [0m[34;1m"asset_id"[0m[1;39m: [0m[0;32m"14766a5b-b842-4f19-b2d0-690b64e46d25"[0m[1;39m,
        [0m[34;1m"asset_state"[0m[1;39m: [0m[0;32m"available"[0m[1;39m,
        [0m[34;1m"asset_type"[0m[1;39m: [0m[0;32m"script"[0m[1;39m,
        [0m[34;1m"created_at"[0m[1;39m: [0m[0;32m"2021-02-19T22:08:45.000Z"[0m[1;39m,
        [0m[34;1m"description"[0m[1;39m: [0m[0;32m""[0m[1;39m,
        [0m[34;1m"name"[0m[1;39m: [0m[0;32m"batch_job_script"[0m[1;39m,
        

In [19]:
result = ! cpdctl asset search --project-id {project_id} --query {query} --type-name script --output json --jmes-query "{jmes_query}" --raw-output
script_id = result.s
print("script id: {}".format(script_id))

script id: 14766a5b-b842-4f19-b2d0-690b64e46d25


List all spaces

In [20]:
! cpdctl space list

...
[1mID[0m                                     [1mName[0m                            [1mCreated[0m                    [1mState[0m    [1mTags[0m   
[36;1md9bfa660-1be7-46ab-aa53-e9010a634bba[0m   cpdctl-demo-space               2021-01-29T08:56:07.389Z   active   []   
[36;1m6c205951-1c61-49b9-b46d-e5e199492775[0m   cpdctl-prod-space-for-scripts   2021-02-22T07:45:34.787Z   active   []   
[36;1m51824744-7b11-4a6b-aff0-6ee16de709d0[0m   cpdctl-demo-new-qa-space        2021-02-22T13:31:23.319Z   active   []   
[36;1m77ece893-36c2-43cd-b1ce-b155decbc05c[0m   cpdctl-qa-space-for-scripts     2021-04-30T12:02:24.332Z   active   []   


Select the 'cpdctl-demo-space' space

In [21]:
dev_space_name = 'cpdctl-demo-space'
jmes_query = "resources[?entity.name == '{}'] | [0].metadata.id".format(dev_space_name)
result = ! cpdctl space list --output json --jmes-query "{jmes_query}" --raw-output
space_id = result.s
print('Space ID: {}'.format(space_id))

Space ID: d9bfa660-1be7-46ab-aa53-e9010a634bba


Select script asset for promotion and provide expected name and metadata:

In [22]:
import json

promote = {
    "mode": 0,
    "space_id": space_id,
    "metadata": {
        "name": "batch_job_script.py",
        "tags": ["cpdctl-demo", "promoted-asset-{}]".format(script_id)]
    }
}
promote_json = json.dumps(promote)

! cpdctl asset promote --project-id {project_id} --asset-id {script_id} --request-body '{promote_json}'

...
[32;1mOK[0m


List assets in the space

In [23]:
! cpdctl asset search --space-id {space_id} --type-name script --query "*:*"

...
[1mID[0m                                     [1mName[0m                  [1mCreated[0m                    [1mDescription[0m   [1mType[0m     [1mState[0m       [1mTags[0m                                                 [1mSize[0m   
[36;1mb7b753ab-e484-4bf1-9a8c-80c729fb5f64[0m   batch_job_script.py   2021-05-20T13:19:38.000Z                 script   available   [cpdctl-demo promoted-asset-14766a5b-b842-4f19-b2…   4183   


Select the promoted script

In [24]:
query = 'asset.name:batch_job_script.py'
jmes_query = "results[0].metadata.asset_id"
result = ! cpdctl asset search --space-id {space_id} --query {query} --type-name script --output json --jmes-query "{jmes_query}" --raw-output
promoted_script_id = result.s
print("promoted script id: {}".format(promoted_script_id))

promoted script id: b7b753ab-e484-4bf1-9a8c-80c729fb5f64


List software specifications

In [25]:
software_specification_name = "default_py3.7"
jmes_query = "resources[0].metadata.asset_id"
result = ! cpdctl environment software-specification list --space-id {space_id} --name '{software_specification_name}' --output json --jmes-query '{jmes_query}' --raw-output
software_specification_id = result.s
print("software specification id: {}".format(software_specification_id))

software specification id: e4429883-c883-42b6-87a8-f419d64088cd


Set python script's software specification

In [32]:
software_spec = {
    "base_id": "{}".format(software_specification_id),
    "name": software_specification_name
}

patch = [{
    "op": "add",
    "path": "/software_spec",
    "value": software_spec
}]
patch_json = json.dumps(patch)

! cpdctl asset attribute update --space-id {space_id} --asset-id {promoted_script_id} --attribute-key script  --json-patch '{patch_json}'

...
[32;1mOK[0m


### 2.2 Run batch deployment job <a class="anchor" id="part2.2"></a>

Create batch deployment:

In [33]:
asset = {
    'id': promoted_script_id
}
asset_json = json.dumps(asset)

hardware_spec = {
    'name': 'S'
}
hardware_spec_json = json.dumps(hardware_spec)

batch_json = '{}'

deployment_name = 'script_batch_deployment'

In [34]:
result = ! cpdctl ml deployment create --space-id {space_id} --name '{deployment_name}' --asset '{asset_json}' --hardware-spec '{hardware_spec_json}' --batch '{batch_json}' --output json -j "metadata.id" --raw-output
deployment_id = result.s
print("deployment id: {}".format(deployment_id))

deployment id: 7f7a3a18-842a-422b-b60e-f742229f8853


Create a deployment job

In [35]:
deployment_job_name = 'script_batch_deployment_job'

deployment = {
    'id': deployment_id
}
deployment_json = json.dumps(deployment)

scoring = {
    "input_data_references": [
      {
        "type": "data_asset",
        "connection": {},
        "location": {
          "href": "/v2/assets/783d8fc5-ae2c-47d0-a311-f8890dfa1ce0?space_id=d9bfa660-1be7-46ab-aa53-e9010a634bba"
        }
      }
    ],
    "output_data_reference": {
      "type": "data_asset",
      "connection": {},
      "location": {
        "href": "/v2/assets/c7de9cb4-0ec5-41fa-8b94-3e89eb2cb795?space_id=d9bfa660-1be7-46ab-aa53-e9010a634bba"
      }
    }
}
scoring_json = json.dumps(scoring)

In [36]:
result = ! cpdctl ml deployment-job create --space-id {space_id} --name '{deployment_job_name}' --deployment '{deployment_json}' --scoring '{scoring_json}' --output json
deployment_job = json.loads(result.s)
print(json.dumps(deployment_job, indent=2))
job_id = deployment_job['entity']['platform_job']['job_id']
run_id = deployment_job['entity']['platform_job']['run_id']

{
  "entity": {
    "deployment": {
      "id": "7f7a3a18-842a-422b-b60e-f742229f8853"
    },
    "platform_job": {
      "job_id": "5b17b549-43a1-469f-9239-cf8fdfc83340",
      "run_id": "ddc60736-669d-4c73-af61-16037d835f8a"
    },
    "scoring": {
      "input_data_references": [
        {
          "connection": {},
          "location": {
            "href": "/v2/assets/783d8fc5-ae2c-47d0-a311-f8890dfa1ce0?space_id=d9bfa660-1be7-46ab-aa53-e9010a634bba"
          },
          "type": "data_asset"
        }
      ],
      "output_data_reference": {
        "connection": {},
        "location": {
          "href": "/v2/assets/c7de9cb4-0ec5-41fa-8b94-3e89eb2cb795?space_id=d9bfa660-1be7-46ab-aa53-e9010a634bba"
        },
        "type": "data_asset"
      },
      "status": {
        "state": "queued"
      }
    }
  },
  "metadata": {
    "created_at": "2021-05-20T14:59:02.893Z",
    "id": "4099ad31-fc54-4c6a-a6f5-63de53450959",
    "name": "script_batch_deployment_job",
    "space_id

Wait for job completion

In [37]:
! cpdctl job run wait --job-id {job_id} --run-id {run_id} --space-id {space_id}

...
[1m[0m               [1m[0m   
[36;1mID:[0m            ddc60736-669d-4c73-af61-16037d835f8a   
[36;1mName:[0m          job run   
[36;1mCreated:[0m       2021-05-20T14:59:02Z   
[36;1mDescription:[0m      
[36;1mState:[0m         Completed   
[36;1mTags:[0m          []   


You can see the batch deployment log:

In [38]:
! cpdctl job run logs --job-id {job_id} --run-id {run_id} --space-id {space_id}

...
{
  "deployment": {
    "id": "7f7a3a18-842a-422b-b60e-f742229f8853"
  },
  "platform_job": {
    "job_id": "5b17b549-43a1-469f-9239-cf8fdfc83340",
    "run_id": "ddc60736-669d-4c73-af61-16037d835f8a"
  },
  "scoring": {
    "input_data_references": [
      {
        "connection": {},
        "location": {
          "href": "/v2/assets/783d8fc5-ae2c-47d0-a311-f8890dfa1ce0?space_id=d9bfa660-1be7-46ab-aa53-e9010a634bba"
        },
        "type": "data_asset"
      }
    ],
    "output_data_reference": {
      "connection": {},
      "location": {
        "href": "/v2/assets/c7de9cb4-0ec5-41fa-8b94-3e89eb2cb795?space_id=d9bfa660-1be7-46ab-aa53-e9010a634bba"
      },
      "type": "data_asset"
    },
    "status": {
      "completed_at": "2021-05-20T14:59:41.414608Z",
      "message": {
        "text": "The directory pointed by the environment variable BATCH_OUTPUT_DIR is empty, skipping content upload to data asset",
      },
      "running_at": "2021-05-20T14:59:15.267871Z",
      "

## 3. Demo 2: Promoting python script to the QA space <a class="anchor" id="part3"></a>

Before starting with this section, please ensure that you have run the cells in [Section 1](#part1) and [Section 2](#part2).

Suppose you have a Python script (.py) on your local system and you would like to run the code in the script as a job on a CPD cluster. This section shows you how to create a Python script asset and run a job on a CPD cluster.



### 3.1 Export all asssets from the source (DEV) space<a class="anchor" id="part3.1"></a>

List all assets in the source (DEV) space

In [39]:
! cpdctl asset search --space-id {space_id} --query '*:*' --type-name asset

...
[1mID[0m                                     [1mName[0m                                                 [1mCreated[0m                    [1mDescription[0m   [1mType[0m         [1mState[0m       [1mTags[0m                                                 [1mSize[0m   
[36;1m783d8fc5-ae2c-47d0-a311-f8890dfa1ce0[0m   car_rental_training_data.csv                         2021-01-29T08:57:12.000Z                 data_asset   available   [cpdctl-demo promoted-7fb76cf7]]                     79518   
[36;1mc7de9cb4-0ec5-41fa-8b94-3e89eb2cb795[0m   bank-marketing-batch-output.csv                      2021-02-19T16:54:38.000Z                 data_asset   available   [connected-data]                                     0   
[36;1md8b147ce-d7e2-4155-8fd6-97f49953fa5e[0m   job run                                              2021-02-21T21:19:26.000Z                 job_run      available   []                                                   0   
[36;1m8af92e6d-8cc4-42ad-81

Export all assets from the source (DEV) space

In [40]:
EXPORT = {
    'all_assets': True
}
EXPORT_JSON = json.dumps(EXPORT)
result = ! cpdctl asset export start --space-id {space_id} --assets '{EXPORT_JSON}' --name dev-space-all-assets --output json --jmes-query "metadata.id"
export_id = result.s
print("The new export with ID: {}".format(export_id))

The new export with ID: "ff08c50c-5b94-472c-a789-2853f8c517fc"


In [41]:
! cpdctl asset export get --space-id {space_id} --export-id {export_id}

...
[1m[0m           [1m[0m   
[36;1mID:[0m        ff08c50c-5b94-472c-a789-2853f8c517fc   
[36;1mName:[0m      dev-space-all-assets   
[36;1mCreated:[0m   2021-05-20T15:04:09.140Z   
[36;1mState:[0m     completed   


In [42]:
dev_space_archive_path = './dev-space-assets.zip'
! cpdctl asset export download --space-id {space_id} --export-id {export_id} --output-file {dev_space_archive_path}

...
[32;1mOK[0m
Output written to ./dev-space-assets.zip


In [43]:
! ls -al {dev_space_archive_path}

-rw-r--r--@ 1 rafalbigaj  staff  58590 May 20 17:04 ./dev-space-assets.zip


### 3.2 Create QA space and import assets there<a class="anchor" id="part3.2"></a>

Create a new QA space

In [44]:
qa_space_name = 'cpdctl-qa-space-for-scripts'
result = ! cpdctl space create --name '{qa_space_name}' --output json --jmes-query "metadata.id" --raw-output
qa_space_id = result.s
print("The new '{}' space ID is: {}".format(qa_space_name, qa_space_id))

The new 'cpdctl-qa-space-for-scripts' space ID is: 76ac8fbb-af3a-456a-ad4d-6dfd37ec54bd


Import assets from the exported archive into QA space

In [45]:
result = ! cpdctl asset import start --space-id {qa_space_id} --import-file {dev_space_archive_path} --output json --jmes-query "metadata.id" --raw-output
qa_import_id = result.s
print("The new import ID is: {}".format(qa_import_id))

The new import ID is: 883e42aa-c9f1-4b3c-9a3d-707ed0a65484


In [46]:
! cpdctl asset import get --space-id {qa_space_id} --import-id {qa_import_id}

...
[1m[0m           [1m[0m   
[36;1mID:[0m        883e42aa-c9f1-4b3c-9a3d-707ed0a65484   
[36;1mCreated:[0m   2021-05-20T15:05:46.348Z   
[36;1mState:[0m     completed   


List all assets in the QA space

In [47]:
! cpdctl asset search --space-id {qa_space_id} --query '*:*' --type-name asset

...
[1mID[0m                                     [1mName[0m                                   [1mCreated[0m                    [1mDescription[0m   [1mType[0m         [1mState[0m       [1mTags[0m                               [1mSize[0m   
[36;1m4366cc47-d32d-4c9c-946e-a871f746d196[0m   batch-job-outputs-connection           2021-05-20T15:05:49.000Z                 connection   available   []                                 0   
[36;1m6f188ad9-94fd-4a35-8aa5-532d258b47f2[0m   bank-marketing-batch-output.csv        2021-05-20T15:05:49.000Z                 data_asset   available   [connected-data]                   0   
[36;1m6eff8009-81c5-4399-a8fc-f0a1bc3f5544[0m   car_rental_training_data.csv           2021-05-20T15:05:50.000Z                 data_asset   available   [cpdctl-demo promoted-7fb76cf7]]   79518   
[36;1mfdaa3ecc-aed3-40b6-bcdb-fbbc73169dfe[0m   boston-house-prices-prediction-model   2021-05-20T15:05:50.000Z                 wml_model    available   [

In [66]:
result = ! cpdctl asset search --space-id {qa_space_id} --query 'asset.name:car_rental*' --type-name data_asset --raw-output -j 'results[0].metadata.asset_id' --output json
qa_input_asset_id = result.s
print('Input asset ID: {}'.format(qa_input_asset_id))
result = ! cpdctl asset search --space-id {qa_space_id} --query 'asset.name:bank-marketing-batch-output*' --type-name data_asset --raw-output -j 'results[0].metadata.asset_id' --output json
qa_output_asset_id = result.s
print('Output asset ID: {}'.format(qa_output_asset_id))

Input asset ID: 6eff8009-81c5-4399-a8fc-f0a1bc3f5544
Output asset ID: 6f188ad9-94fd-4a35-8aa5-532d258b47f2


### 3.3 Run batch deployment job in the QA space<a class="anchor" id="part3.3"></a>

Search for the imported script asset in the QA space

In [48]:
asset_type = 'script'
query = 'asset.name:{}'.format('batch_job_script.py')
jmes_query = 'results[0].metadata.asset_id'
result = ! cpdctl asset search --space-id {qa_space_id} --query '{query}' --type-name {asset_type} --output json --jmes-query "{jmes_query}" --raw-output
qa_script_id = result.s
print("ID of the script in QA space: {}".format(qa_script_id))


ID of the script in QA space: 1d192318-7b85-41dd-9967-1b119b066113


Create the script batch deployment in the QA space:

In [49]:
asset = {
    'id': qa_script_id
}
asset_json = json.dumps(asset)

hardware_spec = {
    'name': 'S'
}
hardware_spec_json = json.dumps(hardware_spec)

batch_json = '{}'

deployment_name = 'script_batch_deployment'

In [50]:
result = ! cpdctl ml deployment create --space-id {qa_space_id} --name '{deployment_name}' --asset '{asset_json}' --hardware-spec '{hardware_spec_json}' --batch '{batch_json}' --output json -j "metadata.id" --raw-output
qa_deployment_id = result.s
print("ID of the deployment in QA space: {}".format(qa_deployment_id))

ID of the deployment in QA space: 293dea38-cae3-4305-a735-8ebdd2a8a097


Create a deployment job

In [67]:
deployment_job_name = 'script_batch_deployment_job'

deployment = {
    'id': qa_deployment_id
}
deployment_json = json.dumps(deployment)

scoring = {
    "input_data_references": [
      {
        "type": "data_asset",
        "connection": {},
        "location": {
          "href": "/v2/assets/{}?space_id={}".format(qa_input_asset_id, qa_space_id)
        }
      }
    ],
    "output_data_reference": {
      "type": "data_asset",
      "connection": {},
      "location": {
        "href": "/v2/assets/{}?space_id={}".format(qa_output_asset_id, qa_space_id)
      }
    }
}
scoring_json = json.dumps(scoring)

In [68]:
result = ! cpdctl ml deployment-job create --space-id {qa_space_id} --name '{deployment_job_name}' --deployment '{deployment_json}' --scoring '{scoring_json}' --output json
qa_deployment_job = json.loads(result.s)
print(json.dumps(qa_deployment_job, indent=2))
qa_job_id = qa_deployment_job['entity']['platform_job']['job_id']
qa_run_id = qa_deployment_job['entity']['platform_job']['run_id']

{
  "entity": {
    "deployment": {
      "id": "293dea38-cae3-4305-a735-8ebdd2a8a097"
    },
    "platform_job": {
      "job_id": "4cb3886e-5864-4e53-8047-92b70bfe10b8",
      "run_id": "8c756251-5d0c-44ef-9a71-6c6d4480b254"
    },
    "scoring": {
      "input_data_references": [
        {
          "connection": {},
          "location": {
            "href": "/v2/assets/6eff8009-81c5-4399-a8fc-f0a1bc3f5544?space_id=76ac8fbb-af3a-456a-ad4d-6dfd37ec54bd"
          },
          "type": "data_asset"
        }
      ],
      "output_data_reference": {
        "connection": {},
        "location": {
          "href": "/v2/assets/6f188ad9-94fd-4a35-8aa5-532d258b47f2?space_id=76ac8fbb-af3a-456a-ad4d-6dfd37ec54bd"
        },
        "type": "data_asset"
      },
      "status": {
        "state": "queued"
      }
    }
  },
  "metadata": {
    "created_at": "2021-05-21T13:41:52.431Z",
    "id": "aadec7c4-7f79-4072-84d2-0d6b2f392654",
    "name": "script_batch_deployment_job",
    "space_id

Wait for job completion

In [69]:
! cpdctl job run wait --job-id {qa_job_id} --run-id {qa_run_id} --space-id {qa_space_id}

...
[1m[0m               [1m[0m   
[36;1mID:[0m            8c756251-5d0c-44ef-9a71-6c6d4480b254   
[36;1mName:[0m          job run   
[36;1mCreated:[0m       2021-05-21T13:41:52Z   
[36;1mDescription:[0m      
[36;1mState:[0m         Completed   
[36;1mTags:[0m          []   


You can see the batch deployment log:

In [70]:
! cpdctl job run logs --job-id {qa_job_id} --run-id {qa_run_id} --space-id {qa_space_id}

...
{
  "deployment": {
    "id": "293dea38-cae3-4305-a735-8ebdd2a8a097"
  },
  "platform_job": {
    "job_id": "4cb3886e-5864-4e53-8047-92b70bfe10b8",
    "run_id": "8c756251-5d0c-44ef-9a71-6c6d4480b254"
  },
  "scoring": {
    "input_data_references": [
      {
        "connection": {},
        "location": {
          "href": "/v2/assets/6eff8009-81c5-4399-a8fc-f0a1bc3f5544?space_id=76ac8fbb-af3a-456a-ad4d-6dfd37ec54bd"
        },
        "type": "data_asset"
      }
    ],
    "output_data_reference": {
      "connection": {},
      "location": {
        "href": "/v2/assets/6f188ad9-94fd-4a35-8aa5-532d258b47f2?space_id=76ac8fbb-af3a-456a-ad4d-6dfd37ec54bd"
      },
      "type": "data_asset"
    },
    "status": {
      "completed_at": "2021-05-21T13:42:27.233464Z",
      "message": {
        "text": "The directory pointed by the environment variable BATCH_OUTPUT_DIR is empty, skipping content upload to data asset",
      },
      "running_at": "2021-05-21T13:42:04.595265Z",
      "

## 4. Demo 3: Download the script from the QA space and upload it to the production space <a class="anchor" id="part4"></a>

Before starting with this section, please ensure that you have run the cells in all previous sections: [Section 1](#part1), [Section 2](#part2) and [Section 3](#part3).



### 4.1 Downloading a script <a class="anchor" id="part4.1"></a>

Download the script from the QA space

In [71]:
jmes_query = 'attachments[0].object_key'
result = ! cpdctl asset get --space-id {qa_space_id} --asset-id {qa_script_id} --output json --jmes-query '{jmes_query}' --raw-output
qa_script_path = result.s
print('Path to the QA script: {}'.format(qa_script_path))

Path to the QA script: script/batch_job_script_9wdnvaqcqxsv1k8sjf9i279x0.py


In [72]:
local_script_path = "qa_batch_job_script.py"
! cpdctl asset file download --space-id {qa_space_id} --path {qa_script_path} --output-file {local_script_path}

...
[32;1mOK[0m
Output written to qa_batch_job_script.py


In [73]:
! head {local_script_path}

import os
import ibm_boto3
import json
import pandas as pd
import requests
from botocore.client import Config
from ibm_watson_machine_learning import APIClient


### Function to get asset details using REST API. This won't be needed once python client adds attachment details in asset meta ###


In [74]:
! sed -i.bak 's#{CPD_URL}#{CPD_PROD_URL}#' {local_script_path}

### 4.2 Creating the new script in production space<a class="anchor" id="part4.2"></a>

Switch the CLI profile to `cpd_prod` - production cluster

In [3]:
! cpdctl config profile use cpd_prod

Switched to profile "cpd_prod".


Upload the script file to production space:

In [76]:
prod_space_name = 'cpdctl-prod-space-for-scripts'
jmes_query = "resources[?entity.name == '{}'] | [0].metadata.id".format(prod_space_name)
result = ! cpdctl space list --output json --jmes-query "{jmes_query}" --raw-output
prod_space_id = result.s
print('Production space ID: {}'.format(prod_space_id))

Production space ID: f67d7982-7dd0-4ed3-9161-503145f4e0ae


In [77]:
remote_script_path = "script/batch_job_script.py"
! cpdctl asset file upload --path {remote_script_path} --file {local_script_path} --space-id {prod_space_id}

...
[32;1mOK[0m


In [78]:
jmes_query = "resources[0].metadata.asset_id"
result = ! cpdctl environment software-specification list --space-id {prod_space_id} --name '{software_specification_name}' --output json --jmes-query '{jmes_query}' --raw-output --profile cpd_prod
prod_software_specification_id = result.s
print("software specification id: {}".format(prod_software_specification_id))

software specification id: e4429883-c883-42b6-87a8-f419d64088cd


Specify the metadata, entity and attachments of the script file in the production space:

In [79]:
script_ts = datetime.now().strftime('%Y%m%d-%H%M%S')

metadata = {
    "name": "batch_job_script_{}".format(script_ts),
    "asset_type": "script",
    "asset_category": "USER",
    "origin_country": "us"
}
metadata_json = json.dumps(metadata)

entity = {
    "script": {
        "language": {
            "name": "python3"
        },
        "software_spec": {
            "base_id": "{}".format(prod_software_specification_id),
            "name": software_specification_name
        }
    }
}
entity_json = json.dumps(entity)

attachments = [
    {
        "asset_type": "script",
        "name": "batch_job_script.py",
        "description": "attachment for script",
        "mime": "application/text",
        "object_key": remote_script_path
    }
]
attachments_json = json.dumps(attachments)

Create a Python script asset:

In [80]:
result = ! cpdctl asset create --metadata '{metadata_json}' --entity '{entity_json}' --attachments '{attachments_json}' --space-id {prod_space_id} --output json -j "metadata.asset_id" --raw-output
prod_script_id = result.s
print("ID of the script in production space: {}".format(prod_script_id))

ID of the script in production space: 0d39cded-9287-4ad0-91a0-d288147f431d


In [81]:
! cpdctl asset search --space-id {prod_space_id} --query '*:*' --type-name asset

...
[1mID[0m                                     [1mName[0m                                    [1mCreated[0m                    [1mDescription[0m   [1mType[0m                            [1mState[0m       [1mTags[0m                               [1mSize[0m   
[36;1mba81206c-6f95-4200-b31d-727ba94b85fb[0m   bank-marketing-batch-output.csv         2021-04-30T12:35:42.000Z                 data_asset                      available   [connected-data]                   0   
[36;1ma6f58baf-1faf-431f-b66a-fc6b3d0fe5e9[0m   default job - script_batch_deployment   2021-04-30T12:39:28.000Z                 job                             available   []                                 0   
[36;1m9dbb9b27-2ba2-4ed6-bba3-461df78025d7[0m   job run                                 2021-04-30T12:41:46.000Z                 job_run                         available   []                                 0   
[36;1m56474183-8df9-45ab-b132-057b2585213c[0m   batch_job_script_20210430-1451

### 4.3 Updating the existing script deployment<a class="anchor" id="part4.3"></a>

Update the **existing** script batch deployment in the production space:

In [83]:
deployment_name = 'script_batch_deployment'
jmes_query = 'resources[0].metadata.id'
result = ! cpdctl ml deployment list --space-id {prod_space_id} --name {deployment_name} --output json --jmes-query '{jmes_query}' --raw-output
prod_deployment_id = result.s
print('Existing production deployment ID: {}'.format(prod_deployment_id))

Existing production deployment ID: 25e17b86-9deb-4950-920d-47ba1fc781c1


Update the deployed asset with the newly created script

In [84]:
asset = {
    'id': prod_script_id
}
asset_json = json.dumps(asset)

In [85]:
! cpdctl ml deployment update --space-id {prod_space_id} --deployment-id {prod_deployment_id} --asset '{asset_json}'

...
[1m[0m           [1m[0m   
[36;1mID:[0m        25e17b86-9deb-4950-920d-47ba1fc781c1   
[36;1mName:[0m      script_batch_deployment   
[36;1mCreated:[0m   2021-04-30T12:39:28.441Z   
[36;1mState:[0m     ready   
[36;1mTags:[0m      []   


Get the deployment job

In [189]:
jmes_query = 'resources[0].entity.platform_job.job_id'
result = ! cpdctl ml deployment-job list --deployment-id {prod_deployment_id} --space-id {prod_space_id} --output json --jmes-query '{jmes_query}' --raw-output
prod_job_id = result.s
print('Production job ID: {}'.format(prod_job_id))

Production job ID: 2bb4ef91-cd73-4748-a10a-6316b2a71550


In [180]:
run = '{}'
jmes_query = 'metadata.asset_id'
result = ! cpdctl job run create --space-id {prod_space_id} --job-id {prod_job_id} --job-run '{run}' --output json --jmes-query '{jmes_query}' --raw-output
prod_run_id = result.s
print('ID of the job run in production space: {}'.format(prod_run_id))

ID of the job run in production space: b558937d-66f1-4d7d-8182-aa1c3ce49cf9


Wait for job completion

In [181]:
! cpdctl job run wait --job-id {prod_job_id} --run-id {prod_run_id} --space-id {prod_space_id}

...
[1m[0m               [1m[0m   
[36;1mID:[0m            b558937d-66f1-4d7d-8182-aa1c3ce49cf9   
[36;1mName:[0m          job run   
[36;1mCreated:[0m       2021-05-21T15:03:32Z   
[36;1mDescription:[0m      
[36;1mState:[0m         Completed   
[36;1mTags:[0m          []   


You can see the batch deployment log:

In [182]:
! cpdctl job run logs --job-id {prod_job_id} --run-id {prod_run_id} --space-id {prod_space_id}

...
{
  "deployment": {
    "id": "865c5d4b-3439-438a-a2a0-48305237f46d"
  },
  "hardware_spec": {
    "id": "f3ebac7d-0a75-410c-8b48-a931428cc4c5"
  },
  "platform_job": {
    "job_id": "2bb4ef91-cd73-4748-a10a-6316b2a71550",
    "run_id": "b558937d-66f1-4d7d-8182-aa1c3ce49cf9"
  },
  "scoring": {
    "input_data_references": [
      {
        "connection": {},
        "location": {
          "href": "/v2/assets/093424a6-c966-447e-8552-7f2d991f9a76?space_id=f67d7982-7dd0-4ed3-9161-503145f4e0ae"
        },
        "type": "data_asset"
      }
    ],
    "output_data_reference": {
      "connection": {},
      "location": {
        "href": "/v2/assets/ba81206c-6f95-4200-b31d-727ba94b85fb?space_id=f67d7982-7dd0-4ed3-9161-503145f4e0ae"
      },
      "type": "data_asset"
    },
    "status": {
      "completed_at": "2021-05-21T15:04:16.176386Z",
      "message": {
        "text": "The directory pointed by the environment variable BATCH_OUTPUT_DIR is empty, skipping content upload to data 

## 5. Cleanup <a class="anchor" id="part5"></a>

Delete QA space

In [227]:
! cpdctl space delete --space-id {qa_space_id} --profile cpd

...
[32;1mOK[0m


### Author

Rafał Bigaj, System Architect with long successful record of building and leading teams. Broad and practical knowledge in the area of cloud computing, machine learning and distributed systems development. 

Copyright © 2020 IBM. This notebook and its source code are released under the terms of the MIT License.