# CPDCTL Samples for Notebooks and Environments in Spaces

CPDCTL is a command-line interface (CLI) you can use to manage the lifecycle of notebooks. By using the notebook CLI, you can automate the flow for creating notebooks and running notebook jobs, as well as promoting notebooks from a project to a space.   

This notebook begins by showing you how to install and configure CPDCTL and is then split up into two sections with examples of how to use the commands for:

- Creating notebooks and running notebook jobs
- Promoting notebooks from a project to a space

## Table of Contents

[1. Installing and configuring CPDCTL](#part1)
- [1.1 Installing the latest version of CPDCTL](#part1.1)
- [1.2 Adding CPD cluster configuration settings](#part1.2)

[2. Demo 1: Creating a notebook asset and running a job](#part2)
- [2.1 Creating a notebook asset](#part2.1)
- [2.2 Running a job](#part2.2)

[3. Demo 2: Promoting a notebook from a project to a space](#part4)

## Before you begin
Import the following libraries:

In [1]:
import base64
import json
import os
import requests
import platform
import tarfile
import zipfile
from IPython.core.display import display, HTML

##  1. Installing and configuring CPDCTL <a class="anchor" id="part1"></a>

### 1.1 Installing the latest version of CPDCTL <a class="anchor" id="part1.1"></a>

To use the notebook and environment CLI commands, you need to install CPDCTL. Download the binary from the [CPDCTL GitHub respository](https://github.com/IBM/cpdctl/releases).

Download the binary and then display the version number:

In [2]:
PLATFORM = platform.system().lower()
CPDCTL_ARCH = "{}_amd64".format(PLATFORM)
CPDCTL_RELEASES_URL="https://api.github.com/repos/IBM/cpdctl/releases"
CWD = os.getcwd()
PATH = os.environ['PATH']
CPDCONFIG = os.path.join(CWD, '.cpdctl.config.yml')

response = requests.get(CPDCTL_RELEASES_URL)
assets = response.json()[0]['assets']
platform_asset = next(a for a in assets if CPDCTL_ARCH in a['name'])
cpdctl_url = platform_asset['url']
cpdctl_file_name = platform_asset['name']
        
response = requests.get(cpdctl_url, headers={'Accept': 'application/octet-stream'})
with open(cpdctl_file_name, 'wb') as f:
    f.write(response.content)
    
display(HTML('<code>cpdctl</code> binary downloaded from: <a href="{}">{}</a>'.format(platform_asset['browser_download_url'], platform_asset['name'])))

In [3]:
%%capture

%env PATH={CWD}:{PATH}
%env CPDCONFIG={CPDCONFIG}

In [4]:
if cpdctl_file_name.endswith('tar.gz'):
    with tarfile.open(cpdctl_file_name, "r:gz") as tar:
        tar.extractall()
elif cpdctl_file_name.endswith('zip'):
    with zipfile.ZipFile(cpdctl_file_name, 'r') as zf:
        zf.extractall()

if CPDCONFIG and os.path.exists(CPDCONFIG):
    os.remove(CPDCONFIG)
    
version_r = ! cpdctl version
CPDCTL_VERSION = version_r.s

print("cpdctl version: {}".format(CPDCTL_VERSION))

cpdctl version: 1.0.77


### 1.2  Adding CPD cluster configuration settings <a class="anchor" id="part1.2"></a>

Before you can use CPDCTL, you need to add configuration settings. You only need to configure these settings once for the same IBM Cloud Pak for Data (CPD) user and cluster. Begin by entering your CPD credentials and the URL to the CPD cluster:

In [5]:
CPD_USER_NAME = #'YOUR CPD user name'
CPD_USER_PASSWORD = #'YOUR CPD user password'
CPD_URL = #'YOUR CPD CLUSTER URL'

Add "cpd_user" user to the cpdctl configuration:

In [6]:
! cpdctl config user set cpd_user --username {CPD_USER_NAME} --password {CPD_USER_PASSWORD}

Add "cpd" cluster to the cpdctl configuration:

In [7]:
! cpdctl config profile set cpd --url {CPD_URL}

Add "cpd" context to the cpdctl configuration:

In [8]:
! cpdctl config context set cpd --profile cpd --user cpd_user

List available contexts:

In [9]:
! cpdctl config context list

[1mName[0m   [1mProfile[0m   [1mUser[0m       [1mCurrent[0m   
[36;1mcpd[0m    cpd       cpd_user   *   


Switch to the context you just created if it is not marked in the `Current` column:

In [10]:
! cpdctl config context use cpd

Switched to context "cpd".


List available spaces in context:

In [11]:
! cpdctl space list

...
[1mID[0m                                     [1mName[0m         [1mCreated[0m                    [1mDescription[0m   [1mState[0m    [1mTags[0m   
[36;1m0f9bb565-a7d8-409b-baaf-5a56cd343155[0m   test_space   2021-05-17T13:52:54.619Z                 active   []   


Choose the space in which you want to work:

In [12]:
result = ! cpdctl space list --output json -j "(resources[].metadata.id)[0]" --raw-output
space_id = result.s
print("space id: {}".format(space_id))

# You can also specify your space id directly:
# space_id = "Your space ID"

space id: 0f9bb565-a7d8-409b-baaf-5a56cd343155


## 2. Demo 1: Creating a notebook asset and running a job <a class="anchor" id="part2"></a>

Before starting with this section, ensure that you have run the cells in [Section 1](#part1) and specified the ID of the space in which you will work.

Suppose you have a Jupyter Notebook (.ipynb) file on your local system and you would like to run the code in the file as a job on a CPD cluster. This section shows you how to create a notebook asset and run a job on a CPD cluster. 

### 2.1 Creating a notebook asset<a class="anchor" id="part2.1"></a>

First of all, you need to create a notebook asset in your space. To create a notebook asset you need to specify:
- The environment in which your notebook is to run
- A notebook file (.ipynb).

List all the environments in your space, filter them by their display name and get the ID of the environment in which your notebook will be run:

In [13]:
environment_name = "Default Python 3.7"
query_string = "(resources[?entity.environment.display_name == '{}'].metadata.asset_id)[0]".format(environment_name)

In [14]:
result = ! cpdctl environment list --space-id {space_id} --output json -j "{query_string}" --raw-output
env_id = result.s
print("environment id: {}".format(env_id))

# You can also specify your environment id directly:
# env_id = "Your environment ID"

environment id: jupconda37oce-0f9bb565-a7d8-409b-baaf-5a56cd343155


Upload the .ipynb file:

In [15]:
remote_file_path = "notebook/cpdctl-test-notebook.ipynb"
local_file_path = "cpdctl-test-notebook.ipynb"

In [16]:
! cpdctl asset file upload --path {remote_file_path} --file {local_file_path} --space-id {space_id}

...
[32;1mOK[0m


Create a notebook asset:

In [17]:
file_name = "cpdctl-test-notebook.ipynb"
runtime = {
    'environment': env_id
}
runtime_json = json.dumps(runtime)

In [18]:
result = ! cpdctl notebook create --file-reference {remote_file_path} --name {file_name} --space {space_id} --runtime '{runtime_json}' --output json -j "metadata.asset_id" --raw-output
notebook_id = result.s
print("notebook id: {}".format(notebook_id))

notebook id: f893cb79-ca23-4c89-9289-a9afdfd2e7dd


### 2.2 Running a job <a class="anchor" id="part2.2"></a>

To create a notebook job, you need to give your job a name, add a description, and pass the notebook ID and environment ID you determined in [2.1](#part2.1). Additionally, you can add environment variables that will be used in your notebook:

In [19]:
job_name = "cpdctl-test-job"
job = {
    'asset_ref': notebook_id, 
    'configuration': {
        'env_id': env_id, 
        'env_variables': [
            'foo=1', 
            'bar=2']
    }, 
    'description': 'my job', 
    'name': job_name
}
job_json = json.dumps(job)

In [20]:
result = ! cpdctl job create --job '{job_json}' --space-id {space_id} --output json -j "metadata.asset_id" --raw-output
job_id = result.s
print("job id: {}".format(job_id))

job id: 20c1c5ab-a239-477b-bfac-bade5fa82033


Run a notebook job:

In [21]:
run_data = {
    'job_run': {}
}
run_data_json = json.dumps(run_data)

In [22]:
result = ! cpdctl job run create --space-id {space_id} --job-id {job_id} --job-run '{run_data_json}' --output json -j "metadata.asset_id" --raw-output
run_id = result.s
print("run id: {}".format(run_id))

run id: d7ae75b9-d03f-46f5-bc19-601215d4bd94


You can see the output of each cell in your .ipynb file by listing job run logs:

In [23]:
! cpdctl job run logs --job-id {job_id} --run-id {run_id} --space-id {space_id}

...

Cell 1:
0
1
4
9
16




## 3. Demo 2: Promoting a notebook from a project to a space <a class="anchor" id="part4"></a>

Before starting with this section, ensure that you have run the cells in [Section 1](#part1) and specified the ID of the space in which you will work.

Suppose you have a notebook in a project and would like to promote a specific version of this notebook to a space. This section shows you how to promote a notebook from a project to a space on a CPD cluster.

Choose a project from which you will promote your notebook:

In [24]:
result = ! cpdctl project list --output json -j "(resources[].metadata.guid)[0]" --raw-output
project_id = result.s
print("project id: {}".format(project_id))

# You can also specify your project id directly:
# project_id = "Your project ID"

project id: 0f5a1f58-7fdc-4a34-ad75-28c5b122758a


Specify the notebook you would like to promote:

In [25]:
result = ! cpdctl asset search --type-name notebook --query "asset.asset_type:notebook" --project-id {project_id} --output json -j "(results[].metadata.asset_id)[0]" --raw-output
notebook_id_in_project = result.s
print("notebook id in project: {}".format(notebook_id_in_project))

# You can also specify your notebook id in project directly:
# notebook_id_in_project = "Your notebook ID in project"

notebook id in project: 8ead5d49-0a5d-4325-9017-996c3bf40245


Create a version for your notebook if it has not any version:

In [26]:
result = ! cpdctl notebook version create --notebook-id {notebook_id} --output json -j "metadata.guid" --raw-output
version_id = result.s
print("version id: {}".format(version_id))

version id: d91eb38c-68fd-4d0f-a54f-836dd78935c0


Or specify an existing version of the notebook:

In [27]:
result = ! cpdctl notebook version list --notebook-id {notebook_id} --output json -j "(resources[].metadata.guid)[0]" --raw-output
version_id = result.s
print("version id: {}".format(version_id))

# You can also specify your version id directly:
# version_id = "Your version ID"

version id: d91eb38c-68fd-4d0f-a54f-836dd78935c0


Promote the notebook to the space. The parameters `name` and `description` are optional. If they are not specified, the name and description of the original notebook in the project will be used.

In [28]:
notebook_name = "cpdctl_test_promote"
notebook_description = "cpdctl test promote"

result = ! cpdctl notebook promote --notebook-id {notebook_id_in_project} --version-id {version_id} --name {notebook_name} --description {notebook_description} --project-id {project_id} --space-id {space_id} --output json -j "metadata.asset_id" --raw-output
notebook_id_in_space = result.s
print("notebook id in space: {}".format(notebook_id_in_space))

notebook id in space: c193910c-7385-4c24-a873-be7870694ec7


Copyright © 2021 IBM. This notebook and its source code are released under the terms of the MIT License.