# *cpdctl* Sample Code for Promoting Notebooks to a Deployment Space

<span style="color:red">**Note: This notebooks is the [sample notebook](https://github.com/IBM/cpdctl/blob/master/samples/Notebook-and-Environment-samples-for-Projects.ipynb) from the cpdctl public repo. It was tested and updated in CPD 4.0.4 - JupyterLab environment for latest cpdctl releases 1.1.132 by WW Data and AI team.**</span>


cpdctl is a command-line interface (CLI) that you can use to manage the lifecycle of notebooks. By using the notebook CLI, you can automate the flow for creating notebooks and running notebook jobs, moving notebooks between projects in Watson Studio, and adding custom libraries to notebook runtime environments.

Using cpdctl is important for 2 reasons:

1. Automation of tasks
2. Performing tasks that we can't perform in the UI
    - Schedule jobs for notebooks created in JupyterLab (notebooks must be in the Project)
    - Promote notebooks to a deployment space

## Before you begin
Import the following libraries:

In [None]:
# Import required libraries and modules
import base64
import json
import os
import platform
import requests
import tarfile
import zipfile
import jmespath
import subprocess
from IPython.core.display import display, HTML

##  Installing and configure CPDCTL <a class="anchor" id="part1"></a>

### Install the latest version of CPDCTL <a class="anchor" id="part1.1"></a>### Install the version v1.1.132 of `cpdctl`

To use the notebook and environment CLI commands, you need to install CPDCTL. Download the binary from the [CPDCTL GitHub respository](https://github.com/IBM/cpdctl/releases).

Download the binary and then display the version number:

In [None]:
PLATFORM = platform.system().lower()
CPDCTL_ARCH = "{}_amd64".format(PLATFORM)
CPDCTL_RELEASES_URL="https://api.github.com/repos/IBM/cpdctl/releases"
CWD = os.getcwd()
PATH = os.environ['PATH']
CPDCONFIG = os.path.join(CWD, '.cpdctl.config.yml')
version='v1.1.132'

response = requests.get(CPDCTL_RELEASES_URL)
asset_version = next(a for a in response.json() if version==a['tag_name'])
#assets = response.json()[0]['assets']
assets=asset_version['assets']
platform_asset = next(a for a in assets if CPDCTL_ARCH in a['name'])
cpdctl_url = platform_asset['url']
cpdctl_file_name = platform_asset['name']

response = requests.get(cpdctl_url, headers={'Accept': 'application/octet-stream'})
with open(cpdctl_file_name, 'wb') as f:
    f.write(response.content)
    
display(HTML('<code>cpdctl</code> binary downloaded from: <a href="{}">{}</a>'.format(platform_asset['browser_download_url'], platform_asset['name'])))

In [None]:
%%capture

%env PATH={CWD}:{PATH}
%env CPDCONFIG={CPDCONFIG}

In [None]:
if cpdctl_file_name.endswith('tar.gz'):
    with tarfile.open(cpdctl_file_name, "r:gz") as tar:
        tar.extractall()
elif cpdctl_file_name.endswith('zip'):
    with zipfile.ZipFile(cpdctl_file_name, 'r') as zf:
        zf.extractall()

if CPDCONFIG and os.path.exists(CPDCONFIG):
    os.remove(CPDCONFIG)
    
version_r = ! cpdctl version
CPDCTL_VERSION = version_r.s

print("cpdctl version: {}".format(CPDCTL_VERSION))

### Add CPD cluster configuration settings <a class="anchor" id="part1.2"></a>

Before you can use CPDCTL, you need to add configuration settings. You only need to configure these settings once for the same IBM Cloud Pak for Data (CPD) user and cluster. Begin by entering your CPD credentials and the URL to the CPD cluster:

In [None]:
# This information is needed only when cpdctl is used from outside of CPD
#CPD_USERNAME = ' ' # for example: datascientist
#CPD_PASSWORD = ' '
#CPD_URL = ' ' #typically, this would be https://cpd-cpd-instance.apps.demo.ibmdte.net

List available spaces in context:

In [None]:
! cpdctl space list

Choose the space in which you want to work:

In [None]:
#result = ! cpdctl space list --output json -j "(resources[].metadata.id)[0]" --raw-output
#space_id = result.s
# print("space id: {}".format(space_id))

# You can also specify your space id directly:
space_id = " "

## 2.Creating a notebook asset and in the deployment space and run a job <a class="anchor" id="part2"></a>

### 2.1 Create a notebook asset<a class="anchor" id="part2.1"></a>

First, we create a *notebook asset* in your project. Assets are used to capture various metadata. To create a notebook asset you need to specify:

- The environment in which your notebook is to run
- A notebook file (.ipynb).

List all the environments in your space, filter them by their display name and get the ID of the environment in which your notebook will be run:

In [None]:
environment_name = "Default Python 3.8"
query_string = "(resources[?entity.environment.display_name == '{}'].metadata.asset_id)[0]".format(environment_name)

In [None]:
result = ! cpdctl environment list --space-id {space_id} --output json -j "{query_string}" --raw-output
env_id = result.s
print("environment id: {}".format(env_id))

# You can also specify your environment id directly:
# env_id = "Your environment ID"

The notebook that we will promote is *Notebook1.ipynb*. You can either create a new notebook with this name or change the name in the code to one of the notebooks in your directory. 

In [None]:
remote_file_path = "notebook/Notebook1.ipynb"
local_file_path = "Notebook1.ipynb"

In [None]:
! cpdctl asset file upload --path {remote_file_path} --file {local_file_path} --space-id {space_id}

Create a notebook asset:

In [None]:
file_name = "Notebook1.ipynb"
runtime = {
    'environment': env_id
}
runtime_json = json.dumps(runtime)

In [None]:
result = ! cpdctl notebook create --file-reference {remote_file_path} --name {file_name} --space {space_id} --runtime '{runtime_json}' --output json -j "metadata.asset_id" --raw-output
notebook_id = result.s
print("notebook id: {}".format(notebook_id))

<span style="color:red">Important Note: Check the target deployment space. *Notebook1* should show up in the Assets tab. </span>

### 2.2 Running a job <a class="anchor" id="part2.2"></a>

To create a notebook job, you need to give your job a name, add a description, and pass the notebook ID and environment ID you determined in [2.1](#part2.1). Additionally, you can add environment variables that will be used in your notebook:

In [None]:
job_name = "cpdctl-test-job"
job = {
    'asset_ref': notebook_id, 
    'configuration': {
        'env_id': env_id, 
        'env_variables': [
            'foo=1', 
            'bar=2'
        ]
    }, 
    'description': 'my job', 
    'name': job_name
}
job_json = json.dumps(job)

In [None]:
result = ! cpdctl job create --job '{job_json}' --space-id {space_id} --output json -j "metadata.asset_id" --raw-output
job_id = result.s
print("job id: {}".format(job_id))

Run a notebook job:

In [None]:
job_run = {
    'configuration': {
        'env_variables': [
            'key1=value1', 
            'key2=value2'
        ]
    }
}
job_run_json = json.dumps(job_run)

In [None]:
result = ! cpdctl job run create --space-id {space_id} --job-id {job_id} --job-run '{job_run_json}' --output json -j "metadata.asset_id" --raw-output
run_id = result.s
print("run id: {}".format(run_id))

You can see the output of each cell in your .ipynb file by listing job run logs:

In [None]:
! cpdctl job run logs --job-id {job_id} --run-id {run_id} --space-id {space_id}

<span style="color:red">Important Note: Check the target deployment space. You should show see a running job in the Jobs tab. </span>

Copyright © 2021 IBM. This notebook and its source code are released under the terms of the MIT License.