# *cpdctl* Sample Code for Copying Notebooks to Another Project

<span style="color:red">**Note: This notebooks is the [sample notebook](https://github.com/IBM/cpdctl/blob/master/samples/Notebook-and-Environment-samples-for-Projects.ipynb) from the cpdctl public repo. It was tested and updated in CPD 4.0.4 - JupyterLab environment for latest cpdctl releases 1.1.132 by WW Data and AI team.**</span>


cpdctl is a command-line interface (CLI) that you can use to manage the lifecycle of notebooks. By using the notebook CLI, you can automate the flow for creating notebooks and running notebook jobs, moving notebooks between projects in Watson Studio, and adding custom libraries to notebook runtime environments.

Using cpdctl is important for 2 reasons:

1. Automation of tasks
2. Performing tasks that we can't perform in the UI
    - Schedule jobs for notebooks created in JupyterLab (notebooks must be in the Project)
    - Promote notebooks to a deployment space

## Before you begin - setup

### Import required libraries and modules

In [None]:
# Import required libraries and modules
import base64
import json
import os
import platform
import requests
import tarfile
import zipfile
import jmespath
import subprocess
from IPython.core.display import display, HTML

### Install the version v1.1.132 of `cpdctl`

In [None]:
PLATFORM = platform.system().lower()
CPDCTL_ARCH = "{}_amd64".format(PLATFORM)
CPDCTL_RELEASES_URL="https://api.github.com/repos/IBM/cpdctl/releases"
CWD = os.getcwd()
PATH = os.environ['PATH']
CPDCONFIG = os.path.join(CWD, '.cpdctl.config.yml')
version='v1.1.132'

response = requests.get(CPDCTL_RELEASES_URL)
asset_version = next(a for a in response.json() if version==a['tag_name'])
#assets = response.json()[0]['assets']
assets=asset_version['assets']
platform_asset = next(a for a in assets if CPDCTL_ARCH in a['name'])
cpdctl_url = platform_asset['url']
cpdctl_file_name = platform_asset['name']

response = requests.get(cpdctl_url, headers={'Accept': 'application/octet-stream'})
with open(cpdctl_file_name, 'wb') as f:
    f.write(response.content)
    
display(HTML('<code>cpdctl</code> binary downloaded from: <a href="{}">{}</a>'.format(platform_asset['browser_download_url'], platform_asset['name'])))

In [None]:
%%capture

%env PATH={CWD}:{PATH}
%env CPDCONFIG={CPDCONFIG}

In [None]:
if cpdctl_file_name.endswith('tar.gz'):
    with tarfile.open(cpdctl_file_name, "r:gz") as tar:
        tar.extractall()
elif cpdctl_file_name.endswith('zip'):
    with zipfile.ZipFile(cpdctl_file_name, 'r') as zf:
        zf.extractall()

if CPDCONFIG and os.path.exists(CPDCONFIG):
    os.remove(CPDCONFIG)
    
version_r = ! cpdctl version
CPDCTL_VERSION = version_r.s

print("cpdctl version: {}".format(CPDCTL_VERSION))

In [None]:
!which cpdctl

In [None]:
!cpdctl version

## 1. Provide CPD Cluster credentials

In [None]:
# Needed only if cpdctl is used outside of the CPD cluster
#CPD_USERNAME = ' ' # for example: datascientist
#CPD_PASSWORD = ' '
#CPD_URL = ' ' #typically, this would be https://cpd-cpd-instance.apps.demo.ibmdte.net

Since this notebook runs inside of the CPD cluster you want to interact with the cpdctl tool, you can leverage the zeror configration mode, which automatcially connects to the CP4D instance.

In [None]:
# show all project
! cpdctl project list

## 2. Choose a project to copy assets to

<span style="color:red">Important Note: Assets can be copied only to projects that are NOT configured with git

In [None]:
# You can specify your project id directly:
project_id = ""

# OR you can reference it by index[0] is the index of the project id list
#result = ! cpdctl project list --output json -j "(resources[].metadata.guid)[0]" --raw-output
#project_id = result.s

print("project id: {}".format(project_id))

## 3. Create a notebook in a non-git project using cpdctl

First, we create a *notebook asset* in your project. Assets are used to capture various metadata. To create a notebook asset you need to specify:

- The environment in which your notebook is to run
- A notebook file (.ipynb).

### 3.1 Get the environment ID

List all the environments in your project, filter them by their display name and get the ID of the environment in which your notebook will be run:

In [None]:
# You can find the name of the available environments in the project => Environment tab

environment_name = "Default Python 3.8"
query_string = "(resources[?entity.environment.display_name == '{}'].metadata.asset_id)[0]".format(environment_name)

In [None]:
result = ! cpdctl environment list --project-id {project_id} --output json -j "{query_string}" --raw-output
env_id = result.s
print("environment id: {}".format(env_id))

# You can also specify your environment id directly:
# env_id = "Your environment ID"

### 3.2 Upload the .ipynb file

The notebook that we will promote is *Notebook1.ipynb*. You can either create a new notebook with this name or change the name in the code to one of the notebooks in your directory. 

In [None]:
remote_file_path = "notebook/Notebook1.ipynb"
local_file_path = "Notebook1.ipynb"

In [None]:
! cpdctl asset file upload --path {remote_file_path} --file {local_file_path} --project-id {project_id}

### 3.3 Create a notebook asset: associate environment runtime with the notebook file

Notebooks in Watson Studio must have metadata, such as the environment associated with the notebook. The following code specifies the environment that will be used for the copied notebook.  

In [None]:
file_name = "Notebook1.ipynb"

runtime = {
    'environment': env_id
}
runtime_json = json.dumps(runtime)

originate = {
    'type': 'blank'
}
originate_json = json.dumps(originate)

In [None]:
result = ! cpdctl notebook create --file-reference {remote_file_path} --name {file_name} --project {project_id} --runtime '{runtime_json}' --originates-from '{originate_json}' --output json -j "metadata.asset_id" --raw-output
notebook_id = result.s
print("notebook id: {}".format(notebook_id))

<span style="color:red">Important Note: Check the target project. *Notebook1* should show up in the Notebook section of the Assets tab. </span>

## 4. Create and run a notebook job

In [None]:
result = ! cpdctl notebook version create --notebook-id {notebook_id} --output json -j "metadata.guid" --raw-output
version_id = result.s
print("version id: {}".format(version_id))

In [None]:
! cpdctl notebook version create --notebook-id {notebook_id} --output json -j "metadata.guid" --raw-output

In [None]:
job_name = "cpdctl-test-job"
job = {
    'asset_ref': notebook_id, 
    'configuration': {
        'env_id': env_id, 
        'env_variables': [
           # 'foo=1', 
           # 'bar=2'
        ]
    }, 
    'description': 'my job', 
    'name': job_name
}
job_json = json.dumps(job)

In [None]:
result = ! cpdctl job create --job '{job_json}' --project-id {project_id} --output json -j "metadata.asset_id" --raw-output
job_id = result.s
print("job id: {}".format(job_id))

In [None]:
job_run = {
    'configuration': {
        'env_variables': [
            #'key1=value1', 
            #'key2=value2'
        ]
    }
}
job_run_json = json.dumps(job_run)

In [None]:
result = ! cpdctl job run create --project-id {project_id} --job-id {job_id} --job-run '{job_run_json}' --output json -j "metadata.asset_id" --raw-output
run_id = result.s
print("run id: {}".format(run_id))

In [None]:
! cpdctl job run logs --job-id {job_id} --run-id {run_id} --project-id {project_id}

<span style="color:red">Important Note: Check the target project. *Notebook1* should be running (Jobs tab). </span>