# Offloading workload from Cloud Pak for Data to Watson ML Accelerator

Notebook created by Kayalvizhi Ganesan and Kelvin Lui, Nov 2020.

Using the Watson Machine Learning API Python client library. You will learn to download a sample Pytorch MNIST model and train it using Watson Machine Learning Accelerator (WMLA) to identify numbers based on images. Then, you will save and deploy the model from your workspace and again use WML to score it. Lastly you will clean up by deleting your deployment. This notebook runs on CPD 3.5.0.

## Table of Contents:

1. [Setup](#setup)<br>
    1.1 [Initialize python client](#Initialize-python-client)<br>
    1.2 [Set default space](#Set-default-space)<br>
    1.3 [Create library](#Create-library)<br>
2. [Train the model](#train)<br>
    2.1 [Create training](#Create-training)<br>
    2.2 [Monitor training](#Monitor-training)<br>
3. [Save and deploy the model](#deploy)<br>
    3.1 [Save Model to project](#Save-Model-to-project)<br>
    3.2 [Create Deployment](#Create-Deployment)<br>
4. [Scoring the model](#score)<br>
    4.1 [Score deployment](#Score-deployment)<br>
    4.2 [Prediction accuracy](#Prediction-accuracy)<br>
    4.3 [Interact with wider project](#Interact-with-wider-project)<br>
5. [Clean up resources](#Clean-up-resources)

&nbsp;
&nbsp;
&nbsp;
&nbsp;
&nbsp;


![options](https://github.com/IBM/wmla-learning-path/raw/master/shared-images/CPD-WMLA_3_pythonclient.png)

<a id = "setup"></a>
## Setup

Uncomment and run this cell if Python v4 client is not installed.

In [1]:
#!pip install --upgrade watson-machine-learning-client-V4
!pip install ibm-watson-machine-learning



### Initialize python client

In [None]:
import sys,os,os.path

Authentication methods:

- If you use the same user ID on both CPD and WMLA, use the **token** authentication method
- If you use different user IDs on CPD and WMLA, use the **user** authentication

Update username and password to access WMLA in cell below, plus the URL for local CPD cluster with WML:

In [None]:
# @hidden_cell

# Authenticate using token for a user which exists on both the CPD and WMLA cluster.
# Otherwise specify username and password

token = os.environ['USER_ACCESS_TOKEN']
# USERNAME = 'kelvin'
# PASSWORD = 'kelvinpassword'
URL = os.environ['RUNTIME_ENV_APSX_URL']


wml_credentials = {
    "token": token,
#     'username': USERNAME,
#     'password': PASSWORD,
    "instance_id" : "openshift",
    "url": URL,
    "version": "4.0"
}

#credential designs
#version would be corresponding to CPD or Openshift version

In [4]:
# This can take time to run - be patient :)
# Creating an instance of python client

from ibm_watson_machine_learning import APIClient
client = APIClient(wml_credentials)
client.version

'1.0.141'

### Set default space

Set default space for CPD usage (if needed, note this unsets the project ID). 

In [5]:
# creating deployment space
meta_props = {
    client.spaces.ConfigurationMetaNames.NAME: "MSK Test Space Wendy"
}

space = client.spaces.store(meta_props)
#space_id = client.spaces.get_uid(space)
space_id = client.spaces.get_id(space)
print("Space id: {}".format(space_id))
# all assets creating after this cell will be stored at this space_id
client.set.default_space(space_id)

Space has been created. However some background setup activities might still be on-going. Check for 'status' field in the response. It has to show 'active' before space can be used. If its not 'active', you can monitor the state with a call to spaces.get_details(space_id)
Space id: ddd3bbfc-1ab8-4e6b-ac97-0dc60764ad51


'SUCCESS'

In [None]:
# client.spaces.list()
# space_id = "a29aca9b-948c-4ba9-945d-da56bba7a1c8"
# client.set.default_space(space_id)

In [None]:
# In case you need to clear up any old spaces before this session

# client.spaces.delete('<GUID_HERE>')

View the predefined runtimes available.

In [None]:
# retrieve existing runtime in CPD cluster,  where training and deployment will happen
#client.software_specifications.list(pre_defined=True, limit=100)
list()

### Create library

Get the zip file containing the training python code is downloaded. It can be modified to use locally stored files.

We are using a pre-created sample zip file which contains the following: 
```
pytorch_onnx_v_1.1
├── emetrics.py
└── pytorch_v_1.1_mnist_onnx.py
```

This code trains a PyTorch model using the MNIST dataset which is downloaded to the WMLA cluster as part of the workload running. 

In [7]:
swid = client.software_specifications.get_id_by_name("pytorch-onnx_1.3-py3.7-edt")
print (swid)

069ea134-3346-5748-b513-49120e15d288


In [8]:
import requests

model_content_resp = requests.get("https://github.com/calinrc/model_definitions/raw/master/libs/pytorch_onnx_v_1.3.zip",
                                  headers={"Content-Type": "application/octet-stream"})
with open("pytorch_onnx_v_1.3.zip", "wb") as f:
    f.write(model_content_resp.content)

In [None]:
#client.runtimes.get_library_details(limit=10)

In [10]:
# stores model definition as custom library, and we will use for training.
meta_props = {
    client.model_definitions.ConfigurationMetaNames.NAME: "pytorch mnist library 4",
    #client.model_definitions.ConfigurationMetaNames.FILEPATH: "./pytorch_onnx_v_1.1.zip",
    client.model_definitions.ConfigurationMetaNames.PLATFORM: {"name": "python",
                                               "versions": ["3.7"]},
    client.model_definitions.ConfigurationMetaNames.VERSION: "1",
    client.model_definitions.ConfigurationMetaNames.COMMAND: "pytorch_v_1.1_mnist_onnx.py --epochs 3 --debug-level debug"
}

model_definition_details = client.model_definitions.store("./pytorch_onnx_v_1.3.zip",meta_props)

model_definition_id = client.model_definitions.get_uid(model_definition_details)

In [14]:
library_href = client.runtimes.get_library_href(library)

AttributeError: 'APIClient' object has no attribute 'runtimes'

In [20]:
for element in client.software_specifications.get_details()['resources']:
    image_name = element['metadata']['name']
    if 'pytorch' in image_name:
        print(image_name)
#         print(element)
        print('-'*80)

pytorch-onnx_1.3-py3.7-edt
--------------------------------------------------------------------------------
pytorch_1.1-py3.6
--------------------------------------------------------------------------------
pytorch-onnx_1.7-py3.8-edt
--------------------------------------------------------------------------------
pytorch_1.2-py3.6
--------------------------------------------------------------------------------
pytorch-onnx_1.1-py3.6-edt
--------------------------------------------------------------------------------
pytorch-onnx_1.2-py3.6-edt
--------------------------------------------------------------------------------
pytorch-onnx_1.1-py3.6
--------------------------------------------------------------------------------
pytorch-onnx_1.7-py3.8
--------------------------------------------------------------------------------
pytorch-onnx_1.2-py3.6
--------------------------------------------------------------------------------
pytorch-onnx_1.7-py3.7
-----------------------------------

<a id = "train"></a>
## Train the model

### Create training

In [25]:
# Create Training pipeline and use the model definition stored in the library
# Command doesn't require to be defined here.   
# Library is deprecated from CPD 3.0.1
meta_props = {
    client.training.ConfigurationMetaNames.NAME: model_definition_id,
    client.training.ConfigurationMetaNames.DESCRIPTION: "PyTorch training at WMLA",
    client.training.ConfigurationMetaNames.MODEL_DEFINITION: {
     "id": model_definition_id,
     #"href": library_href,
     "runtime": {"href": "/v4/runtimes/pytorch-onnx_1.7-py3.7"},
     #"command": "pytorch_v_1.1_mnist_onnx.py --epochs 10 --debug-level debug",   -> will override training command stored in library
     "software_spec": {
      "name": "pytorch-onnx_1.7-py3.7"
      
     },   
     
     "hardware_spec": {"name": "v100", "nodes": 1},
     "parameters": {
            "name": "pytorch onnx defintion",
            "description": "pytorch onnx defintion"
          }
     },
    client.training.ConfigurationMetaNames.SPACE_UID: space_id,    
    
    client.training.ConfigurationMetaNames.TRAINING_DATA_REFERENCES: [
        {
          "name": "training_input_data",
          "type": "fs",
          "connection": {},
          "location": {
            "path": "wmla-data"
          },
        }
      ],
    client.training.ConfigurationMetaNames.TRAINING_RESULTS_REFERENCE: {
        "location": {
          "path": "/spaces/" + space_id + "/assets/trainings"
        },
        "type": "fs"
      }
}

training = client.training.run(meta_props)


In [26]:
training_uid = client.training.get_uid(training)
training_state = client.training.get_status(training_uid)['state']
print("Training uid: {}".format(training_uid))

Training uid: fe5f3a1e-349a-4140-928a-8e2b68e566e7


In [27]:
client.training.get_details(training_uid)

{'metadata': {'created_at': '2021-12-03T05:04:09.795Z',
  'description': 'PyTorch training at WMLA',
  'guid': 'fe5f3a1e-349a-4140-928a-8e2b68e566e7',
  'id': 'fe5f3a1e-349a-4140-928a-8e2b68e566e7',
  'name': 'd48eea04-d412-4364-9c2e-d5cac36a4450',
  'space_id': 'ddd3bbfc-1ab8-4e6b-ac97-0dc60764ad51'},
 'entity': {'description': 'PyTorch training at WMLA',
  'model_definition': {'hardware_spec': {'name': 'v100'},
   'id': 'd48eea04-d412-4364-9c2e-d5cac36a4450',
   'parameters': {'description': 'pytorch onnx defintion',
    'name': 'pytorch onnx defintion'},
   'software_spec': {'name': 'pytorch-onnx_1.7-py3.7'}},
  'name': 'd48eea04-d412-4364-9c2e-d5cac36a4450',
  'results_reference': {'location': {'path': '/spaces/ddd3bbfc-1ab8-4e6b-ac97-0dc60764ad51/assets/trainings',
    'model': '/spaces/ddd3bbfc-1ab8-4e6b-ac97-0dc60764ad51/assets/trainings/fe5f3a1e-349a-4140-928a-8e2b68e566e7/data/model',
    'notebooks_path': '/spaces/ddd3bbfc-1ab8-4e6b-ac97-0dc60764ad51/assets/trainings/fe5f3a1e

### Monitor training

In [28]:
import time 

while training_state == 'pending' or training_state == 'running':
    time.sleep(10)
    print("Current training state: {}".format(training_state))
    training_state = client.training.get_status(training_uid)['state']

print ("training completes")    

Current training state: pending
Current training state: pending
Current training state: pending
Current training state: pending
Current training state: pending
Current training state: pending
Current training state: pending
Current training state: pending
Current training state: pending
Current training state: pending
Current training state: pending
Current training state: pending
Current training state: pending
Current training state: pending
Current training state: pending
Current training state: pending
Current training state: pending
Current training state: pending
Current training state: pending
Current training state: pending
Current training state: pending
Current training state: pending
Current training state: pending
Current training state: pending
Current training state: pending
Current training state: pending
Current training state: pending
Current training state: pending
Current training state: pending
Current training state: pending
Current training state: pending
Current 

<a id = "deploy"></a>
## Save and deploy the model

### Save Model to project

Note that you can save your model to your Watson Studio project by changing from the space we've been working in up to this point by using the `PROJECT_ID` variable. 

This saved model can then be promoted to the CPD deployment space from the UI (not currently possible with python client).

In [None]:
# Only use this if you wish to save model to Watson Studio project. Promotion to deployment space / deployment must then be done from UI.

# project_id = os.environ['PROJECT_ID']
# client.set.default_project(project_id)

Otherwise stick with the working space you've been using so far:

In [None]:
# Only use this if you need to change back to default space created in this notebook

# client.set.default_space(space_id)

In [13]:
client.software_specifications.list()


-----------------------------  ------------------------------------  ----
NAME                           ASSET_ID                              TYPE
default_py3.6                  0062b8c9-8b7d-44a0-a9b9-46c416adcbd9  base
pytorch-onnx_1.3-py3.7-edt     069ea134-3346-5748-b513-49120e15d288  base
scikit-learn_0.20-py3.6        09c5a1d0-9c1e-4473-a344-eb7b665ff687  base
spark-mllib_3.0-scala_2.12     09f4cff0-90a7-5899-b9ed-1ef348aebdee  base
ai-function_0.1-py3.6          0cdb0f1e-5376-4f4d-92dd-da3b69aa9bda  base
shiny-r3.6                     0e6e79df-875e-4f24-8ae9-62dcc2148306  base
pytorch_1.1-py3.6              10ac12d6-6b30-4ccd-8392-3e922c096a92  base
scikit-learn_0.22-py3.6        154010fa-5b3b-4ac1-82af-4d5ee5abbc85  base
default_r3.6                   1b70aec3-ab34-4b87-8aa0-a4a3c8296a36  base
tensorflow_1.15-py3.6          2b73a275-7cbf-420b-a912-eae7f436e0bc  base
pytorch_1.2-py3.6              2c8ef57d-2687-4b7d-acce-01f94976dac1  base
spark-mllib_2.3                2e51f70

In [20]:
software_spec_id = client.software_specifications.get_id_by_name('pytorch-onnx_1.3-py3.7')
print (software_spec_id)

8d5d8a87-a912-54cf-81ec-3914adaa988d


In [21]:
meta_props = {
    client.repository.ModelMetaNames.NAME: "Pytorch MNIST model demo",
    client.repository.ModelMetaNames.TRAINING_DATA_REFERENCES: [client.training.get_details(training_uid)["entity"]["results_reference"]],
    client.repository.ModelMetaNames.SOFTWARE_SPEC_UID: software_spec_id,
    client.repository.ModelMetaNames.TYPE: "pytorch-onnx_1.3"
}

model = client.repository.store_model(training_uid, meta_props)

ApiRequestFailure: Failure during creating new model. (POST https://cpd-qq-cpd-cpd-qq.apps.wml1x210.ma.platformlab.ibm.com/ml/v4/models?version=2020-08-01)
Status code: 201, body: {
  "entity": {
    "software_spec": {
      "id": "8d5d8a87-a912-54cf-81ec-3914adaa988d",
      "name": "pytorch-onnx_1.3-py3.7"
    },
    "training_data_references": [{
      "connection": {

      },
      "location": {
        "path": "/spaces/076357a0-f85b-4865-be2c-174d59c0f0c6/assets/trainings",
        "model": "/spaces/076357a0-f85b-4865-be2c-174d59c0f0c6/assets/trainings/f0913b0b-e423-43b6-a706-292635b95c7e/data/model",
        "training": "/spaces/076357a0-f85b-4865-be2c-174d59c0f0c6/assets/trainings/f0913b0b-e423-43b6-a706-292635b95c7e",
        "training_status": "/spaces/076357a0-f85b-4865-be2c-174d59c0f0c6/assets/trainings/f0913b0b-e423-43b6-a706-292635b95c7e/training-status.json",
        "logs": "/spaces/076357a0-f85b-4865-be2c-174d59c0f0c6/assets/trainings/f0913b0b-e423-43b6-a706-292635b95c7e/logs",
        "assets_path": "/spaces/076357a0-f85b-4865-be2c-174d59c0f0c6/assets/trainings/f0913b0b-e423-43b6-a706-292635b95c7e/assets"
      },
      "type": "fs"
    }],
    "type": "pytorch-onnx_1.3"
  },
  "metadata": {
    "created_at": "2020-11-18T15:28:02.869Z",
    "id": "8faf869c-2f0a-403a-b1c6-2ce4bed58300",
    "modified_at": "2020-11-18T15:28:02.869Z",
    "name": "Pytorch MNIST model demo",
    "owner": "1000331004",
    "space_id": "076357a0-f85b-4865-be2c-174d59c0f0c6"
  },
  "system": {
    "warnings": []
  }
}

In [None]:
model_uid = client.repository.get_model_uid(model)

In [None]:
client.repository.list_models()

### Create Deployment

Read more on deploying models here: https://www.ibm.com/support/producthub/icpdata/docs/content/SSQNUZ_current/wsj/wmls/wmls-deploy-python.html

In [None]:
client.deployments.get_details()

In [None]:
meta_props = {
    client.deployments.ConfigurationMetaNames.NAME: "Pytorch MNIST deployment",
    client.deployments.ConfigurationMetaNames.ONLINE: {}
}

deployment = client.deployments.create(model_uid, meta_props)

In [None]:
deployment_uid = client.deployments.get_uid(deployment)
print("Deployment uid: {}".format(deployment_uid))

<a id = "score"></a>
## Scoring the model

### Score deployment

In [None]:
values = [[[[0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0],[0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0],[0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0],[0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0],[0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0],[0,0,0,0,0,0,0,0,0,0,0,0,0.6392157077789307,0.7568628191947937,0.5960784554481506,0.3607843220233917,0.20000001788139343,0.20000001788139343,0.20000001788139343,0.20000001788139343,0.12156863510608673,0,0,0,0,0,0,0],[0,0,0,0,0,0,0,0,0,0,0,0.08235294371843338,0.874509871006012,0.9921569228172302,0.988235354423523,0.9921569228172302,0.988235354423523,0.9921569228172302,0.988235354423523,0.9921569228172302,0.7529412508010864,0.32156863808631897,0,0,0,0,0,0],[0,0,0,0,0,0,0,0,0,0,0,0.20000001788139343,0.9921569228172302,0.40000003576278687,0,0.08235294371843338,0.40000003576278687,0.24313727021217346,0.40000003576278687,0.40000003576278687,0.2392157018184662,0.7176470756530762,0.1568627506494522,0,0,0,0,0],[0,0,0,0,0,0,0,0,0,0,0,0.20000001788139343,0.988235354423523,0.40000003576278687,0,0,0,0,0,0,0,0.2392157018184662,0,0,0,0,0,0],[0,0,0,0,0,0,0,0,0,0,0,0.20000001788139343,0.9921569228172302,0.40000003576278687,0,0,0,0,0,0,0,0,0,0,0,0,0,0],[0,0,0,0,0,0,0,0,0,0,0,0.3607843220233917,0.988235354423523,0.40000003576278687,0,0,0,0,0,0,0,0,0,0,0,0,0,0],[0,0,0,0,0,0,0,0,0,0,0,0.6784313917160034,0.9921569228172302,0.40000003576278687,0,0,0,0,0,0,0,0,0,0,0,0,0,0],[0,0,0,0,0,0,0,0,0,0,0,0.9921569228172302,0.988235354423523,0.874509871006012,0.7960785031318665,0.7960785031318665,0.7960785031318665,0.32156863808631897,0,0,0,0,0,0,0,0,0,0],[0,0,0,0,0,0,0,0,0,0,0.48235297203063965,0.9960784912109375,0.9921569228172302,0.8784314393997192,0.7960785031318665,0.7960785031318665,0.874509871006012,0.9960784912109375,0.27843138575553894,0,0,0,0,0,0,0,0,0],[0,0,0,0,0,0,0,0,0,0.16078431904315948,0.9529412388801575,0.9921569228172302,0.5098039507865906,0.0784313753247261,0,0,0.0784313753247261,0.9921569228172302,0.9098039865493774,0.16078431904315948,0,0,0,0,0,0,0,0],[0,0,0,0,0,0,0,0,0,0.5960784554481506,0.9921569228172302,0.7176470756530762,0,0,0,0,0,0.5176470875740051,0.9921569228172302,0.40000003576278687,0,0,0,0,0,0,0,0],[0,0,0,0,0,0,0,0,0,0.20000001788139343,0.5921568870544434,0.0784313753247261,0,0,0,0,0,0.20000001788139343,0.988235354423523,0.40000003576278687,0,0,0,0,0,0,0,0],[0,0,0,0,0,0,0,0,0,0.08235294371843338,0,0,0,0,0,0,0,0.4431372880935669,0.9921569228172302,0.40000003576278687,0,0,0,0,0,0,0,0],[0,0,0,0,0,0,0,0,0.32156863808631897,0.7176470756530762,0,0,0,0,0,0,0,0.7568628191947937,0.988235354423523,0.40000003576278687,0,0,0,0,0,0,0,0],[0,0,0,0,0,0,0,0,0.7960785031318665,0.7176470756530762,0,0,0,0,0,0,0.08235294371843338,0.9960784912109375,0.9921569228172302,0.16078431904315948,0,0,0,0,0,0,0,0],[0,0,0,0,0,0,0,0.08235294371843338,0.874509871006012,0.40000003576278687,0,0,0,0,0,0.08235294371843338,0.7960785031318665,0.9921569228172302,0.5098039507865906,0,0,0,0,0,0,0,0,0],[0,0,0,0,0,0,0,0,0.8000000715255737,0.48235297203063965,0,0,0,0,0.16078431904315948,0.6784313917160034,0.9921569228172302,0.7960785031318665,0.0784313753247261,0,0,0,0,0,0,0,0,0],[0,0,0,0,0,0,0,0,0.6352941393852234,0.874509871006012,0.40000003576278687,0.08235294371843338,0.40000003576278687,0.6392157077789307,0.9529412388801575,0.9921569228172302,0.6705882549285889,0.0784313753247261,0,0,0,0,0,0,0,0,0,0],[0,0,0,0,0,0,0,0,0.16078431904315948,0.917647123336792,0.9921569228172302,1,0.9921569228172302,1,0.6745098233222961,0.32156863808631897,0,0,0,0,0,0,0,0,0,0,0,0],[0,0,0,0,0,0,0,0,0,0.11764706671237946,0.5137255191802979,0.7529412508010864,0.43529415130615234,0.19607844948768616,0.03921568766236305,0,0,0,0,0,0,0,0,0,0,0,0,0],[0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0],[0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0],[0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]]]]

meta_props = {
    client.deployments.ScoringMetaNames.INPUT_DATA: [{"values": values}]
}    
predictions = client.deployments.score(deployment_uid, meta_props)
print("Predictions returned\n{}".format(predictions))

### Prediction accuracy

In [None]:
highest_log_probability = max(predictions['predictions'][0]['values'][0])
prediction = predictions['predictions'][0]['values'][0].index(highest_log_probability)
print("We predict the picture below is a {}".format(prediction))

from matplotlib import pyplot as plt 
import numpy as np

first_image = np.array(values[0][0], dtype='float')
plt.imshow(first_image, cmap='gray')
plt.show()

### Interact with wider project 

In [None]:
from project_lib import Project
project = Project.access()
storage_credentials = project.get_storage_metadata()

In [None]:
project.get_name()

### Clean up resources

In [None]:
print("Deployment deletion: {}".format(client.deployments.delete(deployment_uid)))
print("Space deletion: {}".format(client.spaces.delete(space_id)))


In [None]:
print("Space deletion: {}".format(client.spaces.delete('4b8f7e78-81e6-40b3-a6d6-0057f0771695')))

<a id = "summary"></a>
## Summary

Congratulations! You have learned to:

1. Download the Pytorch MNIST model
2. Create a Watson Machine Learning model by using the Pytorch model
3. Train the model by offloading work to Watson Machine Learning Acclerator
4. Save and deploy from your workspace
5. Score the model
6. Clean up