# Use Keras and Core ML to recognize hand-written digits with `ibm-watson-machine-learning` 

This notebook contains steps and code to demonstrate support of deep learning experiments in Watson Machine Learning Service. It introduces commands for getting data, training_definition persistance, experiment training, model persistance, model deployment and scoring.

Some familiarity with Python is helpful. This notebook uses Python.


## Learning goals

The learning goals of this notebook are:

-  Working with Watson Machine Learning service.
-  Training Deep Learning models (TensorFlow).
-  Saving trained models in Watson Machine Learning repository.
-  Online deployment and scoring of trained model.

## Contents

This notebook contains the following parts:

1.	[Setup](#setup)
2.	[Create model definition](#model_df)
3.	[Train model](#training)
4.  [Persist trained model](#persist)
5.	[Deploy the model via CoreML](#deploy)
6.  [Clean up](#clean)
7.	[Summary and next steps](#summary)

<a id="setup"></a>
## 1. Set up the environment

Before you use the sample code in this notebook, you must perform the following setup tasks:

-  Create a <a href="https://console.ng.bluemix.net/catalog/services/ibm-watson-machine-learning/" target="_blank" rel="noopener no referrer">Watson Machine Learning (WML) Service</a> instance (a free plan is offered and information about how to create the instance can be found <a href="https://dataplatform.cloud.ibm.com/docs/content/wsj/analyze-data/ml-service-instance.html?context=analytics" target="_blank" rel="noopener no referrer">here</a>).

### Connection to WML

Authenticate the Watson Machine Learning service on IBM Cloud. You need to provide platform `api_key` and instance `location`.

You can use [IBM Cloud CLI](https://cloud.ibm.com/docs/cli/index.html) to retrieve platform API Key and instance location.

API Key can be generated in the following way:
```
ibmcloud login
ibmcloud iam api-key-create API_KEY_NAME
```

In result, get the value of `api_key` from the output.


Location of your WML instance can be retrieved in the following way:
```
ibmcloud login --apikey API_KEY -a https://cloud.ibm.com
ibmcloud resource service-instance WML_INSTANCE_NAME
```

In result, get the value of `location` from the output.

**Tip**: Your `Cloud API key` can be generated by going to the [**Users** section of the Cloud console](https://cloud.ibm.com/iam#/users). From that page, click your name, scroll down to the **API Keys** section, and click **Create an IBM Cloud API key**. Give your key a name and click **Create**, then copy the created key and paste it below. You can also get a service specific url by going to the [**Endpoint URLs** section of the Watson Machine Learning docs](https://cloud.ibm.com/apidocs/machine-learning).  You can check your instance location in your  <a href="https://console.ng.bluemix.net/catalog/services/ibm-watson-machine-learning/" target="_blank" rel="noopener no referrer">Watson Machine Learning (WML) Service</a> instance details.

You can also get service specific apikey by going to the [**Service IDs** section of the Cloud Console](https://cloud.ibm.com/iam/serviceids).  From that page, click **Create**, then copy the created key and paste it below.

**Action**: Enter your `api_key` and `location` in the following cell.

In [None]:
api_key = 'PASTE YOUR PLATFORM API KEY HERE'
location = 'PASTE YOUR INSTANCE LOCATION HERE'

In [None]:
wml_credentials = {
    "apikey": api_key,
    "url": 'https://' + location + '.ml.cloud.ibm.com'
}

### Install and import the `ibm-watson-machine-learning` package
**Note:** `ibm-watson-machine-learning` documentation can be found <a href="http://ibm-wml-api-pyclient.mybluemix.net/" target="_blank" rel="noopener no referrer">here</a>.

In [None]:
!pip install ibm-watson-machine-learning

In [None]:
from ibm_watson_machine_learning import APIClient

client = APIClient(wml_credentials)

### Working with spaces

First of all, you need to create a space that will be used for your work. If you do not have space already created, you can use [Deployment Spaces Dashboard](https://dataplatform.cloud.ibm.com/ml-runtime/spaces?context=cpdaas) to create one.

- Click New Deployment Space
- Create an empty space
- Select Cloud Object Storage
- Select Watson Machine Learning instance and press Create
- Copy `space_id` and paste it below

**Tip**: You can also use SDK to prepare the space for your work. More information can be found [here](https://github.com/IBM/watson-machine-learning-samples/blob/master/cloud/notebooks/python_sdk/instance-management/Space%20management.ipynb).

**Action**: Assign space ID below

In [None]:
space_id = 'PASTE YOUR SPACE ID HERE'

You can use `list` method to print all existing spaces.

In [None]:
client.spaces.list(limit=10)

To be able to interact with all resources available in Watson Machine Learning, you need to set **space** which you will be using.

In [None]:
client.set.default_space(space_id)

### 1.1 Working with Cloud Object Storage

-  Create [Cloud Object Storage (COS)](https://console.bluemix.net/catalog/infrastructure/cloud-object-storage) instance (a lite plan is offered).
    - After you create COS instance, go to your COS dashboard.
    - In "Service credentials" tab, click on "New Credential",
    - Add inline configuration parameter: {"HMAC":true}, click "Add".

    This configuration parameter will add section below to instance credentials which will be used later on,
    ```
      "cos_hmac_keys": {
            "access_key_id": "***",
            "secret_access_key": "***"
       }
    ```

Boto library allows Python developers to manage Cloud Object Storage.

**Note:** If ibm_boto3 is not preinstalled in you environment please install it by running the following command: `!pip install ibm-cos-sdk`

In [None]:
import ibm_boto3
from ibm_botocore.client import Config
import os
import json
import warnings
warnings.filterwarnings('ignore')

We define the endpoint we will use. You can find this information in "Endpoint" section of your Cloud Object Storage instance's dashboard.

In [None]:
cos_credentials = {
  "apikey": "***",
  "cos_hmac_keys": {
    "access_key_id": "***",
    "secret_access_key": "***"
  },
  "endpoints": "***",
  "iam_apikey_description": "***",
  "iam_apikey_name": "***",
  "iam_role_crn": "***",
  "iam_serviceid_crn": "***",
  "resource_instance_id": "***"
}

api_key = cos_credentials['apikey']
service_instance_id = cos_credentials['resource_instance_id']
auth_endpoint = "https://iam.bluemix.net/oidc/token/"
service_endpoint = "https://s3.us-west.cloud-object-storage.test.appdomain.cloud"

We create Boto resource by providing type, endpoint_url and credentials.

In [None]:
cos = ibm_boto3.resource('s3',
                         ibm_api_key_id=api_key,
                         ibm_service_instance_id=service_instance_id,
                         ibm_auth_endpoint=auth_endpoint,
                         config=Config(signature_version='oauth'),
                         endpoint_url=service_endpoint)

Let's create the buckets we will use to store training data and training results.

**Note:**: Bucket name has to be unique - please update below ones to any unique name.

In [None]:
buckets = ['tf-keras-data-example-4', 'tf-keras-results-example-4']
for bucket in buckets:
    if not cos.Bucket(bucket) in cos.buckets.all():
        print('Creating bucket "{}"...'.format(bucket))
        try:
            cos.create_bucket(Bucket=bucket)
        except ibm_boto3.exceptions.ibm_botocore.client.ClientError as e:
            print('Error: {}.'.format(e.response['Error']['Message']))

Now we should have our buckets created.

In [None]:
print(list(cos.buckets.limit(50)))

### 1.2 Downloading MNIST data and upload it to COS buckets

We will work with Keras **MNIST** sample dataset. Let's download our training data and upload them to 'mnist-keras-data' bucket.

Below cell will create 'MNIST_KERAS_DATA' folder and download the file from link.

**Note:** First, please install wget library by the command below
`!pip install wget`

In [None]:
link = 'https://s3.amazonaws.com/img-datasets/mnist.npz'

In [None]:
import wget

data_dir = 'MNIST_KERAS_DATA'
if not os.path.isdir(data_dir):
    os.mkdir(data_dir)

if not os.path.isfile(os.path.join(data_dir, os.path.join(link.split('/')[-1]))):
    wget.download(link, out=data_dir)  
        
!ls MNIST_KERAS_DATA

Upload the data files to created bucket.

In [None]:
bucket_name = buckets[0]
bucket_obj = cos.Bucket(bucket_name)

for filename in os.listdir(data_dir):
    with open(os.path.join(data_dir, filename), 'rb') as data: 
        bucket_obj.upload_file(os.path.join(data_dir, filename), filename)
        print('{} is uploaded.'.format(filename))

Let's see the list of all buckets and their contents.

In [None]:
for obj in bucket_obj.objects.all():
    print('Object key: {}'.format(obj.key))
    print('Object size (kb): {}'.format(obj.size/1024))

We are done with Cloud Object Storage, we are ready to train our model!

<a id="model_def"></a>
# 2. Create model definition

For purpose of this example two Keras model definitions have been prepared:

 - Multilayer Perceptron (MLP)
 - Convolution Neural Network (CNN)

### 2.1 Prepare model definition metadata

In [None]:
metaprops = {
    client.model_definitions.ConfigurationMetaNames.NAME: "MNIST cnn model definition",
    client.model_definitions.ConfigurationMetaNames.DESCRIPTION: "MNIST cnn model definition",
    client.model_definitions.ConfigurationMetaNames.COMMAND: "python3 mnist_cnn.py",
    client.model_definitions.ConfigurationMetaNames.PLATFORM: {"name": "python", "versions": ["3.7"]},
    client.model_definitions.ConfigurationMetaNames.VERSION: "2.0",
    client.model_definitions.ConfigurationMetaNames.SPACE_UID: space_id
}

### 2.2 Get sample model definition content files from git (python scripts with CNN and MLP)  

In [None]:
filename_mnist = 'MNIST.zip'

if not os.path.isfile(filename_mnist):
    filename_mnist = wget.download('https://github.com/pmservice/wml-sample-models/raw/master/keras/mnist/MNIST.zip')

**Tip**: Convert below cell to code and run it to see model deinition's code.

### 2.3 Publish model definition

In [None]:
model_definition_details = client.model_definitions.store(filename_mnist, meta_props=metaprops)

In [None]:
model_definition_id = client.model_definitions.get_id(model_definition_details)
print(model_definition_id)

#### List model definitions

In [None]:
client.model_definitions.list(limit=20)

<a id="training"></a>
# 3. Train model

### 3.1 Prepare training metadata

In [None]:
training_metadata = {
    client.training.ConfigurationMetaNames.NAME: "Keras-MNIST",
    client.training.ConfigurationMetaNames.SPACE_UID: space_id,
    client.training.ConfigurationMetaNames.DESCRIPTION: "Keras-MNIST predict written digits cnn",
    client.training.ConfigurationMetaNames.TAGS :[{
      "value": "MNIST",
      "description": "predict written digits"
    }],
    client.training.ConfigurationMetaNames.TRAINING_RESULTS_REFERENCE:  {
    "name": "MNIST results",
    "connection": {
            "endpoint_url": service_endpoint,
            "access_key_id": cos_credentials['cos_hmac_keys']['access_key_id'],
            "secret_access_key": cos_credentials['cos_hmac_keys']['secret_access_key']
      },
      "location": {
        "bucket": buckets[1]
      },
    "type": "s3"
  },
  client.training.ConfigurationMetaNames.MODEL_DEFINITION:{
        "id": model_definition_id,
        "command": "python3 mnist_cnn.py",
        "hardware_spec": {
          "name": "K80",
          "nodes": 1
        },
        "software_spec": {
          "name": "tensorflow_2.1-py3.7"
        },
        "parameters": {
          "name": "MNIST cnn",
          "description": "Simple MNIST cnn model"
        }
      },
  client.training.ConfigurationMetaNames.TRAINING_DATA_REFERENCES: [
       {
      "name": "training_input_data",
      "type": "s3",
      "connection": {
        "endpoint_url": service_endpoint,
        "access_key_id": cos_credentials['cos_hmac_keys']['access_key_id'],
        "secret_access_key": cos_credentials['cos_hmac_keys']['secret_access_key']
      },
      "location": {
        "bucket": buckets[0]
      },
      "schema": {
        "id":"idmlp_schema",
        "fields": [
          {
            "name": "text",
            "type": "string"
          }
        ]
      }
    }
  ]
}

### 3.2 Train model in background

In [None]:
training = client.training.run(training_metadata)

### 3.3 Get training id and status

In [None]:
training_id = client.training.get_uid(training)

In [None]:
client.training.get_status(training_id)["state"]

### 3.4 Get training details

In [None]:
training_details = client.training.get_details(training_id)
print(json.dumps(training_details, indent=2))

#### List trainings

In [None]:
client.training.list()

#### Cancel training

You can cancel the training run by calling the method below.  
**Tip**: If you want to  delete train runs and results add `hard_delete=True` as a parameter.

<a id="persist"></a>
# 4. Persist trained model

### 4.1 Download trained model from COS

In [None]:
uid = client.training.get_details(training_id)['entity']['results_reference']['location']['logs']

#### Download model from COS

In [None]:
bucket_name = buckets[1]
bucket_obj = cos.Bucket(bucket_name)

model_path = ""
for obj in bucket_obj.objects.iterator():
    if uid in obj.key and obj.key.endswith(".h5"):
        model_path = obj.key
        break

model_name = model_path.split("/")[-1]
bucket_obj.download_file(model_path, model_name)

#### Unpack model and compress it to tar.gz format

In [None]:
import tarfile
    
model_name = "mnist_cnn.h5"    
with tarfile.open(model_name + ".tar.gz", "w:gz") as tar:
    tar.add("mnist_cnn.h5")

### 4.2 Publish model

In [None]:
software_spec_uid = client.software_specifications.get_uid_by_name('default_py3.7')

In [None]:
model_meta_props = {
                    client.repository.ModelMetaNames.NAME: "Keras MNIST_CNN",
                    client.repository.ModelMetaNames.TYPE: "tensorflow_2.1",
                    client.repository.ModelMetaNames.SOFTWARE_SPEC_UID: software_spec_uid
}

published_model = client.repository.store_model(model='mnist_cnn.h5.tar.gz', meta_props=model_meta_props)
model_uid = client.repository.get_model_uid(published_model)

### 4.3 Get model details

In [None]:
model_details = client.repository.get_details(model_uid)
print(json.dumps(model_details, indent=2))

#### List stored models

In [None]:
client.repository.list_models(limit=5)

<a id="deploy"></a>
# 5. Deploy and score

### 5.1 Create online deployment for published model

In [None]:
deployment = client.deployments.create(model_uid, meta_props={
                                            client.deployments.ConfigurationMetaNames.NAME: "Keras MNIST",
                                            client.deployments.ConfigurationMetaNames.VIRTUAL: {"export_format": "coreml"}})

deployment_uid = client.deployments.get_uid(deployment)

You can list existing deployments by executing following cell.

In [None]:
client.deployments.list()

#### Download the virtual deployment content: Core ML model.

In [None]:
deployment_content = client.deployments.download(deployment_uid)

Use the code in the cell below to create the download link.

In [None]:
from ibm_watson_machine_learning.utils import create_download_link

create_download_link(deployment_content)

**Note:** You can use <a href="https://developer.apple.com/xcode/" target="_blank" rel="noopener no referrer">Xcode</a> to preview the model's metadata (after unzipping). 

### 5.3 Test the `Core ML` model<a id="testcoreML"></a>

Use the following steps to run a test against the downloaded Core ML model.

In [None]:
!pip install --upgrade coremltools

Use the ``coremltools`` to load the model and check some basic metadata.

First, extract the model.

In [None]:
from ibm_watson_machine_learning.utils import extract_mlmodel_from_archive

extracted_model_path = extract_mlmodel_from_archive('mlartifact.tar.gz', model_uid)

Load the model and check the description.

In [None]:
import coremltools

loaded_model = coremltools.models.MLModel(extracted_model_path)
print(loaded_model.get_spec())

Use following steps to change the model input type for IOS application.

In [None]:
spec = coremltools.utils.load_spec(extracted_model_path)

Get current input type.

In [None]:
inp = spec.description.input[0]

Change and save the input data type.

In [None]:
inp.type.imageType.height = 28 
inp.type.imageType.width = 28
inp.type.imageType.colorSpace = coremltools.proto.FeatureTypes_pb2.ImageFeatureType.GRAYSCALE

In [None]:
spec.description.input[0]

The model looks good now and can be used on your iPhone now. You can save it with following cell.

In [None]:
coremltools.utils.save_spec(spec, 'mnistCNN.mlmodel') 

<a id="clean"></a>
# 6. Clean up

If you want to clean up all created assets:
- experiments
- trainings
- pipelines
- model definitions
- models
- functions
- deployments

please follow up this sample [notebook](https://github.com/IBM/watson-machine-learning-samples/blob/master/cloud/notebooks/python_sdk/instance-management/Machine%20Learning%20artifacts%20management.ipynb).

<a id="summary"></a>
# 7. Summary and next steps     

 You successfully completed this notebook! You learned how to use `ibm-watson-machine-learning-client` to run experiments. Check out our _[Online Documentation](https://dataplatform.cloud.ibm.com/docs/content/wsj/getting-started/welcome-main.html?context=analytics)_ for more samples, tutorials, documentation, how-tos, and blog posts. 

### Author

**Szymon Kucharczyk**, Software Engineer in Watson Machine Learning.

Copyright Â© 2020 IBM. This notebook and its source code are released under the terms of the MIT License.