<table style="border: none" align="left">
   <tr style="border: none">
      <th style="border: none"><font face="verdana" size="5" color="black"><b>Persist, deploy, and score a PMML model to predict species of iris</b></th>
      <th style="border: none"><img src="https://github.com/pmservice/customer-satisfaction-prediction/blob/master/app/static/images/ml_icon_gray.png?raw=true" alt="Watson Machine Learning icon" height="40" width="40"></th>
  <tr style="border: none">
       <th style="border: none"><img src="https://github.com/pmservice/wml-sample-models/blob/master/spark/customer-satisfaction-prediction/images/users_banner_2-03.png?raw=true" width="600" alt="Icon"> </th>
   </tr>
</table>

This notebook demonstrates how to store a sample Predictive Model Markup Language (PMML) model and score test data. 
You will use the Iris data set to predict the species of an iris flower. This data set contains measurements of the iris perianth flower. 

Some familiarity with Python is helpful. This notebook uses `watson-machine-learning-client-V4` and is compatible with Watson Studio Local 2.1 and Python 3.6.


## Learning goals

You will learn how to:

-  Set up with the Python client.
-  Deploy a PMML model.
-  Score the deployed model.


## Table of Contents

1.	[Setting up](#setup)<br>
2.	[Persist, deploy, and score a PMML model](#scoring)<br>
    2.1 [Persist a PMML model](#persist)<br>
    2.2 [Create a batch deployment](#deploy)<br>
    2.3 [Score a test data record](#score)<br>
3.	[Summary and next steps](#summary)

<a id="setup"></a>
## 1. Setting up

To get started on Watson Studio Local 2.1, find documentation on installation and set up <a href="https://www.ibm.com/support/knowledgecenter/SSHGWL_2.1.0/wsj/local/welcome.html" target="_blank" rel="noopener no referrer">here</a>.

**Authenticate the Python client on Watson Studio Local 2.1.**

<div class="alert alert-block alert-info">To find your authentication information (your credentials) follow the steps provided here in the <a href="https://www.ibm.com/support/knowledgecenter/SSHGWL_2.1.0/wsj/analyze-data/ml-authentication-local.html" target="_blank" rel="noopener no referrer">Documentation.</a></div>

**Action**: Enter your credentials in the following configurations.

In [1]:
# Enter your credentials here.
import sys,os,os.path
token = os.environ['USER_ACCESS_TOKEN']

from project_lib.utils import environment
url = environment.get_common_api_url()

wml_credentials = {
    "token": token,
    "instance_id" : "wml_local",
    "url": url,
    "version": "2.5.0"
}

Import the `watson-machine-learning-client` module.
<div class="alert alert-block alert-info">
For more information about the <b>Watson Machine Learning Python client (V4)</b>, please refer to the <a href="https://wml-api-pyclient-dev-v4.mybluemix.net/" target="_blank" rel="noopener no referrer">Python client documentation</a>. If you're using the notebook within a project on your WSL cluster, you do not need to install this package as it comes pre-installed with the notebooks.
</div>

In [2]:
from watson_machine_learning_client import WatsonMachineLearningAPIClient

In [3]:
client = WatsonMachineLearningAPIClient(wml_credentials)

In [4]:
client.version

'1.0.58'

<a id="scoring"></a>
## 2. Persist, deploy, and score a PMML model

In this section, you will use the Python client to create a batch deployment and score the PMML model with new data records.

### 2.1 Persist a PMML model<a id="persist"></a>

Use `wget` to download the sample PMML model, `iris_chaid.xml` from the <a href="https://github.com/pmservice/wml-sample-models" target="_blank" rel="noopener no referrer">Git repository</a>.  

<div class="alert alert-block alert-info">
You may need to install the <tt>wget</tt> package. To install the <tt>wget</tt> package, run the following command.</div>

In [None]:
!pip install --upgrade wget

In [6]:
# Download the sample PMML model and the 'iris_chaid.xml' file from the Github repo.
import wget
import os

sample_dir = 'pmml_sample_model'
if not os.path.isdir(sample_dir):
    os.mkdir(sample_dir)
    
filename=os.path.join(sample_dir, 'iris_chaid.xml')
if not os.path.isfile(filename):
    filename = wget.download('https://github.com/pmservice/wml-sample-models/raw/master/pmml/iris-species/model/iris_chaid.xml', out=sample_dir)

You can obtain the space UID using the following cells.

<div class="alert alert-block alert-info">
You can create your own <a href="https://www.ibm.com/support/knowledgecenter/SSHGWL_2.1.0/wsj/analyze-data/ml-spaces_local.html" target="_blank" rel="noopener no referrer">deployment space</a> by selecting <b>Deployment Spaces</b> from the Navigation Menu on the top left of this page.</div>

Alternatively, you can create a deployment and obtain its UID using the code in the following cell. The cell is not executable cell at this stage, but you can enter the name of your space in the metadata and use it if needed.

In [7]:
# Obtain the UId of your space
def guid_from_space_name(client, space_name):
    space = client.spaces.get_details()
    return(next(item for item in space['resources'] if item['entity']["name"] == space_name)['metadata']['guid'])

**Action:** Enter the name of your deployment space in the code below: `space_uid = guid_from_space_name(client, 'YOUR DEPLOYMENT SPACE')`.

In [8]:
# Enter the name of your deployment space here:
space_uid = guid_from_space_name(client, 'YOUR DEPLOYMENT SPACE')
print("Space UID = " + space_uid)

Space UID = 7760d6fb-dff6-4546-84c4-6ac90e4a371a


Setting the default space is mandatory for Watson Studio Local 2.1. You can set this using the cell below.

In [9]:
client.set.default_space(space_uid)

'SUCCESS'

Store the downloaded file as *CHAID PMML model for Iris data*, and then list all the files stored in the deployment space. First, you need to create the model metadata. 

In [10]:
# Model metadata
props_pmml = {
    client.repository.ModelMetaNames.NAME: 'CHAID PMML model for Iris data',
    client.repository.ModelMetaNames.RUNTIME_UID: 'pmml_4.2.1', 
    client.repository.ModelMetaNames.TYPE: 'pmml_4.2.1'
}

You need the model UID to create the deployment. You can extract the model UID from the saved model details and use it in the next section to create the deployment.

In [11]:
# Create the model artifact.
model_artifact = client.repository.store_model(filename, meta_props=props_pmml)
model_uid = client.repository.get_model_uid(model_artifact)
print("Model UID = " + model_uid)

Model UID = 6395cff4-3f60-4c40-ae2e-65c60908e45f


Get the saved model metadata.

In [12]:
# Details about the model.
model_details = client.repository.get_details(model_uid)
from pprint import pprint
pprint(model_details)

{'entity': {'content_status': {'state': 'persisted'},
            'name': 'CHAID PMML model for Iris data',
            'runtime': {'href': '/v4/runtimes/pmml_4.2.1'},
            'space': {'href': '/v4/spaces/7760d6fb-dff6-4546-84c4-6ac90e4a371a'},
            'type': 'pmml_4.2.1'},
 'metadata': {'created_at': '2020-03-05T23:41:12.002Z',
              'guid': '6395cff4-3f60-4c40-ae2e-65c60908e45f',
              'href': '/v4/models/6395cff4-3f60-4c40-ae2e-65c60908e45f?space_id=7760d6fb-dff6-4546-84c4-6ac90e4a371a',
              'id': '6395cff4-3f60-4c40-ae2e-65c60908e45f',
              'modified_at': '2020-03-05T23:41:13.002Z',
              'owner': '1000331005'}}


You can list all stored models once again using the `list_models` method.

In [13]:
# Display a list of all the models.
client.repository.list_models()

------------------------------------  ------------------------------  ------------------------  ----------
GUID                                  NAME                            CREATED                   TYPE
6395cff4-3f60-4c40-ae2e-65c60908e45f  CHAID PMML model for Iris data  2020-03-05T23:41:12.002Z  pmml_4.2.1
------------------------------------  ------------------------------  ------------------------  ----------


<div class="alert alert-block alert-info">
From the list of downloaded files, you can see that the model is successfully stored in the deployment space.</div>

### 2.2 Create a batch deployment<a id="deploy"></a>

Now, create a batch deployment, *Iris species prediction*, for the stored model, and list all the deployments for the model.

In [14]:
# Deployment metadata.
deploy_meta = {
    client.deployments.ConfigurationMetaNames.NAME: "Iris species prediction",
    client.deployments.ConfigurationMetaNames.BATCH: {},
    client.deployments.ConfigurationMetaNames.COMPUTE: {"name": "S", "nodes": 1}
}

In [15]:
# Create a batch deployment.
deployment_details = client.deployments.create(model_uid, meta_props=deploy_meta)

# List the deployments.
client.deployments.list()



#######################################################################################

Synchronous deployment creation for uid: '6395cff4-3f60-4c40-ae2e-65c60908e45f' started

#######################################################################################


ready.


------------------------------------------------------------------------------------------------
Successfully finished deployment creation, deployment_uid='83d75a6a-61fd-4aa1-80a7-9e3869c39b68'
------------------------------------------------------------------------------------------------


------------------------------------  -----------------------  -----  ------------------------  -------------
GUID                                  NAME                     STATE  CREATED                   ARTIFACT_TYPE
83d75a6a-61fd-4aa1-80a7-9e3869c39b68  Iris species prediction  ready  2020-03-05T23:44:38.196Z  model
------------------------------------  -----------------------  -----  ------------------------  ----------

<div class="alert alert-block alert-info">From the list of deployed models, you can see that model was  successfully created and stored in the deployment space.</div>

Now, you can check details of your deployments.

In [16]:
# Deployment UID.
deployment_uid = client.deployments.get_uid(deployment_details)
print('Deployment uid = {}'.format(deployment_uid))

Deployment uid = 83d75a6a-61fd-4aa1-80a7-9e3869c39b68


In [17]:
# Deployment details.
print(client.deployments.get_details(deployment_uid))

{'metadata': {'parent': {'href': ''}, 'guid': '83d75a6a-61fd-4aa1-80a7-9e3869c39b68', 'modified_at': '', 'created_at': '2020-03-05T23:44:38.196Z', 'href': '/v4/deployments/83d75a6a-61fd-4aa1-80a7-9e3869c39b68'}, 'entity': {'name': 'Iris species prediction', 'custom': {}, 'description': '', 'compute': {'name': 'S', 'nodes': 1}, 'batch': {}, 'space': {'href': '/v4/spaces/7760d6fb-dff6-4546-84c4-6ac90e4a371a'}, 'status': {'state': 'ready'}, 'asset': {'href': '/v4/models/6395cff4-3f60-4c40-ae2e-65c60908e45f?space_id=7760d6fb-dff6-4546-84c4-6ac90e4a371a'}, 'auto_redeploy': False}}


### 2.3 Score a test data record<a id="score"></a>

Now, you can score the data and predict the species of iris flower given from the data.

In [18]:
# Score data and predict species of flower. Prepare scoring payload.
job_payload = {
    client.deployments.ScoringMetaNames.INPUT_DATA: [{
        'fields': ['Sepal.Length', 'Sepal.Width', 'Petal.Length', 'Petal.Width'],
        'values': [[5.1, 3.5, 1.4, 0.2],
                  [4.9, 3.0, 1.4, 0.2],
                  [7.0, 3.2, 4.7, 1.4]]
    }]
}
pprint(job_payload)

{'input_data': [{'fields': ['Sepal.Length',
                            'Sepal.Width',
                            'Petal.Length',
                            'Petal.Width'],
                 'values': [[5.1, 3.5, 1.4, 0.2],
                            [4.9, 3.0, 1.4, 0.2],
                            [7.0, 3.2, 4.7, 1.4]]}]}


In [19]:
job = client.deployments.create_job(deployment_id=deployment_uid, meta_props=job_payload)
job_uid = client.deployments.get_job_uid(job)
print('Job uid = {}'.format(job_uid))

Job uid = 163d7090-4ec0-46e1-994e-48f3727a327a


In [20]:
def poll_async_job(client, job_uid):
    import time
    while True:
        job_status = client.deployments.get_job_status(job_uid)
        print(job_status)
        state = job_status['state']
        if state == 'completed' or 'fail' in state:
            return client.deployments.get_job_details(job_uid)
        time.sleep(5)

In [21]:
# Perform prediction and display the result.
job_details = poll_async_job(client, job_uid)
pprint(job_details)

{'state': 'queued', 'running_at': '', 'completed_at': ''}
{'state': 'queued', 'running_at': '', 'completed_at': ''}
{'state': 'queued', 'running_at': '', 'completed_at': ''}
{'state': 'queued', 'running_at': '', 'completed_at': ''}
{'state': 'queued', 'running_at': '', 'completed_at': ''}
{'state': 'queued', 'running_at': '', 'completed_at': ''}
{'state': 'queued', 'running_at': '', 'completed_at': ''}
{'state': 'queued', 'running_at': '', 'completed_at': ''}
{'state': 'queued', 'running_at': '', 'completed_at': ''}
{'state': 'completed', 'running_at': '', 'completed_at': ''}
{'entity': {'deployment': {'href': '/v4/deployments/83d75a6a-61fd-4aa1-80a7-9e3869c39b68'},
            'scoring': {'input_data': [{'fields': ['Sepal.Length',
                                                   'Sepal.Width',
                                                   'Petal.Length',
                                                   'Petal.Width'],
                                        'values': [[5.1, 3

As we can see from the prediction, the species of the first two samples is Iris Setosa and the third is Iris Versicolor.

<a id="summary"></a>
## 3. Summary and next steps     

You successfully completed this notebook! You learned how to use Watson Machine Learning for PMML model deployment and scoring. 

### Resources <a id="resources"></a>

To learn more about configurations used in this notebook or more sample notebooks, tutorials, documentation, how-tos, and blog posts, check out these links:

<div class="alert alert-block alert-success">

<h4>IBM documentation</h4>
 <ul>
 <li> <a href="https://wml-api-pyclient-dev-v4.mybluemix.net" target="_blank" rel="noopener no referrer">watson-machine-learning</a></li> 
 <li> <a href="https://www.ibm.com/support/knowledgecenter/SSHGWL_2.1.0/wsj/local/welcome.html" target="_blank" rel="noopener noreferrer">Watson Studio Local 2.1</a></li>
 <li> <a href="https://www.ibm.com/support/knowledgecenter/SSHGWL_2.1.0/wsj/wmls/deploy-models.html#batch" target="_blank" rel="noopener no referrer">Batch Deployments</a></li>
     <ul>
         <li> <a href="https://www.ibm.com/support/knowledgecenter/SSHGWL_2.1.0/wsj/wmls/wmls-deploy-python.html#deploy-batch" target="_blank" rel="noopener no referrer">Batch Deployments with the Python client</a></li>
    </ul>
 </ul>
 
<h4> IBM Samples</h4>
<br>
 <li> <a href="https://github.com/IBMDataScience/sample-notebooks" target="_blank" rel="noopener noreferrer">Sample notebooks</a></li>
 <li> <a href="https://github.com/pmservice/wml-sample-models" target="_blank" rel="noopener noreferrer">Sample models</a></li>
 
<h4> Others</h4>
<br>
 <li> <a href="https://www.python.org" target="_blank" rel="noopener noreferrer">Official Python website</a><br></li>
</div>

### Citation

Dua, D. and Karra Taniskidou, E. (2017). [UCI Machine Learning Repository](http://archive.ics.uci.edu/ml). Irvine, CA: University of California, School of Information and Computer Science.


### Authors

**Wojciech Sobala** is a Data Scientist at IBM.  <br><br>
**Jihyoung Kim**, Ph.D., is a Data Scientist at IBM who strives to make data science easy for everyone through Watson Studio.

<hr>
Copyright © 2018-2020 IBM. This notebook and its source code are released under the terms of the MIT License.

<div style="background:#F5F7FA; height:110px; padding: 2em; font-size:14px;">
<span style="font-size:18px;color:#152935;">Love this notebook? </span>
<span style="font-size:15px;color:#152935;float:right;margin-right:40px;">Don't have an account yet?</span><br>
<span style="color:#5A6872;">Share it with your colleagues and help them discover the power of Watson Studio!</span>
<span style="border: 1px solid #3d70b2;padding:8px;float:right;margin-right:40px; color:#3d70b2;"><a href="https://ibm.co/wsnotebooks" target="_blank" style="color: #3d70b2;text-decoration: none;">Sign Up</a></span><br>
</div>