<table style="border: none" align="left">
   <tr style="border: none">
      <th style="border: none"><font face="verdana" size="5" color="black"><b>Persist, Deploy, and Score an SPSS Model to Predict Customer Churn</b></th>
      <th style="border: none"><img src="https://github.com/pmservice/customer-satisfaction-prediction/blob/master/app/static/images/ml_icon_gray.png?raw=true" alt="Watson Machine Learning icon" height="40" width="40"></th>
  <tr style="border: none">
       <th style="border: none"><img src="https://github.com/pmservice/wml-sample-models/blob/master/spark/customer-satisfaction-prediction/images/users_banner_2-03.png?raw=true" width="600" alt="Icon"> </th>
   </tr>
</table>

This notebook demonstrates how to deploy an SPSS model and score test data. 

You will use the **Telco Customer Churn** data set which contains anonymous data about customers of a telecommunication company. Use the details of this data set to predict customer churn which is critical to business as it is easier to retain existing customers rather than acquire new ones.

Some familiarity with Python is helpful. This notebook is compatible Watson Studio Local 2.1 and Python 3.6.


## Learning goals

In this notebook you learn how to:

-  Set up using the Python client.
-  Create a batch deployment of the SPSS model.
-  Use the deployed model to score data.


## Contents

1.	[Setting up](#setup)
2.	[Save the SPSS model](#save)
3.  [Deploy the SPSS model](#deploy)
4.	[Summary](#summary)

To get started on Watson Studio Local 2.1, find documentation on installation and set up <a href="https://www.ibm.com/support/knowledgecenter/SSHGWL_2.1.0/wsj/local/welcome.html" target="_blank" rel="noopener no referrer">here</a>.

## 1. Setting up <a id="setup"></a>

Import the `watson-machine-learning-client` module.
<div class="alert alert-block alert-info">
For more information about the <b>Watson Machine Learning Python client (V4)</b>, please refer to the <a href="https://wml-api-pyclient-dev-v4.mybluemix.net/" target="_blank" rel="noopener no referrer">Python client documentation</a>. If you're using the notebook within a project on your WSL cluster, you do not need to install this package as it comes pre-installed with the notebooks.
</div>

In [1]:
from watson_machine_learning_client import WatsonMachineLearningAPIClient

**Authenticate the Python client on Watson Studio Local 2.1.**

<div class="alert alert-block alert-info">To find your authentication information (your credentials) follow the steps provided here in the <a href="https://www.ibm.com/support/knowledgecenter/SSHGWL_2.1.0/wsj/analyze-data/ml-authentication-local.html" target="_blank" rel="noopener no referrer">Documentation.</a></div>

**Action**: Enter your credentials in the following cell.

In [2]:
# Enter your credentials here.
import sys,os,os.path
token = os.environ['USER_ACCESS_TOKEN']

from project_lib.utils import environment
url = environment.get_common_api_url()

wml_credentials = {
    "token": token,
    "instance_id" : "wml_local",
    "url": url,
    "version": "2.5.0"
}

Import the `watson-machine-learning-client` module and authenticate the service instance.

In [3]:
client = WatsonMachineLearningAPIClient(wml_credentials)

In [4]:
client.version

'1.0.58'

You can obtain the space UID by using the following cells.

<div class="alert alert-block alert-info">
You can create your own <a href="https://www.ibm.com/support/knowledgecenter/SSHGWL_2.1.0/wsj/analyze-data/ml-spaces_local.html" target="_blank" rel="noopener no referrer">deployment space</a> by selecting <b>Deployment Spaces</b> from the Navigation Menu on the top left of this page.</div>

Alternatively, you can create a deployment and obtain its UID using the code in the following cell. The cell is not executable cell at this stage, but you can enter the name of your space in the metadata and use it if needed.

In [5]:
# Obtain the UId of your space
def guid_from_space_name(client, space_name):
    space = client.spaces.get_details()
    return(next(item for item in space['resources'] if item['entity']["name"] == space_name)['metadata']['guid'])

**Action:** Enter the name of your deployment space in the code below: `space_uid = guid_from_space_name(client, 'YOUR DEPLOYMENT SPACE')`.

In [6]:
# Enter the name of your deployment space here:
space_uid = guid_from_space_name(client, 'YOUR DEPLOYMENT SPACE')
print("Space UID = " + space_uid)

Space UID = 7760d6fb-dff6-4546-84c4-6ac90e4a371a


Setting the default space is mandatory with Watson Studio Local 2.1. You can set this using the cell below.

In [7]:
client.set.default_space(space_uid)

'SUCCESS'

<a id="save"></a>
## 2. Save the SPSS model

Download the SPSS sample model from the <a href="https://github.com/pmservice/wml-sample-models" target="_blank" rel="noopener no referrer">Git repository</a>.

<div class="alert alert-block alert-info"><b>Note:</b> You may need to install the <tt>wget</tt> package. To install the <tt>wget</tt> package, run the following command.</div>

In [None]:
!pip install --upgrade wget

In [9]:
# Download the sample SPSS model.
import os
import wget

sample_dir = 'spss_sample_model'
if not os.path.isdir(sample_dir):
    os.mkdir(sample_dir)

filename=os.path.join(sample_dir, 'customer-satisfaction-prediction.str')
if not os.path.isfile(filename):
    filename = wget.download('https://github.com/pmservice/wml-sample-models/raw/master/spss/customer-satisfaction-prediction/model/customer-satisfaction-prediction.str',\
                             out=sample_dir)
print(filename)

spss_sample_model/customer-satisfaction-prediction.str


Save the downloaded SPSS sample model as *SPSS model for Churn prediction*. First, you need to create the model metadata.

In [10]:
# Save the SPSS model.
# Model Metadata.
meta_props={
    client.repository.ModelMetaNames.NAME: "SPSS model for Churn prediction",
    client.repository.ModelMetaNames.RUNTIME_UID: "spss-modeler_18.1",
    client.repository.ModelMetaNames.TYPE: "spss-modeler_18.1"
}

You can extract the model UID from the saved model details and use it in the next section to create the deployment.

In [11]:
# Create the model artifact.
model_artifact = client.repository.store_model(filename, meta_props=meta_props)
model_uid = client.repository.get_model_uid(model_artifact)
print("Model UID = " + model_uid)

Model UID = 24a8c70a-5b87-4317-a825-fff1bb68ee32


Get the saved model metadata using the model UID.

In [12]:
# Details about the model.
model_details = client.repository.get_details(model_uid)
from pprint import pprint
pprint(model_details)

{'entity': {'content_status': {'state': 'persisted'},
            'name': 'SPSS model for Churn prediction',
            'runtime': {'href': '/v4/runtimes/spss-modeler_18.1'},
            'space': {'href': '/v4/spaces/7760d6fb-dff6-4546-84c4-6ac90e4a371a'},
            'type': 'spss-modeler_18.1'},
 'metadata': {'created_at': '2020-03-06T17:32:29.002Z',
              'guid': '24a8c70a-5b87-4317-a825-fff1bb68ee32',
              'href': '/v4/models/24a8c70a-5b87-4317-a825-fff1bb68ee32?space_id=7760d6fb-dff6-4546-84c4-6ac90e4a371a',
              'id': '24a8c70a-5b87-4317-a825-fff1bb68ee32',
              'modified_at': '2020-03-06T17:32:30.002Z',
              'owner': '1000331005'}}


You can list all stored models using the `list_models` method.

In [13]:
# Display a list of all the models.
client.repository.list_models()

------------------------------------  -------------------------------  ------------------------  -----------------
GUID                                  NAME                             CREATED                   TYPE
24a8c70a-5b87-4317-a825-fff1bb68ee32  SPSS model for Churn prediction  2020-03-06T17:32:29.002Z  spss-modeler_18.1
------------------------------------  -------------------------------  ------------------------  -----------------


<div class="alert alert-block alert-info">
From the list of downloaded files, you can see that model is successfully stored in the deployment space.</div>

## 3. Deploy the SPSS Model <a id="deploy"></a>

Using the code below to create a batch deployment using the model UID obtained in the previous section. Create the deployment metadata using the code below.

In [14]:
# Deployment metadata.
deploy_meta = {
    client.deployments.ConfigurationMetaNames.NAME: "Sample SPSS model deployment",
    client.deployments.ConfigurationMetaNames.BATCH: {},
    client.deployments.ConfigurationMetaNames.COMPUTE: {"name": "S", "nodes": 1}
}

In [15]:
# Create the deployment.
deployment_details = client.deployments.create(model_uid, meta_props=deploy_meta)



#######################################################################################

Synchronous deployment creation for uid: '24a8c70a-5b87-4317-a825-fff1bb68ee32' started

#######################################################################################


ready.


------------------------------------------------------------------------------------------------
Successfully finished deployment creation, deployment_uid='528570d5-9fbf-4be3-b934-97e413e69703'
------------------------------------------------------------------------------------------------




Get the list of all deployments in the deployment space.

In [16]:
# List the deployments.
client.deployments.list()

------------------------------------  ----------------------------  -----  ------------------------  -------------
GUID                                  NAME                          STATE  CREATED                   ARTIFACT_TYPE
528570d5-9fbf-4be3-b934-97e413e69703  Sample SPSS model deployment  ready  2020-03-06T17:32:31.971Z  model
------------------------------------  ----------------------------  -----  ------------------------  -------------


<div class="alert alert-block alert-info">From the list of deployed models, you can see that model was  successfully deployed in the deployment space.</div>

Now, you can check details of your deployments.

In [17]:
# Deployment UID.
deployment_uid = client.deployments.get_uid(deployment_details)
print('Deployment uid = {}'.format(deployment_uid))

Deployment uid = 528570d5-9fbf-4be3-b934-97e413e69703


In [18]:
# Deployment details.
print(client.deployments.get_details(deployment_uid))

{'metadata': {'parent': {'href': ''}, 'guid': '528570d5-9fbf-4be3-b934-97e413e69703', 'modified_at': '', 'created_at': '2020-03-06T17:32:31.971Z', 'href': '/v4/deployments/528570d5-9fbf-4be3-b934-97e413e69703'}, 'entity': {'name': 'Sample SPSS model deployment', 'custom': {}, 'description': '', 'compute': {'name': 'S', 'nodes': 1}, 'batch': {}, 'space': {'href': '/v4/spaces/7760d6fb-dff6-4546-84c4-6ac90e4a371a'}, 'status': {'state': 'ready'}, 'asset': {'href': '/v4/models/24a8c70a-5b87-4317-a825-fff1bb68ee32?space_id=7760d6fb-dff6-4546-84c4-6ac90e4a371a'}, 'auto_redeploy': False}}


Prepare the scoring payload with values to score against the deployed model.

In [19]:
# Prepare scoring payload.
job_payload = {
    client.deployments.ScoringMetaNames.INPUT_DATA: [{
        'fields':['customerID','gender','SeniorCitizen','Partner','Dependents','tenure','PhoneService','MultipleLines','InternetService','OnlineSecurity','OnlineBackup','DeviceProtection','TechSupport','StreamingTV','StreamingMovies','Contract','PaperlessBilling','PaymentMethod','MonthlyCharges','TotalCharges','Churn','SampleWeight'],
        'values':[['3638-WEABW','Female',0,'Yes','No',58,'Yes','Yes','DSL','No','Yes','No','Yes','No','No','Two year','Yes','Credit card (automatic)',59.9,3505.1,'No',2.768],
                 ['9237-HQITU', 'Female',0,'No','No',2,'Yes','No','Fiber optic','No','No','No','No','No','No','Month-to-month','Yes','Electronic check',70.700,151.650,'Yes',1.000]]
    }]
}
pprint(job_payload)

{'input_data': [{'fields': ['customerID',
                            'gender',
                            'SeniorCitizen',
                            'Partner',
                            'Dependents',
                            'tenure',
                            'PhoneService',
                            'MultipleLines',
                            'InternetService',
                            'OnlineSecurity',
                            'OnlineBackup',
                            'DeviceProtection',
                            'TechSupport',
                            'StreamingTV',
                            'StreamingMovies',
                            'Contract',
                            'PaperlessBilling',
                            'PaymentMethod',
                            'MonthlyCharges',
                            'TotalCharges',
                            'Churn',
                            'SampleWeight'],
                 'values': [['3638-WEABW',
 

In [20]:
job = client.deployments.create_job(deployment_id=deployment_uid, meta_props=job_payload)
job_uid = client.deployments.get_job_uid(job)
print('Job uid = {}'.format(job_uid))

Job uid = 74d0a0e8-a3c1-420f-80d5-b3840158b3ae


In [21]:
def poll_async_job(client, job_uid):
    import time
    while True:
        job_status = client.deployments.get_job_status(job_uid)
        print(job_status)
        state = job_status['state']
        if state == 'completed' or 'fail' in state:
            return client.deployments.get_job_details(job_uid)
        time.sleep(5)

In [None]:
# Perform prediction and display the result.
job_details = poll_async_job(client, job_uid)
pprint(job_details)

In [24]:
print(job_details)

{'metadata': {'guid': '74d0a0e8-a3c1-420f-80d5-b3840158b3ae', 'href': '/v4/deployment_jobs/74d0a0e8-a3c1-420f-80d5-b3840158b3ae', 'created_at': '2020-03-06T17:05:14.787Z', 'parent': {'href': ''}}, 'entity': {'deployment': {'href': '/v4/deployments/910eb721-56c5-439f-ba60-3e34731661d4'}, 'scoring': {'input_data': [{'fields': ['customerID', 'gender', 'SeniorCitizen', 'Partner', 'Dependents', 'tenure', 'PhoneService', 'MultipleLines', 'InternetService', 'OnlineSecurity', 'OnlineBackup', 'DeviceProtection', 'TechSupport', 'StreamingTV', 'StreamingMovies', 'Contract', 'PaperlessBilling', 'PaymentMethod', 'MonthlyCharges', 'TotalCharges', 'Churn', 'SampleWeight'], 'values': [['3638-WEABW', 'Female', 0, 'Yes', 'No', 58, 'Yes', 'Yes', 'DSL', 'No', 'Yes', 'No', 'Yes', 'No', 'No', 'Two year', 'Yes', 'Credit card (automatic)', 59.9, 3505.1, 'No', 2.768], ['9237-HQITU', 'Female', 0, 'No', 'No', 2, 'Yes', 'No', 'Fiber optic', 'No', 'No', 'No', 'No', 'No', 'No', 'Month-to-month', 'Yes', 'Electronic 

As you can see, the first sample telco customer is satisfied ("Predicted Churn", "No"), and the second customer is predicted to have churned/left ("Predicted Churn", "Yes").

In [19]:
filename=os.path.join(sample_dir, 'scoreInput.csv')
if not os.path.isfile(filename):
    filename = wget.download('https://github.com/pmservice/wml-sample-models/raw/master/spss/customer-satisfaction-prediction/data/scoreInput.csv',\
                             out=sample_dir)
print(filename)

spss_sample_model/scoreInput.csv


Use the following code to save the payload as a .csv and add it as a data asset in your deployment space.

In [20]:
# churn_input_payload.to_csv("churn_input_payload.csv", index=False)
asset_details = client.data_assets.create(name="churn_input_payload.csv", file_path="spss_sample_model/scoreInput.csv")

Creating data asset...
SUCCESS


Retrieve the URL of the stored asset.

In [21]:
asset_href = client.data_assets.get_href(asset_details)

Prepare the job payload that you'll use to score the batch deployment.

In [22]:
# Prepare job payload.
job_payload_asset = {
    client.deployments.ScoringMetaNames.INPUT_DATA_REFERENCES: [{
        "type": "data_asset",
        "connection": {},
        "location": {
            "href": asset_href
        },
    }],
    client.deployments.ScoringMetaNames.OUTPUT_DATA_REFERENCE: {
            "type": "data_asset",
            "connection": {},
            "location": {
                "name": "churn_results_{}.csv".format(deployment_uid),
                "description": "results"
            }
        }
    }
pprint(job_payload_asset)

{'input_data_references': [{'connection': {},
                            'location': {'href': '/v2/assets/5c89a439-5a7b-48ea-bcd8-c73862fe8366?space_id=7760d6fb-dff6-4546-84c4-6ac90e4a371a'},
                            'type': 'data_asset'}],
 'output_data_reference': {'connection': {},
                           'location': {'description': 'results',
                                        'name': 'churn_results_528570d5-9fbf-4be3-b934-97e413e69703.csv'},
                           'type': 'data_asset'}}


Use the following method to run the scoring.

In [23]:
job = client.deployments.create_job(deployment_id=deployment_uid, meta_props=job_payload_asset)
job_uid = client.deployments.get_job_uid(job)
print('Job uid = {}'.format(job_uid))

Job uid = f775aa35-b840-4303-8473-8478e521894e


In [24]:
def poll_async_job(client, job_uid):
    import time
    while True:
        job_status = client.deployments.get_job_status(job_uid)
        print(job_status)
        state = job_status['state']
        if state == 'completed' or 'fail' in state:
            return client.deployments.get_job_details(job_uid)
        time.sleep(5)

In [25]:
# Perform prediction.
job_details_asset = poll_async_job(client, job_uid)
pprint(job_details_asset)

{'state': 'queued', 'running_at': '', 'completed_at': ''}
{'state': 'queued', 'running_at': '', 'completed_at': ''}
{'state': 'queued', 'running_at': '', 'completed_at': ''}
{'state': 'running', 'running_at': '2020-03-06T17:34:45.175Z', 'completed_at': ''}
{'state': 'completed', 'running_at': '2020-03-06T17:34:45.109Z', 'completed_at': '2020-03-06T17:34:52.434Z'}
{'entity': {'deployment': {'href': '/v4/deployments/528570d5-9fbf-4be3-b934-97e413e69703'},
            'scoring': {'input_data_references': [{'connection': {},
                                                   'location': {'href': '/v2/assets/5c89a439-5a7b-48ea-bcd8-c73862fe8366?space_id=7760d6fb-dff6-4546-84c4-6ac90e4a371a'},
                                                   'type': 'data_asset'}],
                        'output_data_reference': {'connection': {},
                                                  'location': {'description': 'results',
                                                               'href': 

You can see your that the `churn results` .csv has been created as a data asset.

In [26]:
client.data_assets.list()

------------------------------------------------------  ----------  ----  ------------------------------------
NAME                                                    ASSET_TYPE  SIZE  ASSET_ID
churn_input_payload.csv                                 data_asset  1643  5c89a439-5a7b-48ea-bcd8-c73862fe8366
churn_results_528570d5-9fbf-4be3-b934-97e413e69703.csv  data_asset  382   9fd46885-12f3-4884-8f1b-dd572ebeb06f
------------------------------------------------------  ----------  ----  ------------------------------------


Get the UID of the results .csv and download the data asset.



In [27]:
import re
results_asset = job_details_asset['entity']['scoring']['output_data_reference']['location']['href']
results_uid = re.split('[?/]', results_asset)[3]
results_uid

'9fd46885-12f3-4884-8f1b-dd572ebeb06f'

You can see the values predicted by the model here.

In [28]:
client.data_assets.download(results_uid, "scoring_results.csv")

Successfully saved asset content to file: 'scoring_results.csv'


'/home/wsuser/work/scoring_results.csv'

In [29]:
import pandas as pd
pd.read_csv("scoring_results.csv")

Unnamed: 0,customerID,Churn,Predicted Churn,Probability of Churn
0,9237-HQITU,Yes,Yes,0.882983
1,3638-WEABW,No,No,0.052631
2,8665-UTDHZ,Yes,No,0.17411
3,8773-HHUOZ,Yes,No,0.484323
4,4080-IIARD,No,No,0.092014
5,6575-SUVOI,No,No,0.092092
6,7495-OOKFY,Yes,Yes,0.97215
7,0731-EBJQB,No,No,0.090598
8,1891-QRQSA,No,No,0.092103
9,5919-TMRGD,Yes,Yes,0.894228


The first 2 customers are said to predicted to have churned and not churned respectively.

You can delete the 2 created data assets using the code below.



In [30]:
client.data_assets.delete(client.data_assets.get_uid(asset_details))
client.data_assets.delete(results_uid)

'SUCCESS'

<a id="summary"></a>
## 4. Summary     

You successfully completed this notebook! 

You learned how to create a SPSS model and use Watson Machine Learning save the model and create a batch deployment. 

### Resources <a id="resources"></a>

To learn more about configurations used in this notebook or more sample notebooks, tutorials, documentation, how-tos, and blog posts, check out these links:

<div class="alert alert-block alert-success"><a id="resources"></a>

<h4>IBM documentation</h4>
 <ul>
 <li> <a href="https://wml-api-pyclient-dev-v4.mybluemix.net" target="_blank" rel="noopener no referrer">watson-machine-learning</a></li> 
 <li> <a href="https://www.ibm.com/support/knowledgecenter/SSHGWL_2.1.0/wsj/local/welcome.html" target="_blank" rel="noopener noreferrer">Watson Studio Local 2.1</a></li>
 <li> <a href="https://www.ibm.com/support/knowledgecenter/SSHGWL_2.1.0/wsj/wmls/deploy-models.html#batch" target="_blank" rel="noopener no referrer">Batch Deployments</a></li>
     <ul>
         <li> <a href="https://www.ibm.com/support/knowledgecenter/SSHGWL_2.1.0/wsj/wmls/wmls-deploy-python.html#deploy-batch" target="_blank" rel="noopener no referrer">Batch Deployments with the Python client</a></li>
    </ul>
 </ul>
 
<h4> IBM Samples</h4>
<br>
 <li> <a href="https://github.com/IBMDataScience/sample-notebooks" target="_blank" rel="noopener noreferrer">Sample notebooks</a></li>
 
<h4> Others</h4>
<br>
 <li> <a href="https://www.python.org" target="_blank" rel="noopener noreferrer">Official Python website</a></li>
 <li> <a href="https://matplotlib.org" target="_blank" rel="noopener noreferrer">Matplotlib: Python plotting</a></li>
 <li> <a href="https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.GridSearchCV.html" target="_blank" rel="noopener noreferrer">scikit-learn: Grid Search</a></li>
 </div>

### Authors

**Lukasz Cmielowski**, Ph.D., is an Automation Architect and Data Scientist at IBM with a track record of developing enterprise-level applications that substantially increase the clients' ability to turn data into actionable knowledge.  
**Jihyoung Kim**, Ph.D., is a Data Scientist at IBM who strives to make data science easy for everyone through Watson Studio.

<hr>
Copyright © 2017-2020 IBM. This notebook and its source code are released under the terms of the MIT License.

<div style="background:#F5F7FA; height:110px; padding: 2em; font-size:14px;">
<span style="font-size:18px;color:#152935;">Love this notebook? </span>
<span style="font-size:15px;color:#152935;float:right;margin-right:40px;">Don't have an account yet?</span><br>
<span style="color:#5A6872;">Share it with your colleagues and help them discover the power of Watson Studio!</span>
<span style="border: 1px solid #3d70b2;padding:8px;float:right;margin-right:40px; color:#3d70b2;"><a href="https://ibm.co/wsnotebooks" target="_blank" style="color: #3d70b2;text-decoration: none;">Sign Up</a></span><br>
</div>