<table style="border: none" align="left">
    <tr style="border: none">
       <th style="border: none"><img src="https://raw.githubusercontent.com/pmservice/cars-4-you/master/static/images/logo.png" width="200" alt="Icon"></th>
       <th style="border: none"><font face="verdana" size="5" color="black"><b>Use python function to predict customer satisfaction</b></th>
   </tr>
</table>

<img align=left src="https://github.com/pmservice/cars-4-you/raw/master/static/images/ai_function.png" alt="Icon" width="664">

This notebook trains a **Keras** (TensorFlow) model to predict customer satisfaction based on the feedback that has been provideed. The notebook also demonstrates how you can use the **python function** for the deep learning model data preprocessing required before you start model scoring.

Some familiarity with Python is helpful. This notebook uses Python 3.5 and the watson-machine-learning-client package.

Contents

- [1. Setup](#setup)
- [2. Load and explore data](#load)
- [3. Create Keras model using TensorFlow backend](#model)
- [4. Store the model in the repository](#persistence)
- [5. Deploy the model](#deployment)
- [6. Score the model](#score)
- [7. Python function](#ai_function)
- [8. Payload logging for AI function](#ai_function)

# <a id="setup"></a>
## 1. Setup

Install TensorFlow version 1.5 and newest version of the watson-machine-learning-client.

### 1.1 Install TensorFlow

In [1]:
!pip install --upgrade tensorflow==1.5

Collecting tensorflow==1.5
  Downloading https://files.pythonhosted.org/packages/43/aa/fe3e9d0b48db4adde9781658b9354814b6cdf6acbfeaa7b2677c7b8002d6/tensorflow-1.5.0-cp35-cp35m-manylinux1_x86_64.whl (44.4MB)
[K    100% |████████████████████████████████| 44.4MB 22kB/s  eta 0:00:01
[?25hRequirement not upgraded as not directly required: six>=1.10.0 in /opt/conda/envs/DSX-Python35/lib/python3.5/site-packages (from tensorflow==1.5)
Requirement not upgraded as not directly required: numpy>=1.12.1 in /opt/conda/envs/DSX-Python35/lib/python3.5/site-packages (from tensorflow==1.5)
Requirement not upgraded as not directly required: protobuf>=3.4.0 in /opt/conda/envs/DSX-Python35/lib/python3.5/site-packages (from tensorflow==1.5)
Collecting tensorflow-tensorboard<1.6.0,>=1.5.0 (from tensorflow==1.5)
  Downloading https://files.pythonhosted.org/packages/cc/fa/91c06952517b4f1bc075545b062a4112e30cebe558a6b962816cb33efa27/tensorflow_tensorboard-1.5.1-py3-none-any.whl (3.0MB)
[K    100% |██████████

### 1.2 Install the Watson Machine Learning client

Install the watson-machine-learning-client from pypi.

In [2]:
# Install the WML client.
!pip install --upgrade watson-machine-learning-client

Collecting watson-machine-learning-client
  Downloading https://files.pythonhosted.org/packages/f6/f3/cebff2ea1088a7aca0c728c6704dc7fa1a0d592d9a989bf7cf11c5644d07/watson_machine_learning_client-1.0.333-py3-none-any.whl (932kB)
[K    100% |████████████████████████████████| 942kB 1.1MB/s eta 0:00:01
[?25hRequirement not upgraded as not directly required: tabulate in /opt/conda/envs/DSX-Python35/lib/python3.5/site-packages (from watson-machine-learning-client)
Requirement not upgraded as not directly required: urllib3 in /opt/conda/envs/DSX-Python35/lib/python3.5/site-packages (from watson-machine-learning-client)
Requirement not upgraded as not directly required: requests in /opt/conda/envs/DSX-Python35/lib/python3.5/site-packages (from watson-machine-learning-client)
Requirement not upgraded as not directly required: ibm-cos-sdk in /opt/conda/envs/DSX-Python35/lib/python3.5/site-packages (from watson-machine-learning-client)
Requirement not upgraded as not directly required: lomond in

**Tip:** Restart the kernel (**Kernel** -> **Restart**)

In [3]:
# Import the WML client.
from watson_machine_learning_client import WatsonMachineLearningAPIClient



Authenticate to the Watson Machine Learning (WML) service on IBM Cloud.

**Tip**: Authentication information (your credentials) can be found in the <a href="https://console.bluemix.net/docs/services/service_credentials.html#service_credentials" target="_blank" rel="noopener no referrer">Service credentials</a> tab of the service instance that you created on IBM Cloud. 
If there are no credentials listed for your instance in **Service credentials**, click **New credential (+)** and enter the information required to generate new authentication information. 

**Action**: Enter your WML service instance credentials here.

In [4]:
wml_credentials = {
  "apikey": "***",
  "instance_id": "***",
  "password": "***",
  "url": "https://us-south.ml.cloud.ibm.com",
  "username": "***"
}


In [5]:
# The code was removed by Watson Studio for sharing.

In [6]:
client = WatsonMachineLearningAPIClient(wml_credentials)

In [7]:
client.version

'1.0.331'

<a id="load"></a>
## 2. Load and explore data

In this section you read in the `car_rental_training_data.csv` file, load it as a pandas dataFrame and then perform a basic exploration. 

In [8]:
!pip install wget

Collecting wget
  Downloading https://files.pythonhosted.org/packages/47/6a/62e288da7bcda82b935ff0c6cfe542970f04e29c756b0e147251b2fb251f/wget-3.2.zip
Building wheels for collected packages: wget
  Running setup.py bdist_wheel for wget ... [?25ldone
[?25h  Stored in directory: /home/dsxuser/.cache/pip/wheels/40/15/30/7d8f7cea2902b4db79e3fea550d7d7b85ecb27ef992b618f3f
Successfully built wget
Installing collected packages: wget
Successfully installed wget-3.2


In [9]:
import wget

link_to_data = 'https://raw.githubusercontent.com/pmservice/wml-sample-models/master/spark/cars-4-you/data/car_rental_training_data.csv'
filename = wget.download(link_to_data)

print(filename)

car_rental_training_data.csv


Load the data as a pandas dataFrame.

In [10]:
import pandas as pd

data = pd.read_csv(filename, sep=';') 
data.head()

Unnamed: 0,ID,Gender,Status,Children,Age,Customer_Status,Car_Owner,Customer_Service,Satisfaction,Business_Area,Action
0,83,Female,M,2,48.85,Inactive,Yes,I thought the representative handled the initi...,0,Product: Availability/Variety/Size,Free Upgrade
1,1307,Female,M,0,55.0,Inactive,No,I have had a few recent rentals that have take...,0,Product: Availability/Variety/Size,Voucher
2,1737,Male,M,0,42.35,Inactive,Yes,car cost more because I didn't pay when I rese...,0,Product: Availability/Variety/Size,Free Upgrade
3,3721,Male,M,2,61.71,Inactive,Yes,I didn't get the car I was told would be avail...,0,Product: Availability/Variety/Size,Free Upgrade
4,11,Male,S,2,56.47,Active,No,If there was not a desired vehicle available t...,1,Product: Availability/Variety/Size,


**Note:** 0 - not satisfied, 1 - satisfied

Extract the required columns and count the number of records.

In [11]:
complain_data = data[['Customer_Service', 'Satisfaction']]

In [12]:
print(complain_data.count())

Customer_Service    486
Satisfaction        486
dtype: int64


<a id="model"></a>
## 3. Create a Keras model using the TensorFlow backend


In [13]:
from keras.preprocessing.text import Tokenizer
from keras.preprocessing.sequence import pad_sequences
from keras.models import Sequential
from keras.layers import Dense, Embedding, LSTM, SpatialDropout1D
from sklearn.model_selection import train_test_split
import os
import numpy
from keras.models import Sequential
from keras.layers.convolutional import Conv1D, MaxPooling1D
from keras.layers.embeddings import Embedding
from keras.preprocessing import sequence

### 3.1 Prepare the data

In [14]:
max_fatures = 500

for idx,row in complain_data.iterrows():
    row[0] = row[0].replace('rt',' ')

tokenizer = Tokenizer(num_words=max_fatures, split=' ')
tokenizer.fit_on_texts(complain_data['Customer_Service'].values)
X = tokenizer.texts_to_sequences(complain_data['Customer_Service'].values)

maxlen = 50

X = pad_sequences(X, maxlen=maxlen)
print(X.shape)

(486, 50)


Split the data into train and test data sets.

In [15]:
Y = complain_data['Satisfaction'].values
X_train, X_test, Y_train, Y_test = train_test_split(X,Y, test_size = 0.33, random_state = 42)

print(X_train.shape,Y_train.shape)
print(X_test.shape,Y_test.shape)

(325, 50) (325,)
(161, 50) (161,)


### 3.2 Design and train the model

Create the network definition based on the Gated Recurrent Unit (Cho et al. 2014).

In [16]:
embedding_vector_length = 32

model = Sequential()
model.add(Embedding(max_fatures, embedding_vector_length, input_length=maxlen))
model.add(Conv1D(filters=32, kernel_size=3, padding='same', activation='relu'))
model.add(MaxPooling1D(pool_size=2))
model.add(LSTM(100))
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
print(model.summary())

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
embedding_1 (Embedding)      (None, 50, 32)            16000     
_________________________________________________________________
conv1d_1 (Conv1D)            (None, 50, 32)            3104      
_________________________________________________________________
max_pooling1d_1 (MaxPooling1 (None, 25, 32)            0         
_________________________________________________________________
lstm_1 (LSTM)                (None, 100)               53200     
_________________________________________________________________
dense_1 (Dense)              (None, 1)                 101       
Total params: 72,405
Trainable params: 72,405
Non-trainable params: 0
_________________________________________________________________
None


Train the model.

In [17]:
history = model.fit(X_train, Y_train, validation_data=(X_test, Y_test), epochs=20, batch_size=64)

Train on 325 samples, validate on 161 samples
Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20


In [18]:
print("Best accuracy on test: %3.3f" % numpy.array(history.history['val_acc']).max())

Best accuracy on test: 0.944


**Note:** For purpose of this demo, model tuning has been skipped.

Store and archive the model in the notebook filesystem.

In [19]:
# Evaluate the model
scores = model.evaluate(X_test, Y_test, verbose=0)
print("Evaluation Accuracy: %.2f%%" % (scores[1]*100))

Evaluation Accuracy: 94.41%


In [20]:
filename = 'complain_model.h5'
model.save(filename)

# Compress Keras model
tar_filename = filename + ".tgz"
cmdstring = "tar -zcvf " + tar_filename + " " + filename
os.system(cmdstring);

In [21]:
!ls -lat

total 1772
-rw-r----- 1 dsxuser dsxuser 815457 Oct 10 08:10 complain_model.h5.tgz
drwxr-x--- 2 dsxuser dsxuser   4096 Oct 10 08:10 .
-rw-r----- 1 dsxuser dsxuser 903944 Oct 10 08:10 complain_model.h5
-rw-r----- 1 dsxuser dsxuser  79518 Oct 10 08:10 car_rental_training_data.csv
drwx------ 1 dsxuser dsxuser   4096 Oct 10 08:10 ..


<a id="persistence"></a>
## 4. Store the model in the repository

In [22]:
model_props = {
    client.repository.ModelMetaNames.NAME: "CARS4U - Satisfaction Prediction Model",
    client.repository.ModelMetaNames.FRAMEWORK_NAME: "tensorflow",
    client.repository.ModelMetaNames.FRAMEWORK_VERSION: "1.5",
    client.repository.ModelMetaNames.RUNTIME_NAME: "python",
    client.repository.ModelMetaNames.RUNTIME_VERSION: "3.5",
    client.repository.ModelMetaNames.FRAMEWORK_LIBRARIES: [{'name':'keras', 'version': '2.1.3'}]
}

published_model_details = client.repository.store_model(model=tar_filename, meta_props=model_props)       

In [23]:
model_uid = client.repository.get_model_uid(published_model_details)
print(model_uid)

6bc8024b-6fea-4f77-9441-b5e41af55f01


<a id="deployment"></a>
## 5. Deploy the model

In [24]:
deployment = client.deployments.create(model_uid, 'CARS4U - Satisfaction Prediction Model Deployment')



#######################################################################################

Synchronous deployment creation for uid: '6bc8024b-6fea-4f77-9441-b5e41af55f01' started

#######################################################################################


INITIALIZING
DEPLOY_IN_PROGRESS
DEPLOY_SUCCESS


------------------------------------------------------------------------------------------------
Successfully finished deployment creation, deployment_uid='480f13de-df00-476c-8f21-723cc76965f4'
------------------------------------------------------------------------------------------------




In [25]:
client.deployments.list()

------------------------------------  ---------------------------------------------------------  ------  --------------  ------------------------  --------------  -------------
GUID                                  NAME                                                       TYPE    STATE           CREATED                   FRAMEWORK       ARTIFACT TYPE
480f13de-df00-476c-8f21-723cc76965f4  CARS4U - Satisfaction Prediction Model Deployment          online  DEPLOY_SUCCESS  2018-10-10T08:10:37.451Z  tensorflow-1.5  model
98455655-1601-4bcc-9a16-b4c76bf1e23a  Apar model deployment                                      online  DEPLOY_SUCCESS  2018-10-01T07:28:41.193Z  mllib-2.1       model
1adbcee6-99f4-4ac8-8171-6eef25e3c378  Apar model deployment                                      online  DEPLOY_SUCCESS  2018-09-28T13:38:28.383Z  mllib-2.1       model
eeae4b17-e24f-4042-8dfc-349836a63a69  CARS4U - Satisfaction Prediction - AI Function Deployment  online  DEPLOY_SUCCESS  2018-09-27T11:28:5

## 6. Score the model<a id="score"></a>

Let's see if our deployment works.

In [26]:
scoring_endpoint = client.deployments.get_scoring_url(deployment)

In [27]:
print(scoring_endpoint)

https://us-south.ml.cloud.ibm.com/v3/wml_instances/e30fe554-6e3e-4e0e-af06-90f93686f358/deployments/480f13de-df00-476c-8f21-723cc76965f4/online


In [28]:
index = 5

scoring_data = X[index].tolist()
print(X_test[index])
print(Y_test[index])

[  0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
   0   0   0   0   0   0   0   0   0   0   0   0   4  43  19   3 242  25
 235   6  22 353   1 237  12  10   8 107  14  34   5  56]
1


In [29]:
scoring_payload = {'values': [scoring_data]}
scores = client.deployments.score(scoring_endpoint, scoring_payload)

Let's print the scoring results.

In [30]:
print(str(scores))

{'values': [[[0.00020281129400245845], [0], [0.00020281129400245845]]], 'fields': ['prediction', 'prediction_classes', 'probability']}


<a id="function"></a>
## 7. Payload logging for AI function - Python function

Let's define a function that does data preprocessing and model scoring for us. As we saw in the previous cells, the model expects numerical input, so the text comment needs to be preprocessed.

### 6.1 Define generic parameters

Define some generic parameters that our function will use to score the model.

#### Parameters

In [31]:
ai_params = {
    'scoring_endpoint': scoring_endpoint,
    'wml_credentials': wml_credentials,
    'word_index': tokenizer.word_index,
}

In [33]:
def score_generator(params=ai_params):
    import re
    from watson_machine_learning_client import WatsonMachineLearningAPIClient
    client = WatsonMachineLearningAPIClient(params['wml_credentials'])
        
    def score(payload):
        max_fatures = 500
        maxlen = 50

        preprocessed_records = []
        complain_data = payload['values']
        word_index = params['word_index']

        for data in complain_data:
            comment = data[0]
            cleanString = re.sub(r"[!\"#$%&()*+,-./:;<=>?@[\]^_`{|}~]", "", comment)
            splitted_comment = cleanString.split()[:maxlen]
            hashed_tokens = []

            for token in splitted_comment:
                index = word_index.get(token, 0)
                if index < 501 and index > 0:
                    hashed_tokens.append(index)

            hashed_tokens_size = len(hashed_tokens)
            padded_tokens = [0]*(maxlen-hashed_tokens_size) + hashed_tokens
            preprocessed_records.append(padded_tokens)

        scoring_payload = {'values': preprocessed_records}
        
        return client.deployments.score(params['scoring_endpoint'], scoring_payload)
        
        
    return score


#### Test the function locally

In [34]:
sample_data = {
    'fields': ['feedback'],
    'values': [
        ['delayed shuttle, almost missed flight, bad customer service'],
        ['The car was great and they were able to provide all features I wanted with limited time they had.']
    ]
}

In [35]:
score = score_generator()
score(sample_data)

{'fields': ['prediction', 'prediction_classes', 'probability'],
 'values': [[[0.0007073709275573492], [0], [0.0007073709275573492]],
  [[0.9991133809089661], [1], [0.9991133809089661]]]}

**Note:** 0 - not satisfied. 1 - satisfied

### 6.2 Store the function

In [36]:
client.repository.FunctionMetaNames.show()

------------------  ----  --------
META_PROP NAME      TYPE  REQUIRED
NAME                str   Y
DESCRIPTION         str   N
RUNTIME_UID         str   N
INPUT_DATA_SCHEMA   dict  N
OUTPUT_DATA_SCHEMA  dict  N
TAGS                list  N
------------------  ----  --------


In [37]:
meta_data = {
    client.repository.FunctionMetaNames.NAME: 'CARS4U - Satisfaction Prediction - AI Function',
}

function_details = client.repository.store_function(meta_props=meta_data, function=score_generator)

No RUNTIME_UID passed. Creating default runtime... SUCCESS

Successfully created default runtime with uid: c7514a67-85e2-40fb-8709-ea5a66431975


In [38]:
client.repository.list_functions()

------------------------------------  ----------------------------------------------  ------------------------  ------
GUID                                  NAME                                            CREATED                   TYPE
17d6fe12-2145-4b0d-b815-d7f5a0abbf78  CARS4U - Satisfaction Prediction - AI Function  2018-10-10T08:11:01.392Z  python
007c3c89-298d-453a-b637-1f8a28a7ca34  CARS4U - Satisfaction Prediction - AI Function  2018-09-27T11:28:53.470Z  python
6bf467f1-eb7d-4bfc-8162-4a5925e17eb5  CARS4U - Satisfaction Prediction - AI Function  2018-09-26T10:07:04.352Z  python
------------------------------------  ----------------------------------------------  ------------------------  ------


### 6.3 Deploy the function

In [39]:
function_uid = client.repository.get_function_uid(function_details)

function_deployment_details = client.deployments.create(artifact_uid=function_uid, name='CARS4U - Satisfaction Prediction - AI Function Deployment')



#######################################################################################

Synchronous deployment creation for uid: '17d6fe12-2145-4b0d-b815-d7f5a0abbf78' started

#######################################################################################


INITIALIZING
DEPLOY_IN_PROGRESS.
DEPLOY_SUCCESS


------------------------------------------------------------------------------------------------
Successfully finished deployment creation, deployment_uid='e3a4a1f7-7c65-48a3-8c46-f26926ba0c71'
------------------------------------------------------------------------------------------------




### 6.4 Score function

In [40]:
ai_function_scoring_endpoint = client.deployments.get_scoring_url(function_deployment_details)

print(ai_function_scoring_endpoint)

https://us-south.ml.cloud.ibm.com/v3/wml_instances/e30fe554-6e3e-4e0e-af06-90f93686f358/deployments/e3a4a1f7-7c65-48a3-8c46-f26926ba0c71/online


In [41]:
response = client.deployments.score(ai_function_scoring_endpoint, sample_data)

In [42]:
print(response)

{'values': [[[0.0007073709275573492], [0], [0.0007073709275573492]], [[0.9991133809089661], [1], [0.9991133809089661]]], 'fields': ['prediction', 'prediction_classes', 'probability']}


### Author

**Lukasz Cmielowski** is a Lead Data Scientist at IBM developing enterprise-level applications that substantially increases clients' ability to turn data into actionable knowledge.

### Citations


Copyright © 2017, 2018 IBM. This notebook and its source code are released under the terms of the MIT License.

<div style="background:#F5F7FA; height:110px; padding: 2em; font-size:14px;">
<span style="font-size:18px;color:#152935;">Love this notebook? </span>
<span style="font-size:15px;color:#152935;float:right;margin-right:40px;">Don't have an account yet?</span><br>
<span style="color:#5A6872;">Share it with your colleagues and help them discover the power of Watson Studio!</span>
<span style="border: 1px solid #3d70b2;padding:8px;float:right;margin-right:40px; color:#3d70b2;"><a href="https://ibm.co/wsnotebooks" target="_blank" style="color: #3d70b2;text-decoration: none;">Sign Up</a></span><br>
</div>


---
