<table style="border: none" align="left">
    <tr style="border: none">
       <th style="border: none"><img src="https://raw.githubusercontent.com/pmservice/cars-4-you/master/static/images/logo.png" width="200" alt="Icon"></th>
       <th style="border: none"><font face="verdana" size="5" color="black"><b>Customer Satisfaction Prediction</b></th>
   </tr>
</table>

<img align=left src="https://github.com/pmservice/cars-4-you/raw/master/static/images/ai_function.png" alt="Icon" width="664">

Keras model and AI function to determine if comment is a complain.

Contents

- [0. Setup](#setup)
- [1. Introduction](#introduction)
- [2. Load and explore data](#load)
- [3. Create Keras model using TensorFlow backend](#model)
- [4. Store the model in the repository](#persistence)
- [5. Deploy the model](#deployment)
- [6. AI function](#ai_function)

<a id="setup"></a>
## 0. Setup

Install TensorFlow version 1.5 and newest version of watson-machine-learning-client.

In [30]:
!pip install --upgrade tensorflow==1.5

Requirement already up-to-date: tensorflow==1.5 in /opt/conda/envs/DSX-Python35/lib/python3.5/site-packages
Requirement not upgraded as not directly required: tensorflow-tensorboard<1.6.0,>=1.5.0 in /opt/conda/envs/DSX-Python35/lib/python3.5/site-packages (from tensorflow==1.5)
Requirement not upgraded as not directly required: six>=1.10.0 in /opt/conda/envs/DSX-Python35/lib/python3.5/site-packages (from tensorflow==1.5)
Requirement not upgraded as not directly required: wheel>=0.26 in /opt/conda/envs/DSX-Python35/lib/python3.5/site-packages (from tensorflow==1.5)
Requirement not upgraded as not directly required: protobuf>=3.4.0 in /opt/conda/envs/DSX-Python35/lib/python3.5/site-packages (from tensorflow==1.5)
Requirement not upgraded as not directly required: numpy>=1.12.1 in /opt/conda/envs/DSX-Python35/lib/python3.5/site-packages (from tensorflow==1.5)
Requirement not upgraded as not directly required: absl-py>=0.1.6 in /opt/conda/envs/DSX-Python35/lib/python3.5/site-packages (from

In [31]:
!rm -rf $PIP_BUILD/watson-machine-learning-client
!pip install --upgrade watson-machine-learning-client==1.0.277

Requirement already up-to-date: watson-machine-learning-client in /opt/conda/envs/DSX-Python35/lib/python3.5/site-packages
Requirement not upgraded as not directly required: pandas in /opt/conda/envs/DSX-Python35/lib/python3.5/site-packages (from watson-machine-learning-client)
Requirement not upgraded as not directly required: tabulate in /opt/conda/envs/DSX-Python35/lib/python3.5/site-packages (from watson-machine-learning-client)
Requirement not upgraded as not directly required: lomond in /opt/conda/envs/DSX-Python35/lib/python3.5/site-packages (from watson-machine-learning-client)
Requirement not upgraded as not directly required: urllib3 in /opt/conda/envs/DSX-Python35/lib/python3.5/site-packages (from watson-machine-learning-client)
Requirement not upgraded as not directly required: tqdm in /opt/conda/envs/DSX-Python35/lib/python3.5/site-packages (from watson-machine-learning-client)
Requirement not upgraded as not directly required: requests in /opt/conda/envs/DSX-Python35/lib/

<a id="introduction"></a>
## 1. Introduction

This notebook trains a **Keras** (TensorFlow) model to predict customer satisfaction based on provided feedback. Notebook also shows usage of **AI Function** for deep learning model data preprocessing required before model scoring.

<a id="load"></a>
## 2. Load and explore data

In this section the data is loaded as pandas dataframe.

In [69]:

from ibmdbpy import IdaDataBase, IdaDataFrame

# @hidden_cell
# This connection object is used to access your data and contains your credentials.
# You might want to remove those credentials before you share your notebook.
idadb_c166344e776040b39f477655199897f8 = IdaDataBase(dsn='DASHDB;Database=BLUDB;Hostname=dashdb-entry-yp-dal10-01.services.dal.bluemix.net;Port=50000;PROTOCOL=TCPIP;UID=dash5120;PWD=G5_CehiL4_Ux')

data_df = IdaDataFrame(idadb_c166344e776040b39f477655199897f8, 'DASH5120.CAR_RENTAL_TRAINING').as_dataframe()
data_df.head()

# You can close the database connection with the following code. Please keep the comment line with the @hidden_cell tag,
# because the close function displays parts of the credentials.
# @hidden_cell
# idadb_c166344e776040b39f477655199897f8.close()
# To learn more about the ibmdby package, please read the documentation: http://pythonhosted.org/ibmdbpy/


Unnamed: 0,ID,Gender,Status,Children,Age,Customer_Status,Car_Owner,Customer_Service,Satisfaction,Business_Area,Action
0,74,Male,M,1,26.26,Active,No,"no wait for pick up and drop off was great, he...",1,Product: Information,
1,83,Female,M,2,48.85,Inactive,Yes,I thought the representative handled the initi...,0,Product: Availability/Variety/Size,Free Upgrade
2,140,Female,S,0,36.92,Inactive,No,Everyone was very cooperative. The auto was r...,1,Product: Functioning,
3,191,Male,M,0,45.51,Inactive,Yes,what customer service? It was a nightmare,0,Service: Knowledge,Voucher
4,239,Male,M,1,46.0,Inactive,Yes,They did not have the auto I wanted. upgraded...,0,Product: Availability/Variety/Size,Free Upgrade


**Note:** 0 - not satisfied, 1 - satisfied

Extract needed columns and count number of records.

In [70]:
complain_data = data_df[['Customer_Service', 'Satisfaction']]

In [71]:
print(complain_data.count())

Customer_Service    482
Satisfaction        482
dtype: int64


<a id="model"></a>
## 3. Create Keras model using TensorFlow backend


In [223]:
from keras.preprocessing.text import Tokenizer
from keras.preprocessing.sequence import pad_sequences
from keras.models import Sequential
from keras.layers import Dense, Embedding, LSTM, SpatialDropout1D
from sklearn.model_selection import train_test_split
import os
import numpy
from keras.models import Sequential
from keras.layers.convolutional import Conv1D, MaxPooling1D
from keras.layers.embeddings import Embedding
from keras.preprocessing import sequence

### 3.1 Prepare data

In [190]:
max_fatures = 500

for idx,row in complain_data.iterrows():
    row[0] = row[0].replace('rt',' ')

tokenizer = Tokenizer(num_words=max_fatures, split=' ')
tokenizer.fit_on_texts(complain_data['Customer_Service'].values)
X = tokenizer.texts_to_sequences(complain_data['Customer_Service'].values)

maxlen = 50

X = pad_sequences(X, maxlen=maxlen)
print(X.shape)

(482, 50)


Split into train and test datasets.

In [191]:
Y = complain_data['Satisfaction'].values
X_train, X_test, Y_train, Y_test = train_test_split(X,Y, test_size = 0.33, random_state = 42)

print(X_train.shape,Y_train.shape)
print(X_test.shape,Y_test.shape)

(322, 50) (322,)
(160, 50) (160,)


### 3.2 Design and train model

Create the network definition based on Gated Recurrent Unit (Cho et al. 2014).

In [75]:
embedding_vector_length = 32

model = Sequential()
model.add(Embedding(max_fatures, embedding_vector_length, input_length=maxlen))
model.add(Conv1D(filters=32, kernel_size=3, padding='same', activation='relu'))
model.add(MaxPooling1D(pool_size=2))
model.add(LSTM(100))
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
print(model.summary())

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
embedding_3 (Embedding)      (None, 50, 32)            16000     
_________________________________________________________________
conv1d_3 (Conv1D)            (None, 50, 32)            3104      
_________________________________________________________________
max_pooling1d_3 (MaxPooling1 (None, 25, 32)            0         
_________________________________________________________________
lstm_3 (LSTM)                (None, 100)               53200     
_________________________________________________________________
dense_3 (Dense)              (None, 1)                 101       
Total params: 72,405
Trainable params: 72,405
Non-trainable params: 0
_________________________________________________________________
None


Train the model.

In [76]:
history = model.fit(X_train, Y_train, validation_data=(X_test, Y_test), epochs=20, batch_size=64)

Train on 322 samples, validate on 160 samples
Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20


In [78]:
print("Best accuracy on test: %3.3f" % numpy.array(history.history['val_acc']).max())

Best accuracy on test: 0.931


**Note:** For purpose of this demo model tuning has been skipped.

Store and archive the model on notebook filesystem.

In [79]:
# evaluate the model
scores = model.evaluate(X_test, Y_test, verbose=0)
print("Evaluation Accuracy: %.2f%%" % (scores[1]*100))

Evaluation Accuracy: 91.25%


In [80]:
filename = 'complain_model.h5'
model.save(filename)

#compress keras model
tar_filename = filename + ".tgz"
cmdstring = "tar -zcvf " + tar_filename + " " + filename
os.system(cmdstring);

In [81]:
!ls -lat

total 1692
-rw-r----- 1 dsxuser dsxuser 818583 Jul 24 09:49 complain_model.h5.tgz
-rw-r----- 1 dsxuser dsxuser 903992 Jul 24 09:49 complain_model.h5
drwxr-x--- 2 dsxuser dsxuser   4096 Jul 24 08:51 .
drwx------ 1 dsxuser dsxuser   4096 Jul 24 07:27 ..


<a id="persistence"></a>
## 4. Store the model in the repository

In [82]:
from watson_machine_learning_client import WatsonMachineLearningAPIClient

In [83]:
# @hidden_cell

wml_credentials = {
  "instance_id": "000263d8-04e0-4060-ad69-fcfe40069018",
  "password": "7419325b-3de4-476c-94cb-4b158fa335b0",
  "url": "https://us-south.ml.cloud.ibm.com",
  "username": "cdc4b5da-8380-42f1-bd82-da044b283959"
}


In [84]:
client = WatsonMachineLearningAPIClient(wml_credentials)

In [85]:
model_props = {
    client.repository.ModelMetaNames.NAME: "CARS4U - Satisfaction Prediction Model",
    client.repository.ModelMetaNames.FRAMEWORK_NAME: "tensorflow",
    client.repository.ModelMetaNames.FRAMEWORK_VERSION: "1.5",
    client.repository.ModelMetaNames.RUNTIME_NAME: "python",
    client.repository.ModelMetaNames.RUNTIME_VERSION: "3.5",
    client.repository.ModelMetaNames.FRAMEWORK_LIBRARIES: [{'name':'keras', 'version': '2.1.3'}]
}

published_model_details = client.repository.store_model(model=tar_filename, meta_props=model_props)       

In [86]:
model_uid = client.repository.get_model_uid(published_model_details)
print(model_uid)

0b282150-0db3-407d-ac32-84500b262341


<a id="deployment"></a>
## 5. Deploy the model

### 5.1 Create deployment

In [87]:
deployment = client.deployments.create(model_uid, 'CARS4U - Satisfaction Prediction Model Deployment')



#######################################################################################

Synchronous deployment creation for uid: '0b282150-0db3-407d-ac32-84500b262341' started

#######################################################################################


INITIALIZING
DEPLOY_IN_PROGRESS..
DEPLOY_SUCCESS


------------------------------------------------------------------------------------------------
Successfully finished deployment creation, deployment_uid='726e333a-73a2-4720-b249-3be832903b4e'
------------------------------------------------------------------------------------------------




In [88]:
client.deployments.list()

------------------------------------  --------------------------------------------------  ------  --------------  ------------------------  --------------  ----------
GUID                                  NAME                                                TYPE    STATE           CREATED                   FRAMEWORK       ASSET TYPE
726e333a-73a2-4720-b249-3be832903b4e  CARS4U - Satisfaction Prediction Model Deployment   online  DEPLOY_SUCCESS  2018-07-24T09:49:11.282Z  tensorflow-1.5  model
b344d58c-590d-4a93-a007-23693947ce31  CARS4U - Business Area Prediction Model Deployment  online  DEPLOY_SUCCESS  2018-07-24T09:47:40.251Z  mllib-2.1       model
584a3b49-7b3c-4663-8b6f-6a433c041b1d  CARS4U - Satisfaction Prediction Model Deployment   online  DEPLOY_SUCCESS  2018-07-24T09:46:21.374Z  tensorflow-1.5  model
f9cabb53-a6fd-426a-a678-6091566d2b3a  CARS4U - Action Model Deployment                    online  DEPLOY_SUCCESS  2018-07-23T18:59:52.145Z  mllib-2.1       model
------------------

### 5.2 Score the model

Let's see if our deployment works.

In [89]:
scoring_endpoint = client.deployments.get_scoring_url(deployment)

In [90]:
print(scoring_endpoint)

https://us-south.ml.cloud.ibm.com/v3/wml_instances/000263d8-04e0-4060-ad69-fcfe40069018/deployments/726e333a-73a2-4720-b249-3be832903b4e/online


In [324]:
index = 5

scoring_data = X[index].tolist()
print(X_test[index])
print(Y_test[index])

[  0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
   0   0   0   0   0   0   0  40 107 131  19   1  22  77]
0


In [325]:
scoring_payload = {'values': [scoring_data]}
scores = client.deployments.score(scoring_endpoint, scoring_payload)

In [327]:
print(scoring_payload)

{'values': [[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 266, 138, 139, 267, 207, 115, 12, 8]]}


In [330]:
len(scoring_payload['values'][0])

50

In [329]:
print({'values': [[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 266, 138, 139, 267, 207, 115, 12, 8], [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 10, 2, 78, 5, 7, 13, 122, 3, 109, 51, 0, 0, 58, 15, 808, 31, 7, 23]]})

{'values': [[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 266, 138, 139, 267, 207, 115, 12, 8], [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 10, 2, 78, 5, 7, 13, 122, 3, 109, 51, 0, 0, 58, 15, 808, 31, 7, 23]]}


Let's print scoring results.

In [326]:
print(str(scores))

{'fields': ['prediction', 'prediction_classes', 'probability'], 'values': [[[0.008548633195459843], [0], [0.008548633195459843]]]}


<a id="function"></a>
## 6. AI function

Let's define AI function that does data preprocessing and model scoring for us. As noticed above model expects numerical input, so the text comment needs to be preprocessed.

### 6.1 Definition

Define some generic parameters our function will use to score the model.

#### Parameters

In [262]:
ai_params = {
    'scoring_endpoint': scoring_endpoint,
    'wml_credentials': wml_credentials,
    'word_index': tokenizer.word_index
}

#### Function

In [345]:
def score_generator(params=ai_params):

    def score(payload):
        import re
        from watson_machine_learning_client import WatsonMachineLearningAPIClient
        client = WatsonMachineLearningAPIClient(params['wml_credentials'])
        
        max_fatures = 500
        maxlen = 50

        preprocessed_records = []
        complain_data = payload['values']
        word_index = params['word_index']

        for data in complain_data:
            comment = data[0]
            cleanString = re.sub(r"[!\"#$%&()*+,-./:;<=>?@[\]^_`{|}~]", "", comment)
            splitted_comment = cleanString.split()[:maxlen]
            hashed_tokens = []

            for token in splitted_comment:
                index = word_index.get(token, 0)
                if index < 501 and index > 0:
                    hashed_tokens.append(index)

            hashed_tokens_size = len(hashed_tokens)
            padded_tokens = [0]*(maxlen-hashed_tokens_size) + hashed_tokens
            preprocessed_records.append(padded_tokens)

        scoring_payload = {'values': preprocessed_records}
        print(str(scoring_payload))
        
        return client.deployments.score(params['scoring_endpoint'], scoring_payload)
        
        
    return score

#### Test locally

In [346]:
sample_data = {
    'fields': ['feedback'],
    'values': [
        ['delayed shuttle, almost missed flight, bad customer service'],
        ['The car was great and they were able to provide all features I wanted with limited time they had.']
    ]
}

In [347]:
score = score_generator()
score(sample_data)

{'values': [[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 266, 138, 139, 267, 207, 115, 12, 8], [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 10, 2, 78, 5, 7, 13, 122, 3, 109, 51, 58, 15, 31, 7, 23]]}


{'fields': ['prediction', 'prediction_classes', 'probability'],
 'values': [[[0.008548641577363014], [0], [0.008548641577363014]],
  [[0.9832062125205994], [1], [0.9832062125205994]]]}

**Note:** 0 - not satisfied. 1 - satisfied

### 6.2 AI function storing

In [350]:
client.repository.FunctionMetaNames.show()

------------------  ----  --------  -------------
META_PROP NAME      TYPE  REQUIRED  DEFAULT_VALUE
NAME                str   Y
DESCRIPTION         str   N
TYPE                str   N         python
RUNTIME_URL         str   N
INPUT_DATA_SCHEMA   dict  N
OUTPUT_DATA_SCHEMA  dict  N
TAGS                list  N
------------------  ----  --------  -------------


In [351]:
function_details = client.repository.store_function(score_generator, 'CARS4U - Sentiment Prediction - AI Function')

Recognized generator function.


In [352]:
client.repository.list_functions()

------------------------------------  ----------------------------------------------------------  ------------------------  ------
GUID                                  NAME                                                        CREATED                   TYPE
34574059-255d-48fe-aa5c-74adc1309068  CARS4U - Sentiment Prediction - AI Function                 2018-07-24T14:49:49.064Z  python
2c4fdde6-353e-4746-9058-ae3a377b868c  CARS4U - Business area and Action Prediction - AI Function  2018-07-24T13:48:30.139Z  python
0a69ab8c-0a92-4625-8350-972d09d692b9  CARS4U - Sentiment Prediction - AI Function                 2018-07-24T13:41:43.298Z  python
86a6d7d3-daac-479b-9580-3accf27a2ab6  CARS4U - Sentiment Prediction - AI Function                 2018-07-24T13:38:35.215Z  python
8c1e531b-27ef-40ac-a316-54268230356c  CARS4U - Business area and Action Prediction - AI Function  2018-07-24T13:16:18.129Z  python
f3d4a4d4-0166-4f21-b389-3cbeae467617  CARS4U - Business area and Action Prediction - 

### 6.3 AI function deployment

In [353]:
function_uid = client.repository.get_function_uid(function_details)

function_deployment_details = client.deployments.create(function_uid, name='CARS4U - Sentiment Prediction - AI Function Deployment')



#######################################################################################

Synchronous deployment creation for uid: '34574059-255d-48fe-aa5c-74adc1309068' started

#######################################################################################


INITIALIZING
DEPLOY_IN_PROGRESS....
DEPLOY_SUCCESS


------------------------------------------------------------------------------------------------
Successfully finished deployment creation, deployment_uid='aeb667ed-5070-4c88-b12f-333e1325d1ad'
------------------------------------------------------------------------------------------------




### Score AI function

In [354]:
ai_function_scoring_endpoint = client.deployments.get_scoring_url(function_deployment_details)

print(ai_function_scoring_endpoint)

https://us-south.ml.cloud.ibm.com/v3/wml_instances/000263d8-04e0-4060-ad69-fcfe40069018/deployments/aeb667ed-5070-4c88-b12f-333e1325d1ad/online


In [355]:
response = client.deployments.score(ai_function_scoring_endpoint, sample_data)

In [356]:
print(response)

{'fields': ['prediction', 'prediction_classes', 'probability'], 'values': [[[0.008548641577363014], [0], [0.008548641577363014]], [[0.9832062125205994], [1], [0.9832062125205994]]]}


---
