# HOMEWORK 5: TEXT CLASSIFICATION
In this homework, you will create models to classify texts from TRUE call-center. There are two classification tasks:
1. Action Classification: Identify which action the customer would like to take (e.g. enquire, report, cancle)
2. Object Classification: Identify which object the customer is referring to (e.g. payment, truemoney, internet, roaming) 

In this homework, you are asked to do the following tasks:
1. Data Cleaning
2. Preprocessing data for keras
3. Build and evaluate a model for "action" classification
4. Build and evaluate a model for "object" classification
5. Build and evaluate a multi-task model that does both "action" and "object" classifications in one-go 


Note: we have removed phone numbers from the dataset for privacy purposes. 

## Import Libs

In [1]:
%matplotlib inline
import pandas as pd
import sklearn
import numpy as np
from IPython.display import display

import math
import glob
import re
import random
import collections
import os
import sys

from keras.preprocessing import sequence
from keras.models import Sequential, Model
from keras.layers import GRU, Dropout
from keras.models import load_model
from keras.layers import Embedding, Reshape, Activation, Input, Dense, Masking
from keras.layers.merge import Dot
from keras.utils import np_utils
from keras.utils.data_utils import get_file
from keras.utils.np_utils import to_categorical
from keras.preprocessing.sequence import skipgrams
from keras.preprocessing import sequence
from keras import backend as K
from keras.optimizers import Adam

import matplotlib.pyplot as plt

random.seed(42)

Using TensorFlow backend.


## Loading data
First, we load the data from disk into a Dataframe.

A Dataframe is essentially a table, or 2D-array/Matrix with a name for each column.

In [2]:
data_df = pd.read_csv('clean-phone-data-for-students.csv')

Let's preview the data.

In [3]:
# Show the top 5 rows
display(data_df.head())
# Summarize the data
data_df.describe()

Unnamed: 0,Sentence Utterance,Action,Object
0,<PHONE_NUMBER_REMOVED> ผมไปจ่ายเงินที่ Counte...,enquire,payment
1,internet ยังความเร็วอยุ่เท่าไหร ครับ,enquire,package
2,ตะกี้ไปชำระค่าบริการไปแล้ว แต่ยังใช้งานไม่ได้...,report,suspend
3,พี่ค่ะยังใช้ internet ไม่ได้เลยค่ะ เป็นเครื่อ...,enquire,internet
4,ฮาโหล คะ พอดีว่าเมื่อวานเปิดซิมทรูมูฟ แต่มันโ...,report,phone_issues


Unnamed: 0,Sentence Utterance,Action,Object
count,16175,16175,16175
unique,13389,10,33
top,บริการอื่นๆ,enquire,service
freq,97,10377,2525


## Data cleaning

We call the DataFrame.describe() again.
Notice that there are 33 unique labels/classes for object and 10 unique labels for action that the model will try to predict.
But there are unwanted duplications e.g. Idd,idd,lotalty_card,Lotalty_card

Also note that, there are 13389 unqiue sentence utterances from 16175 utterances. You have to clean that too!

## #TODO 1: 
You will have to remove unwanted label duplications as well as duplications in text inputs. 
Also, you will have to trim out unwanted whitespaces from the text inputs. 
This shouldn't be too hard, as you have already seen it in the demo.



In [4]:
#TODO1
data_df['clean_label_obj']=data_df['Object'].str.lower().copy()
data_df['clean_label_act']=data_df['Action'].str.lower().copy()
data_df.drop('Action', axis=1, inplace=True)
data_df.drop('Object', axis=1, inplace=True)
display(data_df.clean_label_obj.unique())
display(data_df.clean_label_act.unique())

array(['payment', 'package', 'suspend', 'internet', 'phone_issues',
       'service', 'nontruemove', 'balance', 'detail', 'bill', 'credit',
       'promotion', 'mobile_setting', 'iservice', 'roaming', 'truemoney',
       'information', 'lost_stolen', 'balance_minutes', 'idd', 'garbage',
       'ringtone', 'rate', 'loyalty_card', 'contact', 'officer'],
      dtype=object)

array(['enquire', 'report', 'cancel', 'buy', 'activate', 'request',
       'garbage', 'change'], dtype=object)

In [5]:
#TODO1
data_df = data_df.drop_duplicates("Sentence Utterance", keep="first")
display(data_df.describe())

Unnamed: 0,Sentence Utterance,clean_label_obj,clean_label_act
count,13389,13389,13389
unique,13389,26,8
top,ประวัติการใช้งานค่ะ,service,enquire
freq,1,2111,8658


In [6]:
data = np.array(data_df.as_matrix(), copy=True)
print(data[:,0])
print(data[:,1])
print(data[:,2])

[' <PHONE_NUMBER_REMOVED> ผมไปจ่ายเงินที่ Counter Services เค้าเช็ต 3276.25 บาท เมื่อวานที่ผมเช็คที่ศูนย์บอกมียอด 3057.79 บาท'
 ' internet ยังความเร็วอยุ่เท่าไหร ครับ'
 ' ตะกี้ไปชำระค่าบริการไปแล้ว แต่ยังใช้งานไม่ได้ ค่ะ' ...
 'ยอดเงินเหลือเท่าไหร่ค่ะ' 'ยอดเงินในระบบ'
 'สอบถามโปรโมชั่นปัจจุบันที่ใช้อยู่ค่ะ']
['payment' 'package' 'suspend' ... 'balance' 'balance' 'package']
['enquire' 'enquire' 'report' ... 'enquire' 'enquire' 'enquire']


  """Entry point for launching an IPython kernel.


## #TODO 2: Preprocessing data for Keras
You will be using Keras in this assignment. Please show us how you prepare your data for keras.
Don't forget to split data into train and test sets (+ validation set if you want)

In [7]:
# TODO2: Preprocessing data for Keras
#data 0 => input
#data 1 => object
#data 2 => action

unique_label_obj = data_df.clean_label_obj.unique()
label_2_num_map = dict(zip(unique_label_obj, range(len(unique_label_obj))))
num_2_label_map = dict(zip(range(len(unique_label_obj)), unique_label_obj))
data[:,1] = np.vectorize(label_2_num_map.get)(data[:,1])

unique_label_act = data_df.clean_label_act.unique()
label_2_num_map = dict(zip(unique_label_act, range(len(unique_label_act))))
num_2_label_map = dict(zip(range(len(unique_label_act)), unique_label_act))
data[:,2] = np.vectorize(label_2_num_map.get)(data[:,2])

In [8]:
def strip_str(string):
    return string.strip()
# Trim of extra begining and trailing whitespace in the string
data[:,0] = np.vectorize(strip_str)(data[:,0])

In [9]:
# Create a character map
CHARS = [
  '\n', ' ', '!', '"', '#', '$', '%', '&', "'", '(', ')', '*', '+',
  ',', '-', '.', '/', '0', '1', '2', '3', '4', '5', '6', '7', '8',
  '9', ':', ';', '<', '=', '>', '?', '@', 'A', 'B', 'C', 'D', 'E',
  'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R',
  'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z', '[', '\\', ']', '^', '_',
  'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm',
  'n', 'o', 'other', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y',
  'z', '}', '~', 'ก', 'ข', 'ฃ', 'ค', 'ฅ', 'ฆ', 'ง', 'จ', 'ฉ', 'ช',
  'ซ', 'ฌ', 'ญ', 'ฎ', 'ฏ', 'ฐ', 'ฑ', 'ฒ', 'ณ', 'ด', 'ต', 'ถ', 'ท',
  'ธ', 'น', 'บ', 'ป', 'ผ', 'ฝ', 'พ', 'ฟ', 'ภ', 'ม', 'ย', 'ร', 'ฤ',
  'ล', 'ว', 'ศ', 'ษ', 'ส', 'ห', 'ฬ', 'อ', 'ฮ', 'ฯ', 'ะ', 'ั', 'า',
  'ำ', 'ิ', 'ี', 'ึ', 'ื', 'ุ', 'ู', 'ฺ', 'เ', 'แ', 'โ', 'ใ', 'ไ',
  'ๅ', 'ๆ', '็', '่', '้', '๊', '๋', '์', 'ํ', '๐', '๑', '๒', '๓',
  '๔', '๕', '๖', '๗', '๘', '๙', '‘', '’', '\ufeff'
]
CHARS_MAP = {v: k for k, v in enumerate(CHARS)}
char = np.array(CHARS)

In [10]:
def create_n_gram_df(df, n_pad):
  """
  Given an input dataframe, create a feature dataframe of shifted characters
  Input:
  df: timeseries of size (N)
  n_pad: the number of context. For a given character at position [idx],
    character at position [idx-n_pad/2 : idx+n_pad/2] will be used 
    as features for that character.
  
  Output:
  dataframe of size (N * n_pad) which each row contains the character, 
    n_pad_2 characters to the left, and n_pad_2 characters to the right
    of that character.
  """
  n_pad_2 = int((n_pad - 1)/2)
  for i in range(n_pad_2):
      df['char-{}'.format(i+1)] = df['char'].shift(i + 1)
      df['char{}'.format(i+1)] = df['char'].shift(-i - 1)
  return df[n_pad_2: -n_pad_2]


def prepare_wiki_feature(raw_text_input):
    """
    Transform the path to a directory containing processed files 
    into a feature matrix and output array
    """
    # we use padding equals 21 here to consider 10 characters to the left
    # and 10 characters to the right as features for the character in the middle
    n_pad = 21
    n_pad_2 = int((n_pad - 1)/2)
    pad = [{'char': ' ', 'target': True}]
    df_pad = pd.DataFrame(pad * n_pad_2)

    df = []

    df.append(pd.DataFrame(  {'char': raw_text_input}))

    df = pd.concat(df)
    # pad with empty string feature
    df = pd.concat((df_pad, df, df_pad))

    # map characters to numbers, use 'other' if not in the predefined character set.
    df['char'] = df['char'].map(lambda x: CHARS_MAP.get(x, 80))

    # Use nearby characters as features
    df_with_context = create_n_gram_df(df, n_pad=n_pad)

    char_row = ['char' + str(i + 1) for i in range(n_pad_2)] + \
             ['char-' + str(i + 1) for i in range(n_pad_2)] + ['char']

    # convert pandas dataframe to numpy array to feed to the model
    x_char = df_with_context[char_row].as_matrix()

    return x_char

#A function for displaying our features in text
def print_features(tfeature,index):
    feature = np.array(tfeature[index],dtype=int).reshape(21,1)
    #Convert to string
    char_list = char[feature]
    left = ''.join(reversed(char_list[10:20].reshape(10))).replace(" ", "")
    center = ''.join(char_list[20])
    right =  ''.join(char_list[0:10].reshape(10)).replace(" ", "")
    word = ''.join([left,' ',center,' ',right])
    print(center + ': ' + word )

In [11]:
from keras.layers import Conv1D, MaxPooling1D, Embedding, TimeDistributed
from keras.layers import Activation, Dropout, Flatten, Dense, Input,GRU, Bidirectional
def get_your_nn():
    max_features = len(CHARS)+1
    max_len=21
    #replace "pass" with code for your neural net
    input1 = Input(shape=(21,))
    x = Embedding(max_features, 32, input_length=max_len)(input1)
    x = Conv1D(100, 5, strides = 1, padding='same', activation='relu')(x)
    x = TimeDistributed(Dense(5, activation='relu'))(x)
    x = Flatten()(x)
    x = Dense(100, activation='relu')(x)
    out = Dense(1, activation='sigmoid')(x)

    model = Model(inputs=input1, outputs=out)
    model.compile(optimizer=Adam(),
                loss='binary_crossentropy',
                metrics=['acc'])
    return model


model = get_your_nn()
model.load_weights("/media/kok/New Volume/NLP/nlp_2019/HW5/model_conv1d_nn.h5")

In [12]:
x_char= []
for ss in data[:,0]:
    x_char.append(prepare_wiki_feature(list(ss)))

of pandas will change to not sort by default.

To accept the future behavior, pass 'sort=False'.




In [13]:
def char_to_word(raw_text, y_pred):
    """ add spaces between words in the raw text based on your prediction
    """
    split_text=""
    for char, y in zip(raw_text,y_pred):
        if y == 1:
            split_text+=" "
            split_text+=char
        else:
            split_text+=char
    return split_text.split(" ")

In [14]:
print(' data lenght: ', len(x_char))

 data lenght:  13389


In [15]:
def getTokens(x_char,data,model):
    y_pred = model.predict(x_char)
    prob_to_class = lambda p: 1 if p[0]>=0.5 else 0
    y_pred = np.apply_along_axis(prob_to_class,1,y_pred)
    return char_to_word(data, y_pred)

In [16]:
tokenData=np.array(data)
tokenData[:,0]=[getTokens(x_char[i],data[:,0][i],model) for i in range (len(data[:,0]))]

In [17]:
print(tokenData[:5])

[[list(['', '<PHONE_NUMBER_REMOV', 'ED>', '', '', 'ผม', 'ไป', 'จ่าย', 'เงิน', 'ที่', '', '', 'Counter', '', '', 'Services', '', '', 'เค้า', 'เช็ต', '', '', '3276', '.', '25', '', '', 'บาท', '', '', 'เมื่อ', 'วาน', 'ที่', 'ผม', 'เช็ค', 'ที่', 'ศูนย์', 'บอก', 'มี', 'ยอด', '', '', '305', '7', '.', '79', '', '', 'บาท'])
  0 0]
 [list(['', 'internet', '', '', 'ยัง', 'ความ', 'เร็ว', 'อยุ่เท่า', 'ไหร', '', '', 'ครับ'])
  1 0]
 [list(['', 'ตะกี้', 'ไป', 'ชำระ', 'ค่า', 'บริการ', 'ไป', 'แล้ว', '', '', 'แต่', 'ยัง', 'ใช้', 'งาน', 'ไม่', 'ได้', '', '', 'ค่ะ'])
  2 1]
 [list(['', 'พี่', 'ค่ะ', 'ยัง', 'ใช้', '', '', 'internet', '', '', 'ไม่', 'ได้', 'เลย', 'ค่ะ', '', '', 'เป็น', 'เครื่อง', '', '', 'โกล', 'ไล'])
  3 0]
 [list(['', 'ฮาโหล', '', '', 'คะ', '', '', 'พอดี', 'ว่า', 'เมื่อ', 'วาน', 'เปิด', 'ซิม', 'ทรูมูฟ', '', '', 'แต่', 'มัน', 'โทร', 'ออก', 'ไม่', 'ได้', 'คะ', '', '', 'แต่', 'เล่น', 'เนต', 'ได้', 'คะ'])
  4 1]]


In [18]:
max_pad = max([len(i) for i in tokenData[:,0]])
print('max padding :',max_pad)

dictionary = dict()
dictionary["for_keras_zero_padding"] = 0
dictionary["UNK"] = 1
for ss in tokenData[:,0]:
    for w in ss:
        if w in dictionary:
            dictionary[w] = (dictionary[w][0],dictionary[w][1]+1)
        else:
            dictionary[w] = (len(dictionary),1)
            
def create_index(words,dictionary):
    data = list()
    for word in words:
        if word in dictionary and dictionary[word][1]>1:
            data.append(dictionary[word][0])
        else:
            data.append(dictionary["UNK"])
    return np.concatenate(([data],[[dictionary["for_keras_zero_padding"]]*(max_pad-len(data)) if len(data)<max_pad else []]),axis=1)

max padding : 158


In [19]:
inData = np.array(tokenData)
inData[:,0] = [create_index(i,dictionary)[0] for i in tokenData[:,0]]
reverse_dictionary = dict(zip(dictionary.values(), dictionary.keys()))

In [20]:
print(inData[:,0][0])

[ 2  3  4  2  2  5  6  7  8  9  2  2  1  2  2 11  2  2 12 13  2  2  1 15
 16  2  2 17  2  2 18 19  9  5 20  9 21 22 23 24  2  2  1 26 15 27  2  2
 17  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
  0  0  0  0  0  0  0  0  0  0  0  0  0  0]


In [21]:
from keras.utils import to_categorical
np.random.shuffle(inData)
act = to_categorical(inData[:,2],dtype=int)
obj = to_categorical(inData[:,1],dtype=int)
#train:test ratio
ratio = 5
x_train = np.array(list(inData[:-len(inData)//ratio,0]))
x_test = np.array(list(inData[-len(inData)//ratio:,0]))

y_train_obj = np.array(list(obj[:-len(inData)//ratio]))
y_test_obj = np.array(list(obj[-len(inData)//ratio:]))

y_train_act = np.array(list(act[:-len(inData)//ratio]))
y_test_act = np.array(list(act[-len(inData)//ratio:]))

print('X_train :',x_train.shape)
print('X_test :',x_test.shape)
print('Y_train_act :',y_train_act.shape)
print('Y_train_obj :',y_train_obj.shape)
print('Y_test_act :',y_test_act.shape)
print('Y_test_obj :',y_test_obj.shape)
print(x_train[0])
print(y_train_act[0])
print(y_train_obj[0])

X_train : (10711, 158)
X_test : (2678, 158)
Y_train_act : (10711, 8)
Y_train_obj : (10711, 26)
Y_test_act : (2678, 8)
Y_test_obj : (2678, 26)
[  2 145  69   2   2  67  41 119 118 412   0   0   0   0   0   0   0   0
   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
   0   0   0   0   0   0   0   0   0   0   0   0   0   0]
[1 0 0 0 0 0 0 0]
[0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]


## #TODO 3: Build and evaluate a model for "action" classification


In [34]:
#TODO 3: Build and evaluate a model for "action" classification
def get_your_action():
    max_features = len(dictionary)+1
    max_len=max_pad
    #replace "pass" with code for your neural net
    input1 = Input(shape=(max_len,))
    x = Embedding(max_features, 32, input_length=max_len,trainable=True)(input1)
    x = Conv1D(50, 3, strides = 1, padding='same', activation='relu')(x)
    x = TimeDistributed(Dense(5, activation='relu'))(x)
    x = Flatten()(x)
    x = Dropout(0.1)(x)
    x = Dense(50, activation='relu')(x)
    out = Dense(len(data_df.clean_label_act.unique()), activation='sigmoid')(x)

    model = Model(inputs=input1, outputs=out)
    model.compile(optimizer=Adam(),
                loss='binary_crossentropy',
                metrics=['acc'])
    return model


action_model = get_your_action()
action_model.summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
input_6 (InputLayer)         (None, 158)               0         
_________________________________________________________________
embedding_6 (Embedding)      (None, 158, 32)           197408    
_________________________________________________________________
conv1d_6 (Conv1D)            (None, 158, 50)           4850      
_________________________________________________________________
time_distributed_6 (TimeDist (None, 158, 5)            255       
_________________________________________________________________
flatten_6 (Flatten)          (None, 790)               0         
_________________________________________________________________
dropout_5 (Dropout)          (None, 790)               0         
_________________________________________________________________
dense_17 (Dense)             (None, 50)                39550     
__________

In [37]:
%%time
action_model.fit(x_train,y_train_act,validation_data=(x_test,y_test_act),batch_size=128,epochs=30,verbose=1)

Train on 10711 samples, validate on 2678 samples
Epoch 1/30
Epoch 2/30
Epoch 3/30
Epoch 4/30
Epoch 5/30
Epoch 6/30
Epoch 7/30
Epoch 8/30
Epoch 9/30
Epoch 10/30
Epoch 11/30
Epoch 12/30
Epoch 13/30
Epoch 14/30
Epoch 15/30
Epoch 16/30
Epoch 17/30
Epoch 18/30
Epoch 19/30
Epoch 20/30
Epoch 21/30
Epoch 22/30
Epoch 23/30
Epoch 24/30
Epoch 25/30
Epoch 26/30
Epoch 27/30
Epoch 28/30
Epoch 29/30
Epoch 30/30
CPU times: user 28.3 s, sys: 4.86 s, total: 33.2 s
Wall time: 27.7 s


<keras.callbacks.History at 0x7f169739c5f8>

In [38]:
from sklearn.metrics import classification_report
y_pred = action_model.predict(x_test)
y_pred = (y_pred > 0.5)
print(classification_report(y_test_act,y_pred,target_names=data_df.clean_label_act.unique()))

              precision    recall  f1-score   support

     enquire       0.83      0.91      0.87      1686
      report       0.76      0.57      0.66       327
      cancel       0.91      0.82      0.86       222
         buy       0.74      0.56      0.63       154
    activate       0.76      0.50      0.61       107
     request       0.50      0.13      0.20        70
     garbage       0.00      0.00      0.00         7
      change       0.65      0.66      0.65       105

   micro avg       0.82      0.79      0.81      2678
   macro avg       0.64      0.52      0.56      2678
weighted avg       0.80      0.79      0.79      2678
 samples avg       0.79      0.79      0.79      2678



## #TODO 4: Build and evaluate a model for "object" classification



In [39]:
#TODO 4: Build and evaluate a model for "object" classification
def get_your_object():
    max_features = len(dictionary)+1
    max_len=max_pad
    #replace "pass" with code for your neural net
    input1 = Input(shape=(max_len,))
    x = Embedding(max_features, 32, input_length=max_len,trainable=True)(input1)
    x = Conv1D(100, 5, strides = 1, padding='same', activation='relu')(x)
    x = TimeDistributed(Dense(5, activation='relu'))(x)
    x = Flatten()(x)
    x = Dropout(0.1)(x)
    x = Dense(100, activation='relu')(x)
    out = Dense(len(data_df.clean_label_obj.unique()), activation='sigmoid')(x)

    model = Model(inputs=input1, outputs=out)
    model.compile(optimizer=Adam(),
                loss='binary_crossentropy',
                metrics=['acc'])
    return model


obj_model = get_your_object()
obj_model.summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
input_7 (InputLayer)         (None, 158)               0         
_________________________________________________________________
embedding_7 (Embedding)      (None, 158, 32)           197408    
_________________________________________________________________
conv1d_7 (Conv1D)            (None, 158, 100)          16100     
_________________________________________________________________
time_distributed_7 (TimeDist (None, 158, 5)            505       
_________________________________________________________________
flatten_7 (Flatten)          (None, 790)               0         
_________________________________________________________________
dropout_6 (Dropout)          (None, 790)               0         
_________________________________________________________________
dense_20 (Dense)             (None, 100)               79100     
__________

In [40]:
%%time
obj_model.fit(x_train,y_train_obj,validation_data=(x_test,y_test_obj),batch_size=128,epochs=50,verbose=1)

Train on 10711 samples, validate on 2678 samples
Epoch 1/50
Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
Epoch 17/50
Epoch 18/50
Epoch 19/50
Epoch 20/50
Epoch 21/50
Epoch 22/50
Epoch 23/50
Epoch 24/50
Epoch 25/50
Epoch 26/50
Epoch 27/50
Epoch 28/50
Epoch 29/50
Epoch 30/50
Epoch 31/50
Epoch 32/50
Epoch 33/50
Epoch 34/50
Epoch 35/50
Epoch 36/50
Epoch 37/50
Epoch 38/50
Epoch 39/50
Epoch 40/50
Epoch 41/50
Epoch 42/50
Epoch 43/50
Epoch 44/50
Epoch 45/50
Epoch 46/50
Epoch 47/50
Epoch 48/50
Epoch 49/50
Epoch 50/50
CPU times: user 1min 3s, sys: 12.4 s, total: 1min 15s
Wall time: 1min 10s


<keras.callbacks.History at 0x7f16973cd940>

In [41]:
y_pred = obj_model.predict(x_test)
y_pred = (y_pred > 0.5)
print(classification_report(y_test_obj,y_pred,target_names=data_df.clean_label_obj.unique()))

                 precision    recall  f1-score   support

        payment       0.50      0.48      0.49       130
        package       0.71      0.62      0.66       346
        suspend       0.76      0.66      0.71       152
       internet       0.75      0.63      0.68       368
   phone_issues       0.64      0.42      0.51       136
        service       0.79      0.70      0.74       418
    nontruemove       0.24      0.13      0.17        45
        balance       0.78      0.74      0.76       282
         detail       0.40      0.26      0.32        61
           bill       0.50      0.58      0.54        97
         credit       0.80      0.41      0.54        39
      promotion       0.65      0.63      0.64       237
 mobile_setting       0.44      0.35      0.39        48
       iservice       0.00      0.00      0.00         5
        roaming       0.82      0.64      0.72        50
      truemoney       0.58      0.64      0.61        44
    information       0.44    

## #TODO 5: Build and evaluate a multi-task model that does both "action" and "object" classifications in one-go 

This can be a bit tricky, if you are not familiar with the Keras functional API. PLEASE READ this webpage(https://keras.io/getting-started/functional-api-guide/) before you start this task.   

Your model will have 2 separate output layers one for action classification task and another for object classification task. 

This is a rough sketch of what your model might look like:
![image](https://raw.githubusercontent.com/ekapolc/nlp_course/master/HW5/multitask_sketch.png)

In [42]:
#TODO 5: Build and evaluate a multi-task model that does both "action" and "object" classifications in one-go
def get_your_mix():
    max_features = len(dictionary)+1
    max_len=max_pad
    #replace "pass" with code for your neural net
    input1 = Input(shape=(max_len,))
    x = Embedding(max_features, 32, input_length=max_len,trainable=True)(input1)
    x = Conv1D(100, 5, strides = 1, padding='same', activation='relu')(x)
    x = TimeDistributed(Dense(5, activation='relu'))(x)
    x = Flatten()(x)
    x = Dropout(0.1)(x)
    x = Dense(100, activation='relu')(x)
    out1 = Dense(len(data_df.clean_label_obj.unique()), activation='sigmoid')(x)
    out2 = Dense(len(data_df.clean_label_act.unique()), activation='sigmoid')(x)

    model = Model(inputs=input1, outputs=[out1,out2])
    model.compile(optimizer=Adam(),
                loss='binary_crossentropy',
                metrics=['acc'])
    return model


mix_model = get_your_mix()
mix_model.summary()

__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
input_8 (InputLayer)            (None, 158)          0                                            
__________________________________________________________________________________________________
embedding_8 (Embedding)         (None, 158, 32)      197408      input_8[0][0]                    
__________________________________________________________________________________________________
conv1d_8 (Conv1D)               (None, 158, 100)     16100       embedding_8[0][0]                
__________________________________________________________________________________________________
time_distributed_8 (TimeDistrib (None, 158, 5)       505         conv1d_8[0][0]                   
__________________________________________________________________________________________________
flatten_8 

In [43]:
%%time
mix_model.fit(x_train,[y_train_obj,y_train_act],validation_data=(x_test,[y_test_obj,y_test_act]),batch_size=128,epochs=50,verbose=1)

Train on 10711 samples, validate on 2678 samples
Epoch 1/50
Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
Epoch 17/50
Epoch 18/50
Epoch 19/50
Epoch 20/50
Epoch 21/50
Epoch 22/50
Epoch 23/50
Epoch 24/50
Epoch 25/50
Epoch 26/50
Epoch 27/50
Epoch 28/50
Epoch 29/50
Epoch 30/50
Epoch 31/50
Epoch 32/50
Epoch 33/50
Epoch 34/50
Epoch 35/50
Epoch 36/50
Epoch 37/50
Epoch 38/50
Epoch 39/50
Epoch 40/50
Epoch 41/50
Epoch 42/50
Epoch 43/50
Epoch 44/50
Epoch 45/50
Epoch 46/50
Epoch 47/50
Epoch 48/50
Epoch 49/50
Epoch 50/50
CPU times: user 1min 20s, sys: 15.2 s, total: 1min 35s
Wall time: 1min 17s


<keras.callbacks.History at 0x7f16969eac50>

In [51]:
y_pred1,y_pred2 = mix_model.predict(x_test)
y_pred1 = (y_pred1 > 0.5)
y_pred2 = (y_pred2 > 0.5)
print(classification_report(y_test_obj,y_pred1,target_names=data_df.clean_label_obj.unique()))
print(classification_report(y_test_act,y_pred2,target_names=data_df.clean_label_act.unique()))

                 precision    recall  f1-score   support

        payment       0.49      0.41      0.44       130
        package       0.69      0.53      0.60       346
        suspend       0.69      0.55      0.61       152
       internet       0.69      0.70      0.69       368
   phone_issues       0.56      0.35      0.43       136
        service       0.78      0.64      0.71       418
    nontruemove       0.43      0.07      0.12        45
        balance       0.82      0.66      0.73       282
         detail       0.57      0.26      0.36        61
           bill       0.47      0.29      0.36        97
         credit       0.88      0.36      0.51        39
      promotion       0.67      0.52      0.59       237
 mobile_setting       0.44      0.33      0.38        48
       iservice       0.00      0.00      0.00         5
        roaming       0.74      0.52      0.61        50
      truemoney       0.89      0.77      0.83        44
    information       0.80    