## Using classification models for aspect-level sentiment analysis of reviews in Chinese with word embedding and vectorization

> This notebook file contains all the code written to obtain the results as described in the thesis report

### Importing Packages
> This next cell contains all the packages used in this project

In [1]:
import random # to create random numbers/alphabets -> https://docs.python.org/2/library/random.html
import pandas as pd #-> https://pandas.pydata.org/
import tensorflow as tf # for creating neural networks - > https://www.tensorflow.org/api_docs
#from gensim.models.word2vec import Word2Vec
import jieba # to cut chinese text -> https://github.com/fxsjy/jieba
from collections import Counter # counter -> https://docs.python.org/2/library/collections.html
import numpy as np # numpy! ->  https://numpy.org/
import os # we use this to navigate directory and read file -> https://docs.python.org/3/library/os.html
import pickle # to make file binary -> https://docs.python.org/3/library/pickle.html
from gensim.models import Word2Vec # to create vector models of chinese text -> https://radimrehurek.com/gensim/
from gensim.models.keyedvectors import KeyedVectors # https://radimrehurek.com/gensim/
import logging # https://docs.python.org/3/library/logging.html
import keras # for deep learning
from keras.models import model_from_json

Using TensorFlow backend.


### Importing the data files
> This section imports all the necessary data files used in this project.

In [3]:
train_set = pd.read_csv('sentiment_analysis_trainingset.csv', header = 0, encoding="utf-8")
val_set = pd.read_csv('sentiment_analysis_validationset.csv', header = 0, encoding="utf-8")

test_set = pd.read_csv('sentiment_analysis_testa.csv', header = 0, encoding="utf-8")

In [4]:
train_set.head(5)

Unnamed: 0,id,content,location_traffic_convenience,location_distance_from_business_district,location_easy_to_find,service_wait_time,service_waiters_attitude,service_parking_convenience,service_serving_speed,price_level,...,environment_decoration,environment_noise,environment_space,environment_cleaness,dish_portion,dish_taste,dish_look,dish_recommendation,others_overall_experience,others_willing_to_consume_again
0,0,"""吼吼吼，萌死人的棒棒糖，中了大众点评的霸王餐，太可爱了。一直就好奇这个棒棒糖是怎么个东西，...",-2,-2,-2,-2,1,-2,-2,-2,...,-2,-2,-2,-2,-2,-2,1,-2,1,-2
1,1,"""第三次参加大众点评网霸王餐的活动。这家店给人整体感觉一般。首先环境只能算中等，其次霸王餐提...",-2,-2,-2,-2,-2,-2,-2,0,...,0,0,0,0,1,-2,-2,-2,1,-2
2,2,"""4人同行 点了10个小吃\n榴莲酥 榴莲味道不足 松软 奶味浓\n虾饺 好吃 两颗大虾仁\...",-2,-2,-2,-2,0,-2,1,0,...,-2,-2,1,-2,0,1,-2,-2,0,-2
3,3,"""之前评价了莫名其妙被删 果断继续差评！ 换了菜单 价格更低 开始砸牌子 但套餐还是有150...",-2,-2,-2,-2,-2,-2,-2,0,...,-2,-2,-2,-2,-2,-1,-2,-2,-1,-1
4,4,"""出乎意料地惊艳，椰子鸡清热降火，美容养颜，大大满足了爱吃火锅怕上火星人。椰子冻是帅帅的老板...",-2,-2,-2,-2,-2,-2,-2,-2,...,-2,-2,-2,-2,-2,1,1,-2,1,-2


> This is how the data looks. The column which is the text column in Chinese, is what we have to process.

### Preprocessing Data!

> Selecting just the text column from the data sets

In [15]:
val_shape = val_set.shape[0]
train = train_set.iloc[:, 1]
val = val_set.iloc[0:7500, 1]
test = test_set.iloc[7501:val_shape-1, 1]

In [13]:
val_shape = val_set.shape[0]
train = train_set.iloc[:, 1]
val = val_set.iloc[0:7500, 1]
test = test_set.iloc[7501:val_shape-1, 1]

### Now we need to segment the Chinese words, for this a package called jieba will be used
> This was inspired from this blog: https://breezegeography.wordpress.com/2018/01/25/how-to-segment-chinese-texts-putting-in-spaces-with-jieba/

In [17]:

# do not run again, it takes couple of hours to completely run this
train_words = []

for sentences in train:
    words = jieba.lcut(sentences, cut_all = True )
    train_words.append(" ".join(words))
    
val_words = []

for sentences in val:
    words = jieba.lcut(sentences, cut_all = True )
    val_words.append(" ".join(words))
    
test_words = []

for sentences in test:
    words = jieba.lcut(sentences, cut_all = True )
    test_words.append(" ".join(words))
    



### Here, we create a word2vec model from the Chinese embedding file, covering over 8 million Chinese words and phrases.
> 

In [18]:
embedding_model = KeyedVectors.load_word2vec_format("Tencent_AILab_ChineseEmbedding.txt", binary = False)
vocab = embedding_model.wv.vocab

# do not run again, it takes several hours to completely run this

print("loading done")

loading done


> Now, we convert the words to numbers. That is encoding.

In [19]:
w2i = {}

w2i["UNK"]=1
i=2
for j in vocab:
    w2i[k] = i
    i = i+1
    
    
print("words converted to indices")

words converted to indices


> Then doing the same for the train words, validation words & test words!

In [20]:
train_2i = {}

train_2i["UNK"]=1
i=2
for j in train_words:
    train_2i[k] = i
    i = i+1
    

val_2i = {}

val_2i["UNK"]=1
i=2
for j in val_words:
    val_2i[k] = i
    i = i+1
    
    
test_2i = {}

test_2i["UNK"]=1
i=2
for j in test_words:
    test_2i[k] = i
    i = i+1

In [1]:
# do not run this

train_2i = np.zeros((m, max_len))
max = 200
train_array = np.array(train_words)
n = train_array.shape[0]

for i in range(n):
    sentence = train_words[i]
    for j in range(len(sentence)):
        if j == max:
            break
        word = sentence[j]
        k=1
        if word in vocab:
            k = w2i[word]
        train_2i[i, j] = k
        
        
val_2i = np.zeros((m, max_len))
max = 200
val_array = np.array(val_words)
n = val_array.shape[0]

for i in range(n):
    sentence = train_words[i]
    for j in range(len(sentence)):
        if j == max:
            break
        word = sentence[j]
        k=1
        if word in vocab:
            k = w2i[word]
        val_2i[i, j] = k
        

test_2i = np.zeros((m, max_len))
max = 200
test_array = np.array(test_words)
n = test_array.shape[0]

for i in range(n):
    sentence = test_words[i]
    for j in range(len(sentence)):
        if j == max:
            break
        word = sentence[j]
        k=1 #unk index
        if word in vocab:
            k = w2i[word]
        test_2i[i, j] = k



print("indices created")


train_2i.dump("ti_w2v.dat")
val_2i.dump("vi_w2v.dat")
test_2i.dump("te_w2v.dat")


print("indices_saved")

indices created
indices_saved


In [21]:
train_set.columns # service_waiters_attitude, dish_taste, others_overall_experience

Index(['id', 'content', 'location_traffic_convenience',
       'location_distance_from_business_district', 'location_easy_to_find',
       'service_wait_time', 'service_waiters_attitude',
       'service_parking_convenience', 'service_serving_speed', 'price_level',
       'price_cost_effective', 'price_discount', 'environment_decoration',
       'environment_noise', 'environment_space', 'environment_cleaness',
       'dish_portion', 'dish_taste', 'dish_look', 'dish_recommendation',
       'others_overall_experience', 'others_willing_to_consume_again'],
      dtype='object')

In [5]:
def embedding_data(length, dim, w2, embedding_model, vocab):
    length = length
    dim = dim
    embedded_matrix = np.zeros((length, dim))
    embedded_matrix[1, :] = np.ones((1, dim))
    for word, index in word2index.items():
        if word in vocab:
            emb_matrix[index, :] = embedding_model[word]

    return embedded_matrix



### Classification Models
> This section trains and evaluates the classification models

##### Intial one time data prep

In [10]:
def data_load(file, header=0, encoding="utf-8"):

    data_df = pd.read_csv(file, header=header, encoding=encoding)
    return data_df

In [6]:
from sklearn.metrics import roc_auc_score
from sklearn.preprocessing import LabelBinarizer

def multiclass_roc_auc_score(y_test, y_pred, average="macro"):
    """
    to calculate multiclass roc score
    referenced from https://medium.com/@plog397/auc-roc-curve-scoring-function-for-multi-class-classification-9822871a6659
    """
    lb = LabelBinarizer()
    lb.fit(y_test)
    y_test = lb.transform(y_test)
    y_pred = lb.transform(y_pred)
    
    return roc_auc_score(y_test, y_pred, average=average)

In [11]:
train = data_load("sentiment_analysis_trainingset.csv")
val = data_load("sentiment_analysis_validationset.csv")
m_val = val.shape[0]


train_doc = train.iloc[:, 1]
val_doc = val.iloc[0:7500, 1]
test_doc = val.iloc[7501:m_val-1:, 1]


train_labels = np.array(train.iloc[:, 2:])
val_labels = np.array(val.iloc[0:7500, 2:])
test_labels = np.array(val.iloc[7501:m_val-1, 2:])

train_sentence_features = np.load("train_features.dat")
val_sentence_features = np.load("val_features.dat")
test_sentence_features = np.load("test_features.dat")

### Linear SVM

> This was implemented as a baseline model, inspired from https://github.com/AIChallenger/AI_Challenger_2018

In [22]:
import numpy as np
import pandas as pd
from sklearn import svm
from sklearn.metrics import f1_score

#### This 1st model is for the Aspect: Service/Waiters' Attitude

In [29]:
label_number = 0

lin_clf = svm.LinearSVC()
lin_clf.fit(train_sentence_features, train_labels[:, label_number])

predict = lin_clf.predict(val_sentence_features)
print("Validation set's accuracy rate: ", label_number, sum(predict==val_labels[:, label_number])/val_labels.shape[0])

f1_val=f1_score(val_labels[:, label_number], predict, average='weighted', labels=np.unique(predict))
print("Validation set's F1 score:", label_number, f1_val)

predict = lin_clf.predict(test_sentence_features)
print("Test set's accuracy rate: ", label_number, sum(predict==test_labels[:, label_number])/test_labels.shape[0])

f1_test=f1_score(test_labels[:, label_number], predict, average='weighted', labels=np.unique(predict))
print("test set's F1 score:", label_number, f1_test)


multiclass_roc_auc_score(test_labels[:, label_number], predict)

Validation set's accuracy rate:  0 0.6554666666666666
Validation set's F1 score: 0 0.7031047177508272
Test set's accuracy rate:  0 0.6524406508402241
test set's F1 score: 0 0.7006724358450414




0.6344760935292431

In [30]:
label_number = 3

lin_clf = svm.LinearSVC()
lin_clf.fit(train_sentence_features, train_labels[:, label_number])

predict = lin_clf.predict(val_sentence_features)
print("Validation set's accuracy rate: ", label_number, sum(predict==val_labels[:, label_number])/val_labels.shape[0])

f1_val=f1_score(val_labels[:, label_number], predict, average='weighted', labels=np.unique(predict))
print("Validation set's F1 score:", label_number, f1_val)

predict = lin_clf.predict(test_sentence_features)
print("Test set's accuracy rate: ", label_number, sum(predict==test_labels[:, label_number])/test_labels.shape[0])

f1_test=f1_score(test_labels[:, label_number], predict, average='weighted', labels=np.unique(predict))
print("test set's F1 score:", label_number, f1_test)


multiclass_roc_auc_score(test_labels[:, label_number], predict)

Validation set's accuracy rate:  3 0.7781333333333333
Validation set's F1 score: 3 0.8035068869799499
Test set's accuracy rate:  3 0.7720725526807148
test set's F1 score: 3 0.7941762215425053




0.614206205821898

In [31]:
label_number = 8

lin_clf = svm.LinearSVC()
lin_clf.fit(train_sentence_features, train_labels[:, label_number])

predict = lin_clf.predict(val_sentence_features)
print("Validation set's accuracy rate: ", label_number, sum(predict==val_labels[:, label_number])/val_labels.shape[0])

f1_val=f1_score(val_labels[:, label_number], predict, average='weighted', labels=np.unique(predict))
print("Validation set's F1 score:", label_number, f1_val)

predict = lin_clf.predict(test_sentence_features)
print("Test set's accuracy rate: ", label_number, sum(predict==test_labels[:, label_number])/test_labels.shape[0])

f1_test=f1_score(test_labels[:, label_number], predict, average='weighted', labels=np.unique(predict))
print("test set's F1 score:", label_number, f1_test)


multiclass_roc_auc_score(test_labels[:, label_number], predict)

Validation set's accuracy rate:  8 0.7592
Validation set's F1 score: 8 0.6573897597997237
Test set's accuracy rate:  8 0.7643371565750867
test set's F1 score: 8 0.6641372761762963




0.5018816817062272

### XGBoost

In [33]:
import numpy as np
import pandas as pd
from sklearn import svm
import xgboost as xgb
from sklearn import datasets
from sklearn.metrics import confusion_matrix
from sklearn.model_selection import GridSearchCV
from sklearn.model_selection import RandomizedSearchCV
from sklearn.metrics import f1_score

In [34]:
label_number = 0

clf = xgb.XGBClassifier(learning_rate=0.1, max_depth=5)
clf.fit(train_sentence_features, train_labels[:, label_number])

predict = clf.predict(val_sentence_features)

print("Validation set's accuracy rate: ", label_number, sum(predict==val_labels[:, label_number])/val_labels.shape[0])

f1_val=f1_score(val_labels[:, label_number], predict, average='weighted', labels=np.unique(predict))
print("Validation set's F1 score:", label_number, f1_val)


predict = clf.predict(test_sentence_features)

print("Test set's accuracy rate: ", label_number, sum(predict==test_labels[:, label_number])/test_labels.shape[0])

f1_test=f1_score(test_labels[:, label_number], predict, average='weighted', labels=np.unique(predict))
print("test set's F1 score:", label_number, f1_test)


multiclass_roc_auc_score(test_labels[:, label_number], predict)

Validation set's accuracy rate:  0 0.8426666666666667
Validation set's F1 score: 0 0.8383807535316171
Test set's accuracy rate:  0 0.8355561483062149
test set's F1 score: 0 0.8303859176222609


0.5949593356019821

In [35]:
label_number = 3

clf = xgb.XGBClassifier(learning_rate=0.1, max_depth=5)
clf.fit(train_sentence_features, train_labels[:, label_number])

predict = clf.predict(val_sentence_features)

print("Validation set's accuracy rate: ", label_number, sum(predict==val_labels[:, label_number])/val_labels.shape[0])

f1_val=f1_score(val_labels[:, label_number], predict, average='weighted', labels=np.unique(predict))
print("Validation set's F1 score:", label_number, f1_val)


predict = clf.predict(test_sentence_features)

print("Test set's accuracy rate: ", label_number, sum(predict==test_labels[:, label_number])/test_labels.shape[0])

f1_test=f1_score(test_labels[:, label_number], predict, average='weighted', labels=np.unique(predict))
print("test set's F1 score:", label_number, f1_test)


multiclass_roc_auc_score(test_labels[:, label_number], predict)

Validation set's accuracy rate:  3 0.8874666666666666
Validation set's F1 score: 3 0.83961908165458
Test set's accuracy rate:  3 0.8826353694318485
test set's F1 score: 3 0.8328593095183175


0.5204445873468494

In [36]:
label_number = 8

clf = xgb.XGBClassifier(learning_rate=0.1, max_depth=5)
clf.fit(train_sentence_features, train_labels[:, label_number])

predict = clf.predict(val_sentence_features)

print("Accuracy rate val: ", label_number, sum(predict==val_labels[:, label_number])/val_labels.shape[0])

f1_val=f1_score(val_labels[:, label_number], predict, average='weighted', labels=np.unique(predict))
print("F1 score val:", label_number, f1_val)


predict = clf.predict(test_sentence_features)
# print("Accuracy rate: ", sum(predict==test_labels[:, label_number])/test_labels.shape[0])

print("Accuracy rate test: ", label_number, sum(predict==test_labels[:, label_number])/test_labels.shape[0])

f1_test=f1_score(test_labels[:, label_number], predict, average='weighted', labels=np.unique(predict))
print("F1 score test:", label_number, f1_test)


multiclass_roc_auc_score(test_labels[:, label_number], predict)

Accuracy rate val:  8 0.7642666666666666
F1 score val: 8 0.6960205979634512
Accuracy rate test:  8 0.767938116831155
F1 score test: 8 0.7194332312423637


0.5097780533494374

### Random Forest

In [37]:
from sklearn.ensemble import RandomForestClassifier
from sklearn import metrics
label_number = 0
rf = RandomForestClassifier()

rf.fit(train_sentence_features, train_labels[:, label_number]) 

predict = rf.predict(val_sentence_features)

print("Validation set's accuracy rate: ", label_number, sum(predict==val_labels[:, label_number])/val_labels.shape[0])


f1_val=f1_score(val_labels[:, label_number], predict, average='weighted', labels=np.unique(predict))
print("Validation set's F1 score", label_number, f1_val)

predict = rf.predict(test_sentence_features)


print("Test set's accuracy rate: ", label_number, sum(predict==test_labels[:, label_number])/test_labels.shape[0])


f1_test=f1_score(test_labels[:, label_number], predict, average='weighted', labels=np.unique(predict))
print("test set's F1 score:", label_number, f1_test)

multiclass_roc_auc_score(test_labels[:, label_number], predict)



Validation set's accuracy rate:  0 0.8108
Validation set's F1 score 0 0.7924395137986145
Test set's accuracy rate:  0 0.8027473993064818
test set's F1 score: 0 0.7752639470310887


0.5592288109409941

In [38]:
from sklearn.ensemble import RandomForestClassifier
from sklearn import metrics
label_number = 3
rf = RandomForestClassifier()

rf.fit(train_sentence_features, train_labels[:, label_number]) 

predict = rf.predict(val_sentence_features)

print("Validation set's accuracy rate: ", label_number, sum(predict==val_labels[:, label_number])/val_labels.shape[0])


f1_val=f1_score(val_labels[:, label_number], predict, average='weighted', labels=np.unique(predict))
print("Validation set's F1 score", label_number, f1_val)

predict = rf.predict(test_sentence_features)


print("Test set's accuracy rate: ", label_number, sum(predict==test_labels[:, label_number])/test_labels.shape[0])


f1_test=f1_score(test_labels[:, label_number], predict, average='weighted', labels=np.unique(predict))
print("test set's F1 score:", label_number, f1_test)

multiclass_roc_auc_score(test_labels[:, label_number], predict)



Validation set's accuracy rate:  3 0.8849333333333333
Validation set's F1 score 3 0.835504722753045
Test set's accuracy rate:  3 0.8794345158708989
test set's F1 score: 3 0.8264611821772441


0.5067589247554134

In [39]:
from sklearn.ensemble import RandomForestClassifier
from sklearn import metrics
label_number = 8
rf = RandomForestClassifier()

rf.fit(train_sentence_features, train_labels[:, label_number]) 

predict = rf.predict(val_sentence_features)

print("Validation set's accuracy rate: ", label_number, sum(predict==val_labels[:, label_number])/val_labels.shape[0])


f1_val=f1_score(val_labels[:, label_number], predict, average='weighted', labels=np.unique(predict))
print("Validation set's F1 score", label_number, f1_val)

predict = rf.predict(test_sentence_features)


print("Test set's accuracy rate: ", label_number, sum(predict==test_labels[:, label_number])/test_labels.shape[0])


f1_test=f1_score(test_labels[:, label_number], predict, average='weighted', labels=np.unique(predict))
print("test set's F1 score:", label_number, f1_test)

multiclass_roc_auc_score(test_labels[:, label_number], predict)



Validation set's accuracy rate:  8 0.7548
Validation set's F1 score 8 0.6753200869488593
Test set's accuracy rate:  8 0.7520672179247799
test set's F1 score: 8 0.6719191872381992


0.5064696169513938

### kNN

In [40]:
from sklearn.neighbors import KNeighborsClassifier
from sklearn.metrics import accuracy_score
from sklearn.metrics import classification_report
from sklearn.metrics import confusion_matrix
from sklearn.metrics import f1_score
label_number = 0
knc = KNeighborsClassifier()

knc.fit(train_sentence_features, train_labels[:, label_number])

predict = knc.predict(val_sentence_features)

print("Validation set's accuracy rate:", label_number, sum(predict==val_labels[:, label_number])/val_labels.shape[0])


f1_val=f1_score(val_labels[:, label_number], predict, average='weighted', labels=np.unique(predict))
print("Validation set's F1 score:", label_number, f1_val)

predict = knc.predict(test_sentence_features)


print("Test set's accuracy rate: ", label_number, sum(predict==test_labels[:, label_number])/test_labels.shape[0])


f1_test=f1_score(test_labels[:, label_number], predict, average='weighted', labels=np.unique(predict))
print("test set's F1 score:", label_number, f1_test)

multiclass_roc_auc_score(test_labels[:, label_number], predict)

Validation set's accuracy rate: 0 0.8124
Validation set's F1 score: 0 0.7908838500666245
Test set's accuracy rate:  0 0.801146972526007
test set's F1 score: 0 0.77510629863595


0.5698106930021813

In [41]:
from sklearn.neighbors import KNeighborsClassifier
from sklearn.metrics import accuracy_score
from sklearn.metrics import classification_report
from sklearn.metrics import confusion_matrix
from sklearn.metrics import f1_score
label_number = 3
knc = KNeighborsClassifier()

knc.fit(train_sentence_features, train_labels[:, label_number])

predict = knc.predict(val_sentence_features)

print("Validation set's accuracy rate:", label_number, sum(predict==val_labels[:, label_number])/val_labels.shape[0])


f1_val=f1_score(val_labels[:, label_number], predict, average='weighted', labels=np.unique(predict))
print("Validation set's F1 score:", label_number, f1_val)

predict = knc.predict(test_sentence_features)


print("Test set's accuracy rate: ", label_number, sum(predict==test_labels[:, label_number])/test_labels.shape[0])


f1_test=f1_score(test_labels[:, label_number], predict, average='weighted', labels=np.unique(predict))
print("test set's F1 score:", label_number, f1_test)

multiclass_roc_auc_score(test_labels[:, label_number], predict)

Validation set's accuracy rate: 3 0.8816
Validation set's F1 score: 3 0.8359981860620342
Test set's accuracy rate:  3 0.8770338757001868
test set's F1 score: 3 0.8298066546549947


0.5182398398284616

In [42]:
from sklearn.neighbors import KNeighborsClassifier
from sklearn.metrics import accuracy_score
from sklearn.metrics import classification_report
from sklearn.metrics import confusion_matrix
from sklearn.metrics import f1_score
label_number = 8
knc = KNeighborsClassifier()

knc.fit(train_sentence_features, train_labels[:, label_number])

predict = knc.predict(val_sentence_features)

print("Validation set's accuracy rate:", label_number, sum(predict==val_labels[:, label_number])/val_labels.shape[0])


f1_val=f1_score(val_labels[:, label_number], predict, average='weighted', labels=np.unique(predict))
print("Validation set's F1 score:", label_number, f1_val)

predict = knc.predict(test_sentence_features)


print("Test set's accuracy rate: ", label_number, sum(predict==test_labels[:, label_number])/test_labels.shape[0])


f1_test=f1_score(test_labels[:, label_number], predict, average='weighted', labels=np.unique(predict))
print("test set's F1 score:", label_number, f1_test)

multiclass_roc_auc_score(test_labels[:, label_number], predict)

Validation set's accuracy rate: 8 0.7366666666666667
Validation set's F1 score: 8 0.672340926428617
Test set's accuracy rate:  8 0.7451320352093892
test set's F1 score: 8 0.6837516780693935


0.5167611069829161

#### ANN

In [20]:
# this is to create the one hot encoding for f1 score
# written using this guide: https://machinelearningmastery.com/how-to-one-hot-encode-sequence-data-in-python/

def onehot_coversion(label, no_class):
    list_mat = []
    for j in range(len(label)):
        label = labels[j]
        label = [0 for i in range(no_class)]
        types = {
        -2: 0,
        -1: 1,
        0: 2,
        1: 3
        }
        created_label[label_types[label]] = 1
        list_mat.append(created_label) 
    return list_mat


def converstion_to_classname(y_oh):
    """
    to get the class name aka the sentiment scores
    from one hot encoded integers
    """
    class_index2name = {
    0: -2,
    1: -1,
    2: 0,
    3: 1
    }

    y_list = y_oh.tolist()
    result = []
    for i in range(len(y_list)):
        single_one_hot = y_list[i]

        index = single_one_hot.index(1)

        class_name = class_index2name.get(index)
        result.append(class_name)
    return result

def converstion_toclassname_ypred(y_prob):
    class_index2name = {
    0: -2,
    1: -1,
    2: 0,
    3: 1
    }

    result = []
    y_prob = np.matrix(y_prob)
    for i in range(y_prob.shape[0]):
        single_prob = np.array(y_prob[i, :])
        index = np.argmax(single_prob)
        class_name = class_index2name.get(index)
        result.append(class_name)
    return result


In [43]:
i = 0
n = train_labels.shape[1]
y_train=np.array(onehot_coversion(train_labels[:, i], 4))
y_val=np.array(onehot_coversion(val_labels[:, i], 4))

In [44]:
import keras
from keras.models import Sequential
from keras.layers import Dense

classifier = Sequential()

classifier.add(Dense(output_dim = 6, init = 'uniform', activation = 'relu', input_dim = 400))
classifier.add(Dense(output_dim = 6, init = 'uniform', activation = 'relu'))
classifier.add(Dense(output_dim = 4, init = 'uniform', activation = 'softmax')) 
classifier.compile(optimizer = 'adam', loss = 'categorical_crossentropy', metrics = ['accuracy'])




  import sys
  
  if __name__ == '__main__':


In [45]:
train_indices = np.load("ti_w2v.dat")
y_train=np.array(convert_to_onehot(train_labels[:, i], 4))
y_val=np.array(convert_to_onehot(val_labels[:, i], 4))
val_indices = np.load("vi_w2v.dat")
test_indices = np.load("te_w2v.dat")
max_length=train_indices.shape[1]

In [46]:
classifier.fit(train_indices, y_train, validation_data=(val_indices, y_val), batch_size = 128, epochs=20, verbose=2)

Train on 105000 samples, validate on 7500 samples
Epoch 1/20
 - 1s - loss: 0.6649 - accuracy: 0.7720 - val_loss: 0.6241 - val_accuracy: 0.7856
Epoch 2/20
 - 1s - loss: 0.6401 - accuracy: 0.7751 - val_loss: 0.6107 - val_accuracy: 0.7856
Epoch 3/20
 - 1s - loss: 0.6294 - accuracy: 0.7750 - val_loss: 0.6002 - val_accuracy: 0.7861
Epoch 4/20
 - 1s - loss: 0.6200 - accuracy: 0.7747 - val_loss: 0.5919 - val_accuracy: 0.7859
Epoch 5/20
 - 1s - loss: 0.6144 - accuracy: 0.7743 - val_loss: 0.5907 - val_accuracy: 0.7852
Epoch 6/20
 - 1s - loss: 0.6100 - accuracy: 0.7742 - val_loss: 0.5892 - val_accuracy: 0.7843
Epoch 7/20
 - 1s - loss: 0.6073 - accuracy: 0.7740 - val_loss: 0.5883 - val_accuracy: 0.7840
Epoch 8/20
 - 1s - loss: 0.6051 - accuracy: 0.7736 - val_loss: 0.5928 - val_accuracy: 0.7813
Epoch 9/20
 - 1s - loss: 0.6034 - accuracy: 0.7739 - val_loss: 0.5911 - val_accuracy: 0.7829
Epoch 10/20
 - 1s - loss: 0.6015 - accuracy: 0.7740 - val_loss: 0.5902 - val_accuracy: 0.7795
Epoch 11/20
 - 1s -

<keras.callbacks.callbacks.History at 0x18d85e3ee80>

In [47]:
y_test=np.array(onehot_coversion(test_labels[:, i], 4))
scores = classifier.evaluate(test_indices, y_test, verbose=2)
y_test_pred = classifier.predict(test_indices, verbose=2)

In [48]:
classifier.evaluate(test_indices, y_test, verbose=2)

[0.6141322560550754, 0.7791411280632019]

In [23]:
def f1(y_true, y_pred):
    
    y_pred_label = converstion_toclassname_ypred(y_pred)
    y_true_label = converstion_to_classname(y_true)
    f1score = f1_score(y_true_label, y_pred_label, average=None)
    f1_ave = np.average(f1score)
    
    
    return f1_ave

In [24]:
f1_test = f1(y_test, y_test_pred)

  'precision', 'predicted', average, warn_for)


In [26]:
f1_test

0.68002434


In [49]:
multiclass_roc_auc_score(test_labels[:, i], predict)

0.5004422141276509

In [50]:
i = 3
n = train_labels.shape[1]
	#one_hot y
y_train=np.array(onehot_coversion(train_labels[:, i], 4))
y_val=np.array(onehot_coversion(val_labels[:, i], 4))

classifier = Sequential()

classifier.add(Dense(output_dim = 6, init = 'uniform', activation = 'relu', input_dim = 400))
classifier.add(Dense(output_dim = 6, init = 'uniform', activation = 'relu'))
classifier.add(Dense(output_dim = 4, init = 'uniform', activation = 'softmax')) 
classifier.compile(optimizer = 'adam', loss = 'categorical_crossentropy', metrics = ['accuracy'])


y_train=np.array(convert_to_onehot(train_labels[:, i], 4))
y_val=np.array(convert_to_onehot(val_labels[:, i], 4))

max_length=train_indices.shape[1]

classifier.fit(train_indices, y_train, validation_data=(val_indices, y_val), batch_size = 128, epochs=20, verbose=2)

y_test=np.array(onehot_coversion(test_labels[:, i], 4))
scores = classifier.evaluate(test_indices, y_test, verbose=2)
y_test_pred = classifier.predict(test_indices, verbose=2)

classifier.evaluate(test_indices, y_test, verbose=2)

f1_test = f1(y_test, y_test_pred)

print("f1 test", fl_test)

print("ROC score", multiclass_roc_auc_score(test_labels[:, i], predict))

  if __name__ == '__main__':
  # Remove the CWD from sys.path while we load stuff.
  # This is added back by InteractiveShellApp.init_path()


Train on 105000 samples, validate on 7500 samples
Epoch 1/20
 - 1s - loss: 0.5066 - accuracy: 0.8809 - val_loss: 0.4888 - val_accuracy: 0.8852
Epoch 2/20
 - 1s - loss: 0.4935 - accuracy: 0.8835 - val_loss: 0.4894 - val_accuracy: 0.8852
Epoch 3/20
 - 1s - loss: 0.4906 - accuracy: 0.8835 - val_loss: 0.4832 - val_accuracy: 0.8852
Epoch 4/20
 - 1s - loss: 0.4882 - accuracy: 0.8835 - val_loss: 0.4815 - val_accuracy: 0.8852
Epoch 5/20
 - 1s - loss: 0.4856 - accuracy: 0.8835 - val_loss: 0.4823 - val_accuracy: 0.8852
Epoch 6/20
 - 1s - loss: 0.4837 - accuracy: 0.8835 - val_loss: 0.4804 - val_accuracy: 0.8852
Epoch 7/20
 - 1s - loss: 0.4820 - accuracy: 0.8835 - val_loss: 0.4798 - val_accuracy: 0.8852
Epoch 8/20
 - 1s - loss: 0.4803 - accuracy: 0.8835 - val_loss: 0.4808 - val_accuracy: 0.8852
Epoch 9/20
 - 1s - loss: 0.4788 - accuracy: 0.8835 - val_loss: 0.4831 - val_accuracy: 0.8851
Epoch 10/20
 - 1s - loss: 0.4776 - accuracy: 0.8835 - val_loss: 0.4836 - val_accuracy: 0.8852
Epoch 11/20
 - 1s -

  'precision', 'predicted', average, warn_for)


In [51]:
i = 8
n = train_labels.shape[1]
	#one_hot y
y_train=np.array(onehot_coversion(train_labels[:, i], 4))
y_val=np.array(onehot_coversion(val_labels[:, i], 4))

classifier = Sequential()

classifier.add(Dense(output_dim = 6, init = 'uniform', activation = 'relu', input_dim = 400))
classifier.add(Dense(output_dim = 6, init = 'uniform', activation = 'relu'))
classifier.add(Dense(output_dim = 4, init = 'uniform', activation = 'softmax')) 
classifier.compile(optimizer = 'adam', loss = 'categorical_crossentropy', metrics = ['accuracy'])


y_train=np.array(convert_to_onehot(train_labels[:, i], 4))
y_val=np.array(convert_to_onehot(val_labels[:, i], 4))

max_length=train_indices.shape[1]

classifier.fit(train_indices, y_train, validation_data=(val_indices, y_val), batch_size = 128, epochs=20, verbose=2)

y_test=np.array(onehot_coversion(test_labels[:, i], 4))
scores = classifier.evaluate(test_indices, y_test, verbose=2)
y_test_pred = classifier.predict(test_indices, verbose=2)

classifier.evaluate(test_indices, y_test, verbose=2)

f1_test = f1(y_test, y_test_pred)

print("f1 score", f1_test)

print("ROC score", multiclass_roc_auc_score(test_labels[:, i], predict))

  if __name__ == '__main__':
  # Remove the CWD from sys.path while we load stuff.
  # This is added back by InteractiveShellApp.init_path()


Train on 105000 samples, validate on 7500 samples
Epoch 1/20
 - 1s - loss: 0.7533 - accuracy: 0.7571 - val_loss: 0.7427 - val_accuracy: 0.7599
Epoch 2/20
 - 1s - loss: 0.7291 - accuracy: 0.7642 - val_loss: 0.7292 - val_accuracy: 0.7599
Epoch 3/20
 - 1s - loss: 0.7248 - accuracy: 0.7642 - val_loss: 0.7244 - val_accuracy: 0.7599
Epoch 4/20
 - 1s - loss: 0.7213 - accuracy: 0.7642 - val_loss: 0.7246 - val_accuracy: 0.7599
Epoch 5/20
 - 1s - loss: 0.7173 - accuracy: 0.7642 - val_loss: 0.7258 - val_accuracy: 0.7599
Epoch 6/20
 - 1s - loss: 0.7152 - accuracy: 0.7642 - val_loss: 0.7262 - val_accuracy: 0.7599
Epoch 7/20
 - 1s - loss: 0.7132 - accuracy: 0.7642 - val_loss: 0.7262 - val_accuracy: 0.7599
Epoch 8/20
 - 1s - loss: 0.7115 - accuracy: 0.7642 - val_loss: 0.7245 - val_accuracy: 0.7599
Epoch 9/20
 - 1s - loss: 0.7103 - accuracy: 0.7642 - val_loss: 0.7293 - val_accuracy: 0.7599
Epoch 10/20
 - 1s - loss: 0.7099 - accuracy: 0.7642 - val_loss: 0.7312 - val_accuracy: 0.7599
Epoch 11/20
 - 1s -

  'precision', 'predicted', average, warn_for)


In [52]:
# this is to create the data files with train/test/val features as numbers from word matrix
# inspired from https://towardsdatascience.com/mapping-word-embeddings-with-word2vec-99a799dc9695

# train_features = np.zeros((105000, 400))
# val_features = np.zeros((7500, 400))
# test_features = np.zeros((7498, 400))

# for i in range(105000):
#     for j in range(400):
#         train_features[i, :] += embedding_matrix[int(train_2i[i, j]), :]


# for i in range(7500):
#     for j in range(400):
#         val_features[i, :] += embedding_matrix[int(val_2i[i, j]), :]


# for i in range(7498):
#     for j in range(400):
#         test_features[i, :] += embedding_matrix[int(test_2i[i, j]), :]

### Additional References
The metrics were calculated with the help of the guide on this website:

https://machinelearningmastery.com/how-to-calculate-precision-recall-f1-and-more-for-deep-learning-models/

https://pythonprogramming.net/words-as-features-nltk-tutorial/