### Baseline of the classification model using Universal Sentence Encoder(USE)

In [18]:
import os
import pickle
from sklearn.svm import SVC
from sklearn.metrics import *
from mlens.ensemble import SuperLearner
from sklearn.metrics import accuracy_score
from utility.utils import json_2_dataframe
from utility.utils import train_test_spliter
from utility.feature_utility import featurized_data
from utility.feature_utility import use_vectorizer
from classification.train import train_model, save_model

In [2]:
from warnings import simplefilter
simplefilter(action='ignore', category=FutureWarning)

In [3]:
from classification.eval import get_confusion_matrix
from classification.eval import get_classfication_report

## data reader and split into train and test data

In [4]:
dataset = json_2_dataframe('../data/ChatbotCorpus.json')
splited_data  = train_test_spliter(dataset)

### preparing text data for classification

In [5]:
X_train, X_test, y_train, y_test = featurized_data(splited_data, 'use')

W0830 10:47:33.727248 139880529274688 deprecation_wrapper.py:119] From /home/genesis/projects/misc/vp/verloop/notebook/utility/feature_utility.py:20: The name tf.Session is deprecated. Please use tf.compat.v1.Session instead.

W0830 10:47:33.773461 139880529274688 deprecation_wrapper.py:119] From /home/genesis/projects/misc/vp/verloop/notebook/utility/feature_utility.py:21: The name tf.global_variables_initializer is deprecated. Please use tf.compat.v1.global_variables_initializer instead.

W0830 10:47:33.777262 139880529274688 deprecation_wrapper.py:119] From /home/genesis/projects/misc/vp/verloop/notebook/utility/feature_utility.py:21: The name tf.tables_initializer is deprecated. Please use tf.compat.v1.tables_initializer instead.



In [8]:
def predict_sample(model, sample, feature='use'):
    if isinstance(sample,str):
        sample = [sample]
    sample_vector = use_vectorizer(sample)
    return model.predict(sample_vector)
    
def get_false_positive(model, data, true_lable):
    text = data
    model_pred = predict_sample(model, data)
    print("false positive sample")
    for count , (actual,predict) in enumerate(zip(true_lable, model_pred)):
        if actual == 'FindConnection' and predict == 'DepartureTime':
            print(f"sentence : {text[count]}\nActual Label : {actual}\tPredict Label : {predict}\n\n")
            
def get_false_negative(model, data, true_lable):
    text = data
    model_pred = predict_sample(model, data)
    print("false negative sample")
    for count , (actual,predict) in enumerate(zip(true_lable, model_pred)):
        if actual == 'DepartureTime' and predict == 'FindConnection':
            print(f"sentence : {text[count]}\nActual Label : {actual}\tPredict Label : {predict}\n\n")
            

### LogisticRegression 

In [9]:
clf_model = train_model('logistic',X_train, y_train)
get_confusion_matrix(clf_model,X_test, y_test)
get_classfication_report(clf_model, X_test, y_test)
save_model(model = clf_model,filepath = 'model/classification/logistic_use.sav')

Confusion Matrix :

[[31  4]
 [ 3 68]]

 Classification Report :

                precision    recall  f1-score   support

 DepartureTime       0.91      0.89      0.90        35
FindConnection       0.94      0.96      0.95        71

     micro avg       0.93      0.93      0.93       106
     macro avg       0.93      0.92      0.92       106
  weighted avg       0.93      0.93      0.93       106





#### Observation:
    - Model did properly generalized.
    - As compare to logistic regression using TF-IDF, use is giving better result

### DecisionTree

In [11]:
clf_model = train_model('decision_tree',X_train, y_train)
get_confusion_matrix(clf_model,X_test, y_test)
get_classfication_report(clf_model, X_test, y_test)
save_model(model = clf_model,filepath = 'model/classification/decision_tree_use.sav')

Confusion Matrix :

[[27  8]
 [ 5 66]]

 Classification Report :

                precision    recall  f1-score   support

 DepartureTime       0.84      0.77      0.81        35
FindConnection       0.89      0.93      0.91        71

     micro avg       0.88      0.88      0.88       106
     macro avg       0.87      0.85      0.86       106
  weighted avg       0.88      0.88      0.88       106





#### Observation:
    - Decision tree classification model is not as good as Logestic regression.
    - Its actully increased false postive and false negative



### knn

In [12]:
clf_model = train_model('knn',X_train, y_train)
get_confusion_matrix(clf_model,X_test, y_test)
get_classfication_report(clf_model, X_test, y_test)
save_model(model = clf_model,filepath = 'model/classification/knn_use.sav')

Confusion Matrix :

[[32  3]
 [ 3 68]]

 Classification Report :

                precision    recall  f1-score   support

 DepartureTime       0.91      0.91      0.91        35
FindConnection       0.96      0.96      0.96        71

     micro avg       0.94      0.94      0.94       106
     macro avg       0.94      0.94      0.94       106
  weighted avg       0.94      0.94      0.94       106



#### Observation:
    - Model is better as compare to logistic, knn and Decison tree.

### Random Forest

In [15]:
clf_model = train_model('random_forest',X_train, y_train)
get_confusion_matrix(clf_model,X_test, y_test)
get_classfication_report(clf_model, X_test, y_test)
save_model(model = clf_model,filepath = 'model/classification/random_forest_use.sav')

Confusion Matrix :

[[32  3]
 [ 3 68]]

 Classification Report :

                precision    recall  f1-score   support

 DepartureTime       0.91      0.91      0.91        35
FindConnection       0.96      0.96      0.96        71

     micro avg       0.94      0.94      0.94       106
     macro avg       0.94      0.94      0.94       106
  weighted avg       0.94      0.94      0.94       106





#### Observation
    - It is not working well with small amount of data.
    - Error rate is more in predicting departuretime category.

### SVC

In [21]:
clf_model = train_model('svm',X_train, y_train)
get_confusion_matrix(clf_model,X_test, y_test)
get_classfication_report(clf_model, X_test, y_test)
save_model(model = clf_model,filepath = 'model/classification/svc_use.sav')

Confusion Matrix :

[[31  4]
 [ 4 67]]

 Classification Report :

                precision    recall  f1-score   support

 DepartureTime       0.89      0.89      0.89        35
FindConnection       0.94      0.94      0.94        71

     micro avg       0.92      0.92      0.92       106
     macro avg       0.91      0.91      0.91       106
  weighted avg       0.92      0.92      0.92       106





In [19]:
get_false_positive(clf_model, splited_data.test.text.values, y_test)

false positive sample
sentence : when is the next train from winterstraße 12 to kieferngarten
Actual Label : FindConnection	Predict Label : DepartureTime


sentence : when is the next rocket from winterstraße 12 to kieferngarte
Actual Label : FindConnection	Predict Label : DepartureTime


sentence : when is the train from garching to marienplatz
Actual Label : FindConnection	Predict Label : DepartureTime


sentence : take me to the airport
Actual Label : FindConnection	Predict Label : DepartureTime




In [20]:
get_false_negative(clf_model, splited_data.test.text.values, y_test)

false negative sample
sentence : what is the next train from münchner freiheit
Actual Label : DepartureTime	Predict Label : FindConnection


sentence : or depart from garching
Actual Label : DepartureTime	Predict Label : FindConnection


sentence : depart in garching, i assume
Actual Label : DepartureTime	Predict Label : FindConnection


sentence : next train from muenchen freicheit
Actual Label : DepartureTime	Predict Label : FindConnection




In [27]:
predict_sample(clf_model,['when is it going','is it possible by truck'])

array(['DepartureTime', 'FindConnection'], dtype=object)

#### Observation
    - The SVC using TF-IDF is much better than as current model.
    - To imporve the current modle we need more data.
#### Reason:
##### Why TF-IDF works better as compared to USE
    - one possilbe reason, samples in both categories looks pretty similary in terms of meaning

#### Pros and cons with TF-IDF
    - With small data it works pretty decent.
    - When new data comes, where all the words not present in current TF-IDF vocab,
      it will fail to detect proper category (TF-IDF didn't capature ).
    - Data is very less, TF-IDF model will become extreme word sensitive.
