In [1]:
from MyNLPToolBox import FilePickling as FP
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score, f1_score, recall_score, precision_score
import numpy as np

So in this experiment we're gonna fine-tune the best models we got from the experiment 3, which Logistic Regression with Logarithmic TFIDF Embedder, without removing special characters & stopwords in the corpus.

# Load the pre-embedded dataset

In [3]:
x_train = FP.load_obj('x_train_log')
x_test = FP.load_obj('x_test_log')
y_train = FP.load_obj('y_train')
y_test = FP.load_obj('y_test')

loaded obj/x_train_log.pkl
loaded obj/x_test_log.pkl
loaded obj/y_train.pkl
loaded obj/y_test.pkl


# Re-Perform Logistic Regression

In [4]:
model_lr = LogisticRegression().fit(x_train,y_train)
y_test_pred = model_lr.predict(x_test)
print('====\nLOGISTIC REGRESSION')
print('Accuracy: ', accuracy_score(y_test,y_test_pred))
print('Precision Score: ', precision_score(y_test,y_test_pred))
print('Recall Score: ', recall_score(y_test,y_test_pred))
print('F1 Score: ', f1_score(y_test,y_test_pred))



====
LOGISTIC REGRESSION
Accuracy:  0.8366373528096587
Precision Score:  0.8284383954154728
Recall Score:  0.789419795221843
F1 Score:  0.8084585809157638


Alright! The accuracy and recall score are quite high. However, now we're gonna finetune this model to make it even better!

# Hyperparameter Tuning
As sklearn use the threshold of 0.5, now we're gonna try different threshold to see what if the perfomance is gained on the test set

In [5]:
thresholds = np.arange(0.1,0.9,0.1) # All threshold from 0.1 -> 0.9 with step of 0.1
for thresh in thresholds:
    print('THRESHOLD ',thresh)
    y_test_pred = [1  if prob[1] > thresh else 0 for prob in model_lr.predict_proba(x_test)]
    print('Accuracy: ', accuracy_score(y_test,y_test_pred))
    print('Precision Score: ', precision_score(y_test,y_test_pred))
    print('Recall Score: ', recall_score(y_test,y_test_pred))
    print('F1 Score: ', f1_score(y_test,y_test_pred))
    print('---')

THRESHOLD  0.1
Accuracy:  0.820688627217171
Precision Score:  0.7446868801360159
Recall Score:  0.8969283276450511
F1 Score:  0.8137482582443103
---
THRESHOLD  0.2
Accuracy:  0.8303771053808318
Precision Score:  0.776885043263288
Recall Score:  0.8580204778156997
F1 Score:  0.8154395069737268
---
THRESHOLD  0.30000000000000004
Accuracy:  0.8339543896258756
Precision Score:  0.7959582790091264
Recall Score:  0.8334470989761092
F1 Score:  0.8142714238079359
---
THRESHOLD  0.4
Accuracy:  0.8373826203607095
Precision Score:  0.8137154554759468
Recall Score:  0.8139931740614335
F1 Score:  0.8138542910766081
---
THRESHOLD  0.5
Accuracy:  0.8366373528096587
Precision Score:  0.8284383954154728
Recall Score:  0.789419795221843
F1 Score:  0.8084585809157638
---
THRESHOLD  0.6
Accuracy:  0.8357430317483977
Precision Score:  0.8441265060240963
Recall Score:  0.7651877133105802
F1 Score:  0.8027210884353742
---
THRESHOLD  0.7000000000000001
Accuracy:  0.8345506036667164
Precision Score:  0.8602533

As we can see, the hyperparam is somewhere between 0.3 -> 0.5, we're going to investigate this range more specifically :D 

In [6]:
thresholds = np.arange(0.3,0.5,0.01) # All threshold from 0.3 -> 0.5 with step of 0.01
for thresh in thresholds:
    print('THRESHOLD ',thresh)
    y_test_pred = [1  if prob[1] > thresh else 0 for prob in model_lr.predict_proba(x_test)]
    print('Accuracy: ', accuracy_score(y_test,y_test_pred))
    print('Precision Score: ', precision_score(y_test,y_test_pred))
    print('Recall Score: ', recall_score(y_test,y_test_pred))
    print('F1 Score: ', f1_score(y_test,y_test_pred))
    print('---')

THRESHOLD  0.3
Accuracy:  0.8339543896258756
Precision Score:  0.7959582790091264
Recall Score:  0.8334470989761092
F1 Score:  0.8142714238079359
---
THRESHOLD  0.31
Accuracy:  0.8344015501565062
Precision Score:  0.7977086743044189
Recall Score:  0.831740614334471
F1 Score:  0.8143692564745196
---
THRESHOLD  0.32
Accuracy:  0.835146817707557
Precision Score:  0.8003952569169961
Recall Score:  0.8293515358361775
F1 Score:  0.8146161582299699
---
THRESHOLD  0.33
Accuracy:  0.8358920852586078
Precision Score:  0.8021143045920053
Recall Score:  0.8286689419795222
F1 Score:  0.8151754238710761
---
THRESHOLD  0.34
Accuracy:  0.8358920852586078
Precision Score:  0.8033167495854063
Recall Score:  0.8266211604095564
F1 Score:  0.814802354920101
---
THRESHOLD  0.35000000000000003
Accuracy:  0.8357430317483977
Precision Score:  0.8040585495675316
Recall Score:  0.8249146757679181
F1 Score:  0.8143530997304583
---
THRESHOLD  0.36000000000000004
Accuracy:  0.836041138768818
Precision Score:  0.805

So at the end our best threshold is 0.44 which gave us accuracy **0.8375** and F1 score of **0.8121** :D