### Pelatihan dan pengujian (Revisi)

#### Daftar Isi:
- [1. Pemodelan SVM](#1-pemodelan-svm)
- [2. Pengukuran Performa (Pelatihan)](#2-pengukuran-performa-pelatihan)
- [3. Pengujian](#3-pengujian)

***

In [1]:
from IPython.core.interactiveshell import InteractiveShell
import warnings
import sklearnex

InteractiveShell.ast_node_interactivity = "all"
sklearnex.patch_sklearn()
warnings.filterwarnings("ignore")

%load_ext watermark
%watermark -a "F. Waskito" -n -t -u -v

Author: F. Waskito

Last updated: Tue Jan 23 2024 12:02:55

Python implementation: CPython
Python version       : 3.9.18
IPython version      : 8.15.0



Intel(R) Extension for Scikit-learn* enabled (https://github.com/intel/scikit-learn-intelex)


In [2]:
import pandas
from collection import analysis

path = "data/tweet/depresi_or_bipolar_tweets_id_01-10[clean].csv"
tweets_table = pandas.read_csv(path)
texts = tweets_table.loc[:, "Text"].copy().to_list()
labels = tweets_table.loc[:, "Sentiment"].copy().to_list()

analysis.get_shape(texts)
analysis.get_distribution(labels)

Shape: (3409,)
Distribution:
	('positive', 864)
	('neutral', 1253)
	('negative', 1292)


Set data yang digunakan telah terlebih dahulu dilakukan praoperasi numerik atasnya demi memangkas waktu prapemrosesan secara keseluruhan.

In [3]:
tweets_table.head(5)
tweets_table.tail(5)

Unnamed: 0,Tweet_ID,Datetime,Username,Text,Sentiment
0,1.520554e+18,2022-05-01 00:00:12+00:00,yfnasa,ilux kelompok main taman anak arti begitu pas ...,positive
1,1.520561e+18,2022-05-01 00:30:46+00:00,SoleilLumina,et ikut akun kutip depresi suka gelisah puisi ...,neutral
2,1.520563e+18,2022-05-01 00:36:48+00:00,petitegeeky,rossy tahun laut betah pas pulang depresi kaya...,neutral
3,1.520565e+18,2022-05-01 00:43:40+00:00,raniapj,jarang nimbrung stres depresi sulit gaul tamba...,neutral
4,1.520565e+18,2022-05-01 00:43:50+00:00,Jawaban,depresi orang kristen laku atas link stressawa...,neutral


Unnamed: 0,Tweet_ID,Datetime,Username,Text,Sentiment
3404,1.524168e+18,2022-05-10 23:21:56+00:00,puci_chuu,depresi hasil apa laku,neutral
3405,1.52417e+18,2022-05-10 23:30:51+00:00,_ilhammmmmm,pulang kota makassar milik tekan depresi ibu s...,negative
3406,1.52417e+18,2022-05-10 23:31:44+00:00,detikhot,medina zein rawat rumah sakit ganggu bipolar i...,neutral
3407,1.524175e+18,2022-05-10 23:48:41+00:00,mournchild,mahasiswa psikolog depresi baca catat materi,neutral
3408,1.524177e+18,2022-05-10 23:57:02+00:00,ramaperput,utang emang bikin tagih kes mudah uang dapat a...,negative


Praoperasi Numerik

_skip_

Ekstraksi Fitur

In [3]:
from preprocess.feature.extraction import TextVectorizer

In [5]:
# BOW
extractor = TextVectorizer(texts)
extractor.transform(target="bow", min_df=2, norm=True)
vector_texts = extractor.vectors

analysis.get_shape(vector_texts)

Shape: (3409, 2202)


In [4]:
# TF-IDF
extractor = TextVectorizer(texts)
extractor.transform(target="tfidf", min_df=2, norm=True, smooth_idf=True)
vector_texts = extractor.vectors

analysis.get_shape(vector_texts)

Shape: (3409, 2202)


Transformasi Label

In [5]:
from preprocess.encoding import LabelEncoder

In [6]:
encoder = LabelEncoder(labels)
encoder.transform(target="integer")
encoded_labels = encoder.encoded_labels

analysis.get_distribution(encoded_labels)

Distribution:
	(2, 864)
	(1, 1253)
	(0, 1292)


Seprasi Data

In [7]:
from sklearn.model_selection import train_test_split

In [8]:
X_train, X_test, y_train, y_test = train_test_split(
    vector_texts,
    encoded_labels,
    test_size = 0.3,
    random_state = 42
)

print("> Train set:")
analysis.get_shape(X_train)
analysis.get_distribution(y_train)
print("\n> Test set:")
analysis.get_shape(X_test)
analysis.get_distribution(y_test)

> Train set:
Shape: (2386, 2202)
Distribution:
	(1, 866)
	(2, 619)
	(0, 901)

> Test set:
Shape: (1023, 2202)
Distribution:
	(0, 391)
	(1, 387)
	(2, 245)


***

### __1. Pemodelan SVM__

Beralih ke:
- [Daftar Isi](#daftar-isi)
- [2. Pengukuran Performa (Pelatihan)](#2-pengukuran-performa-pelatihan)

In [9]:
from sklearn.svm import SVC

#### 1.1 Kernel Linear

In [10]:
linear_svm = SVC(kernel="linear")

In [11]:
linear_params = {
    "C": [
        1.0, 10.0, 100.0, 1000.0, 10000.0,
    ],
}

#### 1.2 Kernel RBF

In [18]:
rbf_svm = SVC(kernel="rbf")

In [19]:
rbf_params = {
    "C": [
        1.0, 10.0, 100.0, 1000.0, 10000.0,
    ],
    "gamma": [
        0.01, 0.1, 1.0,
    ],
}

#### 1.3 Kernel Polinomial

In [23]:
poly_svm = SVC(kernel="poly")

In [24]:
poly_params = {
    "C": [
        1.0, 10.0, 100.0, 1000.0, 10000.0,
    ],
    "gamma": [
        0.01, 0.1, 1.0,
    ],
    "degree": [
        3, 6, 9,
    ]
}

### __2. Pengukuran Performa (Pelatihan)__

Beralih ke:
- [1. Pemodelan SVM](#1-pemodelan-svm)
- [2.1 Pengukuran Performa: SVM + BoW](#21-pengukuran-performa-svm--bow)

In [13]:
from model.tuning import SVMGridSearchCV

In [14]:
n_fold = 10
scoring = ["accuracy", "precision", "recall", "f1",]
random_state = 42
dir = f"data/result/train/"

In [31]:
linear_train = SVMGridSearchCV(
    model= linear_svm,
    params= linear_params,
    cv= n_fold,
    scoring= scoring,
    scoring_avg= "macro",
    random_state = random_state,
)

In [15]:
rbf_train = SVMGridSearchCV(
    model= rbf_svm,
    params= rbf_params,
    cv= n_fold,
    scoring= scoring,
    scoring_avg= "macro",
    random_state = random_state,
)

In [15]:
poly_train = SVMGridSearchCV(
    model= poly_svm,
    params= poly_params,
    cv= n_fold,
    scoring= scoring,
    scoring_avg= "macro",
    random_state = random_state,
)

#### 2.1 Pengukuran Performa: SVM + BOW
Beralih ke:
- [2. Pengukuran Performa (Pelatihan)](#2-pengukuran-performa-pelatihan)
- [2.2 Pengukuran Performa: SVM + TF-IDF](#22-pengukuran-performa-svm--tf-idf)

##### 2.1.1 Performa Pelatihan Linear-BoW

In [16]:
linear_train.fit(X_train, y_train) # k = 5

Number of folds: 5
Total combination of parameters: 5


i=0|{'kernel': 'linear', 'C': 1.0}  {'mean_accuracy': 0.673, 'mean_precision': 0.663, 'mean_recall': 0.659, 'mean_f1': 0.655}
i=1|{'kernel': 'linear', 'C': 10.0}  {'mean_accuracy': 0.676, 'mean_precision': 0.665, 'mean_recall': 0.664, 'mean_f1': 0.664}
i=2|{'kernel': 'linear', 'C': 100.0}  {'mean_accuracy': 0.648, 'mean_precision': 0.637, 'mean_recall': 0.634, 'mean_f1': 0.635}
i=3|{'kernel': 'linear', 'C': 1000.0}  {'mean_accuracy': 0.637, 'mean_precision': 0.627, 'mean_recall': 0.623, 'mean_f1': 0.624}
i=4|{'kernel': 'linear', 'C': 10000.0}  {'mean_accuracy': 0.625, 'mean_precision': 0.616, 'mean_recall': 0.611, 'mean_f1': 0.612}


In [17]:
print(f"Best train (k={n_fold}) result of Linear-BoW:")
linear_train.get_best_result(base_on="accuracy")

Best train (k=5) result of Linear-BoW:


[{'params': {'kernel': 'linear', 'C': 10.0},
  'scores': {'mean_accuracy': 0.676,
   'mean_precision': 0.665,
   'mean_recall': 0.664,
   'mean_f1': 0.664}}]

In [21]:
linear_train.fit(X_train, y_train) # k = 10

Number of folds: 10
Total combination of parameters: 5


i=0|{'kernel': 'linear', 'C': 1.0}  {'mean_accuracy': 0.685, 'mean_precision': 0.678, 'mean_recall': 0.673, 'mean_f1': 0.67}
i=1|{'kernel': 'linear', 'C': 10.0}  {'mean_accuracy': 0.689, 'mean_precision': 0.68, 'mean_recall': 0.678, 'mean_f1': 0.677}
i=2|{'kernel': 'linear', 'C': 100.0}  {'mean_accuracy': 0.654, 'mean_precision': 0.643, 'mean_recall': 0.64, 'mean_f1': 0.64}
i=3|{'kernel': 'linear', 'C': 1000.0}  {'mean_accuracy': 0.645, 'mean_precision': 0.638, 'mean_recall': 0.632, 'mean_f1': 0.632}
i=4|{'kernel': 'linear', 'C': 10000.0}  {'mean_accuracy': 0.629, 'mean_precision': 0.62, 'mean_recall': 0.615, 'mean_f1': 0.615}


In [23]:
print(f"Best train (k={n_fold}) result of Linear-BoW:")
linear_train.get_best_result(base_on="accuracy")

Best train (k=10) result of Linear-BoW:


[{'params': {'kernel': 'linear', 'C': 10.0},
  'scores': {'mean_accuracy': 0.689,
   'mean_precision': 0.68,
   'mean_recall': 0.678,
   'mean_f1': 0.677}}]

In [24]:
# Save the result
res_table = linear_train.get_table_result()
res_best_table = linear_train.get_best_table_result(base_on="accuracy")

res_table.to_csv(dir+f"train_k{n_fold}_linear_bow.csv", index=False)
res_best_table.to_csv(dir+f"train_k{n_fold}_linear_bow[best].csv", index=False)

##### 2.1.2 Performa Pelatihan RBF-BoW

In [29]:
rbf_train.fit(X_train, y_train) # k = 5

Number of folds: 5
Total combination of parameters: 15


i=0|{'kernel': 'rbf', 'C': 1.0, 'gamma': 0.01}  {'mean_accuracy': 0.527, 'mean_precision': 0.593, 'mean_recall': 0.507, 'mean_f1': 0.471}
i=1|{'kernel': 'rbf', 'C': 1.0, 'gamma': 0.1}  {'mean_accuracy': 0.642, 'mean_precision': 0.646, 'mean_recall': 0.621, 'mean_f1': 0.618}
i=2|{'kernel': 'rbf', 'C': 1.0, 'gamma': 1.0}  {'mean_accuracy': 0.576, 'mean_precision': 0.615, 'mean_recall': 0.533, 'mean_f1': 0.513}
i=3|{'kernel': 'rbf', 'C': 10.0, 'gamma': 0.01}  {'mean_accuracy': 0.657, 'mean_precision': 0.665, 'mean_recall': 0.641, 'mean_f1': 0.636}
i=4|{'kernel': 'rbf', 'C': 10.0, 'gamma': 0.1}  {'mean_accuracy': 0.677, 'mean_precision': 0.666, 'mean_recall': 0.658, 'mean_f1': 0.659}
i=5|{'kernel': 'rbf', 'C': 10.0, 'gamma': 1.0}  {'mean_accuracy': 0.583, 'mean_precision': 0.614, 'mean_recall': 0.54, 'mean_f1': 0.521}
i=6|{'kernel': 'rbf', 'C': 100.0, 'gamma': 0.01}  {'mean_accuracy': 0.674, 'mean_precision': 0.663, 'mean_recall': 0.659, 'mean_f1': 0.658}
i=7|{'kernel': 'rbf', 'C': 100.0, 

In [30]:
print(f"Best train (k={n_fold}) result of RBF-BoW:")
rbf_train.get_best_result(base_on="accuracy")

Best train (k=5) result of RBF-BoW:


[{'params': {'kernel': 'rbf', 'C': 10.0, 'gamma': 0.1},
  'scores': {'mean_accuracy': 0.677,
   'mean_precision': 0.666,
   'mean_recall': 0.658,
   'mean_f1': 0.659}}]

In [34]:
rbf_train.fit(X_train, y_train) # k = 10

Number of folds: 10
Total combination of parameters: 15


i=0|{'kernel': 'rbf', 'C': 1.0, 'gamma': 0.01}  {'mean_accuracy': 0.544, 'mean_precision': 0.61, 'mean_recall': 0.523, 'mean_f1': 0.494}
i=1|{'kernel': 'rbf', 'C': 1.0, 'gamma': 0.1}  {'mean_accuracy': 0.655, 'mean_precision': 0.659, 'mean_recall': 0.638, 'mean_f1': 0.636}
i=2|{'kernel': 'rbf', 'C': 1.0, 'gamma': 1.0}  {'mean_accuracy': 0.579, 'mean_precision': 0.627, 'mean_recall': 0.537, 'mean_f1': 0.519}
i=3|{'kernel': 'rbf', 'C': 10.0, 'gamma': 0.01}  {'mean_accuracy': 0.663, 'mean_precision': 0.67, 'mean_recall': 0.649, 'mean_f1': 0.644}
i=4|{'kernel': 'rbf', 'C': 10.0, 'gamma': 0.1}  {'mean_accuracy': 0.691, 'mean_precision': 0.682, 'mean_recall': 0.674, 'mean_f1': 0.674}
i=5|{'kernel': 'rbf', 'C': 10.0, 'gamma': 1.0}  {'mean_accuracy': 0.588, 'mean_precision': 0.625, 'mean_recall': 0.545, 'mean_f1': 0.527}
i=6|{'kernel': 'rbf', 'C': 100.0, 'gamma': 0.01}  {'mean_accuracy': 0.689, 'mean_precision': 0.681, 'mean_recall': 0.675, 'mean_f1': 0.674}
i=7|{'kernel': 'rbf', 'C': 100.0, '

In [35]:
print(f"Best train (k={n_fold}) result of RBF-BoW:")
rbf_train.get_best_result(base_on="accuracy")

Best train (k=10) result of RBF-BoW:


[{'params': {'kernel': 'rbf', 'C': 10.0, 'gamma': 0.1},
  'scores': {'mean_accuracy': 0.691,
   'mean_precision': 0.682,
   'mean_recall': 0.674,
   'mean_f1': 0.674}}]

In [36]:
# Save the result
res_table = rbf_train.get_table_result()
res_best_table = rbf_train.get_best_table_result(base_on="accuracy")

res_table.to_csv(dir+f"train_k{n_fold}_rbf_bow.csv", index=False)
res_best_table.to_csv(dir+f"train_k{n_fold}_rbf_bow[best].csv", index=False)

##### 2.1.3 Performa Pelatihan Polinomial-BoW

In [42]:
poly_train.fit(X_train, y_train) # k = 5

Number of folds: 5
Total combination of parameters: 45


i=0|{'kernel': 'poly', 'C': 1.0, 'gamma': 0.01, 'degree': 3}  {'mean_accuracy': 0.365, 'mean_precision': 0.305, 'mean_recall': 0.336, 'mean_f1': 0.182}
i=1|{'kernel': 'poly', 'C': 1.0, 'gamma': 0.01, 'degree': 6}  {'mean_accuracy': 0.363, 'mean_precision': 0.188, 'mean_recall': 0.334, 'mean_f1': 0.178}
i=2|{'kernel': 'poly', 'C': 1.0, 'gamma': 0.01, 'degree': 9}  {'mean_accuracy': 0.363, 'mean_precision': 0.188, 'mean_recall': 0.334, 'mean_f1': 0.178}
i=3|{'kernel': 'poly', 'C': 1.0, 'gamma': 0.1, 'degree': 3}  {'mean_accuracy': 0.368, 'mean_precision': 0.566, 'mean_recall': 0.338, 'mean_f1': 0.188}
i=4|{'kernel': 'poly', 'C': 1.0, 'gamma': 0.1, 'degree': 6}  {'mean_accuracy': 0.364, 'mean_precision': 0.321, 'mean_recall': 0.334, 'mean_f1': 0.18}
i=5|{'kernel': 'poly', 'C': 1.0, 'gamma': 0.1, 'degree': 9}  {'mean_accuracy': 0.364, 'mean_precision': 0.321, 'mean_recall': 0.334, 'mean_f1': 0.18}
i=6|{'kernel': 'poly', 'C': 1.0, 'gamma': 1.0, 'degree': 3}  {'mean_accuracy': 0.491, 'mean_p

In [43]:
print(f"Best train (k={n_fold}) result of Poly-BoW:")
poly_train.get_best_result(base_on="accuracy")

Best train (k=5) result of Poly-BoW:


[{'params': {'kernel': 'poly', 'C': 100.0, 'gamma': 1.0, 'degree': 3},
  'scores': {'mean_accuracy': 0.593,
   'mean_precision': 0.592,
   'mean_recall': 0.576,
   'mean_f1': 0.567}}]

In [17]:
poly_train.fit(X_train, y_train) # k = 10

Number of folds: 10
Total combination of parameters: 45


i=0|{'kernel': 'poly', 'C': 1.0, 'gamma': 0.01, 'degree': 3}  {'mean_accuracy': 0.367, 'mean_precision': 0.271, 'mean_recall': 0.337, 'mean_f1': 0.186}
i=1|{'kernel': 'poly', 'C': 1.0, 'gamma': 0.01, 'degree': 6}  {'mean_accuracy': 0.363, 'mean_precision': 0.154, 'mean_recall': 0.334, 'mean_f1': 0.178}
i=2|{'kernel': 'poly', 'C': 1.0, 'gamma': 0.01, 'degree': 9}  {'mean_accuracy': 0.363, 'mean_precision': 0.154, 'mean_recall': 0.334, 'mean_f1': 0.178}
i=3|{'kernel': 'poly', 'C': 1.0, 'gamma': 0.1, 'degree': 3}  {'mean_accuracy': 0.368, 'mean_precision': 0.405, 'mean_recall': 0.338, 'mean_f1': 0.188}
i=4|{'kernel': 'poly', 'C': 1.0, 'gamma': 0.1, 'degree': 6}  {'mean_accuracy': 0.364, 'mean_precision': 0.221, 'mean_recall': 0.334, 'mean_f1': 0.18}
i=5|{'kernel': 'poly', 'C': 1.0, 'gamma': 0.1, 'degree': 9}  {'mean_accuracy': 0.364, 'mean_precision': 0.221, 'mean_recall': 0.334, 'mean_f1': 0.18}
i=6|{'kernel': 'poly', 'C': 1.0, 'gamma': 1.0, 'degree': 3}  {'mean_accuracy': 0.503, 'mean_p

In [18]:
print(f"Best train (k={n_fold}) result of Poly-BoW:")
poly_train.get_best_result(base_on="accuracy")

Best train (k=10) result of Poly-BoW:


[{'params': {'kernel': 'poly', 'C': 100.0, 'gamma': 1.0, 'degree': 3},
  'scores': {'mean_accuracy': 0.609,
   'mean_precision': 0.605,
   'mean_recall': 0.59,
   'mean_f1': 0.582}}]

In [19]:
# Save the result
res_table = poly_train.get_table_result()
res_best_table = poly_train.get_best_table_result()

res_table.to_csv(dir+f"train_k{n_fold}_poly_bow.csv", index=False)
res_best_table.to_csv(dir+f"train_k{n_fold}_poly_bow[best].csv", index=False)

#### 2.2 Pengukuran Performa: SVM + TF-IDF
Beralih ke:
- [2.1 Pengukuran Performa: SVM + BoW](#21-pengukuran-performa-svm--bow)
- [3. Pengujian](#3-pengujian)

##### 2.2.1 Performa Pelatihan Linear-TFIDF

In [27]:
linear_train.fit(X_train, y_train) # k = 5

Number of folds: 5
Total combination of parameters: 5


i=0|{'kernel': 'linear', 'C': 1.0}  {'mean_accuracy': 0.698, 'mean_precision': 0.687, 'mean_recall': 0.685, 'mean_f1': 0.686}
i=1|{'kernel': 'linear', 'C': 10.0}  {'mean_accuracy': 0.676, 'mean_precision': 0.668, 'mean_recall': 0.663, 'mean_f1': 0.665}
i=2|{'kernel': 'linear', 'C': 100.0}  {'mean_accuracy': 0.651, 'mean_precision': 0.643, 'mean_recall': 0.638, 'mean_f1': 0.639}
i=3|{'kernel': 'linear', 'C': 1000.0}  {'mean_accuracy': 0.627, 'mean_precision': 0.621, 'mean_recall': 0.614, 'mean_f1': 0.615}
i=4|{'kernel': 'linear', 'C': 10000.0}  {'mean_accuracy': 0.604, 'mean_precision': 0.606, 'mean_recall': 0.592, 'mean_f1': 0.593}


In [28]:
print(f"Best train (k={n_fold}) result of Linear-TFIDF:")
poly_train.get_best_result(base_on="accuracy")

Best train (k=5) result of Linear-TFIDF:


[{'params': {'kernel': 'poly', 'C': 100.0, 'gamma': 1.0, 'degree': 3},
  'scores': {'mean_accuracy': 0.609,
   'mean_precision': 0.605,
   'mean_recall': 0.59,
   'mean_f1': 0.582}}]

In [32]:
linear_train.fit(X_train, y_train) # k = 10

Number of folds: 10
Total combination of parameters: 5


i=0|{'kernel': 'linear', 'C': 1.0}  {'mean_accuracy': 0.721, 'mean_precision': 0.713, 'mean_recall': 0.711, 'mean_f1': 0.71}
i=1|{'kernel': 'linear', 'C': 10.0}  {'mean_accuracy': 0.686, 'mean_precision': 0.679, 'mean_recall': 0.673, 'mean_f1': 0.673}
i=2|{'kernel': 'linear', 'C': 100.0}  {'mean_accuracy': 0.647, 'mean_precision': 0.639, 'mean_recall': 0.633, 'mean_f1': 0.633}
i=3|{'kernel': 'linear', 'C': 1000.0}  {'mean_accuracy': 0.648, 'mean_precision': 0.639, 'mean_recall': 0.632, 'mean_f1': 0.633}
i=4|{'kernel': 'linear', 'C': 10000.0}  {'mean_accuracy': 0.638, 'mean_precision': 0.634, 'mean_recall': 0.623, 'mean_f1': 0.623}


In [33]:
print(f"Best train (k={n_fold}) result of Linear-TFIDF:")
poly_train.get_best_result(base_on="accuracy")

Best train (k=10) result of Linear-TFIDF:


[{'params': {'kernel': 'poly', 'C': 100.0, 'gamma': 1.0, 'degree': 3},
  'scores': {'mean_accuracy': 0.609,
   'mean_precision': 0.605,
   'mean_recall': 0.59,
   'mean_f1': 0.582}}]

In [34]:
# Save the result
res_table = linear_train.get_table_result()
res_best_table = linear_train.get_best_table_result(base_on="accuracy")

res_table.to_csv(dir+f"train_k{n_fold}_linear_tfidf.csv", index=False)
res_best_table.to_csv(dir+f"train_k{n_fold}_linear_tfidf[best].csv", index=False)

##### 2.2.2 Performa Pelatihan RBF-TFIDF

In [41]:
rbf_train.fit(X_train, y_train) # k = 5

Number of folds: 5
Total combination of parameters: 15


i=0|{'kernel': 'rbf', 'C': 1.0, 'gamma': 0.01}  {'mean_accuracy': 0.47, 'mean_precision': 0.662, 'mean_recall': 0.476, 'mean_f1': 0.407}
i=1|{'kernel': 'rbf', 'C': 1.0, 'gamma': 0.1}  {'mean_accuracy': 0.673, 'mean_precision': 0.677, 'mean_recall': 0.667, 'mean_f1': 0.667}
i=2|{'kernel': 'rbf', 'C': 1.0, 'gamma': 1.0}  {'mean_accuracy': 0.709, 'mean_precision': 0.717, 'mean_recall': 0.683, 'mean_f1': 0.686}
i=3|{'kernel': 'rbf', 'C': 10.0, 'gamma': 0.01}  {'mean_accuracy': 0.68, 'mean_precision': 0.68, 'mean_recall': 0.676, 'mean_f1': 0.674}
i=4|{'kernel': 'rbf', 'C': 10.0, 'gamma': 0.1}  {'mean_accuracy': 0.704, 'mean_precision': 0.695, 'mean_recall': 0.688, 'mean_f1': 0.69}
i=5|{'kernel': 'rbf', 'C': 10.0, 'gamma': 1.0}  {'mean_accuracy': 0.711, 'mean_precision': 0.71, 'mean_recall': 0.69, 'mean_f1': 0.693}
i=6|{'kernel': 'rbf', 'C': 100.0, 'gamma': 0.01}  {'mean_accuracy': 0.7, 'mean_precision': 0.69, 'mean_recall': 0.685, 'mean_f1': 0.686}
i=7|{'kernel': 'rbf', 'C': 100.0, 'gamma':

In [42]:
print(f"Best train (k={n_fold}) result of RBF-TFIDF:")
rbf_train.get_best_result(base_on="accuracy")

Best train (k=5) result of RBF-TFIDF:


[{'params': {'kernel': 'rbf', 'C': 100.0, 'gamma': 1.0},
  'scores': {'mean_accuracy': 0.712,
   'mean_precision': 0.711,
   'mean_recall': 0.691,
   'mean_f1': 0.694}}]

In [16]:
rbf_train.fit(X_train, y_train) # k = 10

Number of folds: 10
Total combination of parameters: 15


i=0|{'kernel': 'rbf', 'C': 1.0, 'gamma': 0.01}  {'mean_accuracy': 0.499, 'mean_precision': 0.663, 'mean_recall': 0.511, 'mean_f1': 0.444}
i=1|{'kernel': 'rbf', 'C': 1.0, 'gamma': 0.1}  {'mean_accuracy': 0.693, 'mean_precision': 0.692, 'mean_recall': 0.688, 'mean_f1': 0.686}
i=2|{'kernel': 'rbf', 'C': 1.0, 'gamma': 1.0}  {'mean_accuracy': 0.725, 'mean_precision': 0.736, 'mean_recall': 0.701, 'mean_f1': 0.704}
i=3|{'kernel': 'rbf', 'C': 10.0, 'gamma': 0.01}  {'mean_accuracy': 0.697, 'mean_precision': 0.694, 'mean_recall': 0.695, 'mean_f1': 0.691}
i=4|{'kernel': 'rbf', 'C': 10.0, 'gamma': 0.1}  {'mean_accuracy': 0.719, 'mean_precision': 0.711, 'mean_recall': 0.705, 'mean_f1': 0.706}
i=5|{'kernel': 'rbf', 'C': 10.0, 'gamma': 1.0}  {'mean_accuracy': 0.731, 'mean_precision': 0.728, 'mean_recall': 0.712, 'mean_f1': 0.714}
i=6|{'kernel': 'rbf', 'C': 100.0, 'gamma': 0.01}  {'mean_accuracy': 0.713, 'mean_precision': 0.704, 'mean_recall': 0.699, 'mean_f1': 0.7}
i=7|{'kernel': 'rbf', 'C': 100.0, '

In [17]:
print(f"Best train (k={n_fold}) result of RBF-TFIDF:")
rbf_train.get_best_result(base_on="accuracy")

Best train (k=10) result of RBF-TFIDF:


[{'params': {'kernel': 'rbf', 'C': 10.0, 'gamma': 1.0},
  'scores': {'mean_accuracy': 0.731,
   'mean_precision': 0.728,
   'mean_recall': 0.712,
   'mean_f1': 0.714}}]

In [18]:
# Save the result
res_table = rbf_train.get_table_result()
res_best_table = rbf_train.get_best_table_result(base_on="accuracy")

res_table.to_csv(dir+f"train_k{n_fold}_rbf_tfidf.csv", index=False)
res_best_table.to_csv(dir+f"train_k{n_fold}_rbf_tfidf[best].csv", index=False)

##### 2.2.3 Performa Pelatihan Polinomial-TFIDF

In [23]:
poly_train.fit(X_train, y_train) # k = 5

Number of folds: 5
Total combination of parameters: 45


i=0|{'kernel': 'poly', 'C': 1.0, 'gamma': 0.01, 'degree': 3}  {'mean_accuracy': 0.432, 'mean_precision': 0.758, 'mean_recall': 0.388, 'mean_f1': 0.291}
i=1|{'kernel': 'poly', 'C': 1.0, 'gamma': 0.01, 'degree': 6}  {'mean_accuracy': 0.43, 'mean_precision': 0.771, 'mean_recall': 0.386, 'mean_f1': 0.286}
i=2|{'kernel': 'poly', 'C': 1.0, 'gamma': 0.01, 'degree': 9}  {'mean_accuracy': 0.553, 'mean_precision': 0.559, 'mean_recall': 0.538, 'mean_f1': 0.534}
i=3|{'kernel': 'poly', 'C': 1.0, 'gamma': 0.1, 'degree': 3}  {'mean_accuracy': 0.432, 'mean_precision': 0.758, 'mean_recall': 0.388, 'mean_f1': 0.291}
i=4|{'kernel': 'poly', 'C': 1.0, 'gamma': 0.1, 'degree': 6}  {'mean_accuracy': 0.43, 'mean_precision': 0.771, 'mean_recall': 0.386, 'mean_f1': 0.286}
i=5|{'kernel': 'poly', 'C': 1.0, 'gamma': 0.1, 'degree': 9}  {'mean_accuracy': 0.43, 'mean_precision': 0.771, 'mean_recall': 0.386, 'mean_f1': 0.286}
i=6|{'kernel': 'poly', 'C': 1.0, 'gamma': 1.0, 'degree': 3}  {'mean_accuracy': 0.568, 'mean_pr

In [24]:
print(f"Best train (k={n_fold}) result of Poly-TFIDF:")
poly_train.get_best_result(base_on="accuracy")

Best train (k=5) result of Poly-TFIDF:


[{'params': {'kernel': 'poly', 'C': 10.0, 'gamma': 1.0, 'degree': 3},
  'scores': {'mean_accuracy': 0.578,
   'mean_precision': 0.626,
   'mean_recall': 0.595,
   'mean_f1': 0.58}}]

In [16]:
poly_train.fit(X_train, y_train) # k = 10

Number of folds: 10
Total combination of parameters: 45


i=0|{'kernel': 'poly', 'C': 1.0, 'gamma': 0.01, 'degree': 3}  {'mean_accuracy': 0.429, 'mean_precision': 0.725, 'mean_recall': 0.384, 'mean_f1': 0.283}
i=1|{'kernel': 'poly', 'C': 1.0, 'gamma': 0.01, 'degree': 6}  {'mean_accuracy': 0.429, 'mean_precision': 0.681, 'mean_recall': 0.384, 'mean_f1': 0.281}
i=2|{'kernel': 'poly', 'C': 1.0, 'gamma': 0.01, 'degree': 9}  {'mean_accuracy': 0.556, 'mean_precision': 0.562, 'mean_recall': 0.54, 'mean_f1': 0.536}
i=3|{'kernel': 'poly', 'C': 1.0, 'gamma': 0.1, 'degree': 3}  {'mean_accuracy': 0.429, 'mean_precision': 0.725, 'mean_recall': 0.384, 'mean_f1': 0.283}
i=4|{'kernel': 'poly', 'C': 1.0, 'gamma': 0.1, 'degree': 6}  {'mean_accuracy': 0.429, 'mean_precision': 0.681, 'mean_recall': 0.384, 'mean_f1': 0.281}
i=5|{'kernel': 'poly', 'C': 1.0, 'gamma': 0.1, 'degree': 9}  {'mean_accuracy': 0.429, 'mean_precision': 0.681, 'mean_recall': 0.384, 'mean_f1': 0.282}
i=6|{'kernel': 'poly', 'C': 1.0, 'gamma': 1.0, 'degree': 3}  {'mean_accuracy': 0.579, 'mean_

In [17]:
print(f"Best train (k={n_fold}) result of Poly-TFIDF:")
poly_train.get_best_result(base_on="accuracy")

Best train (k=10) result of Poly-TFIDF:


[{'params': {'kernel': 'poly', 'C': 10.0, 'gamma': 1.0, 'degree': 3},
  'scores': {'mean_accuracy': 0.593,
   'mean_precision': 0.636,
   'mean_recall': 0.607,
   'mean_f1': 0.594}}]

In [18]:
# Save the result
res_table = poly_train.get_table_result()
res_best_table = poly_train.get_best_table_result(base_on="accuracy")

res_table.to_csv(dir+f"train_k{n_fold}_poly_tfidf.csv", index=False)
res_best_table.to_csv(dir+f"train_k{n_fold}_poly_tfidf[best].csv", index=False)

***

### __3. Pengujian__

Beralih ke:
- [2.2 Pengukuran Performa: SVM + TF-IDF](#22-pengukuran-performa-svm--tf-idf)
- [Daftar Isi](#daftar-isi)

#### 3.1 Pengujian SVM + BOW

In [13]:
from model.testing import SVMGridTester

In [14]:
scoring = ["accuracy", "precision", "recall", "f1",]
random_state = 42
dir = "data/result/test/"

##### 3.1.1 Pengujian Linear-BoW

In [15]:
linear_test = SVMGridTester(
    model = linear_svm,
    params = linear_params,
    scoring = scoring,
    scoring_avg = "macro",
    random_state = random_state
)

linear_test.fit_predict(
    X_train,
    X_test,
    y_train,
    y_test,
)

Total combination of parameters: 5


i=0|{'kernel': 'linear', 'C': 1.0}  {'accuracy': 0.708, 'precision': 0.696, 'recall': 0.693, 'f1': 0.692}
i=1|{'kernel': 'linear', 'C': 10.0}  {'accuracy': 0.716, 'precision': 0.699, 'recall': 0.699, 'f1': 0.699}
i=2|{'kernel': 'linear', 'C': 100.0}  {'accuracy': 0.682, 'precision': 0.669, 'recall': 0.669, 'f1': 0.668}
i=3|{'kernel': 'linear', 'C': 1000.0}  {'accuracy': 0.673, 'precision': 0.662, 'recall': 0.657, 'f1': 0.658}
i=4|{'kernel': 'linear', 'C': 10000.0}  {'accuracy': 0.66, 'precision': 0.649, 'recall': 0.643, 'f1': 0.644}


In [16]:
print("Best test result of Linear-BoW:")
res_best = linear_test.get_best_result(base_on="accuracy")
res_best

Best test result of Linear-BoW:


[{'params': {'kernel': 'linear', 'C': 10.0},
  'scores': {'accuracy': 0.716,
   'precision': 0.699,
   'recall': 0.699,
   'f1': 0.699},
  'cm': array([[292,  54,  45],
         [ 33, 298,  56],
         [ 63,  40, 142]], dtype=int64)}]

In [17]:
# Save the result
res_table = linear_test.get_table_result()
res_best_table = linear_test.get_best_table_result(base_on="accuracy")
res_best_cm_table = pandas.DataFrame(res_best[0]["cm"])

res_table.to_csv(dir+"test_linear_bow.csv", index=False)
res_best_table.to_csv(dir+"test_linear_bow[best].csv", index=False)
res_best_cm_table.to_csv(dir+"test_linear_bow[best-cm].csv")

##### 3.1.2 Pengujian RBF-BoW

In [20]:
rbf_test = SVMGridTester(
    model = rbf_svm,
    params = rbf_params,
    scoring = scoring,
    scoring_avg = "macro",
    random_state = random_state
)

rbf_test.fit_predict(
    X_train,
    X_test,
    y_train,
    y_test,
)

Total combination of parameters: 15


i=0|{'kernel': 'rbf', 'C': 1.0, 'gamma': 0.01}  {'accuracy': 0.563, 'precision': 0.616, 'recall': 0.541, 'f1': 0.521}
i=1|{'kernel': 'rbf', 'C': 1.0, 'gamma': 0.1}  {'accuracy': 0.65, 'precision': 0.646, 'recall': 0.624, 'f1': 0.625}
i=2|{'kernel': 'rbf', 'C': 1.0, 'gamma': 1.0}  {'accuracy': 0.601, 'precision': 0.665, 'recall': 0.551, 'f1': 0.54}
i=3|{'kernel': 'rbf', 'C': 10.0, 'gamma': 0.01}  {'accuracy': 0.67, 'precision': 0.672, 'recall': 0.648, 'f1': 0.648}
i=4|{'kernel': 'rbf', 'C': 10.0, 'gamma': 0.1}  {'accuracy': 0.707, 'precision': 0.695, 'recall': 0.685, 'f1': 0.688}
i=5|{'kernel': 'rbf', 'C': 10.0, 'gamma': 1.0}  {'accuracy': 0.621, 'precision': 0.668, 'recall': 0.572, 'f1': 0.566}
i=6|{'kernel': 'rbf', 'C': 100.0, 'gamma': 0.01}  {'accuracy': 0.715, 'precision': 0.703, 'recall': 0.7, 'f1': 0.701}
i=7|{'kernel': 'rbf', 'C': 100.0, 'gamma': 0.1}  {'accuracy': 0.708, 'precision': 0.692, 'recall': 0.686, 'f1': 0.688}
i=8|{'kernel': 'rbf', 'C': 100.0, 'gamma': 1.0}  {'accuracy

In [21]:
print("Best test result of RBF-BoW:")
res_best = rbf_test.get_best_result(base_on="accuracy")
res_best

Best test result of RBF-BoW:


[{'params': {'kernel': 'rbf', 'C': 1000.0, 'gamma': 0.01},
  'scores': {'accuracy': 0.72,
   'precision': 0.707,
   'recall': 0.703,
   'f1': 0.704},
  'cm': array([[303,  51,  37],
         [ 45, 293,  49],
         [ 67,  37, 141]], dtype=int64)}]

In [22]:
# Save the result
res_table = rbf_test.get_table_result()
res_best_table = rbf_test.get_best_table_result(base_on="accuracy")
res_best_cm_table = pandas.DataFrame(res_best[0]["cm"])

res_table.to_csv(dir+"test_rbf_bow.csv", index=False)
res_best_table.to_csv(dir+"test_rbf_bow[best].csv", index=False)
res_best_cm_table.to_csv(dir+"test_rbf_bow[best-cm].csv")

##### 3.1.3 Pengujian Polinomial-BoW

In [25]:
poly_test = SVMGridTester(
    model = poly_svm,
    params = poly_params,
    scoring = scoring,
    scoring_avg = "macro",
    random_state = random_state
)

poly_test.fit_predict(
    X_train,
    X_test,
    y_train,
    y_test,
)

Total combination of parameters: 45


i=0|{'kernel': 'poly', 'C': 1.0, 'gamma': 0.01, 'degree': 3}  {'accuracy': 0.382, 'precision': 0.46, 'recall': 0.339, 'f1': 0.194}
i=1|{'kernel': 'poly', 'C': 1.0, 'gamma': 0.01, 'degree': 6}  {'accuracy': 0.378, 'precision': 0.126, 'recall': 0.333, 'f1': 0.183}
i=2|{'kernel': 'poly', 'C': 1.0, 'gamma': 0.01, 'degree': 9}  {'accuracy': 0.378, 'precision': 0.126, 'recall': 0.333, 'f1': 0.183}
i=3|{'kernel': 'poly', 'C': 1.0, 'gamma': 0.1, 'degree': 3}  {'accuracy': 0.384, 'precision': 0.46, 'recall': 0.341, 'f1': 0.2}
i=4|{'kernel': 'poly', 'C': 1.0, 'gamma': 0.1, 'degree': 6}  {'accuracy': 0.378, 'precision': 0.126, 'recall': 0.333, 'f1': 0.183}
i=5|{'kernel': 'poly', 'C': 1.0, 'gamma': 0.1, 'degree': 9}  {'accuracy': 0.378, 'precision': 0.126, 'recall': 0.333, 'f1': 0.183}
i=6|{'kernel': 'poly', 'C': 1.0, 'gamma': 1.0, 'degree': 3}  {'accuracy': 0.514, 'precision': 0.587, 'recall': 0.478, 'f1': 0.448}
i=7|{'kernel': 'poly', 'C': 1.0, 'gamma': 1.0, 'degree': 6}  {'accuracy': 0.449, 'pr

In [26]:
print("Best test result of Poly-BoW:")
res_best = poly_test.get_best_result(base_on="accuracy")
res_best

Best test result of Poly-BoW:


[{'params': {'kernel': 'poly', 'C': 100.0, 'gamma': 1.0, 'degree': 3},
  'scores': {'accuracy': 0.64,
   'precision': 0.635,
   'recall': 0.616,
   'f1': 0.616},
  'cm': array([[227, 124,  40],
         [ 35, 320,  32],
         [ 77,  60, 108]], dtype=int64)}]

In [27]:
# Save the result
res_table = poly_test.get_table_result()
res_best_table = poly_test.get_best_table_result(base_on="accuracy")
res_best_cm_table = pandas.DataFrame(res_best[0]["cm"])

res_table.to_csv(dir+"test_poly_bow.csv", index=False)
res_best_table.to_csv(dir+"test_poly_bow[best].csv", index=False)
res_best_cm_table.to_csv(dir+"test_poly_bow[best-cm].csv")

#### 3.2 Pengujian SVM + TF-IDF

##### 3.2.1 Pengujian Linear-TFIDF

In [15]:
linear_test = SVMGridTester(
    model = linear_svm,
    params = linear_params,
    scoring = scoring,
    scoring_avg = "macro",
    random_state = random_state
)

linear_test.fit_predict(
    X_train,
    X_test,
    y_train,
    y_test,
)

Total combination of parameters: 5


i=0|{'kernel': 'linear', 'C': 1.0}  {'accuracy': 0.741, 'precision': 0.729, 'recall': 0.726, 'f1': 0.727}
i=1|{'kernel': 'linear', 'C': 10.0}  {'accuracy': 0.719, 'precision': 0.71, 'recall': 0.706, 'f1': 0.708}
i=2|{'kernel': 'linear', 'C': 100.0}  {'accuracy': 0.697, 'precision': 0.688, 'recall': 0.683, 'f1': 0.685}
i=3|{'kernel': 'linear', 'C': 1000.0}  {'accuracy': 0.697, 'precision': 0.688, 'recall': 0.684, 'f1': 0.686}
i=4|{'kernel': 'linear', 'C': 10000.0}  {'accuracy': 0.693, 'precision': 0.686, 'recall': 0.681, 'f1': 0.683}


In [16]:
print("Best test result of Linear-TFIDF:")
res_best = linear_test.get_best_result(base_on="accuracy")
res_best

Best test result of Linear-TFIDF:


[{'params': {'kernel': 'linear', 'C': 1.0},
  'scores': {'accuracy': 0.741,
   'precision': 0.729,
   'recall': 0.726,
   'f1': 0.727},
  'cm': array([[308,  37,  46],
         [ 47, 299,  41],
         [ 63,  31, 151]], dtype=int64)}]

In [17]:
# Save the result
res_table = linear_test.get_table_result()
res_best_table = linear_test.get_best_table_result(base_on="accuracy")
res_best_cm_table = pandas.DataFrame(res_best[0]["cm"])

res_table.to_csv(dir+"test_linear_tfidf.csv", index=False)
res_best_table.to_csv(dir+"test_linear_tfidf[best].csv", index=False)
res_best_cm_table.to_csv(dir+"test_linear_tfidf[best-cm].csv")

##### 3.2.2 Pengujian RBF-TFIDF

In [20]:
rbf_test = SVMGridTester(
    model = rbf_svm,
    params = rbf_params,
    scoring = scoring,
    scoring_avg = "macro",
    random_state = random_state
)

rbf_test.fit_predict(
    X_train,
    X_test,
    y_train,
    y_test,
)

Total combination of parameters: 15


i=0|{'kernel': 'rbf', 'C': 1.0, 'gamma': 0.01}  {'accuracy': 0.505, 'precision': 0.643, 'recall': 0.515, 'f1': 0.442}
i=1|{'kernel': 'rbf', 'C': 1.0, 'gamma': 0.1}  {'accuracy': 0.686, 'precision': 0.679, 'recall': 0.679, 'f1': 0.676}
i=2|{'kernel': 'rbf', 'C': 1.0, 'gamma': 1.0}  {'accuracy': 0.724, 'precision': 0.721, 'recall': 0.692, 'f1': 0.697}
i=3|{'kernel': 'rbf', 'C': 10.0, 'gamma': 0.01}  {'accuracy': 0.703, 'precision': 0.693, 'recall': 0.697, 'f1': 0.692}
i=4|{'kernel': 'rbf', 'C': 10.0, 'gamma': 0.1}  {'accuracy': 0.741, 'precision': 0.73, 'recall': 0.722, 'f1': 0.725}
i=5|{'kernel': 'rbf', 'C': 10.0, 'gamma': 1.0}  {'accuracy': 0.731, 'precision': 0.724, 'recall': 0.704, 'f1': 0.709}
i=6|{'kernel': 'rbf', 'C': 100.0, 'gamma': 0.01}  {'accuracy': 0.736, 'precision': 0.724, 'recall': 0.718, 'f1': 0.72}
i=7|{'kernel': 'rbf', 'C': 100.0, 'gamma': 0.1}  {'accuracy': 0.721, 'precision': 0.71, 'recall': 0.706, 'f1': 0.708}
i=8|{'kernel': 'rbf', 'C': 100.0, 'gamma': 1.0}  {'accura

In [21]:
print("Best test result of RBF-TFIDF:")
res_best = rbf_test.get_best_result(base_on="accuracy")
res_best

Best test result of RBF-TFIDF:


[{'params': {'kernel': 'rbf', 'C': 10.0, 'gamma': 0.1},
  'scores': {'accuracy': 0.741,
   'precision': 0.73,
   'recall': 0.722,
   'f1': 0.725},
  'cm': array([[315,  38,  38],
         [ 50, 299,  38],
         [ 68,  33, 144]], dtype=int64)}]

In [22]:
# Save the result
res_table = rbf_test.get_table_result()
res_best_table = rbf_test.get_best_table_result(base_on="accuracy")
res_best_cm_table = pandas.DataFrame(res_best[0]["cm"])

res_table.to_csv(dir+"test_rbf_tfidf.csv", index=False)
res_best_table.to_csv(dir+"test_rbf_tfidf[best].csv", index=False)
res_best_cm_table.to_csv(dir+"test_rbf_tfidf[best-cm].csv")

##### 3.2.3 Pengujian Polinomial-TFIDF

In [25]:
poly_test = SVMGridTester(
    model = poly_svm,
    params = poly_params,
    scoring = scoring,
    scoring_avg = "macro",
    random_state = random_state
)

poly_test.fit_predict(
    X_train,
    X_test,
    y_train,
    y_test,
)

Total combination of parameters: 45


i=0|{'kernel': 'poly', 'C': 1.0, 'gamma': 0.01, 'degree': 3}  {'accuracy': 0.433, 'precision': 0.792, 'recall': 0.385, 'f1': 0.287}
i=1|{'kernel': 'poly', 'C': 1.0, 'gamma': 0.01, 'degree': 6}  {'accuracy': 0.429, 'precision': 0.792, 'recall': 0.38, 'f1': 0.277}
i=2|{'kernel': 'poly', 'C': 1.0, 'gamma': 0.01, 'degree': 9}  {'accuracy': 0.582, 'precision': 0.583, 'recall': 0.553, 'f1': 0.55}
i=3|{'kernel': 'poly', 'C': 1.0, 'gamma': 0.1, 'degree': 3}  {'accuracy': 0.433, 'precision': 0.792, 'recall': 0.385, 'f1': 0.287}
i=4|{'kernel': 'poly', 'C': 1.0, 'gamma': 0.1, 'degree': 6}  {'accuracy': 0.429, 'precision': 0.792, 'recall': 0.38, 'f1': 0.277}
i=5|{'kernel': 'poly', 'C': 1.0, 'gamma': 0.1, 'degree': 9}  {'accuracy': 0.429, 'precision': 0.792, 'recall': 0.38, 'f1': 0.277}
i=6|{'kernel': 'poly', 'C': 1.0, 'gamma': 1.0, 'degree': 3}  {'accuracy': 0.597, 'precision': 0.696, 'recall': 0.553, 'f1': 0.544}
i=7|{'kernel': 'poly', 'C': 1.0, 'gamma': 1.0, 'degree': 6}  {'accuracy': 0.515, 'pr

In [26]:
print("Best test result of Poly-TFIDF:")
res_best = poly_test.get_best_result(base_on="accuracy")
res_best

Best test result of Poly-TFIDF:


[{'params': {'kernel': 'poly', 'C': 10.0, 'gamma': 1.0, 'degree': 3},
  'scores': {'accuracy': 0.62,
   'precision': 0.643,
   'recall': 0.63,
   'f1': 0.616},
  'cm': array([[257,  32, 102],
         [ 57, 204, 126],
         [ 50,  22, 173]], dtype=int64)}]

In [27]:
# Save the result
res_table = poly_test.get_table_result()
res_best_table = poly_test.get_best_table_result()
res_best_cm_table = pandas.DataFrame(res_best[0]["cm"])

res_table.to_csv(dir+"test_poly_tfidf.csv", index=False)
res_best_table.to_csv(dir+"test_poly_tfidf[best].csv", index=False)
res_best_cm_table.to_csv(dir+"test_poly_tfidf[best-cm].csv")