更新 directory 内的 .py 文件，使能够 pipeline 里正确使用

In [1]:
%reload_ext autoreload
%autoreload 2

Load 各部分的函数

In [2]:
import settings

from load_data import read_json
from training_SVM import tune_linear_svm_with_cv
from evaluation import evaluate_eval_files
import joblib

读取训练集

In [3]:
TRAIN_FILES = [
    str(settings.DATA_HOME / "train/usual_train.txt"),
    str(settings.DATA_HOME / "train/virus_train.txt"),
]

all_train_texts = []
all_train_labels = []

for file in TRAIN_FILES:
    texts, labels = read_json(file)
    all_train_texts += texts
    all_train_labels += labels

print(f"Total train samples: {len(all_train_texts)}")

Total train samples: 36322


通过交叉验证（cv = 5, C:(0.001~100) randomized 100 values)，

得到模型最佳**C值**，和此时训练集内部的**F1值**如上结果。

In [4]:
result = tune_linear_svm_with_cv(all_train_texts, all_train_labels)

best_model = result["best_model"]
vectorizer = result["vectorizer"]
print("Best params:", result["best_params"])
print("Best CV score (weighted F1):", result["best_score"])

Best params: {'C': 0.4750810162102798}
Best CV score (weighted F1): 0.6909869572878653


In [5]:
joblib.dump({"model": best_model, "vectorizer": vectorizer}, "svm_emotion_model.pkl")

['svm_emotion_model.pkl']

## Evaluation

In [6]:
eval_usual = str(settings.DATA_HOME / "eval/usual_eval.txt")
eval_usual_label = str(settings.DATA_HOME / "eval/usual_eval_labeled.txt")
eval_virus = str(settings.DATA_HOME / "eval/virus_eval.txt")
eval_virus_label = str(settings.DATA_HOME / "eval/virus_eval_labeled.txt")

usual_ids, usual_trues, usual_pred = evaluate_eval_files(
    best_model, vectorizer, eval_usual, eval_usual_label
)

print("\n=====================================================================\n")

virus_ids, virus_trues, virus_pred = evaluate_eval_files(
    best_model, vectorizer, eval_virus, eval_virus_label
)

Target Samples: ./datasets/eval/usual_eval.txt
Total eval samples: 1997
=== Classification Report (per-class F1) ===
              precision    recall  f1-score   support

       angry     0.7554    0.8370    0.7941       583
        fear     0.6404    0.6552    0.6477        87
       happy     0.7344    0.7212    0.7277       391
     neutral     0.8145    0.8048    0.8096       420
         sad     0.5936    0.5867    0.5901       346
    surprise     0.6612    0.4706    0.5498       170

    accuracy                         0.7251      1997
   macro avg     0.6999    0.6793    0.6865      1997
weighted avg     0.7226    0.7251    0.7219      1997

=== Accuracy ===
0.7250876314471708

=== Confusion Matrix (rows=true, cols=pred) ===
[[488   2  17   9  58   9]
 [ 10  57   4   2   8   6]
 [ 23   5 282  32  37  12]
 [ 13   2  34 338  27   6]
 [ 75  10  31  19 203   8]
 [ 37  13  16  15   9  80]]


Target Samples: ./datasets/eval/virus_eval.txt
Total eval samples: 1996
=== Classification

## Test set
### (./test/mixed/virus_test_*)

In [7]:
test_virus = str(settings.DATA_HOME / "test/mixed/virus_test.txt")
test_virus_label = str(settings.DATA_HOME / "test/mixed/virus_test_labeled.txt")

test_ids, test_trues, test_pred = evaluate_eval_files(
    best_model, vectorizer, test_virus, test_virus_label
)

Target Samples: ./datasets/test/mixed/virus_test.txt
Total eval samples: 3000
=== Classification Report (per-class F1) ===
              precision    recall  f1-score   support

       angry     0.6585    0.6415    0.6499       463
        fear     0.5893    0.3474    0.4371       190
       happy     0.8812    0.8721    0.8766      1540
     neutral     0.5573    0.6731    0.6098       520
         sad     0.5061    0.5662    0.5345       219
    surprise     0.4250    0.2500    0.3148        68

    accuracy                         0.7323      3000
   macro avg     0.6029    0.5584    0.5704      3000
weighted avg     0.7345    0.7323    0.7298      3000

=== Accuracy ===
0.7323333333333333

=== Confusion Matrix (rows=true, cols=pred) ===
[[ 297   12   41   91   17    5]
 [  33   66   17   33   36    5]
 [  40    7 1343  104   40    6]
 [  39   14   90  350   23    4]
 [  25    9   26   32  124    3]
 [  17    4    7   18    5   17]]


## Summary

eval_set(both usual and virus) 和 test_set(virus)上的精确度（Accuracy）均保持在 **0.72~0.74** 之间

但各个情感 class 上的 F1 Score 有参差不齐的差别。比如 happy 的 F1值 较高， surprise 的 F1值 较低