# HateStack

The following Notebook contain the code of our proposed model: The HateStack. We show the results for the 5-Fold Cross Validation metric (used to determine the generalization power of the model in the competition leaderboard) and the final Test Score measured with the whole Training Data.

The evaluation Score used during the competition was a Custom Implementation of the F1-Score defined as:

$$F_{1\,custom} = 0.5 \cdot (F_{1\,Hate} + Macro \, F_{1\,communities})$$

In [1]:
%load_ext autoreload
%autoreload 2

In [2]:
from utils import import_data, validation_train, full_train
from sklearn.ensemble import ExtraTreesClassifier, StackingClassifier
from xgboost import XGBClassifier
from catboost import CatBoostClassifier
from sklearn.linear_model import LogisticRegression
from sklearn.neural_network import MLPClassifier
from loguru import logger
from pytictoc import TicToc

RANDOM_STATE = 42
df_train, df_test, stopwords = import_data()

LABELS = [
    "Odio",
    "Mujeres",
    "Comunidad LGBTQ+",
    "Comunidades Migrantes",
    "Pueblos Originarios",
]


[32m2023-09-15 13:30:55.020[0m | [1mINFO    [0m | [36mutils.utilities[0m:[36mimport_data[0m:[36m12[0m - [1mTraining and Test data succesfully loaded...[0m
[32m2023-09-15 13:30:55.020[0m | [1mINFO    [0m | [36mutils.utilities[0m:[36mimport_data[0m:[36m13[0m - [1mTrain Shape: (2256, 9), Test Shape: (2291, 9)[0m


### Model Definition

In [3]:
et = ExtraTreesClassifier(n_estimators=500, n_jobs=-1, random_state=RANDOM_STATE)
cb = CatBoostClassifier(
    n_estimators=500, thread_count=-1, random_state=RANDOM_STATE, verbose=False
)
xgb = XGBClassifier(n_estimators=500, n_jobs=-1, random_state=RANDOM_STATE)
lr = LogisticRegression(random_state=RANDOM_STATE)
mlp = MLPClassifier(
    hidden_layer_sizes=(64, 32),
    activation="relu",
    solver="adam",
    random_state=RANDOM_STATE,
    alpha=0.1,
)

estimators = [("et", et), ("cb", cb), ("xgb", xgb), ("lr", lr), ("mlp", mlp)]

hate_stack = StackingClassifier(
    estimators=estimators, final_estimator=LogisticRegression(random_state=42), cv=3
)


### Validation Schema

Training Process following a 5-Fold Cross Validation Schema. 

In [4]:
t = TicToc()
t.tic()
dict_results = validation_train(
    df_train, hate_stack, LABELS, stopwords, random_state=RANDOM_STATE, verbose=True
)
print(f"Stacking Results:")
print(f"Mean Validation Score: {dict_results['mean_val_score']}")
print(f"SD Validation Score: {dict_results['sd_val_score']}")
print(f"Mean Precision Validation Score: {dict_results['mean_precision_val_score']}")
print(f"SD Precision Validation Score: {dict_results['sd_precision_val_score']}")
print(f"Mean Recall Validation Score: {dict_results['mean_recall_val_score']}")
print(f"SD Recall Validation Score: {dict_results['sd_recall_val_score']}")
t.toc("HateStack CV Training Time: ")


Train Score fold 1: 0.9992974418134202
Validation Score fold 1: 0.7690925562468738
--------------------------------------------
Train Score fold 2: 0.9986517262769286
Validation Score fold 2: 0.8049551199392759
--------------------------------------------
Train Score fold 3: 0.9994154555206065
Validation Score fold 3: 0.806837224787222
--------------------------------------------
Train Score fold 4: 0.9997538158542589
Validation Score fold 4: 0.796186917671002
--------------------------------------------
Train Score fold 5: 1.0
Validation Score fold 5: 0.8177478834597189
--------------------------------------------
Stacking Results:
Mean Validation Score: 0.7989639404208184
SD Validation Score: 0.016436123646685594
Mean Precision Validation Score: 0.8790885981248543
SD Precision Validation Score: 0.006223935806222798
Mean Recall Validation Score: 0.7387118240434579
SD Recall Validation Score: 0.02121214078333721
HateStack CV Training Time:  553.720891 seconds.


### Full Train and Predictions

Test Predictions using the whole Training Set.



In [5]:
t = TicToc()
t.tic()
dict_results = full_train(df_train, df_test, hate_stack, LABELS, stopwords)
print(f"Stacking Results:")
print(f"Test Score: {dict_results['test_score']}")
print(f"Test Precision: {dict_results['test_precision']}")
print(f"Test Recall: {dict_results['test_recall']}")
t.toc("HateStack Full Training Time: ")


Stacking Results:
Test Score: 0.8175373271622297
Test Precision: 0.7838088936814132
Test Recall: 0.8610952188880378
HateStack Full Training Time:  129.480336 seconds.
