# Toxicity Target Type Classification

In this notebook, we will train a baseline model to predict the type of a targeted toxic comment.

We will use [simpletransformers](https://simpletransformers.ai/) that is a wrapper for many popular models available in [Hugging Face](https://huggingface.co/).

We will use a pre-trained model ([neuralmind/bert-base-portuguese-cased · Hugging Face](https://huggingface.co/neuralmind/bert-base-portuguese-cased)) that is trained on Portuguese.

## Imports

In [None]:
import sys
from pathlib import Path

if str(Path(".").absolute().parent) not in sys.path:
    sys.path.append(str(Path(".").absolute().parent.parent))

In [None]:
from dotenv import load_dotenv

# Initialize the env vars
load_dotenv("../../.env")

In [None]:
import os
import shutil
import logging
import numpy as np
import pandas as pd
import mlflow
import matplotlib.pyplot as plt
from typing import List, Dict, Any
from kaggle.api.kaggle_api_extended import KaggleApi
from sklearn.metrics import classification_report
from sklearn.model_selection import train_test_split
from sklearn.utils.class_weight import compute_class_weight
from simpletransformers.classification import (
    ClassificationModel,
    ClassificationArgs
)

%matplotlib inline

logging.basicConfig(level=logging.INFO)

_logger = logging.getLogger("transformers")
_logger.setLevel(logging.WARNING)

mlflow.set_experiment("toxicity-target-type-detection")

mlflow.start_run(tags={"project": "olid-br"})

params = {
    "seed": 1993,
    "model_type": "bert",
    "model_name": "neuralmind/bert-base-portuguese-cased",
    "num_train_epochs": 6,
    "test_size": 0.3,
    "use_cuda": False
}

## Functions

In this section, we will define some helper functions.

In [None]:
def prep_data(X: List[str], Y: List[int], classes: Dict[Any, int]):
    """
    Prepare data (X, y) in a list.

    Args:
    - X: list of strings (texts)
    - y: List of ints (0 or 1)
    - classes: dictionary of classess ({class_id: class_name, ...})

    Returns:
        List of data.
    """

    def get_key_by_value(dictionary, value):
        "Get key by value in dictionary"
        return next(key for key, val in dictionary.items() if val == value)

    data = []
    for x, y in zip(X, Y):
        y = get_key_by_value(classes, y)
        data.append([x, y])
    return data

## Load the data

In this section, we will download the data and load it into a pandas dataframe.

In [None]:
# Download the data
if not os.path.exists("olidbr.csv"):
    print("Downloading data from Kaggle")
    kaggle = KaggleApi()
    kaggle.authenticate()
    kaggle.dataset_download_files(dataset="olidbr", unzip=True)

# Load data
df = pd.read_csv("olidbr.csv")

# Delete files
for file in ["olidbr.csv", "metadata.csv"]:
    os.remove(file)

# Log dataset version
olidbr = kaggle.dataset_view(dataset="olidbr")
mlflow.log_param("dataset_version", f"v{olidbr.currentVersionNumber}")

print(f"Shape: {df.shape}")
df.head()

We need to filter only targeted toxic comments.

In [None]:
df = df[(df["is_offensive"] == "OFF") & (df["is_targeted"] == "TIN")]
df.reset_index(drop=True, inplace=True)

print(f"Shape: {df.shape}")

## Explorative analysis

In the second cell, we load the data and perform an exploratory analysis.

In [None]:
df_eda = df[["text", "targeted_type"]].groupby("targeted_type").count()
df_eda.reset_index(inplace=True)
df_eda

In [None]:
ax = df_eda.plot(x="targeted_type", y="text", kind="bar",
                 legend=False, figsize=(10, 6),
                 xlabel="targeted_type", ylabel="count", fontsize=14,
                 rot=1, title="targeted_type distribution")

for container in ax.containers:
    ax.bar_label(container, fontsize=14)

mlflow.log_figure(
    figure=ax.get_figure(),
    artifact_file="targeted_type_distribution.png")

In [None]:
classes = {
    0: "IND",
    1: "GRP",
    2: "OTH"
}

## Prepare the data

In this section, we will prepare the data in order to train the model.

The `simpletransformers` library expects the data in a specific format.

More information about the format can be found in the [Classification Data Formats - Simple Transformers](https://simpletransformers.ai/docs/classification-data-formats/#binary-classification)

In [None]:
df = df[["text", "targeted_type"]]

X = df["text"].values
y = df["targeted_type"].values

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=params["test_size"],
                                                    random_state=params["seed"],
                                                    stratify=y)

train_data = prep_data(X_train, y_train, classes)
test_data = prep_data(X_test, y_test, classes)

df_train = pd.DataFrame(train_data)
df_train.columns = ["text", "labels"]

df_test = pd.DataFrame(test_data)
df_test.columns = ["text", "labels"]

mlflow.log_param("train_size", len(df_train))
mlflow.log_param("test_size", len(df_test))

print(f"train_data: {df_train.shape}")
print(f"test_data: {df_test.shape}")


## Training the model

In this section, we will train a baseline model to predict if a toxic comment is targeted or not.

We will not perform hyperparameter tuning because it is a simple baseline model.

In [None]:
# Temporary folders
temp_folders = ["cache_dir", "outputs", "runs"]

for folder in temp_folders:
    if os.path.exists(folder):
        shutil.rmtree(folder, ignore_errors=True)
        
# Compute class weights
params["class_weights"] = compute_class_weight(
    class_weight="balanced",
    classes=np.unique(y_train),
    y=y_train)

# Optional model configuration
model_args = ClassificationArgs(
    num_train_epochs=params["num_train_epochs"])

# Create a ClassificationModel
model = ClassificationModel(
    model_type=params["model_type"],
    model_name=params["model_name"],
    args=model_args,
    weight=list(params["class_weights"]),
    use_cuda=params["use_cuda"]
)

# Train the model
model.train_model(df_train)

## Evaluating the model

In this section, we will evaluate the model with the following metrics:

- **Accuracy**: the percentage of correct predictions;
- **Precision**: the percentage of predicted targeted comments that are actually targeted;
- **Recall**: the percentage of targeted comments that are actually predicted as targeted;
- **F1-Score**: the harmonic mean of precision and recall;
- **ROC AUC**: the area under the receiver operating characteristic Curve (ROC AUC).

In [None]:
result, model_outputs, wrong_predictions = model.eval_model(df_test)

y_true = df_test["labels"].tolist()

y_pred, raw_outputs = model.predict(df_test["text"].tolist())

In [None]:
# Logging metrics in MLflow
metrics = classification_report(
    y_true, y_pred, digits=4,
    target_names=classes.values(), output_dict=True)

mlflow.log_metric("auroc", result["auroc"])
mlflow.log_metric("accuracy", metrics["accuracy"])
mlflow.log_metric("weighted_f1_score", metrics["weighted avg"]["f1-score"])
mlflow.log_metric("weighted_precision", metrics["weighted avg"]["precision"])
mlflow.log_metric("weighted_recall", metrics["weighted avg"]["recall"])

mlflow.log_metric("UNT_precision", metrics["UNT"]["precision"])
mlflow.log_metric("UNT_recall", metrics["UNT"]["recall"])
mlflow.log_metric("UNT_f1_score", metrics["UNT"]["f1-score"])
mlflow.log_metric("TIN_precision", metrics["TIN"]["precision"])
mlflow.log_metric("TIN_recall", metrics["TIN"]["recall"])
mlflow.log_metric("TIN_f1_score", metrics["TIN"]["f1-score"])

## Testing the model

In the last section, we will test the model with some comments from the test set.

In [None]:
df_pred = df_test.head(20)

predictions, raw_outputs = model.predict(df_pred.head(10)["text"].tolist())

df_pred = df_pred.assign(predictions=predictions)

df_pred["labels"] = df_pred["labels"].map(classes)
df_pred["predictions"] = df_pred["predictions"].map(classes)

df_pred

In [None]:
mlflow.end_run()