<a href="https://colab.research.google.com/github/iolef/Sarcasm-identification-in-implicit-misogyny/blob/main/1_2_Implicit_hate_detection_model.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **Implicit hate detection model - binary classification**



In this notebook, [DistilBERT](https://https://huggingface.co/docs/transformers/model_doc/distilbert) is used to fine-tune an implicit hate detection model using the [Implicit hate corpus](https://https://github.com/SALT-NLP/implicit-hate) by El Sherief et al. (2021).

# **1. Setup**

1.1 Installing Transformers

In [None]:
# Transformers installation
! pip install transformers[torch] datasets evaluate
# To install from source instead of the last release, comment the command above and uncomment the following one.
# ! pip install git+https://github.com/huggingface/transformers.git

Collecting transformers[torch]
  Downloading transformers-4.32.1-py3-none-any.whl (7.5 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m7.5/7.5 MB[0m [31m61.1 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting datasets
  Downloading datasets-2.14.4-py3-none-any.whl (519 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m519.3/519.3 kB[0m [31m52.0 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting evaluate
  Downloading evaluate-0.4.0-py3-none-any.whl (81 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m81.4/81.4 kB[0m [31m13.3 MB/s[0m eta [36m0:00:00[0m
Collecting huggingface-hub<1.0,>=0.15.1 (from transformers[torch])
  Downloading huggingface_hub-0.16.4-py3-none-any.whl (268 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m268.8/268.8 kB[0m [31m34.8 MB/s[0m eta [36m0:00:00[0m
Collecting tokenizers!=0.11.3,<0.14,>=0.11.1 (from transformers[torch])
  Downloading tokenizers-0.13.3-cp310-cp310-manylinux_2_17_x86_

1.2 Imports

In [None]:
from sklearn.model_selection import train_test_split
from datasets import Dataset
import pandas as pd
import zipfile
import os
import re

# **2. Corpus upload**

Uploading and unzipping the corpus folder and creating a dataframe with the selected file.

In [None]:
# Unzipping the corpus folder
with zipfile.ZipFile("implicit-hate-corpus-nov-2021.zip", "r") as zip_ref:
    zip_ref.extractall()

# Seelcting the desired subfolder from the corpus folder
dataset_path = os.path.join(os.getcwd(), "implicit-hate-corpus", "implicit_hate_v1_stg1_posts.tsv")

# Creating the dataframe
df = pd.read_csv(dataset_path, delimiter="\t")

# **3. Preprocessing**

Creating a function which includes all the text preprocessing operations.

In [None]:
def clean_text(post):
 # Lowercasing
 post = post.lower()
 # Hashtag removal
 post = re.sub(r'((?<=[\s\W])|^)[#](\w+|[^#]|$)', ' ', post)
 # "rt" removal
 post = re.sub('^\s*rt\s+', '', post)
 # special characters removal
 post = re.sub(r'[^\w]', ' ', post)
 # stripping
 post = post.strip()
 # removing the unnecessary whitespaces between words
 post = ' '.join(post.split())
 return post

# Applying to each post the clean_text function
df["post"] = df["post"].apply(clean_text)

# **4. Preparing the data for the training**

4.1 Converting the dataframe in a format compatible with Huggingface

In [None]:
# Renaming the columns
df.rename(columns={"post": "text", "class": "label"}, inplace=True)

# Removing the "not hate" category
df = df.loc[df['label'] != "not_hate"]

# Converting the labels into numbers
df['label'] = df['label'].replace({"explicit_hate": 0, "implicit_hate": 1})

4.2 Splitting the data

In [None]:
# Adding a seed in order to be able to replicate the same experiment
seed = 100
# Retaining 80% of the data for the training set
train_set, test_val_set = train_test_split(df, test_size=0.2, random_state = seed)

# Dividing the remaining 20%
test_set, val_set = train_test_split(test_val_set, test_size=0.5, random_state = seed)

# Trasforming the dataframes into datasets
train_ds = Dataset.from_pandas(train_set, split="train")
val_ds = Dataset.from_pandas(val_set, split="val")
test_ds = Dataset.from_pandas(test_set, split="test")

4.3 Creating the tokenizer

In [None]:
from transformers import AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("distilbert-base-uncased")

def preprocess_function(examples):
    return tokenizer(examples["text"], truncation=True)

tokenized_train_set = train_ds.map(preprocess_function, batched=True)
tokenized_val_set = val_ds.map(preprocess_function, batched=True)
tokenized_test_set = test_ds.map(preprocess_function, batched=True)

Downloading (…)okenizer_config.json:   0%|          | 0.00/28.0 [00:00<?, ?B/s]

Downloading (…)lve/main/config.json:   0%|          | 0.00/483 [00:00<?, ?B/s]

Downloading (…)solve/main/vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

Downloading (…)/main/tokenizer.json:   0%|          | 0.00/466k [00:00<?, ?B/s]

Map:   0%|          | 0/6551 [00:00<?, ? examples/s]

Map:   0%|          | 0/819 [00:00<?, ? examples/s]

Map:   0%|          | 0/819 [00:00<?, ? examples/s]

In [None]:
# Printing a few examples
train_ds[0:100]

{'text': ['people are getting more tribal and racially aware every day mainly due to leftists going all out anti white and pro white genocide thus',
  'yes white people are the most pathetic people in the entire',
  'the past glories of america that you feed on were built by a 90 white p',
  'ethnic norwegians disappearing as non white colonization intensifies via',
  'anti white much',
  'real quotes from anti white globalist jew',
  'non whites already have groups to help them the federal government for instance',
  'nonsense slaves unlike poor whites had access to healthcare and a better diet than even modern blacks',
  'why would any white person want to be around a fat black guy the smell alone would be overbearing',
  'when you drab the swamp in sacramento',
  'i agree sjw deserve all the derision they get and more also like to poke the alt right crowd',
  'ws irony ur really an idiot porkistop burning hindus christian alive in pakistan',
  'what would you call an arab with a woo

# **5. Training the model**

Training the model with DistilBERT pre-trained model.

In [None]:
from transformers import DataCollatorWithPadding

data_collator = DataCollatorWithPadding(tokenizer=tokenizer)

import evaluate

accuracy = evaluate.load("accuracy")

import numpy as np


def compute_metrics(eval_pred):
    predictions, labels = eval_pred
    predictions = np.argmax(predictions, axis=1)
    return accuracy.compute(predictions=predictions, references=labels)

id2label = {0: "explicit_hate", 1: "implicit_hate"}
label2id = {"explicit_hate": 0, "implicit_hate": 1}

from transformers import AutoModelForSequenceClassification, TrainingArguments, Trainer

model = AutoModelForSequenceClassification.from_pretrained(
    "distilbert-base-uncased", num_labels=2, id2label=id2label, label2id=label2id
)

training_args = TrainingArguments(
    output_dir="my_hate_model",
    learning_rate=3e-7,
    per_device_train_batch_size=128,
    per_device_eval_batch_size=128,
    num_train_epochs=15,
    weight_decay=0.01,
    evaluation_strategy="epoch",
    save_strategy="epoch",
    load_best_model_at_end=True,
    push_to_hub=False,
)

trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=tokenized_train_set,
    eval_dataset=tokenized_test_set,
    tokenizer=tokenizer,
    data_collator=data_collator,
    compute_metrics=compute_metrics,
)

trainer.train()

Downloading builder script:   0%|          | 0.00/4.20k [00:00<?, ?B/s]

Downloading model.safetensors:   0%|          | 0.00/268M [00:00<?, ?B/s]

Some weights of DistilBertForSequenceClassification were not initialized from the model checkpoint at distilbert-base-uncased and are newly initialized: ['classifier.weight', 'classifier.bias', 'pre_classifier.bias', 'pre_classifier.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
You're using a DistilBertTokenizerFast tokenizer. Please note that with a fast tokenizer, using the `__call__` method is faster than using a method to encode the text followed by a call to the `pad` method to get a padded encoding.


Epoch,Training Loss,Validation Loss,Accuracy
1,No log,0.651527,0.868132
2,No log,0.627632,0.864469
3,No log,0.606034,0.864469
4,No log,0.583938,0.864469
5,No log,0.560441,0.864469
6,No log,0.536581,0.864469
7,No log,0.514411,0.864469
8,No log,0.495202,0.864469
9,No log,0.479239,0.864469
10,0.567200,0.46665,0.864469


TrainOutput(global_step=780, training_loss=0.5251112620035807, metrics={'train_runtime': 587.9932, 'train_samples_per_second': 167.119, 'train_steps_per_second': 1.327, 'total_flos': 1734794346425916.0, 'train_loss': 0.5251112620035807, 'epoch': 15.0})

# **6. Testing the model**

Trying the model on a sample sentence.

In [None]:
from transformers import pipeline

text = "I think all women are liars."

classifier = pipeline(task= 'sentiment-analysis',
                      model= "my_hate_model/checkpoint-780",
                      tokenizer = "my_hate_model/checkpoint-780")

classifier(text)

[{'label': 'implicit_hate', 'score': 0.706383466720581}]