# Hate Speech Detection - Part II: Training a Model

In this notebook we use the general hatespeech dataset we constructed in the previous notebook ([here](https://www.kaggle.com/jessedingley/hatespeech-detection-data)) to build a general hate speech model that predicts if a tweet conveys hate speech or not. For this we use BERT for sequence classification.


# 0. Setup

### Imports

In [1]:
# for gpu use, tensors etc...
import torch

# import tokenizer, model for sequence classification, trainer and training arguments
from transformers import AutoModelForSequenceClassification, AutoTokenizer, Trainer, TrainingArguments

# We need to create Datasets types to pass into the model
from torch.utils.data import Dataset, DataLoader

# for opening csv
import csv

# for computing metrics
from sklearn.metrics import accuracy_score, precision_recall_fscore_support

# for saving model
import os

### Model and Tokenizer setup
We're using the `distilbert-base-uncased` variant of BERT which is smaller and more efficient than regular BERT.

In [2]:
MODEL_NAME = "cardiffnlp/twitter-roberta-base" # Distil BERT is a smaller model with faster execution time

# define model and tokenizer
model = AutoModelForSequenceClassification.from_pretrained(MODEL_NAME, num_labels=2)
tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME, padding_side = "right")

Downloading:   0%|          | 0.00/565 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/501M [00:00<?, ?B/s]

Some weights of the model checkpoint at cardiffnlp/twitter-roberta-base were not used when initializing RobertaForSequenceClassification: ['lm_head.bias', 'lm_head.dense.weight', 'lm_head.dense.bias', 'lm_head.layer_norm.weight', 'lm_head.layer_norm.bias', 'lm_head.decoder.weight', 'lm_head.decoder.bias', 'roberta.pooler.dense.weight', 'roberta.pooler.dense.bias']
- This IS expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of RobertaForSequenceClassification were not initialized from the model checkpoint at cardiffnlp/twitter-roberta-base and

Downloading:   0%|          | 0.00/899k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/456k [00:00<?, ?B/s]

### GPU

In [3]:
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

### Enable wand logging


In [4]:
# if true then use wandb logging
USE_WANDB = False

In [5]:
logging_place = "none"
if USE_WANDB:
    import wandb
    wandb.login()
    %env WANDB_PROJECT=robertatweet
    logging_place = "wandb"
    
print(logging_place)

none


### Hyperparameters

In [6]:
N_EPOCHS = 30

# 1. Data

## 1.1. Retrieve Train and Test Data

### Choose data to work on

In [7]:
# if true then used tweets normalized for Bertweet
BERTWEET = True

In [8]:
# get correct input_data
if BERTWEET:
    input_data = "hatespeech-detection-data-bertweet"
else:
    input_data = "hatespeech-detection-data"

### Open Data

In [9]:
# open train data
with open("../input/"+input_data+"/hatespeech_train.csv", "r", encoding="utf8") as f:
    train_data = [{k: v for k, v in row.items()} for row in csv.DictReader(f, skipinitialspace=True)] 

# open test data
with open("../input/"+input_data+"/hatespeech_test.csv", "r", encoding="utf8") as f:
    test_data = [{k: v for k, v in row.items()} for row in csv.DictReader(f, skipinitialspace=True)] 

## 1.2. Separate Tweets from Labels

In [10]:
train_data_tweets = [row["tweet"] for row in train_data] 
train_data_labels = [int(row["label"]) for row in train_data]

test_data_tweets = [row["tweet"] for row in test_data]
test_data_labels = [int(row["label"]) for row in test_data]

## 1.3. Split Training Data into Train and Validation splits

In [11]:
from sklearn.model_selection import train_test_split

# calculate valiation split size (it needs to represent 15% of all data)
val_split_size = (0.15*(len(train_data)+len(test_data)))/len(train_data)

# split
train_tweets, val_tweets, train_labels, val_labels = train_test_split(train_data_tweets, train_data_labels, test_size=val_split_size)

## 1.4. Tokenize Data
More specifically tokenize tweets.

In [12]:
tokenized_train_tweets = tokenizer(train_tweets, truncation=True, padding=True)
tokenized_val_tweets = tokenizer(val_tweets, truncation=True, padding=True)
tokenized_test_tweets = tokenizer(test_data_tweets, truncation=True, padding=True)

Asking to truncate to max_length but no maximum length is provided and the model has no predefined maximum length. Default to no truncation.


## 1.5. Construct `Dataset` class
This is the necessary format for training.

In [13]:
class HateSpeechDataset(Dataset):
    def __init__(self, encodings, labels):
        self.encodings = encodings
        self.labels = labels

    def __getitem__(self, idx):
        item = {key: torch.tensor(val[idx]) for key, val in self.encodings.items()}
        item['labels'] = torch.tensor(self.labels[idx])
        return item

    def __len__(self):
        return len(self.labels)

In [14]:
train_dataset = HateSpeechDataset(tokenized_train_tweets, train_labels)
val_dataset = HateSpeechDataset(tokenized_val_tweets, val_labels)
test_dataset = HateSpeechDataset(tokenized_test_tweets, test_data_labels)

# 2. Training

## 2.1. Set various Parameters

### 2.1.1. Model Hyperparameters

In [15]:
training_args = TrainingArguments(
    output_dir='./results',
    num_train_epochs=N_EPOCHS, 
    per_device_train_batch_size=32,
    per_device_eval_batch_size=64,
    load_best_model_at_end = True, 
    learning_rate=0.001,
    warmup_steps=100,
    weight_decay=0.01,
    logging_dir='./logs',
    logging_steps=100,
    report_to = logging_place
)

### 2.1.2. Evaluation Metrics

In [16]:
# A function computing metrics based on model output
def compute_metrics(pred):
    labels = pred.label_ids
    preds = pred.predictions.argmax(-1)
    precision, recall, f1, _ = precision_recall_fscore_support(labels, preds, average='binary')
    acc = accuracy_score(labels, preds)
    return {
        'accuracy': acc,
        'f1': f1,
        'precision': precision,
        'recall': recall
    }

## 2.2. Train the model

### 2.2.1 Set model to training mode

In [17]:
model.train()

RobertaForSequenceClassification(
  (roberta): RobertaModel(
    (embeddings): RobertaEmbeddings(
      (word_embeddings): Embedding(50265, 768, padding_idx=1)
      (position_embeddings): Embedding(514, 768, padding_idx=1)
      (token_type_embeddings): Embedding(1, 768)
      (LayerNorm): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
      (dropout): Dropout(p=0.1, inplace=False)
    )
    (encoder): RobertaEncoder(
      (layer): ModuleList(
        (0): RobertaLayer(
          (attention): RobertaAttention(
            (self): RobertaSelfAttention(
              (query): Linear(in_features=768, out_features=768, bias=True)
              (key): Linear(in_features=768, out_features=768, bias=True)
              (value): Linear(in_features=768, out_features=768, bias=True)
              (dropout): Dropout(p=0.1, inplace=False)
            )
            (output): RobertaSelfOutput(
              (dense): Linear(in_features=768, out_features=768, bias=True)
              (LayerN

### 2.2.2. Define `Trainer` (training setup)

In [18]:
trainer = Trainer(
    model=model,                         # the instantiated 🤗 Transformers model to be trained
    args=training_args,                  # training arguments, defined above
    compute_metrics=compute_metrics,
    train_dataset=train_dataset,         # training dataset
    eval_dataset=val_dataset             # evaluation dataset
)

### 2.2.3. Freeze BERT layers
(We only want to train the classifier head)

In [19]:
for name, param in model.named_parameters():
    if 'classifier' not in name:
        param.requires_grad = False

### 2.2.4. Actually train the model

In [20]:
trainer.train()

Step,Training Loss
100,0.4845
200,0.4266
300,0.3759
400,0.359
500,0.3769
600,0.3555
700,0.3712
800,0.3534
900,0.3508
1000,0.3433


TrainOutput(global_step=9300, training_loss=0.3236829769995905, metrics={'train_runtime': 676.4091, 'train_samples_per_second': 13.749, 'total_flos': 1.3131629218368e+16, 'epoch': 30.0, 'init_mem_cpu_alloc_delta': 1517355008, 'init_mem_gpu_alloc_delta': 499887104, 'init_mem_cpu_peaked_delta': 380329984, 'init_mem_gpu_peaked_delta': 0, 'train_mem_cpu_alloc_delta': 144986112, 'train_mem_gpu_alloc_delta': 7138816, 'train_mem_cpu_peaked_delta': 0, 'train_mem_gpu_peaked_delta': 63822336})

# 3. Evaluation

## 3.1. Evaluation on Dev Set

In [21]:
trainer.evaluate()

{'eval_loss': 0.2758982181549072,
 'eval_accuracy': 0.8857008466603951,
 'eval_f1': 0.7643064985451019,
 'eval_precision': 0.8073770491803278,
 'eval_recall': 0.7255985267034991,
 'eval_runtime': 3.4989,
 'eval_samples_per_second': 607.615,
 'epoch': 30.0,
 'eval_mem_cpu_alloc_delta': 0,
 'eval_mem_gpu_alloc_delta': 0,
 'eval_mem_cpu_peaked_delta': 0,
 'eval_mem_gpu_peaked_delta': 116925440}

In [22]:
if USE_WANDB:
    # stop wandb
    wandb.finish()

## 3.2. Evaluation on test set.

### Set model to evaluate mode

In [23]:
model.eval()

RobertaForSequenceClassification(
  (roberta): RobertaModel(
    (embeddings): RobertaEmbeddings(
      (word_embeddings): Embedding(50265, 768, padding_idx=1)
      (position_embeddings): Embedding(514, 768, padding_idx=1)
      (token_type_embeddings): Embedding(1, 768)
      (LayerNorm): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
      (dropout): Dropout(p=0.1, inplace=False)
    )
    (encoder): RobertaEncoder(
      (layer): ModuleList(
        (0): RobertaLayer(
          (attention): RobertaAttention(
            (self): RobertaSelfAttention(
              (query): Linear(in_features=768, out_features=768, bias=True)
              (key): Linear(in_features=768, out_features=768, bias=True)
              (value): Linear(in_features=768, out_features=768, bias=True)
              (dropout): Dropout(p=0.1, inplace=False)
            )
            (output): RobertaSelfOutput(
              (dense): Linear(in_features=768, out_features=768, bias=True)
              (LayerN

### Function to predict output of a tweet

In [24]:
def get_sent_pred(input_str,device=device):
    tok = tokenizer(input_str, return_tensors="pt", truncation=True, padding=True)
    tok.to(device)
    with torch.no_grad():
        pred = model(**tok)
    return pred['logits'].argmax(-1).item()

### Function to compute metrics of model output for test data

In [25]:
def compute_metrics_test(y_true,y_pred):
    precision, recall, f1, _ = precision_recall_fscore_support(y_true, y_pred, average='binary')
    acc = accuracy_score(y_true, y_pred)
    return {
        'accuracy': acc,
        'f1': f1,
        'precision': precision,
        'recall': recall
}

### Compute Metrics

In [26]:
compute_metrics_test(test_data_labels, [get_sent_pred(sent) for sent in test_data_tweets])

{'accuracy': 0.8814675446848542,
 'f1': 0.7640449438202247,
 'precision': 0.8143712574850299,
 'recall': 0.7195767195767195}

# 4. Save model

In [27]:
# Saving best-practices: if you use defaults names for the model, you can reload it using from_pretrained()
output_dir = '/kaggle/working/hatespeech_model/'

# Create output directory if needed
if not os.path.exists(output_dir):
    os.makedirs(output_dir)

print("Saving model to %s" % output_dir)

# Save a trained model, configuration and tokenizer using `save_pretrained()`.
# They can then be reloaded using `from_pretrained()`
model_to_save = model.module if hasattr(model, 'module') else model  # Take care of distributed/parallel training
model_to_save.save_pretrained(output_dir)
tokenizer.save_pretrained(output_dir)

Saving model to /kaggle/working/hatespeech_model/


('/kaggle/working/hatespeech_model/tokenizer_config.json',
 '/kaggle/working/hatespeech_model/special_tokens_map.json',
 '/kaggle/working/hatespeech_model/vocab.json',
 '/kaggle/working/hatespeech_model/merges.txt',
 '/kaggle/working/hatespeech_model/added_tokens.json')