# Toxic Comment Classification Challenge

Identify and classify toxic online comments

Discussing things you care about can be difficult. The threat of abuse and harassment online means that many people stop expressing themselves and give up on seeking different opinions. Platforms struggle to effectively facilitate conversations, leading many communities to limit or completely shut down user comments.

The Conversation AI team, a research initiative founded by Jigsaw and Google (both a part of Alphabet) are working on tools to help improve online conversation. One area of focus is the study of negative online behaviors, like toxic comments (i.e. comments that are rude, disrespectful or otherwise likely to make someone leave a discussion). So far they’ve built a range of publicly available models served through the Perspective API, including toxicity. But the current models still make errors, and they don’t allow users to select which types of toxicity they’re interested in finding (e.g. some platforms may be fine with profanity, but not with other types of toxic content).

In this competition, you’re challenged to build a multi-headed model that’s capable of detecting different types of of toxicity like threats, obscenity, insults, and identity-based hate better than Perspective’s current models. You’ll be using a dataset of comments from Wikipedia’s talk page edits. Improvements to the current model will hopefully help online discussion become more productive and respectful.

Disclaimer: the dataset for this competition contains text that may be considered profane, vulgar, or offensive.

Link: https://www.kaggle.com/competitions/jigsaw-toxic-comment-classification-challenge

Help: https://www.kaggle.com/code/satyamkryadav/bert-model-96-77

In [1]:
import numpy as np
import pandas as pd
import torch
import torch.nn as nn
from sklearn.model_selection import train_test_split
from torch.utils.data import DataLoader, RandomSampler, TensorDataset
from tqdm.notebook import tqdm
from transformers import (AdamW, BertModel, BertTokenizer,
                          get_linear_schedule_with_warmup)

In [2]:
%load_ext nb_black

<IPython.core.display.Javascript object>

In [3]:
train_df = pd.read_csv(
    "../data/jigsaw-toxic-comment-classification-challenge/train.csv"
).set_index("id")
train_df

Unnamed: 0_level_0,comment_text,toxic,severe_toxic,obscene,threat,insult,identity_hate
id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
0000997932d777bf,Explanation\nWhy the edits made under my usern...,0,0,0,0,0,0
000103f0d9cfb60f,D'aww! He matches this background colour I'm s...,0,0,0,0,0,0
000113f07ec002fd,"Hey man, I'm really not trying to edit war. It...",0,0,0,0,0,0
0001b41b1c6bb37e,"""\nMore\nI can't make any real suggestions on ...",0,0,0,0,0,0
0001d958c54c6e35,"You, sir, are my hero. Any chance you remember...",0,0,0,0,0,0
...,...,...,...,...,...,...,...
ffe987279560d7ff,""":::::And for the second time of asking, when ...",0,0,0,0,0,0
ffea4adeee384e90,You should be ashamed of yourself \n\nThat is ...,0,0,0,0,0,0
ffee36eab5c267c9,"Spitzer \n\nUmm, theres no actual article for ...",0,0,0,0,0,0
fff125370e4aaaf3,And it looks like it was actually you who put ...,0,0,0,0,0,0


<IPython.core.display.Javascript object>

In [4]:
test_df = pd.read_csv(
    "../data/jigsaw-toxic-comment-classification-challenge/test.csv"
).set_index("id")
test_df

Unnamed: 0_level_0,comment_text
id,Unnamed: 1_level_1
00001cee341fdb12,Yo bitch Ja Rule is more succesful then you'll...
0000247867823ef7,== From RfC == \n\n The title is fine as it is...
00013b17ad220c46,""" \n\n == Sources == \n\n * Zawe Ashton on Lap..."
00017563c3f7919a,":If you have a look back at the source, the in..."
00017695ad8997eb,I don't anonymously edit articles at all.
...,...
fffcd0960ee309b5,". \n i totally agree, this stuff is nothing bu..."
fffd7a9a6eb32c16,== Throw from out field to home plate. == \n\n...
fffda9e8d6fafa9e,""" \n\n == Okinotorishima categories == \n\n I ..."
fffe8f1340a79fc2,""" \n\n == """"One of the founding nations of the..."


<IPython.core.display.Javascript object>

In [5]:
test_labels_df = pd.read_csv(
    "../data/jigsaw-toxic-comment-classification-challenge/test_labels.csv"
).set_index("id")
test_labels_df

Unnamed: 0_level_0,toxic,severe_toxic,obscene,threat,insult,identity_hate
id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
00001cee341fdb12,-1,-1,-1,-1,-1,-1
0000247867823ef7,-1,-1,-1,-1,-1,-1
00013b17ad220c46,-1,-1,-1,-1,-1,-1
00017563c3f7919a,-1,-1,-1,-1,-1,-1
00017695ad8997eb,-1,-1,-1,-1,-1,-1
...,...,...,...,...,...,...
fffcd0960ee309b5,-1,-1,-1,-1,-1,-1
fffd7a9a6eb32c16,-1,-1,-1,-1,-1,-1
fffda9e8d6fafa9e,-1,-1,-1,-1,-1,-1
fffe8f1340a79fc2,-1,-1,-1,-1,-1,-1


<IPython.core.display.Javascript object>

In [6]:
sample_submission_df = pd.read_csv(
    "../data/jigsaw-toxic-comment-classification-challenge/sample_submission.csv"
)
sample_submission_df

Unnamed: 0,id,toxic,severe_toxic,obscene,threat,insult,identity_hate
0,00001cee341fdb12,0.5,0.5,0.5,0.5,0.5,0.5
1,0000247867823ef7,0.5,0.5,0.5,0.5,0.5,0.5
2,00013b17ad220c46,0.5,0.5,0.5,0.5,0.5,0.5
3,00017563c3f7919a,0.5,0.5,0.5,0.5,0.5,0.5
4,00017695ad8997eb,0.5,0.5,0.5,0.5,0.5,0.5
...,...,...,...,...,...,...,...
153159,fffcd0960ee309b5,0.5,0.5,0.5,0.5,0.5,0.5
153160,fffd7a9a6eb32c16,0.5,0.5,0.5,0.5,0.5,0.5
153161,fffda9e8d6fafa9e,0.5,0.5,0.5,0.5,0.5,0.5
153162,fffe8f1340a79fc2,0.5,0.5,0.5,0.5,0.5,0.5


<IPython.core.display.Javascript object>

In [7]:
df = pd.concat([train_df, test_df]).fillna(0)
df

Unnamed: 0_level_0,comment_text,toxic,severe_toxic,obscene,threat,insult,identity_hate
id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
0000997932d777bf,Explanation\nWhy the edits made under my usern...,0.0,0.0,0.0,0.0,0.0,0.0
000103f0d9cfb60f,D'aww! He matches this background colour I'm s...,0.0,0.0,0.0,0.0,0.0,0.0
000113f07ec002fd,"Hey man, I'm really not trying to edit war. It...",0.0,0.0,0.0,0.0,0.0,0.0
0001b41b1c6bb37e,"""\nMore\nI can't make any real suggestions on ...",0.0,0.0,0.0,0.0,0.0,0.0
0001d958c54c6e35,"You, sir, are my hero. Any chance you remember...",0.0,0.0,0.0,0.0,0.0,0.0
...,...,...,...,...,...,...,...
fffcd0960ee309b5,". \n i totally agree, this stuff is nothing bu...",0.0,0.0,0.0,0.0,0.0,0.0
fffd7a9a6eb32c16,== Throw from out field to home plate. == \n\n...,0.0,0.0,0.0,0.0,0.0,0.0
fffda9e8d6fafa9e,""" \n\n == Okinotorishima categories == \n\n I ...",0.0,0.0,0.0,0.0,0.0,0.0
fffe8f1340a79fc2,""" \n\n == """"One of the founding nations of the...",0.0,0.0,0.0,0.0,0.0,0.0


<IPython.core.display.Javascript object>

In [8]:
df["comment_text"] = (
    df["comment_text"]
    .str.replace(r"(@.*?)[\s]", " ", regex=True)
    .str.replace(r"[0-9]+", "", regex=True)
    .str.replace(r"\s([@][\w_-]+)", "", regex=True)
    .str.strip()
    .str.replace(r"&amp;", "&", regex=True)
    .str.replace(r"\s+", " ", regex=True)
    .str.replace("#", " ", regex=False)
)
df["comment_text"]

id
0000997932d777bf    Explanation Why the edits made under my userna...
000103f0d9cfb60f    D'aww! He matches this background colour I'm s...
000113f07ec002fd    Hey man, I'm really not trying to edit war. It...
0001b41b1c6bb37e    " More I can't make any real suggestions on im...
0001d958c54c6e35    You, sir, are my hero. Any chance you remember...
                                          ...                        
fffcd0960ee309b5    . i totally agree, this stuff is nothing but t...
fffd7a9a6eb32c16    == Throw from out field to home plate. == Does...
fffda9e8d6fafa9e    " == Okinotorishima categories == I see your c...
fffe8f1340a79fc2    " == ""One of the founding nations of the EU -...
ffffce3fb183ee80    " :::Stop already. Your bullshit is not welcom...
Name: comment_text, Length: 312735, dtype: object

<IPython.core.display.Javascript object>

In [9]:
tokenizer = BertTokenizer.from_pretrained("bert-base-uncased")

<IPython.core.display.Javascript object>

In [10]:
def tokenizer_encode(text, column="input_ids"):
    encoding = tokenizer.encode_plus(
        text,
        add_special_tokens=True,
        truncation=True,
        padding="max_length",
        return_attention_mask=True,
        # return_tensors="pt",
    )

    return encoding.get(column)


tqdm.pandas()

df["input_ids"] = df["comment_text"].progress_map(lambda x: tokenizer_encode(x))
df["attention_masks"] = df["comment_text"].progress_map(
    lambda x: tokenizer_encode(x, "attention_mask")
)

df

  0%|          | 0/312735 [00:00<?, ?it/s]

  0%|          | 0/312735 [00:00<?, ?it/s]

Unnamed: 0_level_0,comment_text,toxic,severe_toxic,obscene,threat,insult,identity_hate,input_ids,attention_masks
id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1
0000997932d777bf,Explanation Why the edits made under my userna...,0.0,0.0,0.0,0.0,0.0,0.0,"[101, 7526, 2339, 1996, 10086, 2015, 2081, 210...","[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ..."
000103f0d9cfb60f,D'aww! He matches this background colour I'm s...,0.0,0.0,0.0,0.0,0.0,0.0,"[101, 1040, 1005, 22091, 2860, 999, 2002, 3503...","[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ..."
000113f07ec002fd,"Hey man, I'm really not trying to edit war. It...",0.0,0.0,0.0,0.0,0.0,0.0,"[101, 4931, 2158, 1010, 1045, 1005, 1049, 2428...","[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ..."
0001b41b1c6bb37e,""" More I can't make any real suggestions on im...",0.0,0.0,0.0,0.0,0.0,0.0,"[101, 1000, 2062, 1045, 2064, 1005, 1056, 2191...","[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ..."
0001d958c54c6e35,"You, sir, are my hero. Any chance you remember...",0.0,0.0,0.0,0.0,0.0,0.0,"[101, 2017, 1010, 2909, 1010, 2024, 2026, 5394...","[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ..."
...,...,...,...,...,...,...,...,...,...
fffcd0960ee309b5,". i totally agree, this stuff is nothing but t...",0.0,0.0,0.0,0.0,0.0,0.0,"[101, 1012, 1045, 6135, 5993, 1010, 2023, 4933...","[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ..."
fffd7a9a6eb32c16,== Throw from out field to home plate. == Does...,0.0,0.0,0.0,0.0,0.0,0.0,"[101, 1027, 1027, 5466, 2013, 2041, 2492, 2000...","[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ..."
fffda9e8d6fafa9e,""" == Okinotorishima categories == I see your c...",0.0,0.0,0.0,0.0,0.0,0.0,"[101, 1000, 1027, 1027, 7929, 5740, 29469, 247...","[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ..."
fffe8f1340a79fc2,""" == """"One of the founding nations of the EU -...",0.0,0.0,0.0,0.0,0.0,0.0,"[101, 1000, 1027, 1027, 1000, 1000, 2028, 1997...","[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ..."


<IPython.core.display.Javascript object>

In [11]:
X_test = df[df.index.isin(test_df.index)][["input_ids", "attention_masks"]]

X = df[df.index.isin(train_df.index)][["input_ids", "attention_masks"]]
y = df[df.index.isin(train_df.index)][["toxic"]].astype(int)

X_test.shape, X.shape, y.shape

((153164, 2), (159571, 2), (159571, 1))

<IPython.core.display.Javascript object>

In [12]:
y.value_counts()

toxic
0        144277
1         15294
dtype: int64

<IPython.core.display.Javascript object>

In [13]:
X_train, X_val, y_train, y_val = train_test_split(
    X, y, test_size=0.1, random_state=2022
)
X_train.shape, X_val.shape, y_train.shape, y_val.shape

((143613, 2), (15958, 2), (143613, 1), (15958, 1))

<IPython.core.display.Javascript object>

In [14]:
# For fine-tuning BERT, the authors recommend a batch size of 16 or 32.
BATCH_SIZE = 8

train_dataset = TensorDataset(
    torch.tensor(X_train["input_ids"]),
    torch.tensor(X_train["attention_masks"]),
    torch.tensor(y_train["toxic"]),
)
train_sampler = RandomSampler(train_dataset)
train_loader = DataLoader(train_dataset, sampler=train_sampler, batch_size=BATCH_SIZE)


val_dataset = TensorDataset(
    torch.tensor(X_val["input_ids"]),
    torch.tensor(X_val["attention_masks"]),
    torch.tensor(y_val["toxic"]),
)
val_sampler = RandomSampler(val_dataset)
val_loader = DataLoader(val_dataset, sampler=val_sampler, batch_size=BATCH_SIZE)

<IPython.core.display.Javascript object>

In [15]:
# Create the BertClassfier class
class BertClassifier(nn.Module):
    def __init__(self, freeze_bert=False):
        super(BertClassifier, self).__init__()
        # Specify hidden size of BERT, hidden size of our classifier, and number of labels
        D_in, H, D_out = 768, 50, 2

        # Instantiate BERT model
        # https://huggingface.co/transformers/v4.9.2/pretrained_models.html?highlight=bert-base-multilingual-cased
        self.bert = BertModel.from_pretrained("bert-base-multilingual-cased")

        # Instantiate an one-layer feed-forward classifier
        self.classifier = nn.Sequential(
            nn.Linear(D_in, H),
            nn.ReLU(),
            nn.Linear(H, D_out),
        )

        # Freeze the BERT model
        if freeze_bert:
            for param in self.bert.parameters():
                param.requires_grad = False

    def forward(self, input_ids, attention_mask):
        # Feed input to BERT
        outputs = self.bert(input_ids=input_ids, attention_mask=attention_mask)
        # Extract the last hidden state of the token `[CLS]` for classification task
        last_hidden_state_cls = outputs[0][:, 0, :]
        # Feed input to classifier to compute logits
        return self.classifier(last_hidden_state_cls)

<IPython.core.display.Javascript object>

In [16]:
device = torch.device("cuda:0")
device

device(type='cuda', index=0)

<IPython.core.display.Javascript object>

In [17]:
num_epochs = 4

model = BertClassifier(freeze_bert=False)
model.to(device)

loss_fn = nn.CrossEntropyLoss()
optimizer = AdamW(model.parameters(), lr=5e-5, eps=1e-8)

total_steps = len(train_loader) * num_epochs
scheduler = get_linear_schedule_with_warmup(
    optimizer, num_warmup_steps=0, num_training_steps=total_steps
)

Downloading:   0%|          | 0.00/625 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/681M [00:00<?, ?B/s]

Some weights of the model checkpoint at bert-base-multilingual-cased were not used when initializing BertModel: ['cls.predictions.bias', 'cls.predictions.decoder.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.transform.dense.weight', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.LayerNorm.bias', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


<IPython.core.display.Javascript object>

In [18]:
def accuracy_score(model, loader, desc=""):
    model.eval()

    accuracy_score = 0

    pbar = tqdm(enumerate(loader), total=len(loader))
    if desc:
        pbar.set_description(desc)

    for _, batch in pbar:
        input_ids, attention_masks, labels = tuple(t.to(device) for t in batch)

        y_preds = model(input_ids, attention_masks)
        accuracy = (y_preds.argmax(dim=1) == labels).float().mean()
        accuracy_score += accuracy.cpu().detach().numpy()

    return accuracy_score / len(loader)

<IPython.core.display.Javascript object>

In [19]:
for epoch in range(num_epochs):
    model.train()

    pbar = tqdm(enumerate(train_loader), total=len(train_loader))
    pbar.set_description("Epoch %d" % epoch)

    epoch_loss = 0
    epoch_accuracy = 0

    for _, batch in pbar:
        input_ids, attention_masks, labels = tuple(t.to(device) for t in batch)

        y_preds = model(input_ids, attention_masks)
        loss = loss_fn(y_preds, labels)

        optimizer.zero_grad()
        loss.backward()
        optimizer.step()

        accuracy = (y_preds.argmax(dim=1) == labels).float().mean()

        epoch_loss += loss.cpu().detach().numpy()
        epoch_accuracy += accuracy.cpu().detach().numpy()

    scheduler.step()

    print(
        "Loss: {}, Train accuracy: {}, Val accuracy: {}".format(
            epoch_loss / len(train_loader),
            epoch_accuracy / len(train_loader),
            accuracy_score(model, val_loader, "Validation"),
        )
    )

  0%|          | 0/17952 [00:00<?, ?it/s]

  0%|          | 0/1995 [00:00<?, ?it/s]

Loss: 0.309984236599445, Train accuracy: 0.904000946969697, Val accuracy: 0.9052631578947369


  0%|          | 0/17952 [00:00<?, ?it/s]

  0%|          | 0/1995 [00:00<?, ?it/s]

Loss: 0.3167595041765534, Train accuracy: 0.904031584225263, Val accuracy: 0.9052631578947369


  0%|          | 0/17952 [00:00<?, ?it/s]

  0%|          | 0/1995 [00:00<?, ?it/s]

Loss: 0.3157706250172646, Train accuracy: 0.9040357620320856, Val accuracy: 0.9052631578947369


  0%|          | 0/17952 [00:00<?, ?it/s]

  0%|          | 0/1995 [00:00<?, ?it/s]

Loss: 0.31501127647172905, Train accuracy: 0.9040357620320856, Val accuracy: 0.9052631578947369


<IPython.core.display.Javascript object>