# 💥 Implicit Toxic Comment Detection – Task Overview

This notebook provides a baseline solution for detecting **toxic language** in social media text using PyTorch and TorchText. The main goal is to classify each input as either **toxic (1)** or **non-toxic (0)**, based on linguistic patterns learned from labeled examples.

## 1. 🧩 Problem

The goal is to detect **implicit toxic language** in social media posts, particularly tweets. The task is framed as a **binary classification** problem:  
> Predict whether a given text is **toxic (1)** or **non-toxic (0)**.

## 2. 📚 Dataset Description

The pre-processed dataset is sourced from the [**ToxicDet Datasets** repository](https://github.com/duyngtr16061999/toxicdet_datasets) and contains four CSV files:

| File             | Description                                                                 |
|------------------|-----------------------------------------------------------------------------|
| `train.csv`      | Labeled training data with two columns: `text` (input sentence) and `label` (1 = toxic, 0 = non-toxic). |
| `unlabeled.csv`  | Unlabeled training samples—useful for semi-supervised learning.             |
| `test.csv`       | Evaluation set with known labels for local validation.                      |
| `hidden_test.csv`| The final test set used for leaderboard evaluation. No labels are provided. |

## 3. 🧠 Implemented base model

The baseline model in this notebook uses:

- **TorchText** for preprocessing and vocabulary building
- **RNN-based classifier**: an embedding layer, a hidden rnn layer (where the recurrent connection occurs), and an output layer (fully connected neural network).
- Training via **PyTorch**: using Adam optimizer and cross-entropy loss for training objective.
- Optional extensions with pretrained embeddings (e.g., GloVe, FastText)

## 4. 🧾 Input and output of the model

- **Input**: A single text comment (string).
  - A tokenized sequence using a vocabulary-based tokenizer (from TorchText)
  - The **length** of the tokenized sequence (used for batching and masking in models like RNNs)
- **Output**: A binary label:
  - `1` → toxic / implicit hate
  - `0` → non-toxic

## 5. 📏 What metric is used?

The primary evaluation metric is **Accuracy**:
> Accuracy = (Correct Predictions) / (Total Samples)

This metric will be computed on both `test.csv` and `hidden_test.csv`.

## 6. 🎯 Task for Students

You are tasked with **improving the toxic comment classification model**. You are free to implement any model you want. Some suggestions (but not compulsary) including:

1. **Model Improvements**:
   - Enhance the baseline using more advanced models like BiLSTM, CNN, Transformer-based models (e.g., BERT, PhoBERT).
   - Consider pretrained embeddings or language models to boost performance.

2. **Semi-Supervised Learning**:
   - Use `unlabeled.csv` to create pseudo-labels or apply consistency-based regularization.
   - Experiment with self-training or co-training frameworks.

4. **Evaluation and Tuning**:
   - Validate your models thoroughly on `test.csv`.
   - Tune hyperparameters for better generalization to `hidden_test.csv`.

## 7. 📤 Submission Instructions

- The code will save the results in a file named `predictions.csv`.  
  ✅ Please rename it to: `submission_<your_name>.csv` before uploading to the competition platform.
- You must also submit the notebook file (`.ipynb`) containing your training and inference code.  
  ✅ Rename it to: `notebook_<your_name>.ipynb`.
- ❌ **Do not submit any model checkpoint files** (e.g., `.pt`, `.bin`, etc.).

In [1]:
!pip install datasets tqdm -q
!pip install torch==2.3.0 --index-url https://download.pytorch.org/whl/cu121 -q
!pip install torchtext==0.18.0 -q

In [2]:
import collections
import csv
import matplotlib.pyplot as plt
import numpy as np
import torch
import torch.nn as nn
import torch.optim as optim
import tqdm
from datasets import Dataset
import torchtext
from torchtext.data.utils import get_tokenizer
from torchtext.vocab import build_vocab_from_iterator, GloVe



In [3]:
seed = 1234

np.random.seed(seed)
torch.manual_seed(seed)
torch.cuda.manual_seed(seed)
torch.backends.cudnn.deterministic = True

In [4]:
!git clone https://github.com/duyngtr16061999/toxicdet_datasets

fatal: destination path 'toxicdet_datasets' already exists and is not an empty directory.


### Prepare data

In [5]:
def load_dataset(csv_file, labeled=True):
    samples = []

    with open(csv_file, mode='r', encoding='utf-8') as file:
        reader = csv.DictReader(file)

        for row in reader:
            sample = {
                'text': row.get('text', '').strip()
            }
            if labeled:
                sample['label'] = int(row.get('label', '').strip())
            samples.append(sample)

    return samples

In [6]:
# train -> 25% vs unlabel -> 75%
# test -> 50%/50% test and hidden test
train_dataset = load_dataset("./toxicdet_datasets/train.csv")
unlabeled_dataset = load_dataset("./toxicdet_datasets/unlabeled.csv", False)

test_dataset = load_dataset("./toxicdet_datasets/test.csv")
hidden_test_dataset = load_dataset("./toxicdet_datasets/hidden_test.csv", False)

In [7]:
def gen(dataset, labeled=True):
    for i in dataset:
        item = {"text": i["text"]}
        if labeled:
            item["label"] = i["label"]
        yield item

In [8]:
train_data = Dataset.from_generator(gen, gen_kwargs={"dataset": train_dataset})
test_data = Dataset.from_generator(gen, gen_kwargs={"dataset": test_dataset})
unlabeled_data = Dataset.from_generator(gen, gen_kwargs={"dataset": unlabeled_dataset, "labeled": False})
hidden_test_data = Dataset.from_generator(gen, gen_kwargs={"dataset": hidden_test_dataset, "labeled": False})

train_data[:5]  # First 5

{'text': ['republicans like the ones the gop tries to fob off on white america ? ! they are no better than democrats !',
  '. feature : legal alliance gains host of court victories for conservative christian movement',
  'black thug attacks old white man with urine because of slavery #swrm #blackthug #whiteman via',
  "it's outrageous that good journalists would white wash an islamic phrase  used to kill  and main  and destroy good .",
  'a cm or his ministers behaving like street-goondas certainly reflects poor understanding of democracy .'],
 'label': [0, 1, 0, 0, 1]}

In [9]:
def tokenize_example(example, tokenizer, max_length):
    tokens = tokenizer(example["text"])[:max_length]
    length = len(tokens)
    return {"tokens": tokens, "length": length}

In [10]:
tokenizer = get_tokenizer("basic_english")
max_length = 256

train_data = train_data.map(
    tokenize_example, fn_kwargs={"tokenizer": tokenizer, "max_length": max_length}
)
test_data = test_data.map(
    tokenize_example, fn_kwargs={"tokenizer": tokenizer, "max_length": max_length}
)
unlabeled_data = unlabeled_data.map(
    tokenize_example, fn_kwargs={"tokenizer": tokenizer, "max_length": max_length}
)
hidden_test_data = hidden_test_data.map(
    tokenize_example, fn_kwargs={"tokenizer": tokenizer, "max_length": max_length}
)

In [11]:
test_size = 0.25

train_valid_data = train_data.train_test_split(test_size=test_size)
train_data = train_valid_data["train"]
valid_data = train_valid_data["test"]

In [12]:
min_freq = 5
special_tokens = ["<unk>", "<pad>"]

vocab = build_vocab_from_iterator(
    train_data["tokens"],
    min_freq=min_freq,
    specials=special_tokens,
)

unk_index = vocab["<unk>"]
pad_index = vocab["<pad>"]
vocab.set_default_index(unk_index)

In [13]:
def numericalize_example(example, vocab):
    ids = vocab.lookup_indices(example["tokens"])
    return {"ids": ids}

In [14]:
train_data = train_data.map(numericalize_example, fn_kwargs={"vocab": vocab})
valid_data = valid_data.map(numericalize_example, fn_kwargs={"vocab": vocab})
test_data = test_data.map(numericalize_example, fn_kwargs={"vocab": vocab})
hidden_test_data = hidden_test_data.map(numericalize_example, fn_kwargs={"vocab": vocab})

In [15]:
train_data = train_data.with_format(type="torch", columns=["ids", "label", "length"])
valid_data = valid_data.with_format(type="torch", columns=["ids", "label", "length"])
test_data = test_data.with_format(type="torch", columns=["ids", "label", "length"])
hidden_test_data = hidden_test_data.with_format(type="torch", columns=["ids", "length"])

In [16]:
def get_collate_fn(pad_index, labeled=True):
    
    def collate_fn(batch):
        batch_ids = [i["ids"] for i in batch]
        batch_ids = nn.utils.rnn.pad_sequence(
            batch_ids, padding_value=pad_index, batch_first=True
        )
        
        batch_length = [i["length"] for i in batch]
        batch_length = torch.stack(batch_length)
        
        batch_dict = {"ids": batch_ids, "length": batch_length}
        
        if labeled:
            batch_label = [i["label"] for i in batch]
            batch_label = torch.stack(batch_label)
            batch_dict["label"] = batch_label
        
        return batch_dict

    return collate_fn

In [17]:
def get_data_loader(dataset, batch_size, pad_index, shuffle=False, labeled=True):
    collate_fn = get_collate_fn(pad_index, labeled=labeled)
    data_loader = torch.utils.data.DataLoader(
        dataset=dataset,
        batch_size=batch_size,
        collate_fn=collate_fn,
        shuffle=shuffle,
    )
    return data_loader

In [18]:
batch_size = 512

train_data_loader = get_data_loader(train_data, batch_size, pad_index, shuffle=True)
valid_data_loader = get_data_loader(valid_data, batch_size, pad_index)
test_data_loader = get_data_loader(test_data, batch_size, pad_index)
hidden_test_data_loader = get_data_loader(hidden_test_data, batch_size, pad_index, labeled=False)

### Define model

In [19]:
class SimpleRNN(nn.Module):
    def __init__(
        self,
        vocab_size,
        embedding_dim,
        hidden_dim,
        output_dim,
        n_layers,
        bidirectional,
        dropout_rate,
        pad_index,
    ):
        super().__init__()
        self.embedding = nn.Embedding(vocab_size, embedding_dim, padding_idx=pad_index)
        self.rnn = nn.RNN(
            embedding_dim,
            hidden_dim,
            n_layers,
            bidirectional=bidirectional,
            dropout=dropout_rate if n_layers > 1 else 0,
            batch_first=True,
        )
        self.fc = nn.Linear(hidden_dim * 2 if bidirectional else hidden_dim, output_dim)
        self.dropout = nn.Dropout(dropout_rate)

    def forward(self, ids, length):
        # ids = [batch size, seq len]
        embedded = self.dropout(self.embedding(ids))
        # embedded = [batch size, seq len, embedding dim]
        packed_embedded = nn.utils.rnn.pack_padded_sequence(
            embedded, length, batch_first=True, enforce_sorted=False
        )
        packed_output, hidden = self.rnn(packed_embedded)
        # hidden = [n layers * n directions, batch size, hidden dim]
        output, _ = nn.utils.rnn.pad_packed_sequence(packed_output, batch_first=True)

        if self.rnn.bidirectional:
            hidden = self.dropout(torch.cat([hidden[-1], hidden[-2]], dim=-1))
        else:
            hidden = self.dropout(hidden[-1])
        prediction = self.fc(hidden)
        return prediction

In [20]:
vocab_size = len(vocab)
embedding_dim = 300
hidden_dim = 300
output_dim = len(train_data.unique("label"))
n_layers = 2
bidirectional = False
dropout_rate = 0.5

model = SimpleRNN(
    vocab_size,
    embedding_dim,
    hidden_dim,
    output_dim,
    n_layers,
    bidirectional,
    dropout_rate,
    pad_index,
)

f"The model has {sum(p.numel() for p in model.parameters() if p.requires_grad):,} trainable parameters"

'The model has 704,702 trainable parameters'

In [21]:
def initialize_weights(m):
    if isinstance(m, nn.Linear):
        nn.init.xavier_normal_(m.weight)
        nn.init.zeros_(m.bias)
    elif isinstance(m, nn.RNN):
        for name, param in m.named_parameters():
            if "bias" in name:
                nn.init.zeros_(param)
            elif "weight" in name:
                nn.init.orthogonal_(param)

model.apply(initialize_weights)

SimpleRNN(
  (embedding): Embedding(1143, 300, padding_idx=1)
  (rnn): RNN(300, 300, num_layers=2, batch_first=True, dropout=0.5)
  (fc): Linear(in_features=300, out_features=2, bias=True)
  (dropout): Dropout(p=0.5, inplace=False)
)

In [22]:
vectors = GloVe()
pretrained_embedding = vectors.get_vecs_by_tokens(vocab.get_itos())
model.embedding.weight.data = pretrained_embedding

In [23]:
lr = 5e-4

optimizer = optim.Adam(model.parameters(), lr=lr)

criterion = nn.CrossEntropyLoss()
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

model = model.to(device)
criterion = criterion.to(device)

### Train and infer

In [24]:
def train(dataloader, model, criterion, optimizer, device):
    model.train()
    epoch_losses = []
    epoch_accs = []
    for batch in tqdm.tqdm(dataloader, desc="training..."):
        ids = batch["ids"].to(device)
        length = batch["length"]
        label = batch["label"].to(device)
        
        prediction = model(ids, length)
        
        loss = criterion(prediction, label)
        accuracy = get_accuracy(prediction, label)
        
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()
        
        epoch_losses.append(loss.item())
        epoch_accs.append(accuracy.item())
    
    return np.mean(epoch_losses), np.mean(epoch_accs)

In [25]:
@torch.no_grad()
def evaluate(dataloader, model, criterion, device):
    model.eval()
    epoch_losses = []
    epoch_accs = []
    for batch in tqdm.tqdm(dataloader, desc="evaluating..."):
        ids = batch["ids"].to(device)
        length = batch["length"]
        label = batch["label"].to(device)
        
        prediction = model(ids, length)
        
        loss = criterion(prediction, label)
        accuracy = get_accuracy(prediction, label)
        
        epoch_losses.append(loss.item())
        epoch_accs.append(accuracy.item())
    
    return np.mean(epoch_losses), np.mean(epoch_accs)

In [26]:
def get_accuracy(prediction, label):
    batch_size, _ = prediction.shape
    
    predicted_classes = prediction.argmax(dim=-1)
    correct_predictions = predicted_classes.eq(label).sum()
    
    accuracy = correct_predictions / batch_size
    return accuracy

In [27]:
n_epochs = 10
best_valid_loss = float("inf")

metrics = collections.defaultdict(list)

for epoch in range(n_epochs):
    train_loss, train_acc = train(
        train_data_loader, model, criterion, optimizer, device
    )
    
    valid_loss, valid_acc = evaluate(valid_data_loader, model, criterion, device)
    
    metrics["train_losses"].append(train_loss)
    metrics["train_accs"].append(train_acc)
    metrics["valid_losses"].append(valid_loss)
    metrics["valid_accs"].append(valid_acc)
    
    if valid_loss < best_valid_loss:
        best_valid_loss = valid_loss
        torch.save(model.state_dict(), "rnn_baseline.pt")
    
    print(f"epoch: {epoch}")
    print(f"train_loss: {train_loss:.3f}, train_acc: {train_acc:.3f}")
    print(f"valid_loss: {valid_loss:.3f}, valid_acc: {valid_acc:.3f}")

training...: 100%|███████████████████████████████████████████████████████████████████████| 6/6 [00:01<00:00,  5.22it/s]
evaluating...: 100%|█████████████████████████████████████████████████████████████████████| 2/2 [00:00<00:00, 32.70it/s]


epoch: 0
train_loss: 0.875, train_acc: 0.585
valid_loss: 0.728, valid_acc: 0.661


training...: 100%|███████████████████████████████████████████████████████████████████████| 6/6 [00:00<00:00, 20.28it/s]
evaluating...: 100%|█████████████████████████████████████████████████████████████████████| 2/2 [00:00<00:00, 28.56it/s]


epoch: 1
train_loss: 0.816, train_acc: 0.579
valid_loss: 0.736, valid_acc: 0.670


training...: 100%|███████████████████████████████████████████████████████████████████████| 6/6 [00:00<00:00, 17.57it/s]
evaluating...: 100%|█████████████████████████████████████████████████████████████████████| 2/2 [00:00<00:00, 24.05it/s]


epoch: 2
train_loss: 0.769, train_acc: 0.611
valid_loss: 0.782, valid_acc: 0.672


training...: 100%|███████████████████████████████████████████████████████████████████████| 6/6 [00:00<00:00, 16.10it/s]
evaluating...: 100%|█████████████████████████████████████████████████████████████████████| 2/2 [00:00<00:00, 28.14it/s]


epoch: 3
train_loss: 0.748, train_acc: 0.621
valid_loss: 0.704, valid_acc: 0.673


training...: 100%|███████████████████████████████████████████████████████████████████████| 6/6 [00:00<00:00, 17.11it/s]
evaluating...: 100%|█████████████████████████████████████████████████████████████████████| 2/2 [00:00<00:00, 26.29it/s]


epoch: 4
train_loss: 0.727, train_acc: 0.631
valid_loss: 0.716, valid_acc: 0.674


training...: 100%|███████████████████████████████████████████████████████████████████████| 6/6 [00:00<00:00, 21.90it/s]
evaluating...: 100%|█████████████████████████████████████████████████████████████████████| 2/2 [00:00<00:00, 38.93it/s]


epoch: 5
train_loss: 0.706, train_acc: 0.638
valid_loss: 0.690, valid_acc: 0.677


training...: 100%|███████████████████████████████████████████████████████████████████████| 6/6 [00:00<00:00, 23.02it/s]
evaluating...: 100%|█████████████████████████████████████████████████████████████████████| 2/2 [00:00<00:00, 41.01it/s]


epoch: 6
train_loss: 0.711, train_acc: 0.646
valid_loss: 0.680, valid_acc: 0.679


training...: 100%|███████████████████████████████████████████████████████████████████████| 6/6 [00:00<00:00, 11.30it/s]
evaluating...: 100%|█████████████████████████████████████████████████████████████████████| 2/2 [00:00<00:00, 35.67it/s]


epoch: 7
train_loss: 0.671, train_acc: 0.658
valid_loss: 0.696, valid_acc: 0.683


training...: 100%|███████████████████████████████████████████████████████████████████████| 6/6 [00:00<00:00, 21.64it/s]
evaluating...: 100%|█████████████████████████████████████████████████████████████████████| 2/2 [00:00<00:00, 37.73it/s]


epoch: 8
train_loss: 0.665, train_acc: 0.669
valid_loss: 0.660, valid_acc: 0.639


training...: 100%|███████████████████████████████████████████████████████████████████████| 6/6 [00:00<00:00, 21.80it/s]
evaluating...: 100%|█████████████████████████████████████████████████████████████████████| 2/2 [00:00<00:00, 35.32it/s]


epoch: 9
train_loss: 0.671, train_acc: 0.672
valid_loss: 0.640, valid_acc: 0.692


In [28]:
model.load_state_dict(torch.load("rnn_baseline.pt"))  # Load best model

test_loss, test_acc = evaluate(test_data_loader, model, criterion, device)
test_loss, test_acc

evaluating...: 100%|█████████████████████████████████████████████████████████████████████| 4/4 [00:01<00:00,  3.57it/s]


(0.6444113105535507, 0.6854678243398666)

In [29]:
@torch.no_grad()
def predict_sentiment(text, model, tokenizer, vocab, device):
    tokens = tokenizer(text)
    ids = vocab.lookup_indices(tokens)
    length = torch.LongTensor([len(ids)])
    tensor = torch.LongTensor(ids).unsqueeze(dim=0).to(device)
    
    prediction = model(tensor, length).squeeze(dim=0)
    
    probability = torch.softmax(prediction, dim=-1)
    predicted_class = prediction.argmax(dim=-1).item()
    predicted_probability = probability[predicted_class].item()
    
    return predicted_class, predicted_probability

In [30]:
text = "that and what culture there is  is a bunch of self-hating white new englanders and left coast radical"

predict_sentiment(text, model, tokenizer, vocab, device)

(1, 0.5322336554527283)

In [31]:
@torch.no_grad()
def infer_predictions(dataloader, model, device):
    """
    Perform inference to get predictions (0 or 1) for the entire dataset.

    Args:
        dataloader: DataLoader providing batches of input data.
        model: Trained PyTorch model.
        device: torch.device ('cuda' or 'cpu').

    Returns:
        List[int]: A list of predicted labels (0 or 1).
    """
    model.eval()
    all_predictions = []
    for batch in tqdm.tqdm(dataloader, desc="inferring..."):
        ids = batch["ids"].to(device)
        length = batch["length"]
        
        prediction = model(ids, length)
        
        probability = torch.softmax(prediction, dim=-1)
        predicted_class = prediction.argmax(dim=-1).detach().cpu().tolist()
        
        all_predictions.extend(predicted_class)
    
    return all_predictions

In [32]:
hidden_predictions = infer_predictions(hidden_test_data_loader, model, device)

inferring...: 100%|██████████████████████████████████████████████████████████████████████| 4/4 [00:00<00:00,  5.23it/s]


### Save to file

In [33]:
def save_predictions_to_csv(dataset, filename):
    with open(filename, mode='w', newline='', encoding='utf-8') as file:
        writer = csv.writer(file)
        writer.writerow(['label'])
        for item in dataset:
            writer.writerow([item])

In [34]:
# Save the output DataFrame to a CSV file
your_name = "baseline"  # <- IMPORTANT: Replace with your name
csv_filename = f"prediction_{your_name}.csv"
save_predictions_to_csv(hidden_predictions, csv_filename)

## **IMPORTANT**: Remember to Download both your Notebook and Output csv file and submit using the NLP Submission Link.

Make sure both files are name correctly.  

For example:
- `JohnSmith-IOAI2025-Task3-NaturalLanguageProcessing.ipynb`.
- `predictions_JohnSmith.csv`.