# 💥 Implicit Toxic Comment Detection – Task Overview

This notebook provides a baseline solution for detecting **toxic language** in social media text using PyTorch and TorchText. The main goal is to classify each input as either **toxic (1)** or **non-toxic (0)**, based on linguistic patterns learned from labeled examples.

## 1. 🧩 Problem

The goal is to detect **implicit toxic language** in social media posts, particularly tweets. The task is framed as a **binary classification** problem:  
> Predict whether a given text is **toxic (1)** or **non-toxic (0)**.

## 2. 📚 Dataset Description

The pre-processed dataset is sourced from the [**ToxicDet Datasets** repository](https://github.com/duyngtr16061999/toxicdet_datasets) and contains four CSV files:

| File             | Description                                                                 |
|------------------|-----------------------------------------------------------------------------|
| `train.csv`      | Labeled training data with two columns: `text` (input sentence) and `label` (1 = toxic, 0 = non-toxic). |
| `unlabeled.csv`  | Unlabeled training samples—useful for semi-supervised learning.             |
| `test.csv`       | Evaluation set with known labels for local validation.                      |
| `hidden_test.csv`| The final test set used for leaderboard evaluation. No labels are provided. |

## 3. 🧠 Implemented base model

The baseline model in this notebook uses:

- **TorchText** for preprocessing and vocabulary building
- **RNN-based classifier**: an embedding layer, a hidden rnn layer (where the recurrent connection occurs), and an output layer (fully connected neural network).
- Training via **PyTorch**: using Adam optimizer and cross-entropy loss for training objective.
- Optional extensions with pretrained embeddings (e.g., GloVe, FastText)

## 4. 🧾 Input and output of the model

- **Input**: A single text comment (string).
  - A tokenized sequence using a vocabulary-based tokenizer (from TorchText)
  - The **length** of the tokenized sequence (used for batching and masking in models like RNNs)
- **Output**: A binary label:
  - `1` → toxic / implicit hate
  - `0` → non-toxic

## 5. 📏 What metric is used?

The primary evaluation metric is **Accuracy**:
> Accuracy = (Correct Predictions) / (Total Samples)

This metric will be computed on both `test.csv` and `hidden_test.csv`.

## 6. 🎯 Task for Students

You are tasked with **improving the toxic comment classification model**. You are free to implement any model you want. Some suggestions (but not compulsary) including:

1. **Model Improvements**:
   - Enhance the baseline using more advanced models like BiLSTM, CNN, Transformer-based models (e.g., BERT, PhoBERT).
   - Consider pretrained embeddings or language models to boost performance.

2. **Semi-Supervised Learning**:
   - Use `unlabeled.csv` to create pseudo-labels or apply consistency-based regularization.
   - Experiment with self-training or co-training frameworks.

4. **Evaluation and Tuning**:
   - Validate your models thoroughly on `test.csv`.
   - Tune hyperparameters for better generalization to `hidden_test.csv`.

## 7. 📤 Submission Instructions

- The code will save the results in a file named `predictions.csv`.  
  ✅ Please rename it to: `submission_<your_name>.csv` before uploading to the competition platform.
- You must also submit the notebook file (`.ipynb`) containing your training and inference code.  
  ✅ Rename it to: `notebook_<your_name>.ipynb`.
- ❌ **Do not submit any model checkpoint files** (e.g., `.pt`, `.bin`, etc.).

In [1]:
!git clone https://github.com/duyngtr16061999/toxicdet_datasets

fatal: destination path 'toxicdet_datasets' already exists and is not an empty directory.


In [2]:
import csv
from pathlib import Path
from collections import Counter
from copy import deepcopy

import numpy as np
import pandas as pd
import torch
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import Dataset, DataLoader, ConcatDataset
from torchtext.vocab import GloVe

import nltk
nltk.download("punkt")
from nltk.tokenize import word_tokenize

[nltk_data] Downloading package punkt to
[nltk_data]     C:\Users\ASUS\AppData\Roaming\nltk_data...
[nltk_data]   Package punkt is already up-to-date!


In [3]:
seed = 1234

np.random.seed(seed)
torch.manual_seed(seed)
torch.cuda.manual_seed(seed)
torch.backends.cudnn.deterministic = True

In [4]:
device = "cuda" if torch.cuda.is_available() else "cpu"

### Prepare data

In [5]:
base_dir = Path("./toxicdet_datasets")

df_train = pd.read_csv(base_dir / "train.csv")
df_train_unlabeled = pd.read_csv(base_dir / "unlabeled.csv")
df_test = pd.read_csv(base_dir / "test.csv")
df_test_unlabeled = pd.read_csv(base_dir / "hidden_test.csv")

len(df_train), len(df_train_unlabeled)

(3908, 11727)

In [6]:
df_train.head()

Unnamed: 0,text,label
0,republicans like the ones the gop tries to fob...,0
1,. feature : legal alliance gains host of court...,1
2,black thug attacks old white man with urine be...,0
3,it's outrageous that good journalists would wh...,0
4,a cm or his ministers behaving like street-goo...,1


In [7]:
def preprocess(texts, vocab=None, *, max_len=500, vocab_size=10000):
    # Tokenize
    text_tokens = [word_tokenize(text.lower()) for text in texts]

    # Build vocabulary
    vocab_provided = vocab is not None
    if not vocab_provided:
        common_words = Counter([token for text in text_tokens for token in text]).most_common(vocab_size - 2)
        vocab = {word: idx + 2 for idx, (word, _) in enumerate(common_words)}
        vocab["<UNK>"] = 1
        vocab["<PAD>"] = 0

    # Tokens to token IDs
    text_token_ids = []
    for text in text_tokens:
        encoded = [vocab.get(word, vocab["<UNK>"]) for word in text]
        # Truncate if more, pad if less
        encoded += [vocab["<PAD>"]] * (max_len - len(encoded))
        encoded = encoded[:max_len]
        text_token_ids.append(encoded)
    text_token_ids = torch.tensor(text_token_ids).to(device)
    
    if vocab_provided:
        return text_token_ids
    return text_token_ids, vocab

In [8]:
X_train, vocab = preprocess(df_train["text"])
y_train = df_train["label"]

len(vocab)

9245

In [9]:
X_train1 = preprocess(df_train_unlabeled["text"], vocab)
X_test, y_test = preprocess(df_test["text"], vocab), df_test["label"]
X_test1 = preprocess(df_test_unlabeled["text"], vocab)

In [10]:
class TextDataset(Dataset):
    def __init__(self, texts, labels=None):
        self.texts = texts
        self.labels = labels

    def __len__(self):
        return len(self.texts)

    def __getitem__(self, idx):
        return self.texts[idx], self.labels[idx] if self.labels is not None else self.texts[idx]

In [11]:
ds_train = TextDataset(X_train, y_train)
dl_train = DataLoader(ds_train, batch_size=64)

In [12]:
ds_train1 = TextDataset(X_train1)
dl_train1 = DataLoader(ds_train1, batch_size=64)

In [13]:
ds_test = TextDataset(X_test, y_test)
dl_test = DataLoader(ds_test, batch_size=64)

In [14]:
ds_test1 = TextDataset(X_test1)
dl_test1 = DataLoader(ds_test1, batch_size=64)

### Train

In [15]:
class MyModel(nn.Module):
    def __init__(self, vocab_size, embedding_dim, hidden_dim, num_classes, padding_idx=0):
        super().__init__()
        self.embedding = nn.Embedding(vocab_size, embedding_dim, padding_idx=padding_idx)
        self.lstm = nn.LSTM(embedding_dim, hidden_dim, num_layers=2, batch_first=True, bidirectional=True)
        self.fc = nn.Linear(2 * hidden_dim, num_classes)
    
    def forward(self, x):
        embedded = self.embedding(x)
        _, (hidden, cell) = self.lstm(embedded)
        last_hidden = torch.cat((hidden[-2], hidden[-1]), dim=1)
        logits = self.fc(last_hidden)
        return logits

In [16]:
def train(model, optimizer, criterion, dataloader, num_epochs, eval_dataloader=None, model_filename="best_lstm.pt"):
    train_losses = []
    highest_accuracy = 0
    for epoch in range(num_epochs):
        model.train()
        running_loss = 0
        
        for inputs, labels in dataloader:
            inputs, labels = inputs.to(device), labels.to(device)
            
            outputs = model(inputs).squeeze()
            loss = criterion(outputs, labels)
            
            optimizer.zero_grad()
            loss.backward()
            optimizer.step()
            
            running_loss += loss.item() * inputs.size(0)

        running_loss /= len(dataloader.dataset)
        train_losses.append(running_loss)
        
        accuracy = evaluate(model, dataloader)
        score = accuracy
        if eval_dataloader is not None:
            score = evaluate(model, eval_dataloader)
            accuracy = [accuracy, score]

        if score > highest_accuracy:
            torch.save(model.state_dict(), model_filename)
            highest_accuracy = score
        
        print(f"Epoch [{epoch + 1}/{num_epochs}], Loss: {running_loss}, Accuracy: {accuracy}")
    return train_losses

In [17]:
@torch.no_grad()
def predict(model, dataloader):
    model.eval()
    all_preds = []
    all_probas = []
    for inputs, _ in dataloader:
        inputs = inputs.to(device)
        
        outputs = model(inputs).squeeze()
        outputs = torch.softmax(outputs, dim=1)
        preds = torch.argmax(outputs, dim=1)
        
        all_preds.append(preds)
        all_probas.append(outputs[torch.arange(outputs.size(0)), preds])

    return torch.hstack(all_preds), torch.hstack(all_probas)

In [18]:
def evaluate(model, dataloader):
    correct = 0
    total = 0
    for inputs, labels in dataloader:
        inputs, labels = inputs.to(device), labels.to(device)
        
        preds, _ = predict(model, [(inputs, None)])
        
        correct += (preds == labels).sum().item()
        total += labels.size(0)
    return correct / total

In [19]:
embedding_dim = 300
hidden_dim = 128

model = MyModel(len(vocab), embedding_dim, hidden_dim, 2, vocab["<PAD>"]).to(device)
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=1e-3)

In [20]:
glove = GloVe(dim=embedding_dim)
embedding_matrix = np.zeros((len(vocab), embedding_dim))

for word, idx in vocab.items():
    if word in glove.stoi:
        embedding_matrix[idx] = glove[word].numpy()
    else:
        embedding_matrix[idx] = np.random.normal(scale=0.6, size=(embedding_dim,))

model.embedding.weight.data.copy_(torch.from_numpy(embedding_matrix))
model.embedding.requires_grad_(False);

In [21]:
train(model, optimizer, criterion, dl_train, 10, dl_test);

Epoch [1/10], Loss: 0.6069801191712597, Accuracy: [0.7320880245649949, 0.6860940695296524]
Epoch [2/10], Loss: 0.5253773936534734, Accuracy: [0.7750767656090072, 0.7290388548057259]
Epoch [3/10], Loss: 0.47478856642024175, Accuracy: [0.79503582395087, 0.7326175869120655]
Epoch [4/10], Loss: 0.43032087228178, Accuracy: [0.8208802456499488, 0.7367075664621677]
Epoch [5/10], Loss: 0.37666725930104844, Accuracy: [0.8349539406345957, 0.7392638036809815]
Epoch [6/10], Loss: 0.33917839367975966, Accuracy: [0.8592630501535312, 0.7351738241308794]
Epoch [7/10], Loss: 0.29498731748257195, Accuracy: [0.8902251791197543, 0.7065439672801636]
Epoch [8/10], Loss: 0.22462493440368905, Accuracy: [0.8359774820880246, 0.7356850715746421]
Epoch [9/10], Loss: 0.22999919789934772, Accuracy: [0.9124872057318322, 0.7167689161554193]
Epoch [10/10], Loss: 0.18234349896644214, Accuracy: [0.9347492323439099, 0.7183026584867076]


In [22]:
best_model = deepcopy(model)
best_model.load_state_dict(torch.load("best_lstm.pt"))
best_model.lstm.flatten_parameters()  # Flatten weights in a single block of memory (for CuDNN optimization)

score = evaluate(best_model, dl_test)
score

0.7392638036809815

### Train with unlabeled data

In [23]:
threshold = 0.85

preds, probas = predict(model, dl_train1)
mask = probas >= threshold

mask.sum().item()

9647

In [24]:
ds_train1_pseu = TextDataset(X_train1[mask], preds[mask])

In [25]:
ds_train_extended = ConcatDataset([ds_train, ds_train1_pseu])
dl_train_extended = DataLoader(ds_train_extended, batch_size=64)

len(ds_train_extended)

13555

In [26]:
train(model, optimizer, criterion, dl_train_extended, 10, dl_test, "best_lstm_pseu.pt");

Epoch [1/10], Loss: 0.11833385116410723, Accuracy: [0.9692364441165622, 0.712678936605317]
Epoch [2/10], Loss: 0.07239131128318589, Accuracy: [0.9692364441165622, 0.717280163599182]
Epoch [3/10], Loss: 0.0567424925826751, Accuracy: [0.9742530431575065, 0.7131901840490797]
Epoch [4/10], Loss: 0.04117908401896388, Accuracy: [0.9721873847288823, 0.7177914110429447]
Epoch [5/10], Loss: 0.0378376873827705, Accuracy: [0.9713021025451862, 0.7229038854805726]
Epoch [6/10], Loss: 0.037005999634482624, Accuracy: [0.9803762449280709, 0.7131901840490797]
Epoch [7/10], Loss: 0.04195667778635885, Accuracy: [0.9772039837698266, 0.7111451942740287]
Epoch [8/10], Loss: 0.03882673933668518, Accuracy: [0.9811877535964588, 0.7223926380368099]
Epoch [9/10], Loss: 0.027516346048409215, Accuracy: [0.9855403909996311, 0.7116564417177914]
Epoch [10/10], Loss: 0.028740759741178085, Accuracy: [0.983548506086315, 0.7203476482617587]


In [27]:
best_model_pseu = deepcopy(model)
best_model_pseu.load_state_dict(torch.load("best_lstm_pseu.pt"))
best_model_pseu.lstm.flatten_parameters()

score_pseu = evaluate(best_model_pseu, dl_test)
score_pseu

0.7229038854805726

In [28]:
model = best_model_pseu if score_pseu >= score else best_model

hidden_predictions, _ = predict(model, dl_test1)
hidden_predictions

tensor([1, 0, 1,  ..., 1, 1, 1], device='cuda:0')

### Save to file

In [29]:
def save_predictions_to_csv(dataset, filename):
    with open(filename, mode='w', newline='', encoding='utf-8') as file:
        writer = csv.writer(file)
        writer.writerow(['label'])
        for item in dataset:
            writer.writerow([item])

In [30]:
# Save the output DataFrame to a CSV file
your_name = "wlamb"  # <- IMPORTANT: Replace with your name
csv_filename = f"prediction_{your_name}.csv"
save_predictions_to_csv(hidden_predictions.tolist(), csv_filename)

## **IMPORTANT**: Remember to Download both your Notebook and Output csv file and submit using the NLP Submission Link.

Make sure both files are name correctly.  

For example:
- `JohnSmith-IOAI2025-Task3-NaturalLanguageProcessing.ipynb`.
- `predictions_JohnSmith.csv`.