# Sarcasm Detection Using Fine-Tuned RoBERTa

This notebook fine-tunes a Twitter-RoBERTa model to detect sarcasm in English tweets using the iSarcasmEval dataset.

**Approach:** Full fine-tuning with Focal Loss to handle class imbalance

## 1. Download Dataset from Google Drive

In [1]:
!pip install -q gdown
import gdown

file_id_train = '1x6CbYlfuPZf1-EZFVN-uKcFptlthVGf8'
url_train = f'https://drive.google.com/uc?id={file_id_train}'
gdown.download(url_train, 'train.En.csv', quiet=False)

file_id_test = '1ZXMDxYoVEDaVO88s8XR1QQKzEYQVCwFG'
url_test = f'https://drive.google.com/uc?id={file_id_test}'
gdown.download(url_test, 'task_A_En_test.csv', quiet=False)

Downloading...
From: https://drive.google.com/uc?id=1x6CbYlfuPZf1-EZFVN-uKcFptlthVGf8
To: /content/train.En.csv
100%|██████████| 495k/495k [00:00<00:00, 105MB/s]

Downloading...
From: https://drive.google.com/uc?id=1ZXMDxYoVEDaVO88s8XR1QQKzEYQVCwFG
To: /content/task_A_En_test.csv
100%|██████████| 133k/133k [00:00<00:00, 65.4MB/s]
Downloading...
From: https://drive.google.com/uc?id=1ZXMDxYoVEDaVO88s8XR1QQKzEYQVCwFG
To: /content/task_A_En_test.csv
100%|██████████| 133k/133k [00:00<00:00, 65.4MB/s]


'task_A_En_test.csv'

## 2. Import Libraries and Setup

- Data processing (pandas, numpy)
- Deep learning (PyTorch, Transformers)
- Evaluation metrics (sklearn)

In [2]:
import pandas as pd
import numpy as np
import re
import torch
import torch.nn as nn
import torch.nn.functional as F
from transformers import (
    AutoTokenizer,
    AutoModelForSequenceClassification,
    AutoConfig,
    Trainer,
    TrainingArguments,
    EarlyStoppingCallback
)
from torch.utils.data import Dataset
from sklearn.model_selection import train_test_split
from sklearn.metrics import (
    classification_report,
    f1_score,
    precision_score,
    recall_score,
    accuracy_score,
    confusion_matrix
)
import warnings
warnings.filterwarnings('ignore')

import random
random.seed(42)
np.random.seed(42)
torch.manual_seed(42)
if torch.cuda.is_available():
    torch.cuda.manual_seed_all(42)

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
print(f"Using device: {device}")

Using device: cuda


## 3. Configuration - Hyperparameters

- **Model:** Twitter-RoBERTa (pre-trained on 58M tweets for irony detection)
- **Training:** 4 epochs, batch size 16, learning rate 1.2e-5
- **Focal Loss:** Handles class imbalance (alpha=[0.33, 0.67], gamma=2.0)
- **Regularization:** Dropout 0.2, weight decay 0.1, label smoothing 0.05

In [3]:
CONFIG = {
    'model_name': 'cardiffnlp/twitter-roberta-base-irony',

    'num_epochs': 4,                   
    'batch_size': 16,
    'gradient_accumulation_steps': 2,  
    'learning_rate': 1.2e-5,           
    'weight_decay': 0.1,
    'warmup_ratio': 0.1,
    'max_length': 128,
    'dropout_rate': 0.2,

    'focal_alpha': [0.33, 0.67],      
    'focal_gamma': 2.0,

    'label_smoothing': 0.05,

    'validation_split': 0.15,
    'random_seed': 42,
    'output_dir': './final_optimized_model',
}

## 4. Text Preprocessing Function

Cleans Twitter text by:
- Replacing URLs with `[URL]` token
- Replacing mentions with `[USER]` token  
- Removing hashtag symbols but keeping text
- Normalizing excessive punctuation (!!!! → !!!)
- Removing extra whitespace

In [4]:
def text_cleaning(text):
    if pd.isna(text):
        return ""
    text = str(text)
    text = re.sub(r'http\S+|www\S+|https\S+', '[URL]', text, flags=re.MULTILINE)
    text = re.sub(r'@(\w+)', r'[USER]', text)
    text = re.sub(r'#(\w+)', r'\1', text)
    text = re.sub(r'(\!)\1{4,}', r'!!!', text)
    text = re.sub(r'(\?)\1{4,}', r'???', text)
    text = re.sub(r'(\.)\1{4,}', r'...', text)
    text = re.sub(r'\s+', ' ', text)
    return text.strip()


## 5. Custom Dataset Class

PyTorch Dataset that:
- Takes raw text and labels
- Tokenizes text using RoBERTa tokenizer
- Returns input_ids, attention_mask, and labels as tensors
- Handles padding and truncation to max_length (128 tokens)

In [5]:
class SarcasmDataset(Dataset):
    def __init__(self, texts, labels, tokenizer, max_length=128):
        self.texts = texts
        self.labels = labels
        self.tokenizer = tokenizer
        self.max_length = max_length

    def __len__(self):
        return len(self.labels)

    def __getitem__(self, idx):
        encoding = self.tokenizer(
            str(self.texts[idx]),
            truncation=True,
            padding='max_length',
            max_length=self.max_length,
            return_tensors='pt'
        )
        return {
            'input_ids': encoding['input_ids'].squeeze(0),
            'attention_mask': encoding['attention_mask'].squeeze(0),
            'labels': torch.tensor(self.labels[idx], dtype=torch.long)
        }

## 6. Focal Loss Implementation

**Why Focal Loss?**
Sarcastic tweets are less common than non-sarcastic ones (class imbalance).

**How it works:**
- **Alpha weights:** Give more importance to the minority class (sarcastic)
- **Gamma parameter:** Focus learning on hard-to-classify examples
- **Label smoothing:** Prevents overconfident predictions

**Custom Trainer:** Overrides the default loss function to use Focal Loss

In [6]:
class FocalLoss(nn.Module):
    def __init__(self, alpha=[0.33, 0.67], gamma=2.0, label_smoothing=0.0):
        super(FocalLoss, self).__init__()
        self.alpha = torch.tensor(alpha, dtype=torch.float32)
        self.gamma = gamma
        self.label_smoothing = label_smoothing

    def forward(self, inputs, targets):
        if self.label_smoothing > 0:
            n_classes = inputs.size(-1)
            smoothed = torch.zeros_like(inputs)
            smoothed.fill_(self.label_smoothing / (n_classes - 1))
            smoothed.scatter_(1, targets.unsqueeze(1), 1.0 - self.label_smoothing)
            targets_smooth = smoothed
            ce_loss = -(targets_smooth * F.log_softmax(inputs, dim=1)).sum(dim=1)
        else:
            ce_loss = F.cross_entropy(inputs, targets, reduction='none')

        pt = torch.exp(-ce_loss)
        alpha_t = self.alpha.to(inputs.device)[targets]
        return (alpha_t * (1 - pt) ** self.gamma * ce_loss).mean()

class FocalLossTrainer(Trainer):
    def __init__(self, *args, focal_loss_fn=None, **kwargs):
        super().__init__(*args, **kwargs)
        self.focal_loss_fn = focal_loss_fn

    def compute_loss(self, model, inputs, return_outputs=False, **kwargs):
        labels = inputs.pop("labels")
        outputs = model(**inputs)
        loss = self.focal_loss_fn(outputs.logits, labels)
        return (loss, outputs) if return_outputs else loss

def compute_metrics(eval_pred):
    logits, labels = eval_pred
    preds = np.argmax(logits, axis=-1)
    return {
        'f1_sarcastic': f1_score(labels, preds, pos_label=1),
        'f1_macro': f1_score(labels, preds, average='macro'),
        'precision': precision_score(labels, preds, pos_label=1, zero_division=0),
        'recall': recall_score(labels, preds, pos_label=1),
        'accuracy': accuracy_score(labels, preds),
    }

## 7. Load and Split Data

**Steps:**
1. Load training and test CSV files
2. Clean all tweets using our preprocessing function
3. Split training data into train (85%) and validation (15%)
4. Use stratified split to maintain class distribution in both sets

In [7]:
df_train = pd.read_csv('/content/train.En.csv', index_col=0)
df_test = pd.read_csv('/content/task_A_En_test.csv')

df_train['text_cleaned'] = df_train['tweet'].apply(text_cleaning)
df_test['text_cleaned'] = df_test['tweet'].apply(text_cleaning)

X_train_full = df_train['text_cleaned'].values
y_train_full = df_train['sarcastic'].values

X_train, X_val, y_train, y_val = train_test_split(
    X_train_full, y_train_full,
    test_size=CONFIG['validation_split'],
    random_state=CONFIG['random_seed'],
    stratify=y_train_full
)

X_test = df_test['text_cleaned'].values
y_test = df_test['sarcastic'].values

print(f"Splits: Train={len(X_train)}, Val={len(X_val)}, Test={len(X_test)}")

Splits: Train=2947, Val=521, Test=1400


## 8. Load Pre-trained Model and Tokenizer

**Loading:**
- Twitter-RoBERTa tokenizer (knows Twitter-specific tokens)
- Pre-trained model with custom dropout rates
- Configure for binary classification (2 labels: sarcastic/non-sarcastic)

**Creating Datasets:**
- Wrap our data in the custom SarcasmDataset class
- Prepare train, validation, and test datasets

In [8]:
tokenizer = AutoTokenizer.from_pretrained(CONFIG['model_name'])

config = AutoConfig.from_pretrained(CONFIG['model_name'])
config.hidden_dropout_prob = CONFIG['dropout_rate']
config.attention_probs_dropout_prob = CONFIG['dropout_rate']
config.num_labels = 2

model = AutoModelForSequenceClassification.from_pretrained(
    CONFIG['model_name'], config=config, ignore_mismatched_sizes=True
).to(device)

train_dataset = SarcasmDataset(X_train, y_train, tokenizer, CONFIG['max_length'])
val_dataset = SarcasmDataset(X_val, y_val, tokenizer, CONFIG['max_length'])
test_dataset = SarcasmDataset(X_test, y_test, tokenizer, CONFIG['max_length'])

config.json:   0%|          | 0.00/705 [00:00<?, ?B/s]

vocab.json: 0.00B [00:00, ?B/s]

merges.txt: 0.00B [00:00, ?B/s]

special_tokens_map.json:   0%|          | 0.00/150 [00:00<?, ?B/s]

pytorch_model.bin:   0%|          | 0.00/499M [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/499M [00:00<?, ?B/s]

## 9. Configure Training

**Training Arguments:**
- Evaluation after each epoch
- Save best model based on F1 score
- Mixed precision training (fp16) for faster computation
- Gradient accumulation for effective batch size of 32
- Early stopping if no improvement for 2 epochs

**Optimizer Settings:**
- AdamW with weight decay
- Learning rate warmup (10% of training)
- Gradient clipping to prevent exploding gradients

In [9]:
focal_loss = FocalLoss(
    alpha=CONFIG['focal_alpha'],
    gamma=CONFIG['focal_gamma'],
    label_smoothing=CONFIG['label_smoothing']
)

training_args = TrainingArguments(
    output_dir=CONFIG['output_dir'],
    num_train_epochs=CONFIG['num_epochs'],
    per_device_train_batch_size=CONFIG['batch_size'],
    per_device_eval_batch_size=CONFIG['batch_size'] * 2,
    gradient_accumulation_steps=CONFIG['gradient_accumulation_steps'],
    learning_rate=CONFIG['learning_rate'],
    weight_decay=CONFIG['weight_decay'],
    warmup_ratio=CONFIG['warmup_ratio'],
    max_grad_norm=1.0,
    eval_strategy="epoch",
    save_strategy="epoch",
    logging_steps=50,
    load_best_model_at_end=True,
    metric_for_best_model="f1_sarcastic",
    greater_is_better=True,
    fp16=torch.cuda.is_available(),
    save_total_limit=2,
    seed=CONFIG['random_seed'],
    report_to="none"
)

early_stopping = EarlyStoppingCallback(
    early_stopping_patience=2,
    early_stopping_threshold=0.001
)

trainer = FocalLossTrainer(
    model=model,
    args=training_args,
    train_dataset=train_dataset,
    eval_dataset=val_dataset,
    compute_metrics=compute_metrics,
    focal_loss_fn=focal_loss,
    callbacks=[early_stopping]
)

## 10. Train the Model

Fine-tune all layers of the RoBERTa model on our sarcasm dataset.

**What happens:**
- Model learns from training data for 4 epochs
- Validates performance after each epoch
- Saves the best performing model
- Early stops if validation F1 doesn't improve

In [10]:

train_result = trainer.train()

val_results = trainer.evaluate()
print("\n" + "="*80)
print("VALIDATION RESULTS")
print("="*80)
print(f"F1 (Sarcastic): {val_results['eval_f1_sarcastic']:.4f} ({val_results['eval_f1_sarcastic']*100:.2f}%)")
print(f"Precision:      {val_results['eval_precision']:.4f}")
print(f"Recall:         {val_results['eval_recall']:.4f}")
print(f"Val Loss:       {val_results['eval_loss']:.4f}")

Epoch,Training Loss,Validation Loss,F1 Sarcastic,F1 Macro,Precision,Recall,Accuracy
1,0.1029,0.064582,0.425926,0.637902,0.534884,0.353846,0.761996
2,0.0647,0.065128,0.503268,0.648373,0.4375,0.592308,0.708253
3,0.0577,0.065492,0.51773,0.669392,0.480263,0.561538,0.738964
4,0.0571,0.067899,0.515152,0.675314,0.507463,0.523077,0.754319



VALIDATION RESULTS
F1 (Sarcastic): 0.5177 (51.77%)
Precision:      0.4803
Recall:         0.5615
Val Loss:       0.0655


## 11. Threshold Optimization on Test Set

**Why optimize threshold?**
The default 0.5 threshold isn't always optimal for imbalanced data.

**Process:**
1. Get model's predicted probabilities for test set
2. Try different thresholds from 0.30 to 0.75
3. Calculate F1 score for each threshold
4. Select the threshold that maximizes F1 score

This ensures we get the best possible performance on our specific task.

In [11]:
model.eval()
all_probabilities = []

with torch.no_grad():
    for i in range(0, len(test_dataset), 32):
        batch_idx = range(i, min(i + 32, len(test_dataset)))
        batch_input = torch.stack([test_dataset[j]['input_ids'] for j in batch_idx]).to(device)
        batch_mask = torch.stack([test_dataset[j]['attention_mask'] for j in batch_idx]).to(device)
        outputs = model(input_ids=batch_input, attention_mask=batch_mask)
        all_probabilities.extend(F.softmax(outputs.logits, dim=1).cpu().numpy())

all_probabilities = np.array(all_probabilities)

print("THRESHOLD OPTIMIZATION\n")

best_f1 = 0
best_threshold = 0.5

for threshold in np.arange(0.3, 0.8, 0.05):
    preds = (all_probabilities[:, 1] >= threshold).astype(int)
    f1 = f1_score(y_test, preds, pos_label=1)
    print(f"Threshold {threshold:.2f}: F1 = {f1:.4f} ({f1*100:.2f}%)")
    if f1 > best_f1:
        best_f1 = f1
        best_threshold = threshold

print(f"\nBest: Threshold={best_threshold:.2f}, F1={best_f1:.4f} ({best_f1*100:.2f}%)")

THRESHOLD OPTIMIZATION

Threshold 0.30: F1 = 0.2909 (29.09%)
Threshold 0.35: F1 = 0.3154 (31.54%)
Threshold 0.40: F1 = 0.3350 (33.50%)
Threshold 0.45: F1 = 0.3599 (35.99%)
Threshold 0.50: F1 = 0.4035 (40.35%)
Threshold 0.55: F1 = 0.4520 (45.20%)
Threshold 0.60: F1 = 0.4569 (45.69%)
Threshold 0.65: F1 = 0.4844 (48.44%)
Threshold 0.70: F1 = 0.4458 (44.58%)
Threshold 0.75: F1 = 0.3768 (37.68%)

Best: Threshold=0.65, F1=0.4844 (48.44%)


## 12. Final Results and Comparison

In [14]:
final_preds = (all_probabilities[:, 1] >= best_threshold).astype(int)

print(f"FINAL RESULTS (Threshold: {best_threshold:.2f})")

print(f"\nPerformance:")
print(f"  F1 (Sarcastic):  {best_f1:.4f} ({best_f1*100:.2f}%)")
print(f"  Precision:       {precision_score(y_test, final_preds, pos_label=1):.4f}")
print(f"  Recall:          {recall_score(y_test, final_preds, pos_label=1):.4f}")
print(f"  Accuracy:        {accuracy_score(y_test, final_preds):.4f}")

print("\n\n")

print(classification_report(y_test, final_preds, target_names=['Non-Sarcastic', 'Sarcastic']))

print("\n\n")

cm = confusion_matrix(y_test, final_preds)
print("\nConfusion Matrix:")
print(f"  TN: {cm[0][0]}, FP: {cm[0][1]}, FN: {cm[1][0]}, TP: {cm[1][1]}")

FINAL RESULTS (Threshold: 0.65)

Performance:
  F1 (Sarcastic):  0.4844 (48.44%)
  Precision:       0.5054
  Recall:          0.4650
  Accuracy:        0.8586



               precision    recall  f1-score   support

Non-Sarcastic       0.91      0.92      0.92      1200
    Sarcastic       0.51      0.47      0.48       200

     accuracy                           0.86      1400
    macro avg       0.71      0.69      0.70      1400
 weighted avg       0.85      0.86      0.86      1400





Confusion Matrix:
  TN: 1109, FP: 91, FN: 107, TP: 93
