**World News Sentiment Classifier**

*   **Objective**: fine-tune the *microsoft/deberta-v3-large* to classify world news headlines into 3 categories
  *   positive
  *   neutral
  *   negative
*   Resource
  *   [Model](https://huggingface.co/microsoft/deberta-v3-large)

## Environment Setup

*   Install Libraries

In [38]:
!pip install transformers datasets evaluate wandb scikit-learn



*   Import Libraries

In [39]:
import re
import numpy as np
import pandas as pd
import torch
import evaluate
from sklearn.model_selection import train_test_split
from datasets import Dataset, DatasetDict
from transformers import (
    AutoTokenizer,
    AutoModelForSequenceClassification,
    Trainer,
    TrainingArguments,
    EarlyStoppingCallback,
    DataCollatorWithPadding,
    set_seed
)
import wandb

*   Set Random Seed for Reproducibility

In [40]:
SEED = 42
set_seed(SEED)
torch.backends.cudnn.deterministic = True
torch.backends.cudnn.benchmark = False

# Tracking

*   Initialize Wandb

In [41]:
wandb.init(
    project="deberta-v3-large_world_news_sentiment_classifier",
    config={
        "model": "microsoft/deberta-v3-large",
        "seed": SEED,
        "batch_size": 8,
        "learning_rate": 6e-6,
        "num_train_epochs": 6,
        "dataset_size": 5652
    }
)

# Data Prepartion

*   Define Label Mapping

In [42]:
LABEL_MAP = {
    "positive": 0,
    "neutral": 1,
    "negative": 2
}
ID2LABEL = {v: k for k, v in LABEL_MAP.items()}

*   Load Dataset

In [43]:
df = pd.read_csv("cleaned_world_news_sentiment.csv")

*   Process Filtering and Mapping

In [44]:
df = df[df['sentiment'].isin(LABEL_MAP.keys())]
df['sentiment'] = df['sentiment'].map(LABEL_MAP).astype(int)

*   Process Sampling

In [45]:
SAMPLES_PER_LABEL = 1884
df = (
    df.groupby('sentiment', group_keys=False)
    .apply(lambda x: x.sample(n=SAMPLES_PER_LABEL, random_state=SEED))
    .reset_index(drop=True)
)

  .apply(lambda x: x.sample(n=SAMPLES_PER_LABEL, random_state=SEED))


In [46]:
print(f"Dataset Scale: {len(df)}")
print(f"Sentiment Distribution:\n{df['sentiment'].value_counts()}")

Dataset Scale: 5652
Sentiment Distribution:
sentiment
0    1884
1    1884
2    1884
Name: count, dtype: int64


*   Split Dataset

In [47]:
train_val, test_df = train_test_split(
    df,
    test_size=0.1,
    stratify=df['sentiment'],
    random_state=SEED
)

train_df, val_df = train_test_split(
    train_val,
    test_size=0.1,
    stratify=train_val['sentiment'],
    random_state=SEED
)

In [48]:
print(f"Train: {len(train_df)}, Validation: {len(val_df)}, Test: {len(test_df)}")

Train: 4577, Validation: 509, Test: 566


*   Convert pandas DataFrame to Hugging Face DatasetDict

In [49]:
dataset = DatasetDict({
    "train": Dataset.from_pandas(train_df, preserve_index=False),
    "validation": Dataset.from_pandas(val_df, preserve_index=False),
    "test": Dataset.from_pandas(test_df, preserve_index=False)
})

# Tokenizing

*   Initialize Model & Tokenizer

In [50]:
MODEL_NAME = "microsoft/deberta-v3-large"
tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME)
tokenizer.model_max_length = 256



*   Define Tokenization Function

In [51]:
def tokenize_fn(batch):
    return tokenizer(
        batch["title"],
        truncation=True,
        max_length=256,
        padding=False,
        add_special_tokens=True
    )

*   Tokenize Dataset & Format for PyTorch

In [52]:
tokenized_ds = dataset.map(
    tokenize_fn,
    batched=True,
    batch_size=2048,
    remove_columns=["title"],
    num_proc=8
)
tokenized_ds = tokenized_ds.rename_column("sentiment", "labels")

Map (num_proc=8):   0%|          | 0/4577 [00:00<?, ? examples/s]

Map (num_proc=8):   0%|          | 0/509 [00:00<?, ? examples/s]

Map (num_proc=8):   0%|          | 0/566 [00:00<?, ? examples/s]

# Model Initialization

*   Load Pretrained Model

In [53]:
model = AutoModelForSequenceClassification.from_pretrained(
    MODEL_NAME,
    num_labels=3,
    id2label=ID2LABEL,
    label2id=LABEL_MAP
)

Some weights of DebertaV2ForSequenceClassification were not initialized from the model checkpoint at microsoft/deberta-v3-large and are newly initialized: ['classifier.bias', 'classifier.weight', 'pooler.dense.bias', 'pooler.dense.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


*   Set Training Arguments

In [54]:
training_args = TrainingArguments(
    output_dir="./deberta-v3-large_world_news_sentiment_classifier-checkpoints",
    save_strategy="steps",
    save_steps=300,
    save_total_limit=20,

    eval_strategy="steps",
    do_eval=True,
    eval_steps=300,
    load_best_model_at_end=True,

    metric_for_best_model="f1",
    greater_is_better=True,

    learning_rate=6e-6,
    weight_decay=0.01,
    warmup_steps=300,
    lr_scheduler_type="linear",
    optim="adamw_torch_fused",

    per_device_train_batch_size=8,
    per_device_eval_batch_size=8,
    num_train_epochs=6,

    fp16=True,
    bf16=False,
    max_grad_norm=1.0,

    logging_steps=300,
    report_to="wandb",

    seed=SEED,
    dataloader_num_workers=2
)

# Metrics & Trainer

*   Define Metric Functions

In [55]:
accuracy = evaluate.load("accuracy")
f1 = evaluate.load("f1")

def compute_metrics(eval_pred):
    preds, labels = eval_pred
    preds = np.argmax(preds, axis=1)
    return {
        "accuracy": accuracy.compute(predictions=preds, references=labels)["accuracy"],
        "f1": f1.compute(predictions=preds, references=labels, average="macro")["f1"]
    }

*   Initialize Trainer

In [56]:
data_collator = DataCollatorWithPadding(tokenizer)

trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=tokenized_ds["train"],
    eval_dataset=tokenized_ds["validation"],
    data_collator=data_collator,
    compute_metrics=compute_metrics,
    callbacks=[EarlyStoppingCallback(early_stopping_patience=15)],
)

# Train & Evaluate

In [57]:
trainer.train()



Step,Training Loss,Validation Loss,Accuracy,F1
300,0.9831,0.674727,0.632613,0.551556
600,0.3551,0.178485,0.960707,0.960639
900,0.1164,0.158937,0.972495,0.972573
1200,0.1189,0.187177,0.966601,0.966434
1500,0.0501,0.239793,0.956778,0.956729
1800,0.0293,0.203145,0.966601,0.966741
2100,0.0147,0.175513,0.972495,0.972547
2400,0.0204,0.235171,0.966601,0.966652
2700,0.0035,0.212162,0.968566,0.968625
3000,0.0148,0.206356,0.972495,0.97252


TrainOutput(global_step=3438, training_loss=0.14977117601420173, metrics={'train_runtime': 799.7312, 'train_samples_per_second': 34.339, 'train_steps_per_second': 4.299, 'total_flos': 1746728275251834.0, 'train_loss': 0.14977117601420173, 'epoch': 6.0})

*   Evaluate by Validation

In [58]:
val_results = trainer.evaluate(tokenized_ds["validation"])
print(f"Validation Accuracy: {val_results['eval_accuracy']:.4f}")
print(f"Validation F1: {val_results['eval_f1']:.4f}")

Validation Accuracy: 0.9725
Validation F1: 0.9726


*   Evaluate by Test

In [59]:
test_results = trainer.evaluate(tokenized_ds["test"])
print(f"Test Accuracy: {test_results['eval_accuracy']:.4f}")
print(f"Test F1: {test_results['eval_f1']:.4f}")

Test Accuracy: 0.9859
Test F1: 0.9859


*   Save Model

In [60]:
trainer.save_model("deberta-v3-large-world-news-sentiment-classifier")
tokenizer.save_pretrained("deberta-v3-large-world-news-sentiment-classifier")

('deberta-v3-large-world-news-sentiment-classifier/tokenizer_config.json',
 'deberta-v3-large-world-news-sentiment-classifier/special_tokens_map.json',
 'deberta-v3-large-world-news-sentiment-classifier/spm.model',
 'deberta-v3-large-world-news-sentiment-classifier/added_tokens.json',
 'deberta-v3-large-world-news-sentiment-classifier/tokenizer.json')

*   Finish Wandb

In [61]:
wandb.finish()

0,1
eval/accuracy,▁███▇████████
eval/f1,▁████████████
eval/loss,█▂▂▂▃▂▂▃▂▂▂▂▁
eval/runtime,▂▃▁▁▁▂▁▃▂▂▃▁█
eval/samples_per_second,▅▃█▇█▄▇▁▄▃▂▇▃
eval/steps_per_second,▅▃█▇█▄▇▁▄▃▂▇▃
train/epoch,▁▁▂▂▂▂▃▃▄▄▄▄▅▅▆▆▆▆▇▇█████
train/global_step,▁▁▂▂▂▂▃▃▄▄▄▄▅▅▆▆▆▆▇▇█████
train/grad_norm,▁▁█▁▁▁▁▁▁▁▁
train/learning_rate,█▇▇▆▅▅▄▃▂▂▁

0,1
eval/accuracy,0.98587
eval/f1,0.98586
eval/loss,0.09534
eval/runtime,4.585
eval/samples_per_second,123.446
eval/steps_per_second,15.485
total_flos,1746728275251834.0
train/epoch,6.0
train/global_step,3438.0
train/grad_norm,0.00348
