**Business News Sentiment Classifier**

*   **Objective**: fine-tune the *microsoft/deberta-v3-large* to classify business news headlines into 3 categories
  *   positive
  *   neutral
  *   negative
*   Resource
  *   [Model](https://huggingface.co/microsoft/deberta-v3-large)

## Environment Setup

*   Install Libraries

In [1]:
!pip install transformers datasets evaluate wandb scikit-learn

Collecting datasets
  Downloading datasets-3.6.0-py3-none-any.whl.metadata (19 kB)
Collecting evaluate
  Downloading evaluate-0.4.3-py3-none-any.whl.metadata (9.2 kB)
Collecting dill<0.3.9,>=0.3.0 (from datasets)
  Downloading dill-0.3.8-py3-none-any.whl.metadata (10 kB)
Collecting xxhash (from datasets)
  Downloading xxhash-3.5.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (12 kB)
Collecting multiprocess<0.70.17 (from datasets)
  Downloading multiprocess-0.70.16-py311-none-any.whl.metadata (7.2 kB)
Collecting fsspec<=2025.3.0,>=2023.1.0 (from fsspec[http]<=2025.3.0,>=2023.1.0->datasets)
  Downloading fsspec-2025.3.0-py3-none-any.whl.metadata (11 kB)
Downloading datasets-3.6.0-py3-none-any.whl (491 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m491.5/491.5 kB[0m [31m9.9 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading evaluate-0.4.3-py3-none-any.whl (84 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m84.0/84.0 kB[0m [31m4.

*   Import Libraries

In [2]:
import re
import numpy as np
import pandas as pd
import torch
import evaluate
from sklearn.model_selection import train_test_split
from datasets import Dataset, DatasetDict
from transformers import (
    AutoTokenizer,
    AutoModelForSequenceClassification,
    Trainer,
    TrainingArguments,
    EarlyStoppingCallback,
    DataCollatorWithPadding,
    set_seed
)
import wandb

*   Set Random Seed for Reproducibility

In [3]:
SEED = 42
set_seed(SEED)
torch.backends.cudnn.deterministic = True
torch.backends.cudnn.benchmark = False

# Tracking

*   Initialize Wandb

In [4]:
wandb.init(
    project="deberta-v3-large_business_news_sentiment_classifier",
    config={
        "model": "microsoft/deberta-v3-large",
        "seed": SEED,
        "batch_size": 8,
        "learning_rate": 6e-6,
        "num_train_epochs": 3,
        "dataset_size": 136995
    }
)

<IPython.core.display.Javascript object>

[34m[1mwandb[0m: Logging into wandb.ai. (Learn how to deploy a W&B server locally: https://wandb.me/wandb-server)
[34m[1mwandb[0m: You can find your API key in your browser here: https://wandb.ai/authorize?ref=models
wandb: Paste an API key from your profile and hit enter:

 ··········


[34m[1mwandb[0m: No netrc file found, creating one.
[34m[1mwandb[0m: Appending key for api.wandb.ai to your netrc file: /root/.netrc
[34m[1mwandb[0m: Currently logged in as: [33mlogicalqubit[0m ([33mlogical-qubit[0m) to [32mhttps://api.wandb.ai[0m. Use [1m`wandb login --relogin`[0m to force relogin


# Data Prepartion

*   Define Label Mapping

In [5]:
LABEL_MAP = {
    "positive": 0,
    "neutral": 1,
    "negative": 2
}
ID2LABEL = {v: k for k, v in LABEL_MAP.items()}

*   Load Dataset

In [6]:
df = pd.read_csv("cleaned_business_news_sentiment.csv")

*   Process Filtering and Mapping

In [7]:
df = df[df['sentiment'].isin(LABEL_MAP.keys())]
df['sentiment'] = df['sentiment'].map(LABEL_MAP).astype(int)

*   Process Sampling

In [8]:
SAMPLES_PER_LABEL = 45665
df = (
    df.groupby('sentiment', group_keys=False)
    .apply(lambda x: x.sample(n=SAMPLES_PER_LABEL, random_state=SEED))
    .reset_index(drop=True)
)

  .apply(lambda x: x.sample(n=SAMPLES_PER_LABEL, random_state=SEED))


In [9]:
print(f"Dataset Scale: {len(df)}")
print(f"Sentiment Distribution:\n{df['sentiment'].value_counts()}")

Dataset Scale: 136995
Sentiment Distribution:
sentiment
0    45665
1    45665
2    45665
Name: count, dtype: int64


*   Split Dataset

In [10]:
train_val, test_df = train_test_split(
    df,
    test_size=0.1,
    stratify=df['sentiment'],
    random_state=SEED
)

train_df, val_df = train_test_split(
    train_val,
    test_size=0.1,
    stratify=train_val['sentiment'],
    random_state=SEED
)

In [11]:
print(f"Train: {len(train_df)}, Validation: {len(val_df)}, Test: {len(test_df)}")

Train: 110965, Validation: 12330, Test: 13700


*   Convert pandas DataFrame to Hugging Face DatasetDict

In [12]:
dataset = DatasetDict({
    "train": Dataset.from_pandas(train_df, preserve_index=False),
    "validation": Dataset.from_pandas(val_df, preserve_index=False),
    "test": Dataset.from_pandas(test_df, preserve_index=False)
})

# Tokenizing

*   Initialize Model & Tokenizer

In [13]:
MODEL_NAME = "microsoft/deberta-v3-large"
tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME)
tokenizer.model_max_length = 256

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


tokenizer_config.json:   0%|          | 0.00/52.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/580 [00:00<?, ?B/s]

Xet Storage is enabled for this repo, but the 'hf_xet' package is not installed. Falling back to regular HTTP download. For better performance, install the package with: `pip install huggingface_hub[hf_xet]` or `pip install hf_xet`


spm.model:   0%|          | 0.00/2.46M [00:00<?, ?B/s]



*   Define Tokenization Function

In [14]:
def tokenize_fn(batch):
    return tokenizer(
        batch["title"],
        truncation=True,
        max_length=256,
        padding=False,
        add_special_tokens=True
    )

*   Tokenize Dataset & Format for PyTorch

In [15]:
tokenized_ds = dataset.map(
    tokenize_fn,
    batched=True,
    batch_size=2048,
    remove_columns=["title"],
    num_proc=8
)
tokenized_ds = tokenized_ds.rename_column("sentiment", "labels")

Map (num_proc=8):   0%|          | 0/110965 [00:00<?, ? examples/s]

Map (num_proc=8):   0%|          | 0/12330 [00:00<?, ? examples/s]

Map (num_proc=8):   0%|          | 0/13700 [00:00<?, ? examples/s]

# Model Initialization

*   Load Pretrained Model

In [16]:
model = AutoModelForSequenceClassification.from_pretrained(
    MODEL_NAME,
    num_labels=3,
    id2label=ID2LABEL,
    label2id=LABEL_MAP
)

Xet Storage is enabled for this repo, but the 'hf_xet' package is not installed. Falling back to regular HTTP download. For better performance, install the package with: `pip install huggingface_hub[hf_xet]` or `pip install hf_xet`


pytorch_model.bin:   0%|          | 0.00/874M [00:00<?, ?B/s]

Xet Storage is enabled for this repo, but the 'hf_xet' package is not installed. Falling back to regular HTTP download. For better performance, install the package with: `pip install huggingface_hub[hf_xet]` or `pip install hf_xet`


model.safetensors:   0%|          | 0.00/874M [00:00<?, ?B/s]

Some weights of DebertaV2ForSequenceClassification were not initialized from the model checkpoint at microsoft/deberta-v3-large and are newly initialized: ['classifier.bias', 'classifier.weight', 'pooler.dense.bias', 'pooler.dense.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


*   Set Training Arguments

In [17]:
training_args = TrainingArguments(
    output_dir="./deberta-v3-large_business_news_sentiment_classifier-checkpoints",
    save_strategy="steps",
    save_steps=1000,
    save_total_limit=20,

    eval_strategy="steps",
    do_eval=True,
    eval_steps=1000,
    load_best_model_at_end=True,
    metric_for_best_model="f1",
    greater_is_better=True,

    learning_rate=6e-6,
    weight_decay=0.01,
    warmup_steps=300,
    lr_scheduler_type="linear",
    optim="adamw_torch_fused",

    per_device_train_batch_size=8,
    per_device_eval_batch_size=8,
    num_train_epochs=3,

    fp16=True,
    bf16=False,
    max_grad_norm=1.0,

    logging_steps=1000,
    report_to="wandb",

    seed=SEED,
    dataloader_num_workers=8
)

# Metrics & Trainer

*   Define Metric Functions

In [18]:
accuracy = evaluate.load("accuracy")
f1 = evaluate.load("f1")

def compute_metrics(eval_pred):
    preds, labels = eval_pred
    preds = np.argmax(preds, axis=1)
    return {
        "accuracy": accuracy.compute(predictions=preds, references=labels)["accuracy"],
        "f1": f1.compute(predictions=preds, references=labels, average="macro")["f1"]
    }

Downloading builder script:   0%|          | 0.00/4.20k [00:00<?, ?B/s]

Downloading builder script:   0%|          | 0.00/6.79k [00:00<?, ?B/s]

*   Initialize Trainer

In [19]:
data_collator = DataCollatorWithPadding(tokenizer)

trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=tokenized_ds["train"],
    eval_dataset=tokenized_ds["validation"],
    data_collator=data_collator,
    compute_metrics=compute_metrics,
    callbacks=[EarlyStoppingCallback(early_stopping_patience=15)],
)

# Train & Evaluate

In [20]:
trainer.train()



Step,Training Loss,Validation Loss,Accuracy,F1
1000,0.4664,0.216541,0.959367,0.959372
2000,0.1801,0.173803,0.97056,0.970571
3000,0.1234,0.153648,0.974858,0.974901
4000,0.125,0.145937,0.976075,0.976084
5000,0.1254,0.092853,0.980454,0.980456
6000,0.1135,0.091413,0.982806,0.982811
7000,0.1071,0.068909,0.984915,0.984917
8000,0.0977,0.109545,0.981346,0.981346
9000,0.0921,0.09066,0.983049,0.983034
10000,0.0935,0.072686,0.985888,0.985895


TrainOutput(global_step=41613, training_loss=0.060704488479030864, metrics={'train_runtime': 13444.5878, 'train_samples_per_second': 24.761, 'train_steps_per_second': 3.095, 'total_flos': 2.1259149660644616e+16, 'train_loss': 0.060704488479030864, 'epoch': 3.0})

*   Evaluate by Validation

In [21]:
val_results = trainer.evaluate(tokenized_ds["validation"])
print(f"Validation Accuracy: {val_results['eval_accuracy']:.4f}")
print(f"Validation F1: {val_results['eval_f1']:.4f}")

Validation Accuracy: 0.9904
Validation F1: 0.9904


*   Evaluate by Test

In [22]:
test_results = trainer.evaluate(tokenized_ds["test"])
print(f"Test Accuracy: {test_results['eval_accuracy']:.4f}")
print(f"Test F1: {test_results['eval_f1']:.4f}")

Test Accuracy: 0.9896
Test F1: 0.9896


*   Save Model

In [23]:
trainer.save_model("deberta-v3-large-business-news-sentiment-classifier")
tokenizer.save_pretrained("deberta-v3-large-business-news-sentiment-classifier")

('deberta-v3-large-business-news-sentiment-classifier/tokenizer_config.json',
 'deberta-v3-large-business-news-sentiment-classifier/special_tokens_map.json',
 'deberta-v3-large-business-news-sentiment-classifier/spm.model',
 'deberta-v3-large-business-news-sentiment-classifier/added_tokens.json',
 'deberta-v3-large-business-news-sentiment-classifier/tokenizer.json')

*   Finish Wandb

In [24]:
wandb.finish()

0,1
eval/accuracy,▁▄▄▅▆▆▇▆▆▇▇▇▇▆▇▇▇▇▇▇▇▇██████████████████
eval/f1,▁▄▄▅▆▆▇▆▆▇▇▇▇▆▇▇▇▇▇▇▇▇██████████████████
eval/loss,█▆▅▅▂▂▁▃▂▁▂▁▂▄▂▁▂▃▁▂▂▂▁▁▁▁▂▂▂▂▂▂▂▂▂▁▂▂▁▂
eval/runtime,▁▁▁▂▁▂▁▁▂▂▁▁▁▂▂▂▂▁▁▂▂▂▂▁▂▁▂▂▂▂▁▁▂▁▂▂▂▂▂█
eval/samples_per_second,▆█▇▆▇▅█▇▅▅▇▇▆▅▅▄▅▆▆▅▅▄▅▆▅▆▆▆▅▄▆▆▄▇▆▅▅▃▃▁
eval/steps_per_second,▆█▇▆▇▅█▇▅▅▇▇▆▅▆▄▅▆▆▅▅▄▅▆▅▆▆▆▅▄▆▆▄▇▆▅▅▃▃▁
train/epoch,▁▁▁▂▂▂▂▂▂▂▃▃▃▄▄▄▄▅▅▅▆▆▆▆▆▇▇▇▇▇▇▇▇▇██████
train/global_step,▁▁▁▁▁▂▂▂▂▃▃▃▃▄▄▄▄▄▄▅▅▅▅▅▅▆▆▆▆▇▇▇▇▇▇█████
train/grad_norm,▁▁▃▁▁▄▁▁▂▁▁▁▁▁▁▂▁▁▁▁█▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁
train/learning_rate,███▇▇▇▇▇▇▆▆▆▆▆▆▅▅▅▅▅▄▄▄▄▄▄▃▃▃▃▃▃▂▂▂▂▂▂▁▁

0,1
eval/accuracy,0.98956
eval/f1,0.98957
eval/loss,0.08274
eval/runtime,102.2436
eval/samples_per_second,133.994
eval/steps_per_second,16.754
total_flos,2.1259149660644616e+16
train/epoch,3.0
train/global_step,41613.0
train/grad_norm,0.00032
