## Product Price Prediction with QLoRA Fine-Tuning

### Learning Objectives:
1. Configure QLoRA for efficient fine-tuning
2. Implement supervised fine-tuning with PEFT
3. Monitor training with W&B
4. Save and share trained adapters

### Install required packages (commented to prevent accidental execution)

In [None]:
!pip install -q datasets requests torch peft bitsandbytes transformers trl accelerate sentencepiece wandb matplotlib

In [None]:
# Import with clear grouping
import os
import re
import math
from datetime import datetime
from tqdm import tqdm
import matplotlib.pyplot as plt

# HuggingFace and Colab specific
from google.colab import userdata
from huggingface_hub import login
import wandb

# PyTorch and Transformers
import torch
import transformers
from transformers import (
    AutoModelForCausalLM,
    AutoTokenizer,
    BitsAndBytesConfig,
    TrainingArguments,
    set_seed
)
from peft import LoraConfig
from trl import SFTTrainer, DataCollatorForCompletionOnlyLM
from datasets import load_dataset

## Project Configuration

Key settings for our fine-tuning experiment:

In [None]:
# Model and Data
BASE_MODEL = "meta-llama/Meta-Llama-3.1-8B"
HF_USER = "ed-donner"
DATASET_NAME = f"{HF_USER}/pricer-data"
MAX_SEQUENCE_LENGTH = 182

# Run Management
RUN_NAME = f"{datetime.now():%Y-%m-%d_%H.%M.%S}"
PROJECT_NAME = "pricer"
PROJECT_RUN_NAME = f"{PROJECT_NAME}-{RUN_NAME}"
HUB_MODEL_NAME = f"{HF_USER}/{PROJECT_RUN_NAME}"

# --- QLoRA Hyperparameters ---
"""
### QLoRA Configuration

Parameters for efficient fine-tuning:
"""
LORA_R = 32          # Rank of low-rank adaptation matrices
LORA_ALPHA = 64      # Scaling factor for LoRA weights
TARGET_MODULES = [   # Which attention layers to adapt
    "q_proj",        # Query projections
    "v_proj",        # Value projections
    "k_proj",        # Key projections
    "o_proj"         # Output projections
]
LORA_DROPOUT = 0.1   # Dropout probability for LoRA layers
QUANT_4_BIT = True   # Use 4-bit quantization

## Training Hyperparameters

#### Training Configuration

Optimization settings:

In [None]:
EPOCHS = 1                    # Training cycles through dataset
BATCH_SIZE = 4                # Samples per batch
GRADIENT_ACCUMULATION_STEPS = 1  # Effective batch size = BATCH_SIZE * STEPS
LEARNING_RATE = 1e-4          # Initial learning rate
LR_SCHEDULER_TYPE = 'cosine'  # Learning rate schedule
WARMUP_RATIO = 0.03           % of steps for LR warmup
OPTIMIZER = "paged_adamw_32bit"  # Memory-efficient AdamW variant

# --- Logging Configuration ---
STEPS = 50                    # Log metrics every N steps
SAVE_STEPS = 2000             # Save checkpoint every N steps
LOG_TO_WANDB = True           # Enable Weights & Biases logging

## HuggingFace and W&B Login

Required for model access and experiment tracking:
1. Get your HF token from https://huggingface.co/settings/tokens
2. Get your W&B API key from https://wandb.ai/settings
3. Add both to Colab secrets

In [None]:
# Log in to HuggingFace

hf_token = userdata.get('HF_TOKEN')
login(hf_token, add_to_git_credential=True)

In [None]:
# Log in to Weights & Biases
wandb_api_key = userdata.get('WANDB_API_KEY')
os.environ["WANDB_API_KEY"] = wandb_api_key
wandb.login()

# Configure Weights & Biases to record against our project
os.environ["WANDB_PROJECT"] = PROJECT_NAME
os.environ["WANDB_LOG_MODEL"] = "checkpoint" if LOG_TO_WANDB else "end"
os.environ["WANDB_WATCH"] = "gradients"

### Loading and Preparing Dataset

Our price prediction dataset contains:
- Product descriptions
- Corresponding prices

In [None]:
dataset = load_dataset(DATASET_NAME)
train = dataset['train']
test = dataset['test']

## Loading Base Model with QLoRA

Using 4-bit quantization for memory efficiency:
- NF4 quantization type
- Double quantization for additional savings
- bfloat16 compute dtype

In [None]:
# pick the right quantization

if QUANT_4_BIT:
  quant_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_use_double_quant=True,
    bnb_4bit_compute_dtype=torch.bfloat16,
    bnb_4bit_quant_type="nf4"
  )
else:
  quant_config = BitsAndBytesConfig(
    load_in_8bit=True,
    bnb_8bit_compute_dtype=torch.bfloat16
  )

In [None]:
# Initialize tokenizer and model
tokenizer = AutoTokenizer.from_pretrained(BASE_MODEL, trust_remote_code=True)
tokenizer.pad_token = tokenizer.eos_token
tokenizer.padding_side = "right"

In [None]:
base_model = AutoModelForCausalLM.from_pretrained(
    BASE_MODEL,
    quantization_config=quant_config,
    device_map="auto",
)

In [None]:
base_model.generation_config.pad_token_id = tokenizer.pad_token_id

print(f"\nMemory footprint: {base_model.get_memory_footprint() / 1e6:.1f} MB")

## Configuring the Training Process

We use two key configurations:
1. LoRA parameters for efficient adaptation
2. Training arguments for the optimization process

In [None]:
# LoRA Configuration: 
lora_parameters = LoraConfig(
    lora_alpha=LORA_ALPHA,
    lora_dropout=LORA_DROPOUT,
    r=LORA_R,
    bias="none",
    task_type="CAUSAL_LM",
    target_modules=TARGET_MODULES,
)

In [None]:
# Training Configuration: General configuration parameters for training

train_parameters = TrainingArguments(
    output_dir=PROJECT_RUN_NAME,
    num_train_epochs=EPOCHS,
    per_device_train_batch_size=BATCH_SIZE,
    gradient_accumulation_steps=GRADIENT_ACCUMULATION_STEPS,
    optim=OPTIMIZER,
    learning_rate=LEARNING_RATE,
    weight_decay=0.001,
    max_grad_norm=0.3,
    warmup_ratio=WARMUP_RATIO,
    lr_scheduler_type=LR_SCHEDULER_TYPE,
    logging_steps=STEPS,
    save_steps=SAVE_STEPS,
    save_total_limit=10,
    bf16=True,
    group_by_length=True,
    report_to="wandb" if LOG_TO_WANDB else None,
    run_name=RUN_NAME,
    max_seq_length=MAX_SEQUENCE_LENGTH,
    save_strategy="steps",
    hub_strategy="every_save",
    push_to_hub=True,
    hub_model_id=HUB_MODEL_NAME,
    hub_private_repo=True
)

## Data Collator Setup

Ensures the model only learns to predict prices (not descriptions):

In [None]:
response_template = "Price is $"
collator = DataCollatorForCompletionOnlyLM(response_template, tokenizer=tokenizer)

## Starting Fine-Tuning

The SFTTrainer will:
1. Apply LoRA adapters to the base model
2. Train only the adapter parameters
3. Log progress to W&B
4. Save checkpoints periodically

In [None]:
trainer = SFTTrainer(
    model=base_model,
    train_dataset=train,
    peft_config=lora_parameters,
    args=train_parameters,
    data_collator=collator,
    dataset_text_field="text"
)


In [None]:
print(f"\nStarting training run: {RUN_NAME}")
trainer.train()

## Saving and Sharing Results


In [None]:
# Save final model
trainer.model.push_to_hub(PROJECT_RUN_NAME, private=True)
print(f"\nModel saved to Hub: {HUB_MODEL_NAME}")

In [None]:
# Clean up W&B
if LOG_TO_WANDB:
    wandb.finish()