# Fine-Tuning Llama for Price Prediction

**Purpose**: Fine-tune Meta's Llama 3.1-8B model on a custom dataset to predict product prices from descriptions.  
**Why**: Enables accurate price estimation for e-commerce applications using specialized data.  
**How**: Use QLoRA for efficient fine-tuning on limited hardware, leveraging Kaggle's free 30-hour weekly GPU quota. One epoch suffices to avoid overfitting; 2-3 could refine but risks overfit.

In [1]:
# Install dependencies
# Why: Ensure compatible versions for fine-tuning on GPU

!pip install
!pip install -q --upgrade torch==2.5.1+cu124 torchvision==0.20.1+cu124 torchaudio==2.5.1+cu124 --index-url https://download.pytorch.org/whl/cu124
!pip install -q --upgrade requests==2.32.3 bitsandbytes==0.46.0 transformers==4.48.3 accelerate==1.3.0 datasets==3.2.0 peft==0.14.0 trl==0.14.0 matplotlib wandb

[31mERROR: You must give at least one requirement to install (see "pip help install")[0m[31m
[0m

In [3]:
# Import libraries
# Why: Core tools for model loading, training, and visualization

import os
import re
import math
from tqdm import tqdm
from dotenv import load_dotenv
from huggingface_hub import login
import torch
import transformers
from transformers import AutoModelForCausalLM, AutoTokenizer, TrainingArguments, set_seed, BitsAndBytesConfig
from datasets import load_dataset, Dataset, DatasetDict
import wandb
from peft import LoraConfig
from trl import SFTTrainer, SFTConfig
from datetime import datetime
import matplotlib.pyplot as plt

2025-09-07 08:53:48.086114: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:477] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
E0000 00:00:1757235228.109437     543 cuda_dnn.cc:8310] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
E0000 00:00:1757235228.116380     543 cuda_blas.cc:1418] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered


In [4]:
# Load environment variables and log in
# Why: Secure API keys; WandB for tracking loss/metrics visually

load_dotenv()
hf_token = os.getenv('HF_TOKEN')
wandb_api_key = os.getenv('WANDB_API_KEY')

login(token=hf_token, add_to_git_credential=True)
wandb.login(key=wandb_api_key)

# Configure WandB project
os.environ["WANDB_PROJECT"] = "pricer"
os.environ["WANDB_LOG_MODEL"] = "checkpoint"
os.environ["WANDB_WATCH"] = "gradients"

Token has not been saved to git credential helper.


[1m[31mCannot authenticate through git-credential as no helper is defined on your machine.
You might have to re-authenticate when pushing to the Hugging Face Hub.
Run the following command in your terminal in case you want to set the 'store' credential helper as default.

git config --global credential.helper store

Read https://git-scm.com/book/en/v2/Git-Tools-Credential-Storage for more details.[0m
Successfully logged into Hugging Face.


[34m[1mwandb[0m: Currently logged in as: [33mishant24singh[0m ([33mishant24singh-technical-board-iiit-bhagalpur[0m) to [32mhttps://api.wandb.ai[0m. Use [1m`wandb login --relogin`[0m to force relogin


Successfully logged into Weights & Biases.
Could not log into Weights & Biases. Please ensure 'WANDB_API_KEY' is a valid secret.


In [11]:
# Define constants
# Why: Centralize hyperparameters for easy tuning
# Other datasets like 'ed-donner/pricer-data' could be used for broader coverage

BASE_MODEL = "meta-llama/Meta-Llama-3.1-8B"
PROJECT_NAME = "pricer"
HF_USER = "ishant24"

DATASET_NAME = "ishant24/lite-data"
# Or just use the one I've uploaded
# DATASET_NAME = "ed-donner/pricer-data"
MAX_SEQUENCE_LENGTH = 182

RUN_NAME = f"{datetime.now():%Y-%m-%d_%H.%M.%S}"
PROJECT_RUN_NAME = f"{PROJECT_NAME}-{RUN_NAME}"
HUB_MODEL_NAME = f"{HF_USER}/{PROJECT_RUN_NAME}"

# QLoRA params
LORA_R = 32
LORA_ALPHA = 64
TARGET_MODULES = ["q_proj", "v_proj", "k_proj", "o_proj"]
LORA_DROPOUT = 0.1
QUANT_4_BIT = True

# Training params
EPOCHS = 1  # One epoch avoids overfitting; 2-3 possible but monitor
BATCH_SIZE = 6  # Low due to GPU memory limits
GRADIENT_ACCUMULATION_STEPS = 1
LEARNING_RATE = 1e-4
LR_SCHEDULER_TYPE = 'cosine'
WARMUP_RATIO = 0.03
OPTIMIZER = "paged_adamw_32bit"

# Admin
STEPS = 50
SAVE_STEPS = 2000
LOG_TO_WANDB = True

%matplotlib inline

## Dataset Loading

**What**: Load and subsample the dataset.  
**Why**: Use 10k samples to reduce computational cost on Kaggle GPU; still yields good results.  
**How**: Hugging Face Datasets library for efficient loading.

In [None]:
# Load dataset
# Why: Custom lite dataset for price prediction

dataset = load_dataset(DATASET_NAME)
train = dataset['train'].select(range(10000))  # 10k for cost efficiency
test = dataset['test']
print(f"New training set size: {len(train)}")

New training set size: 10000


In [7]:
# Initialize WandB run
# Why: Visualize training progress

if LOG_TO_WANDB:
    wandb.init(project=PROJECT_NAME, name=RUN_NAME)

## Model Quantization

**What**: Set up 4-bit quantization.  
**Why**: Reduces memory usage for fine-tuning on Kaggle GPU.  
**How**: BitsAndBytesConfig for NF4 quantization.

In [8]:
# Quantization config

quant_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_use_double_quant=True,
    bnb_4bit_compute_dtype=torch.bfloat16,
    bnb_4bit_quant_type="nf4"
)

## Model Loading

**What**: Load tokenizer and quantized model.  
**Why**: Prepare Llama for fine-tuning with low memory.  
**How**: AutoTokenizer and AutoModelForCausalLM.

In [9]:
# Load tokenizer and model

tokenizer = AutoTokenizer.from_pretrained(BASE_MODEL, trust_remote_code=True)
tokenizer.pad_token = tokenizer.eos_token
tokenizer.padding_side = "right"

base_model = AutoModelForCausalLM.from_pretrained(
    BASE_MODEL,
    quantization_config=quant_config,
    device_map="auto",
)
base_model.generation_config.pad_token_id = tokenizer.pad_token_id

print(f"Memory footprint: {base_model.get_memory_footprint() / 1e6:.1f} MB")

Loading checkpoint shards:   0%|          | 0/4 [00:00<?, ?it/s]

Memory footprint: 5591.5 MB


In [12]:
# Data collator for completions
# Why: Focus training on price prediction output

from trl import DataCollatorForCompletionOnlyLM
response_template = "Price is $"
collator = DataCollatorForCompletionOnlyLM(response_template, tokenizer=tokenizer)

## Training Configuration

**What**: Set LoRA and training params.  
**Why**: Efficient adaptation; cosine scheduler for smooth learning.  
**How**: LoraConfig and SFTConfig.

In [14]:
# LoRA config
lora_parameters = LoraConfig(
    lora_alpha=LORA_ALPHA,
    lora_dropout=LORA_DROPOUT,
    r=LORA_R,
    bias="none",
    task_type="CAUSAL_LM",
    target_modules=TARGET_MODULES,
)

# Training config
train_parameters = SFTConfig(
    output_dir=PROJECT_RUN_NAME,
    num_train_epochs=EPOCHS,
    per_device_train_batch_size=BATCH_SIZE,
    per_device_eval_batch_size=1,
    eval_strategy="no",
    gradient_accumulation_steps=GRADIENT_ACCUMULATION_STEPS,
    optim=OPTIMIZER,
    save_steps=SAVE_STEPS,
    save_total_limit=10,
    logging_steps=STEPS,
    learning_rate=LEARNING_RATE,
    weight_decay=0.001,
    fp16=False,
    bf16=True,
    max_grad_norm=0.3,
    max_steps=-1,
    warmup_ratio=WARMUP_RATIO,
    group_by_length=True,
    lr_scheduler_type=LR_SCHEDULER_TYPE,
    report_to="wandb" if LOG_TO_WANDB else None,
    run_name=RUN_NAME,
    max_seq_length=MAX_SEQUENCE_LENGTH,
    dataset_text_field="text",
    save_strategy="steps",
    hub_strategy="every_save",
    push_to_hub=True,
    hub_model_id=HUB_MODEL_NAME,
    hub_private_repo=True
)

# SFT Trainer
fine_tuning = SFTTrainer(
    model=base_model,
    train_dataset=train,
    peft_config=lora_parameters,
    args=train_parameters,
    data_collator=collator
)

## Fine-Tuning Execution

**What**: Train and save the model.  
**Why**: Adapt Llama to price data; push to Hub for reuse.  
**How**: SFTTrainer.train(); monitor on WandB.

In [None]:
# Train and save
fine_tuning.train()
fine_tuning.model.push_to_hub(PROJECT_RUN_NAME, private=True)
print(f"Saved to the hub: {PROJECT_RUN_NAME}")

Step,Training Loss
50,1.7179
100,1.375
150,1.4032
200,1.4069
250,1.3758
300,1.3923
350,1.3776
400,1.4208
450,1.4056
500,1.3777
