# Regression Head Pricer: Improved Price Prediction from Embeddings

## Overview

This notebook demonstrates an enhanced price prediction model that uses a regression head trained on embeddings from a fine-tuned LLaMA model. The approach extracts embeddings from multiple layers of the model and feeds them into a specialized neural network to predict prices.

## Key Results

**Model Performance:**
- **Mean Prediction Error: $38.82** (tested on 10,000 samples)
- **Improvement: ~$4.00** compared to the fine-tuned model from Week 7 ($43.78)
- **Relative improvement: ~9%** reduction in prediction error

## Methodology

1. **Base Model**: LLaMA 3.2-3B fine-tuned on product pricing data
2. **Embedding Extraction**: Multi-layer embeddings from the final 3 transformer layers
3. **Regression Head**: 2-layer neural network with batch normalization and dropout
4. **Training**: Regression head trained on log-transformed prices

## Additional Resources

For detailed experiments and analysis, see: [github.com/antonawinkler/slm-pricer](https://github.com/antonawinkler/slm-pricer) (Notebook 03)

In [None]:
import torch
import numpy as np
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
from huggingface_hub import hf_hub_download

# 1. Configuration
llama_base_model_name = "meta-llama/Llama-3.2-3B"
llama_fine_tuned_model_name = "ed-donner/price-2025-11-28_18.47.07"
llama_fine_tuned_model_revision = "b19c8bfea3b6ff62237fbb0a8da9779fc12cefbd"

# 2. Load Base Model & Adapter
tokenizer = AutoTokenizer.from_pretrained(llama_base_model_name)
llama_base_model = AutoModelForCausalLM.from_pretrained(llama_base_model_name, device_map="auto", dtype=torch.float16)
llama_fine_tuned_model = PeftModel.from_pretrained(llama_base_model, llama_fine_tuned_model_name, revision=llama_fine_tuned_model_revision).merge_and_unload()
llama_fine_tuned_model.eval()

# 3. Download Regression Head
model_path = hf_hub_download(repo_id="antonawinkler/llama-pricer-regression-head", filename="model.pth")
checkpoint = torch.load(model_path, map_location="cpu")
model_config = checkpoint["model_config"]
embedding_config = checkpoint["embedding_config"]

# 4. Define Regression Head
class PriceRegressor(torch.nn.Module):
    def __init__(self, input_dim, hidden_dim1, hidden_dim2, dropout):
        super().__init__()
        self.net = torch.nn.Sequential(
            torch.nn.Linear(input_dim, hidden_dim1),
            torch.nn.BatchNorm1d(hidden_dim1),
            torch.nn.ReLU(),
            torch.nn.Dropout(dropout),
            torch.nn.Linear(hidden_dim1, hidden_dim2),
            torch.nn.BatchNorm1d(hidden_dim2),
            torch.nn.ReLU(),
            torch.nn.Linear(hidden_dim2, 1),
        )

    def forward(self, x):
        return self.net(x).squeeze(-1)

regression_head = PriceRegressor(
    input_dim=model_config["input_dim"],
    hidden_dim1=model_config["hidden_dim1"],
    hidden_dim2=model_config["hidden_dim2"],
    dropout=model_config["dropout"]
)
regression_head.load_state_dict(checkpoint["model_state_dict"])
regression_head.eval().to(llama_fine_tuned_model.device)

# 5. Prediction Helper
def get_single_embedding(model, tokenizer, text, n_layers):
    inputs = tokenizer(text, return_tensors="pt").to(model.device)

    with torch.no_grad():
        outputs = model(**inputs, output_hidden_states=True)
        selected_layers = outputs.hidden_states[-n_layers:]
        # Extract last token from each layer
        layer_vecs = [layer[:, -1, :] for layer in selected_layers]
        return torch.cat(layer_vecs, dim=-1).float()

# 6. Predict
description = "Apple AirPods Pro (2nd Generation) with MagSafe Charging Case"
embedding = get_single_embedding(
    llama_fine_tuned_model,
    tokenizer,
    description,
    n_layers=embedding_config["n_layers"]
)

with torch.no_grad():
    log_price = regression_head(embedding)
    price = np.exp(log_price.item())

print(f"Predicted Price: ${price:.2f}")


In [None]:
!pip install -q --upgrade bitsandbytes trl
!wget -q https://raw.githubusercontent.com/ed-donner/llm_engineering/main/week7/util.py -O util.py

In [None]:
from util import evaluate
from datasets import load_dataset

In [None]:
def model_predict(item):
  embedding = get_single_embedding(
      llama_fine_tuned_model,
      tokenizer,
      item["prompt"],
      n_layers=3,
  )

  with torch.no_grad():
      log_price = regression_head(embedding)
      return np.expm1(log_price.item())


In [None]:
df_test = load_dataset('ed-donner/items_prompts_full', split="test")

In [None]:
evaluate(model_predict, df_test)



## Model Predict Error $38.82 for all 10'000 test samples.

## Improvement of almost 4 dollars (Model Predict Error $43.78) compared to the fine-tuned model of week 7.

In [None]:
evaluate(model_predict, df_test, size=10000)