# The Product Pricer Continued

A model that can estimate how much something costs, from its description.

## AT LAST - it's time for Fine Tuning!

After all this data preparation, and old school machine learning, we've finally arrived at the moment you've been waiting for. Fine-tuning a model.

In [1]:
# imports

import os
import re
import math
import json
import random
from dotenv import load_dotenv
from huggingface_hub import login
import matplotlib.pyplot as plt
import numpy as np
import pickle
from collections import Counter
from openai import OpenAI
from anthropic import Anthropic

In [2]:
# environment

load_dotenv(override=True)
os.environ['OPENAI_API_KEY'] = os.getenv('OPENAI_API_KEY', 'your-key-if-not-using-env')
os.environ['ANTHROPIC_API_KEY'] = os.getenv('ANTHROPIC_API_KEY', 'your-key-if-not-using-env')
os.environ['HF_TOKEN'] = os.getenv('HF_TOKEN', 'your-key-if-not-using-env')

In [3]:
# Log in to HuggingFace

hf_token = os.environ['HF_TOKEN']
login(hf_token, add_to_git_credential=True)

Note: Environment variable`HF_TOKEN` is set and is the current active token independently from the token you've just configured.


In [4]:
# moved our Tester into a separate package
# call it with Tester.test(function_name, test_dataset)

from items import Item
from testing import Tester

In [5]:
openai = OpenAI()

In [6]:
%matplotlib inline

In [7]:
# from datasets import load_dataset

# dataset = load_dataset("stevhliu/demo")

In [1]:
##!! Downloaded data sets dont function as Items.
## Instead downloading Eds pickle files from:
## https://drive.google.com/drive/folders/1f_IZGybvs9o0J5sb3xmtTEQB3BXllzrW

# Ed Donner larger set. 400k
# https://huggingface.co/datasets/ed-donner/pricer-data
# My smaller data set:   290k
# https://huggingface.co/datasets/altstuff001/pricer-data

from datasets import load_dataset
from types import SimpleNamespace

hf_dataset = load_dataset("ed-donner/pricer-data")

## Access the TRAIN data


# convert dataset to a dict:
dict_train = [{"text": item["text"], "price": item["price"]} for item in hf_dataset["train"]]
# Enable dot notation:
train = [SimpleNamespace(**item) for item in dict_train]

# Convert dictionaries to objects with dot notation

## Access the TEST data
# convert dataset to a dict:
dict_test = [{"text": item["text"], "price": item["price"]} for item in hf_dataset["test"]]
# Enable dot notation:
test = [SimpleNamespace(**item) for item in dict_test]

print(train[1].text)
print(train[1].price)
print(test[1].text)
print(test[1].price)



How much does this cost to the nearest dollar?

Power Stop Rear Z36 Truck and Tow Brake Kit with Calipers
The Power Stop Z36 Truck & Tow Performance brake kit provides the superior stopping power demanded by those who tow boats, haul loads, tackle mountains, lift trucks, and play in the harshest conditions. The brake rotors are drilled to keep temperatures down during extreme braking and slotted to sweep away any debris for constant pad contact. Combined with our Z36 Carbon-Fiber Ceramic performance friction formulation, you can confidently push your rig to the limit and look good doing it with red powder brake calipers. Components are engineered to handle the stress of towing, hauling, mountainous driving, and lifted trucks. Dust-free braking performance. Z36 Carbon-Fiber Ceramic formula provides the extreme braking performance demanded by your truck or 4x

Price is $507.00
506.98
How much does this cost to the nearest dollar?

Motorcraft YB3125 Fan Clutch
Motorcraft YB3125 Fan Clutch

In [9]:
# One more thing!
# Let's pickle the training and test dataset so we don't have to execute all this code next time!

#with open('train-ed.pkl', 'wb') as file:
#    pickle.dump(train_data, file)

#with open('test-ed.pkl', 'wb') as file:
#    pickle.dump(test_data, file)

In [10]:
# Let's avoid curating all our data again! Load in the pickle files:

#with open('pickle-ed/train.pkl', 'rb') as file:
#    train = pickle.load(file)

#with open('pickle-ed/test.pkl', 'rb') as file:
#    test = pickle.load(file)

print(train[0])    
print(test[0])    

<Delphi FG0166 Fuel Pump Module = $226.95>
<OEM AC Compressor w/A/C Repair Kit For Ford F150 F-150 V8 & Lincoln Mark LT 2007 2008 - BuyAutoParts 60-83447RN NEW = $374.41>


In [24]:
print(f'Loaded {len(train)} training points\n eg:')
print(train[0])   
print(f'\n\nLoaded {len(test)} test points\n eg:')
print(test[0]) 

Loaded 400000 training points
 eg:
<Delphi FG0166 Fuel Pump Module = $226.95>


Loaded 2000 test points
 eg:
<OEM AC Compressor w/A/C Repair Kit For Ford F150 F-150 V8 & Lincoln Mark LT 2007 2008 - BuyAutoParts 60-83447RN NEW = $374.41>


In [11]:
# OpenAI recommends fine-tuning with populations of 50-100 examples
# But as our examples are very small, I'm suggesting we go with 200 examples (and 1 epoch)

fine_tune_train = train[:200]
fine_tune_validation = train[200:250]

# Step 1

Prepare our data for fine-tuning in JSONL (JSON Lines) format and upload to OpenAI

In [12]:
# First let's work on a good prompt for a Frontier model
# Notice that I'm removing the " to the nearest dollar"
# When we train our own models, we'll need to make the problem as easy as possible, 
# but a Frontier model needs no such simplification.

def messages_for(item):
    system_message = "You estimate prices of items. Reply only with the price, no explanation"
    user_prompt = item.test_prompt().replace(" to the nearest dollar","").replace("\n\nPrice is $","")
    return [
        {"role": "system", "content": system_message},
        {"role": "user", "content": user_prompt},
        {"role": "assistant", "content": f"Price is ${item.price:.2f}"}
    ]

In [42]:
print(train[0].test_prompt())

messages_for(train[0])

How much does this cost to the nearest dollar?

Delphi FG0166 Fuel Pump Module
Delphi brings 80 years of OE Heritage into each Delphi pump, ensuring quality and fitment for each Delphi part. Part is validated, tested and matched to the right vehicle application Delphi brings 80 years of OE Heritage into each Delphi assembly, ensuring quality and fitment for each Delphi part Always be sure to check and clean fuel tank to avoid unnecessary returns Rigorous OE-testing ensures the pump can withstand extreme temperatures Brand Delphi, Fit Type Vehicle Specific Fit, Dimensions LxWxH 19.7 x 7.7 x 5.1 inches, Weight 2.2 Pounds, Auto Part Position Unknown, Operation Mode Mechanical, Manufacturer Delphi, Model FUEL PUMP, Dimensions 19.7

Price is $


[{'role': 'system',
  'content': 'You estimate prices of items. Reply only with the price, no explanation'},
 {'role': 'user',
  'content': 'How much does this cost?\n\nDelphi FG0166 Fuel Pump Module\nDelphi brings 80 years of OE Heritage into each Delphi pump, ensuring quality and fitment for each Delphi part. Part is validated, tested and matched to the right vehicle application Delphi brings 80 years of OE Heritage into each Delphi assembly, ensuring quality and fitment for each Delphi part Always be sure to check and clean fuel tank to avoid unnecessary returns Rigorous OE-testing ensures the pump can withstand extreme temperatures Brand Delphi, Fit Type Vehicle Specific Fit, Dimensions LxWxH 19.7 x 7.7 x 5.1 inches, Weight 2.2 Pounds, Auto Part Position Unknown, Operation Mode Mechanical, Manufacturer Delphi, Model FUEL PUMP, Dimensions 19.7'},
 {'role': 'assistant', 'content': 'Price is $226.95'}]

In [25]:
# Convert the items into a list of json objects - a "jsonl" string
# Each row represents a message in the form:
# {"messages" : [{"role": "system", "content": "You estimate prices...


def make_jsonl(items):
    result = ""
    for item in items:
        messages = messages_for(item)
        messages_str = json.dumps(messages)
        result += '{"messages": ' + messages_str +'}\n'
    return result.strip()

In [26]:
print(make_jsonl(train[:3]))

{"messages": [{"role": "system", "content": "You estimate prices of items. Reply only with the price, no explanation"}, {"role": "user", "content": "How much does this cost?\n\nDelphi FG0166 Fuel Pump Module\nDelphi brings 80 years of OE Heritage into each Delphi pump, ensuring quality and fitment for each Delphi part. Part is validated, tested and matched to the right vehicle application Delphi brings 80 years of OE Heritage into each Delphi assembly, ensuring quality and fitment for each Delphi part Always be sure to check and clean fuel tank to avoid unnecessary returns Rigorous OE-testing ensures the pump can withstand extreme temperatures Brand Delphi, Fit Type Vehicle Specific Fit, Dimensions LxWxH 19.7 x 7.7 x 5.1 inches, Weight 2.2 Pounds, Auto Part Position Unknown, Operation Mode Mechanical, Manufacturer Delphi, Model FUEL PUMP, Dimensions 19.7"}, {"role": "assistant", "content": "Price is $226.95"}]}
{"messages": [{"role": "system", "content": "You estimate prices of items. 

In [27]:
# Convert the items into jsonl and write them to a file

def write_jsonl(items, filename):
    with open(filename, "w") as f:
        jsonl = make_jsonl(items)
        f.write(jsonl)

In [28]:
write_jsonl(fine_tune_train, "fine_tune_train.jsonl")

In [29]:
write_jsonl(fine_tune_validation, "fine_tune_validation.jsonl")

In [30]:
## Upload to OpenAI

# !IMPORTANT "rb" Read Binary = open as binary bytes file

with open("fine_tune_train.jsonl", "rb") as f:
    train_file = openai.files.create(file=f, purpose="fine-tune")

In [31]:
train_file

FileObject(id='file-NY3WrFPGfVYxBVhhVdDJG6', bytes=188543, created_at=1744285865, filename='fine_tune_train.jsonl', object='file', purpose='fine-tune', status='processed', status_details=None, expires_at=None)

In [32]:
## Upload to OpenAI

with open("fine_tune_validation.jsonl", "rb") as f:
    validation_file = openai.files.create(file=f, purpose="fine-tune")

In [33]:
validation_file

FileObject(id='file-C9DxZLRKUQPE2JQcwU2XzC', bytes=47036, created_at=1744285875, filename='fine_tune_validation.jsonl', object='file', purpose='fine-tune', status='processed', status_details=None, expires_at=None)

# Step 2

I love Weights and Biases - a beautiful, free platform for monitoring training runs.  
Weights and Biases is integrated with OpenAI for fine-tuning.

First set up your weights & biases free account at:

https://wandb.ai

From the Avatar >> Settings menu, near the bottom, you can create an API key.

Then visit the OpenAI dashboard at:

https://platform.openai.com/account/organization

In the integrations section, you can add your Weights & Biases key.

## And now time to Fine-tune!

In [34]:
wandb_integration = {"type": "wandb", "wandb": {"project": "gpt-pricer"}}

In [37]:
print(train_file.id)
print(validation_file.id)

file-NY3WrFPGfVYxBVhhVdDJG6
file-C9DxZLRKUQPE2JQcwU2XzC


In [43]:
# SEE: https://platform.openai.com/docs/models/gpt-4o-mini
## Currently latest is still gpt-4o-mini-2024-07-18 (at April 2025)
## Using mini because when we tested earlier vs gpt-4o (regular) there was little difference
## and mini is MUCH cheaper.

#- seed : makes results repeatable
#- hyperparameters: confgures any extra optimisation/tuning params for training, eg n_epochs
#  - n_epochs : OPTIONAL, but we are providing 500 data points which
#             is already more than recommended.
#             Ou seja, we have a ton of training data.
#             With less training data, we may have wanted to run several "epochs".

#- integrations: configures "wandb" (https://wandb.ai/)
#- suffix : OPTIONAL, adds to the name of the resulting model


openai.fine_tuning.jobs.create(
    training_file=train_file.id,
    validation_file=validation_file.id,
    model="gpt-4o-mini-2024-07-18",
    seed=42,
    hyperparameters={"n_epochs": 1},
    integrations = [wandb_integration],
    suffix="pricer"
)

FineTuningJob(id='ftjob-aM235qvMmHMpgEM2BBL9FKgh', created_at=1744369243, error=Error(code=None, message=None, param=None), fine_tuned_model=None, finished_at=None, hyperparameters=Hyperparameters(batch_size='auto', learning_rate_multiplier='auto', n_epochs=1), model='gpt-4o-mini-2024-07-18', object='fine_tuning.job', organization_id='org-xs4ecm7YkfNN9ZAa9jPpQvcw', result_files=[], seed=42, status='validating_files', trained_tokens=None, training_file='file-NY3WrFPGfVYxBVhhVdDJG6', validation_file='file-C9DxZLRKUQPE2JQcwU2XzC', estimated_finish=None, integrations=[FineTuningJobWandbIntegrationObject(type='wandb', wandb=FineTuningJobWandbIntegration(project='gpt-pricer', entity=None, name=None, tags=None, run_id='ftjob-aM235qvMmHMpgEM2BBL9FKgh'))], method=Method(dpo=None, supervised=MethodSupervised(hyperparameters=MethodSupervisedHyperparameters(batch_size='auto', learning_rate_multiplier='auto', n_epochs=1)), type='supervised'), user_provided_suffix='pricer', metadata=None)

In [44]:
openai.fine_tuning.jobs.list(limit=1)

SyncCursorPage[FineTuningJob](data=[FineTuningJob(id='ftjob-aM235qvMmHMpgEM2BBL9FKgh', created_at=1744369243, error=Error(code=None, message=None, param=None), fine_tuned_model=None, finished_at=None, hyperparameters=Hyperparameters(batch_size=1, learning_rate_multiplier=1.8, n_epochs=1), model='gpt-4o-mini-2024-07-18', object='fine_tuning.job', organization_id='org-xs4ecm7YkfNN9ZAa9jPpQvcw', result_files=[], seed=42, status='running', trained_tokens=None, training_file='file-NY3WrFPGfVYxBVhhVdDJG6', validation_file='file-C9DxZLRKUQPE2JQcwU2XzC', estimated_finish=1744369782, integrations=[FineTuningJobWandbIntegrationObject(type='wandb', wandb=FineTuningJobWandbIntegration(project='gpt-pricer', entity=None, name=None, tags=None, run_id='ftjob-aM235qvMmHMpgEM2BBL9FKgh'))], method=Method(dpo=None, supervised=MethodSupervised(hyperparameters=MethodSupervisedHyperparameters(batch_size=1, learning_rate_multiplier=1.8, n_epochs=1)), type='supervised'), user_provided_suffix='pricer', metadata

In [45]:
job_id = openai.fine_tuning.jobs.list(limit=1).data[0].id

In [46]:
job_id

'ftjob-aM235qvMmHMpgEM2BBL9FKgh'

In [47]:
openai.fine_tuning.jobs.retrieve(job_id)

FineTuningJob(id='ftjob-aM235qvMmHMpgEM2BBL9FKgh', created_at=1744369243, error=Error(code=None, message=None, param=None), fine_tuned_model=None, finished_at=None, hyperparameters=Hyperparameters(batch_size=1, learning_rate_multiplier=1.8, n_epochs=1), model='gpt-4o-mini-2024-07-18', object='fine_tuning.job', organization_id='org-xs4ecm7YkfNN9ZAa9jPpQvcw', result_files=[], seed=42, status='running', trained_tokens=None, training_file='file-NY3WrFPGfVYxBVhhVdDJG6', validation_file='file-C9DxZLRKUQPE2JQcwU2XzC', estimated_finish=1744369775, integrations=[FineTuningJobWandbIntegrationObject(type='wandb', wandb=FineTuningJobWandbIntegration(project='gpt-pricer', entity=None, name=None, tags=None, run_id='ftjob-aM235qvMmHMpgEM2BBL9FKgh'))], method=Method(dpo=None, supervised=MethodSupervised(hyperparameters=MethodSupervisedHyperparameters(batch_size=1, learning_rate_multiplier=1.8, n_epochs=1)), type='supervised'), user_provided_suffix='pricer', metadata=None)

In [56]:
events = openai.fine_tuning.jobs.list_events(fine_tuning_job_id=job_id, limit=10).data

In [61]:
events = openai.fine_tuning.jobs.list_events(fine_tuning_job_id=job_id, limit=10).datafor index, obj in enumerate(events):
    print(f"EVENT{index} - {obj.message}\n----------\n{obj.data}\n\n")

EVENT0 - The job has successfully completed
----------
{}


EVENT1 - New fine-tuned model created
----------
{}


EVENT2 - Step 200/200: training loss=1.14, validation loss=1.13, full validation loss=1.12
----------
{'step': 200, 'train_loss': 1.1376311779022217, 'valid_loss': 1.1295273303985596, 'total_steps': 200, 'full_valid_loss': 1.1193085956573485, 'train_mean_token_accuracy': 0.75, 'valid_mean_token_accuracy': 0.75, 'full_valid_mean_token_accuracy': 0.7925}


EVENT3 - Step 199/200: training loss=1.42
----------
{'step': 199, 'train_loss': 1.4206628799438477, 'total_steps': 200, 'train_mean_token_accuracy': 0.75}


EVENT4 - Step 198/200: training loss=0.52
----------
{'step': 198, 'train_loss': 0.5175371170043945, 'total_steps': 200, 'train_mean_token_accuracy': 0.875}


EVENT5 - Step 197/200: training loss=1.24
----------
{'step': 197, 'train_loss': 1.2442662715911865, 'total_steps': 200, 'train_mean_token_accuracy': 0.75}


EVENT6 - Step 196/200: training loss=0.87
----------
{

In [62]:
## NOTE: Step x/200 (eg 199/200) because we have 200 traing data points.
## i.e. each data point is a "step"

events = openai.fine_tuning.jobs.list_events(fine_tuning_job_id=job_id, limit=250).data
for index, obj in enumerate(events):
    print(f"EVENT: {obj.message}")

EVENT0 - The job has successfully completed
EVENT1 - New fine-tuned model created
EVENT2 - Step 200/200: training loss=1.14, validation loss=1.13, full validation loss=1.12
EVENT3 - Step 199/200: training loss=1.42
EVENT4 - Step 198/200: training loss=0.52
EVENT5 - Step 197/200: training loss=1.24
EVENT6 - Step 196/200: training loss=0.87
EVENT7 - Step 195/200: training loss=1.25
EVENT8 - Step 194/200: training loss=0.99
EVENT9 - Step 193/200: training loss=1.23
EVENT10 - Step 192/200: training loss=1.41
EVENT11 - Step 191/200: training loss=1.06
EVENT12 - Step 190/200: training loss=0.92, validation loss=0.91
EVENT13 - Step 189/200: training loss=1.76
EVENT14 - Step 188/200: training loss=1.24
EVENT15 - Step 187/200: training loss=1.11
EVENT16 - Step 186/200: training loss=1.31
EVENT17 - Step 185/200: training loss=1.62
EVENT18 - Step 184/200: training loss=0.90
EVENT19 - Step 183/200: training loss=0.75
EVENT20 - Step 182/200: training loss=1.52
EVENT21 - Step 181/200: training loss=

# Step 3

Test our fine tuned model

In [63]:
fine_tuned_model_name = openai.fine_tuning.jobs.retrieve(job_id).fine_tuned_model

In [64]:
fine_tuned_model_name

'ft:gpt-4o-mini-2024-07-18:personal:pricer:BL6Z9eik'

In [65]:
# The prompt

def test_messages_for(item):
    system_message = "You estimate prices of items. Reply only with the price, no explanation"
    user_prompt = item.test_prompt().replace(" to the nearest dollar","").replace("\n\nPrice is $","")
    return [
        {"role": "system", "content": system_message},
        {"role": "user", "content": user_prompt},
        {"role": "assistant", "content": "Price is $"}
    ]

In [66]:
# Try this out

test_messages_for(test[0])

[{'role': 'system',
  'content': 'You estimate prices of items. Reply only with the price, no explanation'},
 {'role': 'user',
  'content': "How much does this cost?\n\nOEM AC Compressor w/A/C Repair Kit For Ford F150 F-150 V8 & Lincoln Mark LT 2007 2008 - BuyAutoParts NEW\nAs one of the world's largest automotive parts suppliers, our parts are trusted every day by mechanics and vehicle owners worldwide. This A/C Compressor and Components Kit is manufactured and tested to the strictest OE standards for unparalleled performance. Built for trouble-free ownership and 100% visually inspected and quality tested, this A/C Compressor and Components Kit is backed by our 100% satisfaction guarantee. Guaranteed Exact Fit for easy installation 100% BRAND NEW, premium ISO/TS 16949 quality - tested to meet or exceed OEM specifications Engineered for superior durability, backed by industry-leading unlimited-mileage warranty Included in this K"},
 {'role': 'assistant', 'content': 'Price is $'}]

In [67]:
# A utility function to extract the price from a string

def get_price(s):
    s = s.replace('$','').replace(',','')
    match = re.search(r"[-+]?\d*\.\d+|\d+", s)
    return float(match.group()) if match else 0

In [68]:
get_price("The price is roughly $99.99 because blah blah")

99.99

In [69]:
# The function for gpt-4o-mini

def gpt_fine_tuned(item):
    response = openai.chat.completions.create(
        model=fine_tuned_model_name, 
#        messages=messages_for(item),
        messages=test_messages_for(item),
        seed=42,
        max_tokens=7
    )
    reply = response.choices[0].message.content
    return get_price(reply)

In [70]:
print(test[0].price)
print(gpt_fine_tuned(test[0]))

374.41
174.77


In [76]:
print(len(test))
print(test[0].test_prompt())

2000
How much does this cost to the nearest dollar?

OEM AC Compressor w/A/C Repair Kit For Ford F150 F-150 V8 & Lincoln Mark LT 2007 2008 - BuyAutoParts NEW
As one of the world's largest automotive parts suppliers, our parts are trusted every day by mechanics and vehicle owners worldwide. This A/C Compressor and Components Kit is manufactured and tested to the strictest OE standards for unparalleled performance. Built for trouble-free ownership and 100% visually inspected and quality tested, this A/C Compressor and Components Kit is backed by our 100% satisfaction guarantee. Guaranteed Exact Fit for easy installation 100% BRAND NEW, premium ISO/TS 16949 quality - tested to meet or exceed OEM specifications Engineered for superior durability, backed by industry-leading unlimited-mileage warranty Included in this K

Price is $


In [75]:
#Tester.test(gpt_fine_tuned, test)
#Tester.test(gpt_fine_tuned, fine_tune_validation)
Tester.test(gpt_fine_tuned, test[250:300])

[93m1: Guess: $128.66 Truth: $170.99 Error: $42.33 SLE: 0.08 Item: Moen YB6462BN Belfield 2-Light Dual-Moun...[0m
[91m2: Guess: $209.00 Truth: $489.00 Error: $280.00 SLE: 0.72 Item: Yinfente 4/4 Cello 5 Srting Electric Cel...[0m
[93m3: Guess: $276.10 Truth: $427.97 Error: $151.87 SLE: 0.19 Item: Evan Fischer Front Bumper Cover Compatib...[0m
[91m4: Guess: $393.69 Truth: $166.99 Error: $226.70 SLE: 0.73 Item: OREDY Front Struts Coil Spring Compatibl...[0m
[93m5: Guess: $154.65 Truth: $102.98 Error: $51.67 SLE: 0.16 Item: # 4 Duranodic Heavy Duty Door Closer[0m
[91m6: Guess: $127.99 Truth: $249.99 Error: $122.00 SLE: 0.44 Item: Fantasy Flight Games Runewars[0m
[91m7: Guess: $390.99 Truth: $699.99 Error: $309.00 SLE: 0.34 Item: Futaba Systems 7PXR 7-Channel FASST Tran...[0m
[92m8: Guess: $726.65 Truth: $669.36 Error: $57.29 SLE: 0.01 Item: Fabtech FTS22182 Uniball Upper Control A...[0m
[91m9: Guess: $66.47 Truth: $216.99 Error: $150.52 SLE: 1.38 Item: [Blue Frame] for Sam

IndexError: list index out of range