## Fine Tuning a Model

In this notebook we continue our work cleaning the Dram Shop items. We've followed a pretty class progression: 

1. Begin writing prompts to do the cleaning in the web UI.
1. Use the API to explore "zero shot" learning, measuring accuracy and developing a training data set.
1. Build a prompt that has a "few shot" learning approach with 8-15 training examples included in the prompt. 

I completed some of this work as part of my data engineering work for the integration of the Dram Shop data into "Telling Stories with Data". I used the "few shot" approach to clean all the beer and wine items. In this notebook we'll build [fine-tuned models](https://platform.openai.com/docs/guides/fine-tuning) for both of these categories. (And we'll throw Cider in the mix because it seems pretty similar to beer.) Let's get started.



In [57]:
import os
import openai
import pandas as pd
import random
from pprint import pprint
import json
import tiktoken # you may not have this installed. It's
                # useful for counting tokens

In [58]:
openai.api_key = os.getenv("OPENAI_API_KEY")


In [59]:
full_item_data = pd.read_csv('../data/20231101_item_data.csv')

Let's start by looking at a few of the values in the raw data.

In [60]:
full_item_data.query('meta_category == "Beer"').sample(10)[['name','clean_item_name']]

Unnamed: 0,name,clean_item_name
31308,ZP Super Pils - Bavik (1),Super Pils
8635,Z10P Tropical IPA - Great Burn,Tropical IPA
31500,ZP Mango Cart - Golden Road,Mango Cart
27255,Z NITRO CBS - Founders,NITRO CBS
10935,ZP Creek Side Session IPA Blacksmith,Creek Side Session IPA
27462,Z22P White Stout - Missouri River Brewing,White Stout
29806,Z02P Kolsch - Fruh,Kolsch
7482,Z15C Snow Brainer Hazy IPA - Gild,Snow Brainer Hazy IPA
24825,ZM Ripper - Stone,Ripper
30974,ZP Mad Mile Cream Ale - Bridger Brewing,Mad Mile Cream Ale


In [65]:
full_item_data.query('meta_category == "Wine"').sample(10)[['name','clean_item_name']]

Unnamed: 0,name,clean_item_name
21359,Z Jauma - Archie's Shiraz 2017,Archie's Shiraz
22450,Z Roca Altxerri Camino,Roca Altxerri Camino
21422,Z Ochoa Calendas Tempranillo Garnacha blend 2016,Calendas Tempranillo Garnacha blend
21825,ZLa Lecciaia Lupaia Toscana IGT,Lupaia Toscana IGT
21297,z Christina St Laurent - Low Intervention - 2021,Christina St Laurent
21839,ZRoco Winery Gravel Road Pinot Noir 2014,Gravel Road Pinot Noir
16255,Z Chardonnay - Tangent,Chardonnay
28589,Z Aubry Champagne Brut Premier Cru,Aubry Champagne Brut Premier Cru
16475,Z35W Coopers Hall Rose' of Malbec,Rose' of Malbec
16126,ZR Piccola Cellars Upright Red Mourvedre/Syrah...,Upright Red Mourvedre/Syrah Blend


Let's put together the fine-tuning training set. I'm going to write out 500 beer/cider combinations and then 250 wine combinations. (Wine is less critical to the business, I know it less well, and the site above says you can start with only 50 observations.)

If you were doing this for real, you would manually go through these files deleting any examples that were not good training samples. I've already done that for you, so we'll write out our examples to `_for_cleaning` files and then below we'll read in the same files with without that string in the file name. To be clear: you don't need to do this! 

In [66]:
random.seed(20231101)

to_write = full_item_data[full_item_data['meta_category'].isin(['Beer', 'Cider'])].sample(500)[['name','clean_item_name']]
to_write.to_csv('../data/beer_cider_sample_for_cleaning.csv', index=False)


In [67]:
to_write = full_item_data[full_item_data['meta_category'].isin(['Wine'])].sample(250)[['name','clean_item_name']]
to_write.to_csv('../data/wine_sample_for_cleaning.csv', index=False)


Pause here for a moment and reflect that you don't have to go clean these. :-) 

## Building a Fine-Tuned Model

Now that "we" have hand-cleaned the training data, we'll read it in and follow the steps necessary to build a fine-tuned model.

In [10]:
beer_training = pd.read_csv('../data/beer_cider_sample.csv')
wine_training = pd.read_csv('../data/wine_sample.csv')

```
{"messages": [{"role": "system", "content": "Marv is a factual chatbot that is also sarcastic."}, {"role": "user", "content": "What's the capital of France?"}, {"role": "assistant", "content": "Paris, as if everyone doesn't know that already."}]}
{"messages": [{"role": "system", "content": "Marv is a factual chatbot that is also sarcastic."}, {"role": "user", "content": "Who wrote 'Romeo and Juliet'?"}, {"role": "assistant", "content": "Oh, just some guy named William Shakespeare. Ever heard of him?"}]}
{"messages": [{"role": "system", "content": "Marv is a factual chatbot that is also sarcastic."}, {"role": "user", "content": "How far is the Moon from Earth?"}, {"role": "assistant", "content": "Around 384,400 kilometers. Give or take a few, like that really matters."}]}

```

In [16]:
beer_system_prompt = "You are a worldwide expert in beer and cider and a master cicerone."
wine_system_prompt = "You are a worldwide expert in wine and a master sommelier."

The format of these fine-tuned examples are a bit unusual and I'm guessing this is going to continue to evolve. We're going to fine-tune the `gpt-3.5-turbo` model, which requires fine-tuning prompts in a conversational format. We'll have the system prompt and user prompt as before, but now we'll have the "assistant" role as part of the object that will contain the answer. 

There are ways to build this directly from Pandas, but since JSON looks a lot like a Python dictionary _and_ that's a format I'm more comfortable with, I'm going to build the object manually. 

In [19]:
prompts = []

for i, row in beer_training.iterrows():
  name = row['name']
  clean_name = row['clean_item_name']

  prompts.append({
    'messages':[
      {"role":"system", "content": beer_system_prompt},
      {"role":"user", "content": f"What beverage is in this name: {name}?"},
      {"role":"assistant", "content": clean_name}
    ] 
  })

  

In [21]:
with open('../data/beer_cider_prompts.jsonl', 'w') as f:
  for prompt in prompts : 
    json.dump(prompt, f)
    f.write("\n")

At this point, we've built the raw materials to train our fine-tuned model.

In [24]:
openai.File.create(
  file=open("../data/beer_cider_prompts.jsonl", "rb"),
  purpose='fine-tune'
)


<File file id=file-qY4vsgtPYaABbXl4Vw3JGwci at 0x7f8df8e4e2c0> JSON: {
  "object": "file",
  "id": "file-qY4vsgtPYaABbXl4Vw3JGwci",
  "purpose": "fine-tune",
  "filename": "file",
  "bytes": 135279,
  "created_at": 1698869428,
  "status": "processed",
  "status_details": null
}

In [25]:
openai.FineTuningJob.create(training_file="file-qY4vsgtPYaABbXl4Vw3JGwci", 
                            model="gpt-3.5-turbo")

<FineTuningJob fine_tuning.job id=ftjob-BovXPBbKbWi6BFhBPuL2RzcE at 0x7f8df8e4e7c0> JSON: {
  "object": "fine_tuning.job",
  "id": "ftjob-BovXPBbKbWi6BFhBPuL2RzcE",
  "model": "gpt-3.5-turbo-0613",
  "created_at": 1698869467,
  "finished_at": null,
  "fine_tuned_model": null,
  "organization_id": "org-iMaZSwjCe3pTA3LXtnAOSclE",
  "result_files": [],
  "status": "validating_files",
  "validation_file": null,
  "training_file": "file-qY4vsgtPYaABbXl4Vw3JGwci",
  "hyperparameters": {
    "n_epochs": "auto"
  },
  "trained_tokens": null,
  "error": null
}

The link has a number of functions we can call to check on the status of the fine tuning, which can take a while.

In [68]:
openai.FineTuningJob.list(limit=3)

<OpenAIObject list at 0x7f8e0a0edf90> JSON: {
  "object": "list",
  "data": [
    {
      "object": "fine_tuning.job",
      "id": "ftjob-2dyPuhyf0QAZDH9cpTKkZGpx",
      "model": "gpt-3.5-turbo-0613",
      "created_at": 1698872705,
      "finished_at": 1698876300,
      "fine_tuned_model": "ft:gpt-3.5-turbo-0613:john-chandler-umt::8GDcbV4N",
      "organization_id": "org-iMaZSwjCe3pTA3LXtnAOSclE",
      "result_files": [
        "file-x0EkrSHkszHniHIvTFh7008x"
      ],
      "status": "succeeded",
      "validation_file": null,
      "training_file": "file-70cfrEIWD3QX1aOZQmfgMHNz",
      "hyperparameters": {
        "n_epochs": 3
      },
      "trained_tokens": 74592,
      "error": null
    },
    {
      "object": "fine_tuning.job",
      "id": "ftjob-McyJGEiW34umr0G5Hvw6KBDX",
      "model": "gpt-3.5-turbo-0613",
      "created_at": 1698870111,
      "finished_at": 1698873228,
      "fine_tuned_model": "ft:gpt-3.5-turbo-0613:john-chandler-umt::8GCp3BDs",
      "organization_id":

In [48]:
openai.FineTuningJob.retrieve("ftjob-BovXPBbKbWi6BFhBPuL2RzcE")
model_name = openai.FineTuningJob.retrieve("ftjob-BovXPBbKbWi6BFhBPuL2RzcE")['fine_tuned_model']

if model_name : 
  print(f"Our model ID is {model_name}") 

Our model ID is ft:gpt-3.5-turbo-0613:john-chandler-umt::8GCeP5e3


Now let's use the model to see how it works. 

In [71]:
data_for_scoring = full_item_data[full_item_data['meta_category'].isin(["Beer","Cider"])].sample(10)

In [72]:
for i, row in data_for_scoring.iterrows():
  name = row['name']
  clean_name = row['clean_item_name']
  print("-"*40)
  print(f"Name: {name}")
  print(f"Clean Name: {clean_name}")

  completion = openai.ChatCompletion.create(
    model="ft:gpt-3.5-turbo-0613:john-chandler-umt::8GCeP5e3",
    messages=[
      {"role": "system", "content": beer_system_prompt},
      {"role": "user", "content": f"What beverage is in this name: {name}?"}
    ]
  )
  
  chat_reply = completion.choices[0].message
  print(f"Chat Reply: {chat_reply}")
  print("-"*40)
  print("\n")


----------------------------------------
Name: Z11P IPA - Stone
Clean Name: IPA
Chat Reply: {
  "role": "assistant",
  "content": "IPA"
}
----------------------------------------


----------------------------------------
Name: Z06P Switchyard Scottish Ale - Carter's
Clean Name: Switchyard Scottish Ale
Chat Reply: {
  "role": "assistant",
  "content": "Switchyard Scottish Ale"
}
----------------------------------------


----------------------------------------
Name: Z26M Twisted Karma- Mountains Walking
Clean Name: Twisted Karma
Chat Reply: {
  "role": "assistant",
  "content": "Twisted Karma"
}
----------------------------------------


----------------------------------------
Name: Z Stone Old Guardian
Clean Name: Stone Old Guardian
Chat Reply: {
  "role": "assistant",
  "content": "Stone Old Guardian"
}
----------------------------------------


----------------------------------------
Name: ZP Plum St. Porter - Bozeman Brewing
Clean Name: Plum St. Porter
Chat Reply: {
  "role": "a

### Fine Tuning a Wine Model

Now we'll do the same for wine.

In [35]:
prompts = []

for i, row in beer_training.iterrows():
  name = row['name']
  clean_name = row['clean_item_name']

  prompts.append({
    'messages':[
      {"role":"system", "content": wine_system_prompt},
      {"role":"user", "content": f"What wine is in this name: {name}?"},
      {"role":"assistant", "content": clean_name}
    ] 
  })

  

In [36]:
with open('../data/wine_prompts.jsonl', 'w') as f:
  for prompt in prompts : 
    json.dump(prompt, f)
    f.write("\n")

At this point, we've built the raw materials to train our fine-tuned model.

In [37]:
openai.File.create(
  file=open("../data/wine_prompts.jsonl", "rb"),
  purpose='fine-tune'
)


<File file id=file-70cfrEIWD3QX1aOZQmfgMHNz at 0x7f8e38cd5d10> JSON: {
  "object": "file",
  "id": "file-70cfrEIWD3QX1aOZQmfgMHNz",
  "purpose": "fine-tune",
  "filename": "file",
  "bytes": 128779,
  "created_at": 1698870084,
  "status": "processed",
  "status_details": null
}

In [51]:
openai.FineTuningJob.create(training_file="file-70cfrEIWD3QX1aOZQmfgMHNz", model="gpt-3.5-turbo")

<FineTuningJob fine_tuning.job id=ftjob-2dyPuhyf0QAZDH9cpTKkZGpx at 0x7f8e48b6e130> JSON: {
  "object": "fine_tuning.job",
  "id": "ftjob-2dyPuhyf0QAZDH9cpTKkZGpx",
  "model": "gpt-3.5-turbo-0613",
  "created_at": 1698872705,
  "finished_at": null,
  "fine_tuned_model": null,
  "organization_id": "org-iMaZSwjCe3pTA3LXtnAOSclE",
  "result_files": [],
  "status": "validating_files",
  "validation_file": null,
  "training_file": "file-70cfrEIWD3QX1aOZQmfgMHNz",
  "hyperparameters": {
    "n_epochs": "auto"
  },
  "trained_tokens": null,
  "error": null
}

The link has a number of functions we can call to check on the status of the fine tuning, which can take a while.

In [52]:
openai.FineTuningJob.list(limit=3)

<OpenAIObject list at 0x7f8e0a0e5630> JSON: {
  "object": "list",
  "data": [
    {
      "object": "fine_tuning.job",
      "id": "ftjob-2dyPuhyf0QAZDH9cpTKkZGpx",
      "model": "gpt-3.5-turbo-0613",
      "created_at": 1698872705,
      "finished_at": null,
      "fine_tuned_model": null,
      "organization_id": "org-iMaZSwjCe3pTA3LXtnAOSclE",
      "result_files": [],
      "status": "queued",
      "validation_file": null,
      "training_file": "file-70cfrEIWD3QX1aOZQmfgMHNz",
      "hyperparameters": {
        "n_epochs": 3
      },
      "trained_tokens": null,
      "error": null
    },
    {
      "object": "fine_tuning.job",
      "id": "ftjob-McyJGEiW34umr0G5Hvw6KBDX",
      "model": "gpt-3.5-turbo-0613",
      "created_at": 1698870111,
      "finished_at": null,
      "fine_tuned_model": null,
      "organization_id": "org-iMaZSwjCe3pTA3LXtnAOSclE",
      "result_files": [],
      "status": "running",
      "validation_file": null,
      "training_file": "file-70cfrEIWD3Q

In [54]:
print(openai.FineTuningJob.retrieve("ftjob-McyJGEiW34umr0G5Hvw6KBDX"))
model_name = openai.FineTuningJob.retrieve("ftjob-McyJGEiW34umr0G5Hvw6KBDX")['fine_tuned_model']

if model_name : 
  print(f"Our model ID is {model_name}") 

{
  "object": "fine_tuning.job",
  "id": "ftjob-McyJGEiW34umr0G5Hvw6KBDX",
  "model": "gpt-3.5-turbo-0613",
  "created_at": 1698870111,
  "finished_at": null,
  "fine_tuned_model": null,
  "organization_id": "org-iMaZSwjCe3pTA3LXtnAOSclE",
  "result_files": [],
  "status": "running",
  "validation_file": null,
  "training_file": "file-70cfrEIWD3QX1aOZQmfgMHNz",
  "hyperparameters": {
    "n_epochs": 3
  },
  "trained_tokens": null,
  "error": null
}


Now let's use the model to see how it works. 

In [55]:
data_for_scoring = full_item_data.query("meta_category == 'Wine'").sample(10)

In [56]:
for i, row in data_for_scoring.iterrows():
  name = row['name']
  clean_name = row['clean_item_name']
  print("-"*40)
  print(f"Name: {name}")
  print(f"Clean Name: {clean_name}")

  completion = openai.ChatCompletion.create(
    model="ft:gpt-3.5-turbo-0613:john-chandler-umt::8GCp3BDs",
    messages=[
      {"role": "system", "content": wine_system_prompt},
      {"role": "user", "content": f"What wine is in this name: {name}?"}
    ]
  )
  
  chat_reply = completion.choices[0].message
  print(f"Chat Reply: {chat_reply}")
  print("-"*40)
  print("\n")


----------------------------------------
Name: Z40W Poggio Anima Asmodeus Nero d'Avola
Clean Name: Asmodeus Nero d'Avola
Chat Reply: {
  "role": "assistant",
  "content": "Asmodeus Nero d'Avola"
}
----------------------------------------


----------------------------------------
Name: Z Bodegas Albero Tempranillo
Clean Name: Tempranillo
Chat Reply: {
  "role": "assistant",
  "content": "Bodegas Albero Tempranillo"
}
----------------------------------------


----------------------------------------
Name: Z Vinum Pinot Noir
Clean Name: Pinot Noir
Chat Reply: {
  "role": "assistant",
  "content": "Vinum Pinot Noir"
}
----------------------------------------


----------------------------------------
Name: Z Queen of The Sierra Red
Clean Name: Queen of The Sierra Red
Chat Reply: {
  "role": "assistant",
  "content": "Queen of The Sierra Red"
}
----------------------------------------


----------------------------------------
Name: Domaine Faury Rhodaniennes Syrah 2022
Clean Name: Rhodan