## Fine-tune GPT3 with Weights & Biases

## imports

In [2]:
import openai
import wandb
import pandas as pd
from dotenv import find_dotenv, load_dotenv
dotenv_path = find_dotenv()
load_dotenv(dotenv_path)


## Dataset preparation

In [3]:
# create a job for splitting dataset
project_name = "GPT-3 blog title"
run = wandb.init(project=project_name, job_type='split dataset')

[34m[1mwandb[0m: Currently logged in as: [33mbenneo[0m. Use [1m`wandb login --relogin`[0m to force relogin


In [6]:
dataset_path= "../data/2_final/prompts.jsonl"

let's look at a few samples of our dataset

In [7]:
!head $dataset_path

{"prompt":"Title: Literally Nobody Voted in the Quincy Midterm Elections ->","completion":" good"}
{"prompt":"Title: Neural Networks: Is Your Brain Like A Computer? ->","completion":" good"}
{"prompt":"Title: Must have MacBook apps for productivity ->","completion":" good"}
{"prompt":"Title: Remedial Data Science Engineering ->","completion":" good"}
{"prompt":"Title: Verifiable Deployment of Smart Contracts ->","completion":" good"}
{"prompt":"Title: Security Tokens vs. Fat Protocols ->","completion":" good"}
{"prompt":"Title: The Fundamental Problem of the Data Economy Nobody is Talking About ->","completion":" good"}
{"prompt":"Title: Hello Triangle, Meet Swift! (And Wide Color) ->","completion":" good"}
{"prompt":"Title: Time series analysis and its different approach in python : Part 1 ->","completion":" good"}
{"prompt":"Title: Making serverless variables work for you ->","completion":" good"}


verify data is correctly formatted

## train/valid split with openai cli

In [8]:
!openai tools fine_tunes.prepare_data -f $dataset_path -q

Analyzing...

- Your file contains 33373 prompt-completion pairs
- Based on your data it seems like you're trying to fine-tune a model for classification
- For classification, we recommend you try one of the faster and cheaper models, such as `ada`
- For classification, you can estimate the expected model performance by keeping a held out dataset, which is not used for training
- All prompts end with suffix ` ->`
- All prompts start with prefix `Title: `

No remediations found.
- [Recommended] Would you like to split into training and validation set? [Y/n]: Y


Your data will be written to a new JSONL file. Proceed [Y/n]: Y

Wrote modified files to `../data/2_final/prompts_prepared_train.jsonl` and `../data/2_final/prompts_prepared_valid.jsonl`
Feel free to take a look!

Now use that file when fine-tuning:
> openai api fine_tunes.create -t "../data/2_final/prompts_prepared_train.jsonl" -v "../data/2_final/prompts_prepared_valid.jsonl" --compute_classification_metrics --classification_p

In [10]:
# check number of samples
!wc -l ../data/2_final/prompts_prepared_train.jsonl
!wc -l ../data/2_final/prompts_prepared_valid.jsonl

   32373 ../data/2_final/prompts_prepared_train.jsonl
    1000 ../data/2_final/prompts_prepared_valid.jsonl


In [14]:
n_train = 32373
n_valid = 1000

## Log train/valid split as W&B artifact

In [11]:
# Create tables for better visualization (optional)

train_path = "../data/2_final/prompts_prepared_train.jsonl"
valid_path = "../data/2_final/prompts_prepared_valid.jsonl"

df_train = pd.read_json(train_path, orient='records', lines=True)
df_valid = pd.read_json(valid_path, orient='records', lines=True)
table_train = wandb.Table(dataframe=df_train)
table_valid = wandb.Table(dataframe=df_valid)

## Upload artifacts

In [16]:
# Create artifacts
artifact_train = wandb.Artifact('medium_train.jsonl', type='training_files', metadata={'samples': n_train})
artifact_train.add_file(train_path)
artifact_train.add(table_train, 'medium_train')

artifact_valid = wandb.Artifact('medium_valid.jsonl', type='validation_files', metadata={'samples': n_valid})
artifact_valid.add_file(valid_path)
artifact_valid.add(table_valid, 'medium_train')

# Log files
run.log_artifact(artifact_train)
run.log_artifact(artifact_valid)

<wandb.sdk.wandb_artifacts.Artifact at 0x12547e250>

In [17]:
entity = wandb.run.entity

In [18]:
wandb.finish()

VBox(children=(Label(value='5.292 MB of 5.292 MB uploaded (0.000 MB deduped)\r'), FloatProgress(value=1.0, max…

## Fine tuning

recover training and validation dataset

In [21]:
artifact_train = run.use_artifact(f'{entity}/{project_name}/medium_train.jsonl:latest', type='training_files')
train_file = artifact_train.get_path('prompts_prepared_train.jsonl').download()

artifact_valid = run.use_artifact(f'{entity}/{project_name}/medium_valid.jsonl:latest', type='validation_files')
valid_file = artifact_valid.get_path('prompts_prepared_valid.jsonl').download()

train_file, valid_file

('./artifacts/medium_train.jsonl:v0/prompts_prepared_train.jsonl',
 './artifacts/medium_valid.jsonl:v0/prompts_prepared_valid.jsonl')

## upload file to openai

In [30]:
# upload train
openai.File.create(
  file=open(train_file, "rb"),
  purpose='fine-tune'
)

# upload validation
openai.File.create(
  file=open(valid_file, "rb"),
  purpose='fine-tune'
)

<File file id=file-ueOemwgI8SRCkiffG18K6WNK at 0x107fc2810> JSON: {
  "bytes": 93362,
  "created_at": 1669849603,
  "filename": "file",
  "id": "file-ueOemwgI8SRCkiffG18K6WNK",
  "object": "file",
  "purpose": "fine-tune",
  "status": "uploaded",
  "status_details": null
}

In [31]:
openai.File.list()

<OpenAIObject list at 0x127f9ee50> JSON: {
  "data": [
    {
      "bytes": 93362,
      "created_at": 1669849603,
      "filename": "file",
      "id": "file-ueOemwgI8SRCkiffG18K6WNK",
      "object": "file",
      "purpose": "fine-tune",
      "status": "processed",
      "status_details": null
    },
    {
      "bytes": 3014524,
      "created_at": 1669849602,
      "filename": "file",
      "id": "file-eRRY7tNbGl7v8c7GLqDr3NRn",
      "object": "file",
      "purpose": "fine-tune",
      "status": "processed",
      "status_details": null
    }
  ],
  "object": "list"
}

define GPT-3 hyperparameters

In [89]:
!openai api fine_tunes.create \
    -t "file-eRRY7tNbGl7v8c7GLqDr3NRn" \
    -v "file-ueOemwgI8SRCkiffG18K6WNK" \
    -m "ada" \
    --n_epochs 4 \
    --batch_size 256 \
    --classification_n_classes 2 \
    --suffix "blog title scorer" \
    --classification_positive_class " good" \
    --compute_classification_metrics

Created fine-tune: ft-cNrLBjqB9V6g0WLayqpRxcGK
Streaming events until fine-tuning is complete...

(Ctrl-C will interrupt the stream, but not cancel the fine-tune)
[2022-11-30 22:57:29] Created fine-tune: ft-cNrLBjqB9V6g0WLayqpRxcGK
[2022-11-30 22:57:33] Fine-tune costs $0.74
[2022-11-30 22:57:34] Fine-tune enqueued. Queue number: 0
[2022-11-30 22:57:35] Fine-tune started


wandb: ERROR Dropped streaming file chunk (see wandb/debug-internal.log)


^C


In [None]:
!openai api fine_tunes.create \
    -t "file-eRRY7tNbGl7v8c7GLqDr3NRn" \
    -v "file-ueOemwgI8SRCkiffG18K6WNK" \
    -m "ada" \
    --n_epochs 4 \
    --batch_size 256 \
    --classification_n_classes 2 \
    --suffix "blog title scorer" \
    --classification_positive_class " good" \
    --prompt_loss_weight 0.1 \
    --compute_classification_metrics

adding a learning_rate_multiplier

In [82]:
!openai api fine_tunes.create \
    -t "file-eRRY7tNbGl7v8c7GLqDr3NRn" \
    -v "file-ueOemwgI8SRCkiffG18K6WNK" \
    -m "ada" \
    --n_epochs 4 \
    --batch_size 256 \
    --classification_n_classes 2 \
    --suffix "blog title scorer" \
    --classification_positive_class " good" \
    --prompt_loss_weight 0.1 \
    --learning_rate_multiplier 0.2 \
    --compute_classification_metrics

Created fine-tune: ft-2ooqd5t0ycMwvf8TM3MUuyAt
Streaming events until fine-tuning is complete...

(Ctrl-C will interrupt the stream, but not cancel the fine-tune)
[2022-11-30 22:26:58] Created fine-tune: ft-2ooqd5t0ycMwvf8TM3MUuyAt
[2022-11-30 22:27:01] Fine-tune costs $0.74
[2022-11-30 22:27:02] Fine-tune enqueued. Queue number: 2
^C


using babbage model

In [41]:
!openai api fine_tunes.create \
    -t "file-eRRY7tNbGl7v8c7GLqDr3NRn" \
    -v "file-ueOemwgI8SRCkiffG18K6WNK" \
    -m "babbage" \
    --n_epochs 4 \
    --batch_size 256 \
    --classification_n_classes 2 \
    --suffix "blog title scorer" \
    --classification_positive_class " good" \
    --prompt_loss_weight 0.1 \
    --compute_classification_metrics

Created fine-tune: ft-3Svmi2GPLKTLnZAQCmHlEYql
Streaming events until fine-tuning is complete...

(Ctrl-C will interrupt the stream, but not cancel the fine-tune)
[2022-11-30 21:27:16] Created fine-tune: ft-3Svmi2GPLKTLnZAQCmHlEYql
[2022-11-30 21:27:25] Fine-tune costs $1.11
[2022-11-30 21:27:25] Fine-tune enqueued. Queue number: 0
[2022-11-30 21:27:28] Fine-tune started

Stream interrupted (client disconnected).
To resume the stream, run:

  openai api fine_tunes.follow -i ft-3Svmi2GPLKTLnZAQCmHlEYql



In [90]:
!openai wandb sync --project "GPT-3 blog title"

[34m[1mwandb[0m: Currently logged in as: [33mbenneo[0m. Use [1m`wandb login --relogin`[0m to force relogin
[34m[1mwandb[0m: Tracking run with wandb version 0.13.5
[34m[1mwandb[0m: Run data is saved locally in [35m[1m/Users/benedictneo/gpt3-blog-title/notebooks/wandb/run-20221130_230647-ft-cNrLBjqB9V6g0WLayqpRxcGK[0m
[34m[1mwandb[0m: Run [1m`wandb offline`[0m to turn off syncing.
[34m[1mwandb[0m: Syncing run [33mft-cNrLBjqB9V6g0WLayqpRxcGK[0m
[34m[1mwandb[0m: ⭐️ View project at [34m[4mhttps://wandb.ai/benneo/GPT-3%20blog%20title[0m
[34m[1mwandb[0m: 🚀 View run at [34m[4mhttps://wandb.ai/benneo/GPT-3%20blog%20title/runs/ft-cNrLBjqB9V6g0WLayqpRxcGK[0m
[34m[1mwandb[0m: Waiting for W&B process to finish... [32m(success).[0m
[34m[1mwandb[0m: | 0.002 MB of 0.002 MB uploaded (0.000 MB deduped)
[34m[1mwandb[0m: Run history:
[34m[1mwandb[0m:      classification/accuracy ▁███
[34m[1mwandb[0m:         classification/auprc ▁█▆▄
[34m[1mwandb[0

In [112]:
!wandb offline

W&B offline. Running your script from this directory will only write metadata locally. Use wandb disabled to completely turn off W&B.


testing the model

In [91]:
finetunes = openai.FineTune.list()

In [106]:
pd.DataFrame.from_dict(finetunes["data"])

Unnamed: 0,object,id,hyperparams,organization_id,model,training_files,validation_files,result_files,created_at,updated_at,status,fine_tuned_model
0,fine-tune,ft-8V9ak7gQ4jv3X4zM3wZwa4ma,"{'n_epochs': 4, 'batch_size': 256, 'prompt_los...",org-Kz5UVJ3lj9OBEwe4ukIaOuoU,ada,"[{'object': 'file', 'id': 'file-eRRY7tNbGl7v8c...","[{'object': 'file', 'id': 'file-ueOemwgI8SRCki...",[],1669849957,1669849995,failed,
1,fine-tune,ft-uSwqlpAX9RlIcFfrYkyNI8g1,"{'n_epochs': 4, 'batch_size': 256, 'prompt_los...",org-Kz5UVJ3lj9OBEwe4ukIaOuoU,ada,"[{'object': 'file', 'id': 'file-eRRY7tNbGl7v8c...","[{'object': 'file', 'id': 'file-ueOemwgI8SRCki...","[{'object': 'file', 'id': 'file-AC9o2olJ4SBYSL...",1669850014,1669850837,succeeded,ada:ft-personal:blog-title-scorer-2022-11-30-2...
2,fine-tune,ft-3Svmi2GPLKTLnZAQCmHlEYql,"{'n_epochs': 4, 'batch_size': 256, 'prompt_los...",org-Kz5UVJ3lj9OBEwe4ukIaOuoU,babbage,"[{'object': 'file', 'id': 'file-eRRY7tNbGl7v8c...","[{'object': 'file', 'id': 'file-ueOemwgI8SRCki...","[{'object': 'file', 'id': 'file-mhWxYNRuWpzRWT...",1669865236,1669867527,succeeded,babbage:ft-personal:blog-title-scorer-2022-12-...
3,fine-tune,ft-2ooqd5t0ycMwvf8TM3MUuyAt,"{'n_epochs': 4, 'batch_size': 256, 'prompt_los...",org-Kz5UVJ3lj9OBEwe4ukIaOuoU,ada,"[{'object': 'file', 'id': 'file-eRRY7tNbGl7v8c...","[{'object': 'file', 'id': 'file-ueOemwgI8SRCki...","[{'object': 'file', 'id': 'file-fqRlTuGaUwFyUk...",1669868818,1669869869,succeeded,ada:ft-personal:blog-title-scorer-2022-12-01-0...
4,fine-tune,ft-cNrLBjqB9V6g0WLayqpRxcGK,"{'n_epochs': 4, 'batch_size': 256, 'prompt_los...",org-Kz5UVJ3lj9OBEwe4ukIaOuoU,ada,"[{'object': 'file', 'id': 'file-eRRY7tNbGl7v8c...","[{'object': 'file', 'id': 'file-ueOemwgI8SRCki...","[{'object': 'file', 'id': 'file-8v1xTMFu7WjWnl...",1669870649,1669871181,succeeded,ada:ft-personal:blog-title-scorer-2022-12-01-0...


wandb: ERROR Dropped streaming file chunk (see wandb/debug-internal.log)


In [113]:
model_ids = []
for run in finetunes["data"]:
    if run["status"] == "succeeded":
        print(f"{run['fine_tuned_model']} \t {run['id']}")
        model_ids.append(run["id"])

ada:ft-personal:blog-title-scorer-2022-11-30-23-27-16 	 ft-uSwqlpAX9RlIcFfrYkyNI8g1
babbage:ft-personal:blog-title-scorer-2022-12-01-04-05-26 	 ft-3Svmi2GPLKTLnZAQCmHlEYql
ada:ft-personal:blog-title-scorer-2022-12-01-04-44-27 	 ft-2ooqd5t0ycMwvf8TM3MUuyAt
ada:ft-personal:blog-title-scorer-2022-12-01-05-06-20 	 ft-cNrLBjqB9V6g0WLayqpRxcGK


In [117]:
for id in model_ids:
    res = openai.FineTune.retrieve(id=id)
    df = pd.DataFrame.from_dict(res["events"])
    df["created_at"] = pd.to_datetime(df["created_at"], unit='s')
    total_time = df["created_at"].max() - df["created_at"].min()
    print(f"fine tuning took {total_time}")

fine tuning took 0 days 00:13:43
fine tuning took 0 days 00:38:11
fine tuning took 0 days 00:17:31
fine tuning took 0 days 00:08:52


In [116]:
wandb.finish()