<h1> Fine-Tune Your LLMs </h1>

**CUSTOMER SUPPORT AUTOMATION PROJECT**

Goal: Automating responses to customer inquiries on various platforms (email, chatbots, social media).

<h3> Step 1. Prepare your data for fine-tuning </h3>

Dataset: For a fashion boutique (Charu's boutique), a dataset of customer inquiries and synthetic responses is generated. This dataset covers a wide range of common questions, complaints, and feedback, along with the company's standard responses.

*Libraries installation*

In [1]:
!pip install openai openai[datalib] urllib3==1.26.6 python-dotenv tiktoken

Collecting openai
  Downloading openai-1.33.0-py3-none-any.whl (325 kB)
[?25l     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/325.5 kB[0m [31m?[0m eta [36m-:--:--[0m[2K     [91m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[90m╺[0m[90m━━━━━━━━[0m [32m256.0/325.5 kB[0m [31m7.8 MB/s[0m eta [36m0:00:01[0m[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m325.5/325.5 kB[0m [31m6.6 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting urllib3==1.26.6
  Downloading urllib3-1.26.6-py2.py3-none-any.whl (138 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m138.5/138.5 kB[0m [31m8.9 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting python-dotenv
  Downloading python_dotenv-1.0.1-py3-none-any.whl (19 kB)
Collecting tiktoken
  Downloading tiktoken-0.7.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.1 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.1/1.1 MB[0m [31m14.2 MB/s[0m eta [36m0:00:00[0m
Collecting 

*Authenticate API by using OpenAI API Key*

In [2]:
import os
from google.colab import userdata

os.environ["OPENAI_API_KEY"]=userdata.get('OPENAI_API_KEY')

In [3]:
from openai import OpenAI
client = OpenAI()

*Some helper functions:*

In [4]:
import json
import tiktoken # for token counting
import numpy as np
from collections import defaultdict

encoding = tiktoken.get_encoding("cl100k_base")

#input_file=formatted_custom_support.json ; output_file=output.jsonl
def json_to_jsonl(input_file, output_file):

    # Open JSON file
    f = open(input_file)

    # returns JSON object as
    # a dictionary
    data = json.load(f)

    # produce JSONL from JSON
    with open(output_file, 'w') as outfile:
        for entry in data:
            json.dump(entry, outfile)
            outfile.write('\n')

def check_file_format(dataset):
    # Format error checks
    format_errors = defaultdict(int)

    for ex in dataset:
        if not isinstance(ex, dict):
            format_errors["data_type"] += 1
            continue

        messages = ex.get("messages", None)
        if not messages:
            format_errors["missing_messages_list"] += 1
            continue

        for message in messages:
            if "role" not in message or "content" not in message:
                format_errors["message_missing_key"] += 1

            if any(k not in ("role", "content", "name", "function_call") for k in message):
                format_errors["message_unrecognized_key"] += 1

            if message.get("role", None) not in ("system", "user", "assistant", "function"):
                format_errors["unrecognized_role"] += 1

            content = message.get("content", None)
            function_call = message.get("function_call", None)

            if (not content and not function_call) or not isinstance(content, str):
                format_errors["missing_content"] += 1

        if not any(message.get("role", None) == "assistant" for message in messages):
            format_errors["example_missing_assistant_message"] += 1

    if format_errors:
        print("Found errors:")
        for k, v in format_errors.items():
            print(f"{k}: {v}")
    else:
        print("No errors found")


# not exact!
# simplified from https://github.com/openai/openai-cookbook/blob/main/examples/How_to_count_tokens_with_tiktoken.ipynb
def num_tokens_from_messages(messages, tokens_per_message=3, tokens_per_name=1):
    num_tokens = 0
    for message in messages:
        num_tokens += tokens_per_message
        for key, value in message.items():
            num_tokens += len(encoding.encode(value))
            if key == "name":
                num_tokens += tokens_per_name
    num_tokens += 3
    return num_tokens

*Converting our JSON file to JSONL*

In [5]:
json_to_jsonl('syndata.json', 'output.jsonl')

In [6]:
data_path = "output.jsonl"

# Load the dataset
with open(data_path, 'r', encoding='utf-8') as f:
    dataset = [json.loads(line) for line in f]

# Initial dataset stats
print("Num examples:", len(dataset))
print("First example:")
for message in dataset[0]["messages"]:
    print(message)

Num examples: 92
First example:
{'role': 'system', 'content': "Automating responses to customer inquiries for Charu's Boutique."}
{'role': 'user', 'content': "Hello, I haven't received my order #123456 and it's been over a week since I placed it. Can you provide an update?"}
{'role': 'assistant', 'content': 'Dear Customer, we apologize for the delay. Your order #123456 is currently being processed and should be shipped within the next 2 days. Thank you for your patience.'}


In [7]:
# Format validation
check_file_format(dataset)

No errors found


<h3> Estimate the cost for call to OpenAI API </h3

In [8]:
# Get the length of the conversation
conversation_length = []

for msg in dataset:
    messages = msg["messages"]
    conversation_length.append(num_tokens_from_messages(messages))

# Pricing and default n_epochs estimate
MAX_TOKENS_PER_EXAMPLE = 4096
TARGET_EPOCHS = 5
MIN_TARGET_EXAMPLES = 100
MAX_TARGET_EXAMPLES = 25000
MIN_DEFAULT_EPOCHS = 1
MAX_DEFAULT_EPOCHS = 25

n_epochs = TARGET_EPOCHS
n_train_examples = len(dataset)

if n_train_examples * TARGET_EPOCHS < MIN_TARGET_EXAMPLES:
    n_epochs = min(MAX_DEFAULT_EPOCHS, MIN_TARGET_EXAMPLES // n_train_examples)
elif n_train_examples * TARGET_EPOCHS > MAX_TARGET_EXAMPLES:
    n_epochs = max(MIN_DEFAULT_EPOCHS, MAX_TARGET_EXAMPLES // n_train_examples)

n_billing_tokens_in_dataset = sum(min(MAX_TOKENS_PER_EXAMPLE, length) for length in conversation_length)
print(f"Dataset has ~{n_billing_tokens_in_dataset} tokens that will be charged for during training")
print(f"By default, you'll train for {n_epochs} epochs on this dataset")
print(f"By default, you'll be charged for ~{n_epochs * n_billing_tokens_in_dataset} tokens")

num_tokens = n_epochs * n_billing_tokens_in_dataset

Dataset has ~6107 tokens that will be charged for during training
By default, you'll train for 5 epochs on this dataset
By default, you'll be charged for ~30535 tokens


In [9]:
# gpt-3.5-turbo	$0.0080 / 1K tokens
cost = (num_tokens/1000) * 0.0080
print(cost)

0.24428


*Fine-tuning is done through finetuning job to which FilesAPI format file needs to be provided*

In [10]:
client.files.create(
  file=open("output.jsonl", "rb"),
  purpose="fine-tune"
)

FileObject(id='file-j1OnzRb1B0v53uBxrWKlbSHY', bytes=34640, created_at=1718011137, filename='output.jsonl', object='file', purpose='fine-tune', status='processed', status_details=None)

*Fine-tuned  model creation:*

In [11]:
client.fine_tuning.jobs.create(
  training_file="file-j1OnzRb1B0v53uBxrWKlbSHY",
  model="gpt-3.5-turbo",
  hyperparameters={
    "n_epochs":5
  }
)

FineTuningJob(id='ftjob-ZY3gd64vMLZWbAnoTyc7vI4a', created_at=1718011379, error=Error(code=None, message=None, param=None), fine_tuned_model=None, finished_at=None, hyperparameters=Hyperparameters(n_epochs=5, batch_size='auto', learning_rate_multiplier='auto'), model='gpt-3.5-turbo-0125', object='fine_tuning.job', organization_id='org-dSLMN9ArgkuhMcmr7i2iPjb2', result_files=[], seed=862277189, status='validating_files', trained_tokens=None, training_file='file-j1OnzRb1B0v53uBxrWKlbSHY', validation_file=None, estimated_finish=None, integrations=[], user_provided_suffix=None)

In [12]:
# Retrieve job status
job_id = "ftjob-ZY3gd64vMLZWbAnoTyc7vI4a"

# Retrieve the state of a fine-tune
# Status field can contain: running or succeeded or failed, etc.
client.fine_tuning.jobs.retrieve(job_id)


FineTuningJob(id='ftjob-ZY3gd64vMLZWbAnoTyc7vI4a', created_at=1718011379, error=Error(code=None, message=None, param=None), fine_tuned_model=None, finished_at=None, hyperparameters=Hyperparameters(n_epochs=5, batch_size=1, learning_rate_multiplier=2), model='gpt-3.5-turbo-0125', object='fine_tuning.job', organization_id='org-dSLMN9ArgkuhMcmr7i2iPjb2', result_files=[], seed=862277189, status='running', trained_tokens=None, training_file='file-j1OnzRb1B0v53uBxrWKlbSHY', validation_file=None, estimated_finish=1718012281, integrations=[], user_provided_suffix=None)

NOTE: *Wait for some time before checking status of running job*

In [14]:
# Retrieve job status
job_id = "ftjob-ZY3gd64vMLZWbAnoTyc7vI4a"

# Retrieve the state of a fine-tune
# Status field can contain: running or succeeded or failed, etc.
client.fine_tuning.jobs.retrieve(job_id)


FineTuningJob(id='ftjob-ZY3gd64vMLZWbAnoTyc7vI4a', created_at=1718011379, error=Error(code=None, message=None, param=None), fine_tuned_model='ft:gpt-3.5-turbo-0125:personal::9YVkAs6v', finished_at=1718012201, hyperparameters=Hyperparameters(n_epochs=5, batch_size=1, learning_rate_multiplier=2), model='gpt-3.5-turbo-0125', object='fine_tuning.job', organization_id='org-dSLMN9ArgkuhMcmr7i2iPjb2', result_files=['file-KfgI53nU232ZX7h7rMEeCnmm'], seed=862277189, status='succeeded', trained_tokens=29615, training_file='file-j1OnzRb1B0v53uBxrWKlbSHY', validation_file=None, estimated_finish=None, integrations=[], user_provided_suffix=None)

<h2> Evaluate results: </h2>


In [21]:
import io
import pandas as pd
import base64

#once training is finished, you can retrieve the file in "result_files=[]"
result_file = "file-KfgI53nU232ZX7h7rMEeCnmm"

file_data = client.files.content(result_file)

file_data_bytes = file_data.read()

# decoding as file is base64 encoded
decoded_data = base64.b64decode(file_data_bytes).decode('utf-8')
# Create a file-like object from the decoded data
file_like_object = io.StringIO(decoded_data)

#now read as csv to create df
# df = pd.read_csv(file_like_object)
df = pd.read_csv(file_like_object)
df

Unnamed: 0,step,train_loss,train_accuracy,valid_loss,valid_mean_token_accuracy
0,1,1.35519,0.72222,,
1,2,1.24711,0.75000,,
2,3,1.31051,0.52941,,
3,4,1.30321,0.80000,,
4,5,1.21014,0.60000,,
...,...,...,...,...,...
455,456,0.01293,1.00000,,
456,457,0.09370,0.96154,,
457,458,0.00361,1.00000,,
458,459,0.00287,1.00000,,


<h3> Use the fine-tuned model </h3>

We will see here the difference between our model and gpt-3.5-turbo to see which works better!

In [24]:
response = client.chat.completions.create(
  model="gpt-3.5-turbo",
  messages=[
    {"role": "system", "content": "This is a customer support chatbot designed to help with common inquiries.",
    "role": "user", "content": "Does Charu's Boutique offer international shipping?"}
  ]
)
print(response.choices[0].message.content)

It is not specified whether Charu's Boutique offers international shipping on their website. It is recommended to contact them directly or check their shipping policies for more information.


In [25]:
fine_tuned_model = "ft:gpt-3.5-turbo-0125:personal::9YVkAs6v"

response = client.chat.completions.create(
  model=fine_tuned_model,
  messages=[
    {"role": "system", "content": "This is a customer support chatbot designed to help with common inquiries for Charu's Boutique.",
     "role": "user", "content": "Does Charu's Boutique offer international shipping?"}
  ]
)
print(response.choices[0].message.content)

Yes, Charu's Boutique offers international shipping to select countries. You can view the list of countries and shipping options during the checkout process.


<h2>CONCLUSION AND RESULTS : </h2>

Here we can see that **Gpt-3.5-turbo does not know details** about Charu's boutique because it has not been fine-tuned for the same. But, **our fine-tuned model** "gpt-3.5-turbo-0125:personal::9YVkAs6v" gives **better result**. Further, finetuning can be done for better results!