# How to fine-tune your GPT3.5-Turbo model?



Before we begin, we need to import all the packages we need and set up an OpenAI API Key. You can create an OpenAI API Key by following the [instruction here](https://www.maisieai.com/help/how-to-get-an-openai-api-key-for-chatgpt)

In [1]:
import openai
import json
import random
from tenacity import (
    retry,
    stop_after_attempt,
    wait_random_exponential,
)

openai.api_key = ''

## 1. Generate a dataset

Firstly, we need a dataset. You can use a real dataset or generate one, depending on your situation.

In our case, we will use both GPT4-generated data and real-life data.

📝 Note: It's a good idea to use a larger training samples.

In [5]:
dataset = []

# Data from GPT4
messages = [
    {'role': 'system', 'content': 'You are an employee in the 200FTE company'},
    {'role': 'user', 'content': 'Give me 25 common questions employees ask HR team. Answer in JSON format, eg: [{"question": "..."}, {"question", "..."}]'}
]
response = openai.ChatCompletion.create(
    model='gpt-4',
    messages=messages,
    temperature=0,
)
data_from_gpt4 = json.loads(response.choices[0]['message']['content'])

for question in data_from_gpt4:
    dataset.append({'question': question.popitem()[-1]})

# Data from Human
data_from_human = [
    {"question": "Can you get my payslip last month?"},
    {"question": "I need more information from HR team on our maternity policy. Could you let them know?"},
    {"question": "I'm not feeling well today, can you book a dayoff for me?"},
    {"question": "How can I file my expenses from client meeting?"},
    {"question": "Can you ask HR to get me a new laptop?"},
]

# Save dataset
dataset = dataset + data_from_human
with open("data/raw.json", "w") as outfile:
    json.dump(dataset, outfile)

print('Dataset: ', dataset)

Dataset:  [{'question': 'What is the process for requesting time off or vacation days?'}, {'question': 'How does the health insurance coverage work?'}, {'question': 'What retirement savings plans are available?'}, {'question': 'How do I report a problem with a coworker or supervisor?'}, {'question': "What is the company's policy on work from home or remote work?"}, {'question': 'How can I apply for a promotion or transfer to a different department?'}, {'question': 'What is the procedure for filing a complaint or grievance?'}, {'question': "What is the company's policy on maternity or paternity leave?"}, {'question': 'How does the performance review process work?'}, {'question': "What are the company's policies on diversity and inclusion?"}, {'question': 'What training or professional development opportunities are available?'}, {'question': "What is the company's policy on overtime work?"}, {'question': 'How do I update my personal information or emergency contacts?'}, {'question': 'Wha

## 2. Label the dataset

After we get generate our first dataset, it's just a list of questions. We can't use it since there is no expected result yet.
We need to complete it.

We'll use GPT4 to generate the missing pieces, which is the interpretation of each question.

📝 Note: You might notice that we shuffle the prompt on each interpretation. We found out that randomness can help the model weigh interpret better so it might be a good idea to add some randomness to your dataset as well.

💡 Tips: We found out later that real data (with human labeling) is the best way to get the best responses. Allowing humans to be in the process helps us make sure that no defective data will be trained in the model.

*Garbage in = Garbage out*

In [13]:
# Prevent OpenAI Rate limits
@retry(wait=wait_random_exponential(min=1, max=60), stop=stop_after_attempt(20))
def completion_with_backoff(**kwargs):
    return openai.ChatCompletion.create(**kwargs)

def get_actions():
    actions = [
        '    get_policy: Find policy - ',
        '    book_holiday: Book holiday -',
        '    get_payslip: Get payslip - ',
        '    escalate: Escalate to HR - ',
    ]
    random.shuffle(actions)
    return '\n'.join(actions)
    
def get_prompt():
    prompt = """
    You are question analyst.
    
    Given a raw text input to a language model select the next action best suited for the input. 
    You will be given the names of the available action and a description of what the action is best suited for.

    Answer in JSON format. Eg. {"action": "get_policy"}
    
    Action:
"""
    return prompt + get_actions()

dataset = []

with open('data/raw.json') as f:
    questions = json.load(f)
    random.shuffle(questions)

    for question in questions:
        q = question.popitem()[-1]
        print(f'Asking: {q}')
        messages = [
            {'role': 'system', 'content': get_prompt()},
            {'role': 'user', 'content': q}
        ]
        response = completion_with_backoff(
            model='gpt-4',
            messages=messages,
            temperature=0, 
        )
        response_json = json.loads(response.choices[0]['message']['content'])
        dataset.append({
            'question': q,
            'action': response_json['action'],
        })
        print(f"Response: {response_json}")

json_object = json.dumps(dataset, indent=4)
with open("data/processed.json", "w") as outfile:
    outfile.write(json_object)

print('Dataset: ', dataset)

Asking: What is the company's policy on employee compensation?
Response: {'action': 'get_policy'}
Asking: What is the procedure for filing a complaint or grievance?
Response: {'action': 'escalate'}
Asking: Can you get my payslip last month?
Response: {'action': 'get_payslip'}
Asking: What is the company's policy on maternity or paternity leave?
Response: {'action': 'get_policy'}
Asking: What is the company's policy on conflict resolution?
Response: {'action': 'get_policy'}
Asking: How can I apply for a promotion or transfer to a different department?
Response: {'action': 'get_policy'}
Asking: Can you ask HR to get me a new laptop?
Response: {'action': 'escalate'}
Asking: What is the process for submitting expenses for reimbursement?
Response: {'action': 'get_policy'}
Asking: What is the process for requesting time off or vacation days?
Response: {'action': 'get_policy'}
Asking: What is the company's policy on work from home or remote work?
Response: {'action': 'get_policy'}
Asking: How

## 3. Format the dataset

OpenAI Model Fine-tuning can only read datasets in a certain format. We need to convert it to OpenAI Message Format first, which contains 3 important keys
- `"role": "system"` - the prompt 
- `"role": "user"` - the question
- `"role": "assistant"` - the expected output

After formatting, We need to split the dataset into Training and Validation sets in `jsonl` files, which are loads of JSON objects that are separated by a newline.

In [17]:
with open('data/processed.json') as f:
    questions = json.load(f)

    dataset = []
    for question in questions:
        dataset.append({
            'messages': [
                {'role': 'system', 'content': get_prompt()},
                {'role': 'user', 'content': question['question']},
                {'role': 'assistant', 'content': "{\"action\": " + str(question['action']) + "}"},
            ]
        })
    
    def save_to_jsonl(conversations, file_path):
        with open(file_path, 'w') as file:
            for conversation in conversations:
                json_line = json.dumps(conversation)
                file.write(json_line + '\n')

            
    # train 80%, validate 20%
    train_ratio = 0.8 
    num_train = int(len(dataset) * train_ratio)
    save_to_jsonl(dataset[:num_train], 'data/train.jsonl')
    save_to_jsonl(dataset[num_train:], 'data/validate.jsonl')

## 4. Fine-tune a model

This step is super simple. We just need to upload our files and tell OpenAI to start fine-tuning. You can do it by the code below.

In [None]:
# Upload dataset
training_file_name = './data/train.jsonl'
validation_file_name = './data/validate.jsonl'

training_response = openai.File.create(
    file=open(training_file_name, "rb"), purpose="fine-tune"
)
training_file_id = training_response["id"]

validation_response = openai.File.create(
    file=open(validation_file_name, "rb"), purpose="fine-tune"
)
validation_file_id = validation_response["id"]

print("Training file id:", training_file_id)
print("Validation file id:", validation_file_id)

In [None]:
suffix_name = "routing-demo"

response = openai.FineTuningJob.create(
    training_file=training_file_id,
    validation_file=validation_file_id,
    model="gpt-3.5-turbo",
    suffix=suffix_name,
)
print("Response: ", response)

Well done! Now, we just need OpenAI to train a model. You can check the status on the [OpenAI Dashboard](https://platform.openai.com/finetune).

How long does it take is totally depends on the size of your dataset.

## 5. Test the model

After OpenAI finishes training your model, You will get an email from OpenAI that will contain the finetuned model's information or you can just go to your dashboard.
Copy and paste the model name into the code.

🎉 Tada! Your fine-tuned model is ready to use now! We recommend validating the performance of the model next. 

In [21]:
model = 'ft:gpt-3.5-turbo-0613:personal:routing-demo:xxxxx'
question = 'Can you find me my payslip on June 2023'
messages = [
    {'role': 'system', 'content': get_prompt()},
    {'role': 'user', 'content': question}
]
response = completion_with_backoff(
    model=model,
    messages=messages,
    temperature=0, 
    max_tokens=500,
)
print("Response: ", response.choices[0]['message']['content'])

Response:  {"action": get_payslip}
