### **Decision Support System**

#### **Fine-Tuning GPT-3.5 with GPT-4 Generated Dataset**

#### Dataset
A dataset consisting of complex decision scenarios with detailed step-by-step analyses and recommendations, generated by GPT-4.

#### Fine-Tuning Process
1. **Data Preparation**: Use the dataset generated by GPT-4 as training data.
2. **Fine-Tuning**: Fine-tune the GPT-3.5 model on this dataset.

#### Objective
The objective is to enhance GPT-3.5's reasoning abilities to break down complex scenarios, consider multiple factors, and provide well-structured recommendations/action plans to support decision-making processes.

#### Result
A fine-tuned GPT-3.5 model that emulates GPT-4's reasoning process. This model can be integrated into decision support systems to assist human decision-makers by analyzing new scenarios and providing informed recommendations.



In [2]:
# !pip install openai tenacity

In [3]:
prompt = '''
        The model could assist in decision-making by analyzing complex scenarios, 
        considering multiple factors, and providing well-reasoned recommendations or action plans.
        '''
temperature = 0.4       # Temperature is chosen to balance factual accuracy with creativity.
number_of_examples = 25

#### **Generate Dataset**

In [4]:
import os
import openai
import random
from tenacity import retry, stop_after_attempt, wait_exponential

In [55]:
# Get OPENAI API key from .env file
from dotenv import load_dotenv

load_dotenv()
openai.api_key = os.getenv("OPENAI_API_KEY")

In [None]:
N_RETRIES = 3

@retry(stop=stop_after_attempt(N_RETRIES), wait=wait_exponential(multiplier=1, min=4, max=70))
def generate_example(prompt, prev_examples, temperature=.5):
    messages=[
        {
            "role": "system",
            "content" : (
                "You are generating data which will be used to train a machine learning model.\n\n"
                "You will be given a high-level description of the model we want to train, and from that, "
                "you will generate data samples, each with a prompt/response pair.\n\n"
                "You will do so in this format:\n\n"
                "```\n"
                "prompt\n"
                "-----------\n"
                "$prompt_goes_here\n"
                "-----------\n\n"
                "response\n"
                "-----------\n"
                "$response_goes_here\n"
                "-----------\n"
                "```\n\n"
                "Only one prompt/response pair should be generated per turn.\n\n"
                "For each turn, make the example slightly more complex than the last, "
                "while ensuring diversity.\n\n"
                "Make sure your samples are unique and diverse, yet high-quality and complex enough "
                "to train a well-performing model.\n\n"
                f"Here is the type of model we want to train:\n\n{prompt}")
        
        }
    ]

    if len(prev_examples) > 0:
        if len(prev_examples) > 8:
            prev_examples = random.sample(prev_examples, 8)
        for example in prev_examples:
            messages.append({
                "role": "assistant",
                "content": example
            })

    response = openai.ChatCompletion.create(
        model="gpt-4",
        messages=messages,
        temperature=temperature,
        max_tokens=1500,
    )

    return response.choices[0].message['content']

# Generate examples
prev_examples = []
for i in range(number_of_examples):
    print(f'Generating example {i}')
    example = generate_example(prompt, prev_examples, temperature)
    prev_examples.append(example)

print(prev_examples)

We also need to generate a system message.

In [7]:
def generate_system_message(prompt):

    response = openai.ChatCompletion.create(
        model="gpt-4",
        messages=[
          {
            "role": "system",
            "content" : (
              "You will be given a high-level description of the model we are training, "
              "and from that, you will generate a simple system prompt for that model to use. "
              "Remember, you are not generating the system message for data generation -- "
              "you are generating the system message to use for inference. "
              "A good format to follow is `Given $INPUT_DATA, you will $WHAT_THE_MODEL_SHOULD_DO`.\n\n"
              "Make it as concise as possible. Include nothing but the system prompt in your response.\n\n"
              "For example, never write: `\"$SYSTEM_PROMPT_HERE\"`.\n\n"
              "It should be like: `$SYSTEM_PROMPT_HERE`.")
          },
          
          {
              "role": "user",
              "content": prompt.strip(),
          }
        ],
        temperature=temperature,
        max_tokens=500,
    )

    return response.choices[0].message['content']

system_message = generate_system_message(prompt)

print(f'The system message is: {system_message}.')

The system message is: `Given the complex scenario and multiple factors, provide a well-reasoned recommendation or action plan.`. Feel free to re-run this cell if you want a better result.


Now let's put our examples into a dataframe and turn them into a final pair of datasets.

In [8]:
import json
import pandas as pd

# Initialize lists to store prompts and responses
prompts = []
responses = []

# Parse out prompts and responses from examples
for example in prev_examples:
  try:
    split_example = example.split('-----------')
    prompts.append(split_example[1].strip())
    responses.append(split_example[3].strip())
  except:
    pass

# Create a DataFrame
df = pd.DataFrame({
    'prompt': prompts,
    'response': responses
})

### create train and test datasets
# last five samples for test
test_df = df.iloc[-5:]
df = df.iloc[:-5]       # Remove the last five records from df

df.to_excel('training_samples.xlsx')
test_df.to_excel('testing_samples.xlsx')

# Remove duplicates
df = df.drop_duplicates()

print('There are ' + str(len(df)) + ' successfully-generated training examples.')

# Initialize list to store training examples
training_examples = []

# Create training examples in the format required for GPT-3.5 fine-tuning
for index, row in df.iterrows():
    training_example = {
        "messages": [
            {"role": "system", "content": system_message.strip()},
            {"role": "user", "content": row['prompt']},
            {"role": "assistant", "content": row['response']}
        ]
    }
    training_examples.append(training_example)

There are 20 successfully-generated examples.


In [9]:
def print_record(record_data):
    """
    Print messages from a record.
    """
    messages = record_data['messages']
    print(f"{messages[0]['role'].upper()} : {messages[0]['content']}\n")
    print(f"{messages[1]['role'].upper()} : {messages[1]['content']}\n")
    print(f"{messages[2]['role'].upper()} : {messages[2]['content']}\n")


def create_jsonl_file(data, save_jsonl_name):
    '''
    Save examples to a .jsonl file
    '''
    with open(save_jsonl_name, 'w') as f:
        for i, example in enumerate(data):
            print(f"=== SAMPLE {i+1} ===")
            print_record(example)
            f.write(json.dumps(example) + '\n')

In [10]:
# Save training examples to a .jsonl file
create_jsonl_file(training_examples, 'training_examples.jsonl')

=== SAMPLE 1 ===
SYSTEM : Given the complex scenario and multiple factors, provide a well-reasoned recommendation or action plan.

USER : You are the CEO of a tech startup. Your company has been losing money for the past three quarters. Your investors are getting worried and your employees are getting anxious. You have two options: Cut costs by laying off half of your staff or seek additional funding. What do you do?

ASSISTANT : Given the situation, it's crucial to consider both short-term and long-term impacts. Laying off half of the staff could provide immediate financial relief, but it could also harm the company's morale and capacity to innovate, which are vital for a tech startup. On the other hand, seeking additional funding could alleviate immediate financial pressure without sacrificing human resources. However, it could dilute current shareholders' equity and may not be possible if investors lack confidence in the company's future.

Before making a decision, I would first ana

#### Upload the file to OpenAI

In [11]:
file_id = openai.File.create(
  file=open('training_examples.jsonl', "rb"),
  purpose='fine-tune'
).id

#### **Train the model**
This takes few minutes to process on OpenAI's servers.

In [12]:
job = openai.FineTuningJob.create(training_file=file_id, model="gpt-3.5-turbo")
job_id = job.id

Run this cell every 20 minutes or so 

Eventually, you'll see a message "New fine-tuned model created: ft:gpt-3.5-turbo-0613:xxxxxxxxxxxx"

Once you see that message, you can go to the OpenAI Playground (or keep going to the next cells and use the API) to try the model!

In [13]:
openai.FineTuningJob.list_events(id=job_id, limit=10)

<OpenAIObject list at 0x1fc6bada4a0> JSON: {
  "object": "list",
  "data": [
    {
      "object": "fine_tuning.job.event",
      "id": "ftevent-6QHd9DmUwsefJWuPwvKCkqJr",
      "created_at": 1714213577,
      "level": "info",
      "message": "Validating training file: file-221jocdtrLl6TaHD5NBRfQFY",
      "data": {},
      "type": "message"
    },
    {
      "object": "fine_tuning.job.event",
      "id": "ftevent-yhdx0dolKWXRfBkEnYpInmqP",
      "created_at": 1714213577,
      "level": "info",
      "message": "Created fine-tuning job: ftjob-anL97Fu09M2gsjdZQZoXDLbj",
      "data": {},
      "type": "message"
    }
  ],
  "has_more": false
}

#### Run the next cell to grab the fine-tuned model name.

In [69]:
model_name_pre_object = openai.FineTuningJob.retrieve(job_id)
fine_tuned_model_name = model_name_pre_object.fine_tuned_model
print(fine_tuned_model_name)

ft:gpt-3.5-turbo-0125:personal::9IZe48S8


ft:gpt-3.5-turbo-0125:personal::9IZe48S8

### Let's try it out!

In [43]:
def get_model_prediction(prompt, model_name, system_message):
    """
    Fetches the model's prediction for a given prompt.
    """
    response = openai.ChatCompletion.create(
        model=model_name,
        messages=[
            {"role": "system", "content": system_message},
            {"role": "user", "content": prompt}
        ],
    )
    return response.choices[0].message['content']

In [71]:
## For test_df
for index, row in test_df.iterrows():
    prompt = row['prompt']
    response = row['response']
    finetuned_model_output = get_model_prediction(prompt, fine_tuned_model_name, system_message)

    print(f"=== SAMPLE {index} ===")
    print(f"PROMPT\n{prompt}\n")
    print(f"MODEL RESPONSE\n{finetuned_model_output}\n")
    print(f"RESPONSE\n{response}\n")

In [None]:
# For custom prompt
custom_prompt = '''
    A small business owner is considering expanding their business. 
    They currently have one store in a small town and are considering opening a second store in a larger city. 
    They have a loyal customer base in their current location but are unsure about how their products will be received in a larger market. 
    They also worry about the increased costs associated with running a larger store. What should they consider before making a decision?
'''

output = get_model_prediction(custom_prompt, fine_tuned_model_name, system_message)


Pamudu Ranasinghe

Reference:
https://github.com/mshumer/gpt-llm-trainer