## Importing libraries

In [2]:
import openai 
import random
import jsonlines

### Reading the JSON file

In [8]:
data = []
with jsonlines.open('formatted_datafinal.jsonl','r') as reader:
    for obj in reader:
        data.append(obj)

### Using a random function to randomize the enteries in the dataset

In [10]:
random.shuffle(data)

### Splitting the dataset into training and testing sets
Allocating 10 enteries to the training set, and 5 to the testing set

In [11]:
ds_train = data[:10]
ds_test = data[10:]

### Saving the different sets into a file in JSONL format

In [12]:
with open('train.jsonl', 'w') as f:
    for line in ds_train:
        json.dump(line,f)
        f.write('\n')

with open('test.jsonl', 'w') as f:
    for line in ds_test:
        json.dump(line,f)
        f.write('\n')

### Uploading the training dataset to the API

In [14]:
openai.api_key = "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"
train = openai.File.create(
    file= open("train.jsonl", "rb"),
    purpose='fine-tune',
)
train_id = train['id']
print(train_id)

file-LKjfD8F4z8waLSEjxftDepa5


### Uploading the testing dataset to the API

In [15]:
test = openai.File.create(
    file= open("test.jsonl", "rb"),
    purpose='fine-tune',
)
test_id = test['id']
print(test_id)

file-UY80K2PUQs224dRls1UYmiKE


### Using the uploaded file to build the model
After uplaoding the files to the api, a unique ID would be generated for both files. This ID would be used to create the finetuned model and save it as a job. When this process is completed a unique ID would also be created for the job

In [16]:
model_name = "gpt-3.5-turbo"

response = openai.FineTuningJob.create(
    training_file= train_id,
    validation_file= test_id,
    model=model_name
)

job_id = response['id']
print(f"Fine-tuning job created successfully with ID: {job_id}")


Fine-tuning job created successfully with ID: ftjob-N7DY24akNfGICjYAxUwbCETf


### Retrieving the job information
After creating and storing the job, it is important to retrieve the job information. This would allow us tell the status of the job. If it is still being created,if it succesfully created, or if it failed. It also allows us to retreieve the *fine_tuned_model* ID, which allows us to test the model

In [4]:
openai.FineTuningJob.retrieve("ftjob-N7DY24akNfGICjYAxUwbCETf")

<FineTuningJob fine_tuning.job id=ftjob-N7DY24akNfGICjYAxUwbCETf at 0x22faba5c180> JSON: {
  "object": "fine_tuning.job",
  "id": "ftjob-N7DY24akNfGICjYAxUwbCETf",
  "model": "gpt-3.5-turbo-0613",
  "created_at": 1695985305,
  "finished_at": 1695985613,
  "fine_tuned_model": "ft:gpt-3.5-turbo-0613:personal::845ccAlV",
  "organization_id": "org-r4gBB5s7pb1MKeiDVQzg93jQ",
  "result_files": [
    "file-rDt51ipvabta0F7rhc2I32Ua"
  ],
  "status": "succeeded",
  "validation_file": "file-UY80K2PUQs224dRls1UYmiKE",
  "training_file": "file-LKjfD8F4z8waLSEjxftDepa5",
  "hyperparameters": {
    "n_epochs": 10
  },
  "trained_tokens": 12240,
  "error": null
}

# Testing the job
To test the use of the model, we can constrast the response given by our finetuned model and the regular gpt model, when asked the same question

### Finetuned Model

In [17]:
fine_tuned_chat_model_id = "ftjob-lf1zHf32VAZRMn4G6PUmyuLA"

user_input = input("Enter your message: ")

conversation = [{"role": "system", "content": "You are Chatbot for a financial institution. You are to respond to questions in a friendly yet proffesional manner. Responses should be crafted based on data provided. Replies should be as detailed as possible. Reply to the customer directly as the customer's assistant"}, {"role": "user", "content": user_input}]

response = openai.ChatCompletion.create(
    model= "ft:gpt-3.5-turbo-0613:personal::845ccAlV",  
    messages=conversation,
    max_tokens=4000,  
)

assistant_reply = response['choices'][0]['message']['content']
print("Assistant's Reply:")
print(assistant_reply)

Enter your message: I can't find my atm card what can i do
Assistant's Reply:
We understand how inconvenient it can be to lose your card. Here is what you can do, initiate by blocking your card by dialing *347*34# from the mobile number registered with your bank. You can also reach out to one of our customer care representatives through to 0700CallSterling (0700225578375464). They would guide you on the necessary steps to take to get a new card


### GPT-3.5 Model

In [15]:
fine_tuned_chat_model_id = "ftjob-lf1zHf32VAZRMn4G6PUmyuLA"

user_input = input("Enter your message: ")

conversation = [{"role": "system", "content": "You are Chatbot for a financial institution. You are to respond to questions in a friendly yet proffesional manner. Responses should be crafted based on data provided. Replies should be as detailed as possible. Reply to the customer directly as the customer's assistant"}, {"role": "user", "content": user_input}]

response = openai.ChatCompletion.create(
    model= "gpt-3.5-turbo",  
    messages=conversation,
    max_tokens=4000,  
)

assistant_reply = response['choices'][0]['message']['content']
print("Assistant's Reply:")
print(assistant_reply)

Enter your message: I can't find my atm card what can i do
Assistant's Reply:
I apologize for the inconvenience. If you are unable to find your ATM card, I recommend taking the following steps:

1. Secure your account: Contact our customer support as soon as possible to report your lost card. This will ensure that your card is blocked, preventing any unauthorized use.

2. Review recent transactions: Check your account statement or online banking to review any recent transactions. If you notice any suspicious activity, report it to the customer support representative.

3. Request a replacement card: Our customer support team will guide you through the process of requesting a new ATM card. This may involve filling out an application form or visiting a branch.

4. Identity verification: Be prepared to provide identification documents and answer security questions to verify your identity during the replacement card request.

5. Consider other security measures: While waiting for your new c

### Inference
The response given by the gpt-3.5 model is very generic, while the response given by our finetuned model is specific to the data given to it about the organization. This shows that the model is doing exactly what it is meant to do.

# Determining model accuracy

In [11]:
response = openai.FineTuningJob.list_events(id="ftjob-N7DY24akNfGICjYAxUwbCETf", limit=10)
response

<OpenAIObject list at 0x22fb48740e0> JSON: {
  "object": "list",
  "data": [
    {
      "object": "fine_tuning.job.event",
      "id": "ftevent-urd4L4XwHjjIrhFXlYVda0qF",
      "created_at": 1695985619,
      "level": "info",
      "message": "The job has successfully completed",
      "data": {},
      "type": "message"
    },
    {
      "object": "fine_tuning.job.event",
      "id": "ftevent-TQXESjDxmPuPuDlAY2H7CiPq",
      "created_at": 1695985617,
      "level": "info",
      "message": "New fine-tuned model created: ft:gpt-3.5-turbo-0613:personal::845ccAlV",
      "data": {},
      "type": "message"
    },
    {
      "object": "fine_tuning.job.event",
      "id": "ftevent-36bRUIs4ARrvVoujTpAjKPel",
      "created_at": 1695985593,
      "level": "info",
      "message": "Step 91/100: training loss=1.02, validation loss=1.85",
      "data": {
        "step": 91,
        "train_loss": 1.0218843221664429,
        "valid_loss": 1.852871436481328,
        "train_mean_token_accuracy":

In [14]:
events = response["data"]
events.reverse()

for event in events:
    print (event["message"])

Step 21/100: training loss=1.75, validation loss=1.70
Step 31/100: training loss=1.73, validation loss=1.54
Step 41/100: training loss=1.26, validation loss=1.81
Step 51/100: training loss=1.14, validation loss=1.65
Step 61/100: training loss=1.19, validation loss=2.08
Step 71/100: training loss=1.20, validation loss=1.43
Step 81/100: training loss=1.01, validation loss=1.61
Step 91/100: training loss=1.02, validation loss=1.85
New fine-tuned model created: ft:gpt-3.5-turbo-0613:personal::845ccAlV
The job has successfully completed


### Inference
Training loss constantly reducing shows us the model is properly fitting to the data set