#Fine-Tuning GPT-4.1-mini
Copyright 2025 Denis Rothman

**This notebook upgrades fine-tuning to GPT 4.1.**

[OpenAI fine-tuning documentation](https://beta.openai.com/docs/guides/fine-tuning/)

Check the cost of fine-tuning your dataset on OpenAI before running the notebook.

Run this notebook cell by cell to:

1.Download and prepare the SQuAD dataset
Stanford Question Answering Dataset (SQuAD) is a reading comprehension dataset.    
2.Fine-tune a model   
3.Run a fine-tuned model

# Installing the environment


In [None]:
#You can retrieve your API key from a file(1)
# or enter it manually(2)
#Comment this cell if you want to enter your key manually.
#(1)Retrieve the API Key from a file
#Store you key in a file and read it(you can type it directly in the notebook but it will be visible for somebody next to you)
from google.colab import drive
drive.mount('/content/drive')
f = open("drive/MyDrive/files/api_key.txt", "r")
API_KEY=f.readline()
f.close()

Mounted at /content/drive


In [None]:
try:
  import openai
except:
  !pip install openai==1.42.0
  import openai

In [None]:
#(2) Enter your manually by
# replacing API_KEY by your key.
#The OpenAI Key
import os
os.environ['OPENAI_API_KEY'] =API_KEY
openai.api_key = os.getenv("OPENAI_API_KEY")

In [None]:
!pip install jsonlines==4.0.0

Collecting jsonlines==4.0.0
  Downloading jsonlines-4.0.0-py3-none-any.whl.metadata (1.6 kB)
Downloading jsonlines-4.0.0-py3-none-any.whl (8.7 kB)
Installing collected packages: jsonlines
Successfully installed jsonlines-4.0.0


In [None]:
!pip install datasets==2.20.0

Collecting datasets==2.20.0
  Downloading datasets-2.20.0-py3-none-any.whl.metadata (19 kB)
Collecting pyarrow-hotfix (from datasets==2.20.0)
  Downloading pyarrow_hotfix-0.7-py3-none-any.whl.metadata (3.6 kB)
Collecting fsspec<=2024.5.0,>=2023.1.0 (from fsspec[http]<=2024.5.0,>=2023.1.0->datasets==2.20.0)
  Downloading fsspec-2024.5.0-py3-none-any.whl.metadata (11 kB)
Downloading datasets-2.20.0-py3-none-any.whl (547 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m547.8/547.8 kB[0m [31m17.0 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading fsspec-2024.5.0-py3-none-any.whl (316 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m316.1/316.1 kB[0m [31m15.1 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading pyarrow_hotfix-0.7-py3-none-any.whl (7.9 kB)
Installing collected packages: pyarrow-hotfix, fsspec, datasets
  Attempting uninstall: fsspec
    Found existing installation: fsspec 2025.3.0
    Uninstalling fsspec-2025.3.0:
      Successfully uninstalled 

Listing the installed packages

In [None]:
import subprocess

# Run pip list and capture the output
result = subprocess.run(['pip', 'list'], stdout=subprocess.PIPE, text=True)

# Split the output into lines and count them
package_list = result.stdout.split('\n')

# Adjust count for headers or empty lines
package_count = len([line for line in package_list if line.strip() != '']) - 2

print(f"Number of installed packages: {package_count}")

Number of installed packages: 636


In [None]:
import subprocess

# Run pip list and capture the output
result = subprocess.run(['pip', 'list'], stdout=subprocess.PIPE, text=True)

# Print the output
print(result.stdout)

Package                               Version
------------------------------------- ------------------
absl-py                               1.4.0
accelerate                            1.9.0
aiofiles                              24.1.0
aiohappyeyeballs                      2.6.1
aiohttp                               3.12.15
aiosignal                             1.4.0
alabaster                             1.0.0
albucore                              0.0.24
albumentations                        2.0.8
ale-py                                0.11.2
altair                                5.5.0
annotated-types                       0.7.0
antlr4-python3-runtime                4.9.3
anyio                                 4.10.0
anywidget                             0.9.18
argon2-cffi                           25.1.0
argon2-cffi-bindings                  25.1.0
array_record                          0.7.2
arviz                                 0.22.0
astropy                               7.1.0
astropy

counting the number of packages

# 1.Preparing the dataset for fine-tuning

## 1.1.Downloading and displaying the dataset

In [None]:
from datasets import load_dataset
import pandas as pd

# Load the SQuAD dataset from HuggingFace
dataset = load_dataset("squad", split="train[:500]")

# Filter the dataset to ensure context and answer are present
filtered_dataset = dataset.filter(lambda x: x["context"] != "" and x["answers"]["text"] != [])

# Extract prompt (context + question) and response (answer)
def extract_prompt_response(example):
    return {
        "prompt": example["context"] + " " + example["question"],
        "response": example["answers"]["text"][0]  # Take the first answer
    }

filtered_dataset = filtered_dataset.map(extract_prompt_response)

# Print the number of examples
print("Number of examples: ", len(filtered_dataset))

Error while fetching `HF_TOKEN` secret value from your vault: 'Requesting secret HF_TOKEN timed out. Secrets can only be fetched when running from the Colab UI.'.
You are not authenticated with the Hugging Face Hub in this notebook.
If the error persists, please let us know by opening an issue on GitHub (https://github.com/huggingface/huggingface_hub/issues/new).


Downloading readme: 0.00B [00:00, ?B/s]

Downloading data:   0%|          | 0.00/14.5M [00:00<?, ?B/s]

Downloading data:   0%|          | 0.00/1.82M [00:00<?, ?B/s]

Generating train split:   0%|          | 0/87599 [00:00<?, ? examples/s]

Generating validation split:   0%|          | 0/10570 [00:00<?, ? examples/s]

Filter:   0%|          | 0/500 [00:00<?, ? examples/s]

Map:   0%|          | 0/500 [00:00<?, ? examples/s]

Number of examples:  500


In [None]:
# Convert the filtered dataset to a pandas DataFrame
df_view = pd.DataFrame(filtered_dataset)

# Display the DataFrame
df_view.head()

Unnamed: 0,id,title,context,question,answers,prompt,response
0,5733be284776f41900661182,University_of_Notre_Dame,"Architecturally, the school has a Catholic cha...",To whom did the Virgin Mary allegedly appear i...,"{'text': ['Saint Bernadette Soubirous'], 'answ...","Architecturally, the school has a Catholic cha...",Saint Bernadette Soubirous
1,5733be284776f4190066117f,University_of_Notre_Dame,"Architecturally, the school has a Catholic cha...",What is in front of the Notre Dame Main Building?,"{'text': ['a copper statue of Christ'], 'answe...","Architecturally, the school has a Catholic cha...",a copper statue of Christ
2,5733be284776f41900661180,University_of_Notre_Dame,"Architecturally, the school has a Catholic cha...",The Basilica of the Sacred heart at Notre Dame...,"{'text': ['the Main Building'], 'answer_start'...","Architecturally, the school has a Catholic cha...",the Main Building
3,5733be284776f41900661181,University_of_Notre_Dame,"Architecturally, the school has a Catholic cha...",What is the Grotto at Notre Dame?,{'text': ['a Marian place of prayer and reflec...,"Architecturally, the school has a Catholic cha...",a Marian place of prayer and reflection
4,5733be284776f4190066117e,University_of_Notre_Dame,"Architecturally, the school has a Catholic cha...",What sits on top of the Main Building at Notre...,{'text': ['a golden statue of the Virgin Mary'...,"Architecturally, the school has a Catholic cha...",a golden statue of the Virgin Mary


## 1.2A Streaming the output to JSON


In [None]:
import json
import pandas as pd

## 1.2. Preparing the dataset for fine-tuning

In [None]:
import jsonlines
import pandas as pd
from datasets import load_dataset

# Convert to DataFrame and clean
df = pd.DataFrame(filtered_dataset)
#columns_to_drop = ['title','question','answers']
#df = df.drop(columns=columns_to_drop)

# Prepare the data items for JSON lines file
items = []
for idx, row in df.iterrows():
    detailed_answer = row['response'] + " Explanation: " + row['context']
    items.append({
        "messages": [
            {"role": "system", "content": "Given a SQuAD question built from Wikipedia with crowdworders, provide the correct answer with a detailed explanation."},
            {"role": "user", "content": row['question']},
            {"role": "assistant", "content": detailed_answer}
        ]
    })

# Write to JSON lines file
with jsonlines.open('/content/QA_prompts_and_completions.json', 'w') as writer:
    writer.write_all(items)

### Visualizing the JSON file

In [None]:
dfile="/content/QA_prompts_and_completions.json"

In [None]:
import pandas as pd

# Load the data
df = pd.read_json(dfile, lines=True)
df

Unnamed: 0,messages
0,"[{'role': 'system', 'content': 'Given a SQuAD ..."
1,"[{'role': 'system', 'content': 'Given a SQuAD ..."
2,"[{'role': 'system', 'content': 'Given a SQuAD ..."
3,"[{'role': 'system', 'content': 'Given a SQuAD ..."
4,"[{'role': 'system', 'content': 'Given a SQuAD ..."
...,...
495,"[{'role': 'system', 'content': 'Given a SQuAD ..."
496,"[{'role': 'system', 'content': 'Given a SQuAD ..."
497,"[{'role': 'system', 'content': 'Given a SQuAD ..."
498,"[{'role': 'system', 'content': 'Given a SQuAD ..."


# 2.Fine-tuning the model



In [None]:
from openai import OpenAI
import jsonlines
client = OpenAI()
# Uploading the training file

result_file = client.files.create(
  file=open("QA_prompts_and_completions.json", "rb"),
  purpose="fine-tune"
)

print(result_file)
param_training_file_name = result_file.id
print(param_training_file_name)

# Creating the fine-tuning job
ft_job = client.fine_tuning.jobs.create(
  training_file=param_training_file_name,
  model="gpt-4.1-mini-2025-04-14"
)

# Printing the fine-tuning job
print(ft_job)

FileObject(id='file-6VgWqp3wmd6Tos6aUihLLZ', bytes=645098, created_at=1755077028, filename='QA_prompts_and_completions.json', object='file', purpose='fine-tune', status='processed', expires_at=None, status_details=None)
file-6VgWqp3wmd6Tos6aUihLLZ
FineTuningJob(id='ftjob-f9JxcdrNVYiVnO4M3EZkLaLT', created_at=1755077029, error=Error(code=None, message=None, param=None), fine_tuned_model=None, finished_at=None, hyperparameters=Hyperparameters(batch_size='auto', learning_rate_multiplier='auto', n_epochs='auto'), model='gpt-4.1-mini-2025-04-14', object='fine_tuning.job', organization_id='org-h2Kjmcir4wyGtqq1mJALLGIb', result_files=[], seed=1735636110, status='validating_files', trained_tokens=None, training_file='file-6VgWqp3wmd6Tos6aUihLLZ', validation_file=None, estimated_finish=None, integrations=[], metadata=None, method=Method(type='supervised', dpo=None, reinforcement=None, supervised=SupervisedMethod(hyperparameters=SupervisedHyperparameters(batch_size='auto', learning_rate_multipli

## Monitoring the fine-tunes

In [None]:
import pandas as pd
from openai import OpenAI
client = OpenAI()
# Assume client is already set up and authenticated
response = client.fine_tuning.jobs.list(limit=3)# increase to see history

# Initialize lists to store the extracted data
job_ids = []
created_ats = []
statuses = []
models = []
training_files = []
error_messages = []
fine_tuned_models = []  # List to store the fine-tuned model names

# Iterate over the jobs in the response
for job in response.data:
    job_ids.append(job.id)
    created_ats.append(job.created_at)
    statuses.append(job.status)
    models.append(job.model)
    training_files.append(job.training_file)
    error_message = job.error.message if job.error else None
    error_messages.append(error_message)

    # Append the fine-tuned model name
    fine_tuned_model = job.fine_tuned_model if hasattr(job, 'fine_tuned_model') else None
    fine_tuned_models.append(fine_tuned_model)

# Create a DataFrame
df = pd.DataFrame({
    'Job ID': job_ids,
    'Created At': created_ats,
    'Status': statuses,
    'Model': models,
    'Training File': training_files,
    'Error Message': error_messages,
    'Fine-Tuned Model': fine_tuned_models  # Include the fine-tuned model names
})

# Convert timestamps to readable format
df['Created At'] = pd.to_datetime(df['Created At'], unit='s')
df = df.sort_values(by='Created At', ascending=False)

# Display the DataFrame
df

Unnamed: 0,Job ID,Created At,Status,Model,Training File,Error Message,Fine-Tuned Model
0,ftjob-f9JxcdrNVYiVnO4M3EZkLaLT,2025-08-13 09:23:49,succeeded,gpt-4.1-mini-2025-04-14,file-6VgWqp3wmd6Tos6aUihLLZ,,ft:gpt-4.1-mini-2025-04-14:personal::C42Vkm29
1,ftjob-5BnCL6gUsFCTzyDYAN4CfvlZ,2024-12-02 10:04:26,succeeded,gpt-4o-mini-2024-07-18,file-KYWaDkNUDFSqqUcjnSv8hw,,ft:gpt-4o-mini-2024-07-18:personal::AZy8s878
2,ftjob-vEQg9PSCb323TBGuxtWVqiRy,2024-12-02 09:47:46,succeeded,gpt-4o-mini-2024-07-18,file-VqrvbFAVDfg6bGyrBBdXRb,,ft:gpt-4o-mini-2024-07-18:personal::AZxxRhfd


### Make sure to obtain your fine-tune model here

If your OpenAI notifications are activated you should receive an email.

Otherwise run the "Monitoring the fine-tunes" cell above to check the status of your fine-tune job.

In [None]:
import pandas as pd

generation=False  # False until the last model fine-tuned is found. Make sure it used the dataset you trained it on!
# Attempt to find the first non-empty Fine-Tuned Model
non_empty_models = df[df['Fine-Tuned Model'].notna() & (df['Fine-Tuned Model'] != '')]

if not non_empty_models.empty:
    first_non_empty_model = non_empty_models['Fine-Tuned Model'].iloc[0]
    print("The latest fine-tuned model is:", first_non_empty_model)
    generation=True
else:
    first_non_empty_model='None'
    print("No fine-tuned models found.")

The latest fine-tuned model is: ft:gpt-4.1-mini-2025-04-14:personal::C42Vkm29


In [None]:
# Fine-tuned model found(True) or not(False)
generation

True

*Note:* Only continue to Step 3, to use the fine-tuned model when your fine-tuned model is ready. If your OpenAI notifications is activiated, you will receive an email with the status of your fine-tunning job.

# 3.Using the fine-tuned OpenAI model

Note: The is a fine-tuning. As such, be patient!
Rune the `Monitoring the fine-tunes` cell and the f`irst_non_empty_model` cell from time to time.

If the fine-tunning succeeded and your model is ready, the name of your model will be `first_non_empty_model`

1.Go to the OpenAI Playground to test your model: https://platform.openai.com/playground

2.Check the metrics in the fine-tuning UI:
https://platform.openai.com/finetune/

3.Try the fined-tune model out in the cell below.

In [None]:
# Define the prompt
prompt="Which prize did Frederick Buechner create?"

*Note:* Only run the following cell if your fine-tune job has succeeded and a fined-tuned model is found in the *Monitoring the fine-tunes"* section of *2.Fine-tuning the model.*

In [None]:
# Assume first_non_empty_model is defined above this snippet
if generation==True:
    response = client.chat.completions.create(
        # fine-tuned model or fallback if the model is not fine-tuned yet
        model=first_non_empty_model or "gpt-4.1-mini-2025-04-14",
        temperature=0.0,  # Adjust as needed for variability
        messages=[
            {"role": "system", "content": "Given a question, reply with a complete explanation for students."},
            {"role": "user", "content": prompt}
        ]
    )
else:
    print("Error: Model is None, cannot proceed with the API request.")

In [None]:
if generation==True:
  print(response)

ChatCompletion(id='chatcmpl-C42hsGT0RiJhxJmjhV0kI7VCERHOX', choices=[Choice(finish_reason='stop', index=0, logprobs=None, message=ChatCompletionMessage(content='Frederick Buechner Prize for Preaching Explanation: The university is affiliated with the Congregation of Holy Cross (Latin: Congregatio a Sancta Cruce, abbreviated postnominally as "CSC"). While religious affiliation is not a criterion for admission, more than 93% of students identify as Christian, with over 80% of the total being Catholic. Collectively, Catholic Mass is celebrated over 100 times per week on campus, and a large campus ministry program provides for the spiritual needs of the community. There are multitudes of religious statues and artwork around campus, most prominent of which are the statue of Mary on the Main Building, the Notre Dame Grotto, and the Word of Life mural on Hesburgh Library depicting Christ as a teacher. Additionally, every classroom displays a crucifix. There are many religious clubs (catholic 

In [None]:
if (generation==True):
  # Access the response from the first choice
  response_text = response.choices[0].message.content
  # Print the response
  print(response_text)

Frederick Buechner Prize for Preaching Explanation: The university is affiliated with the Congregation of Holy Cross (Latin: Congregatio a Sancta Cruce, abbreviated postnominally as "CSC"). While religious affiliation is not a criterion for admission, more than 93% of students identify as Christian, with over 80% of the total being Catholic. Collectively, Catholic Mass is celebrated over 100 times per week on campus, and a large campus ministry program provides for the spiritual needs of the community. There are multitudes of religious statues and artwork around campus, most prominent of which are the statue of Mary on the Main Building, the Notre Dame Grotto, and the Word of Life mural on Hesburgh Library depicting Christ as a teacher. Additionally, every classroom displays a crucifix. There are many religious clubs (catholic and non-Catholic) at the university, including Council #1477 of the Knights of Columbus (KOC), Baptist Collegiate Ministry (BCM), Jewish Club, Muslim Student Ass

In [None]:
import textwrap

if generation==True:
  wrapped_text = textwrap.fill(response_text.strip(), 60)
  print(wrapped_text)

Frederick Buechner Prize for Preaching Explanation: The
university is affiliated with the Congregation of Holy Cross
(Latin: Congregatio a Sancta Cruce, abbreviated
postnominally as "CSC"). While religious affiliation is not
a criterion for admission, more than 93% of students
identify as Christian, with over 80% of the total being
Catholic. Collectively, Catholic Mass is celebrated over 100
times per week on campus, and a large campus ministry
program provides for the spiritual needs of the community.
There are multitudes of religious statues and artwork around
campus, most prominent of which are the statue of Mary on
the Main Building, the Notre Dame Grotto, and the Word of
Life mural on Hesburgh Library depicting Christ as a
teacher. Additionally, every classroom displays a crucifix.
There are many religious clubs (catholic and non-Catholic)
at the university, including Council #1477 of the Knights of
Columbus (KOC), Baptist Collegiate Ministry (BCM), Jewish
Club, Muslim Student Ass

[Consult OpenAI fine-tune documentation for more](https://platform.openai.com/docs/guides/fine-tuning)