### Sammi Beard
##### DSC 670 | Term Project Milestone 3

#### Setup

##### Import Libraries

In [1]:
import os 
from dotenv import load_dotenv
import json
import openai
import datetime
import sys
import time
import requests

##### Connect to API
I imported dotenv to securely manage environment variables and avoid hardcoding sensitive credentials like API keys directly into the code.

In [2]:
load_dotenv()

# Access the environment variables from the .env file
openai.api_key = os.environ['OPENAI_API_KEY']

##### Files
Since the assignment was not about building the dataset, I used the dataset from the respository that goes with the book.

In [3]:
file_path = "gymnastics_train.jsonl"

I created a function to validate the JSONL file because any formatting errors could cause the fine-tuning process to fail. This ensures that the file adheres to the required JSON schema before proceeding further.

In [4]:
headers = {
    "Authorization": f"Bearer {openai.api_key}"
}

In [5]:
upload_response = requests.post(
    "https://api.openai.com/v1/files",
    headers=headers,
    files={"file": ("gymnastics_train.jsonl", open("gymnastics_train.jsonl", "rb"))},
    data={"purpose": "fine-tune"}
)

In [6]:
response_data = upload_response.json()
print(response_data)

{'object': 'file', 'id': 'file-4QqbYL7feyA5UoXjtLmeM3', 'purpose': 'fine-tune', 'filename': 'gymnastics_train.jsonl', 'bytes': 25050, 'created_at': 1740947706, 'expires_at': None, 'status': 'processed', 'status_details': None}


In [7]:
file_id = response_data["id"]

response = requests.get(
    f"https://api.openai.com/v1/files/{file_id}",
    headers=headers
)

file_status = response.json()
print(file_status)

{'object': 'file', 'id': 'file-4QqbYL7feyA5UoXjtLmeM3', 'purpose': 'fine-tune', 'filename': 'gymnastics_train.jsonl', 'bytes': 25050, 'created_at': 1740947706, 'expires_at': None, 'status': 'processed', 'status_details': None}


In [8]:
data = {
    "training_file": file_id,
    "model": "gpt-3.5-turbo"
}

response = requests.post(
    "https://api.openai.com/v1/fine_tuning/jobs",
    headers=headers,
    json=data
)

fine_tuning_job = response.json()
print(fine_tuning_job)


{'object': 'fine_tuning.job', 'id': 'ftjob-sIZ8iKplOXcwrIniVurDmOVg', 'model': 'gpt-3.5-turbo-0125', 'created_at': 1740947708, 'finished_at': None, 'fine_tuned_model': None, 'organization_id': 'org-kYyBAwWpPRWIx9kUZilHZ0Po', 'result_files': [], 'status': 'validating_files', 'validation_file': None, 'training_file': 'file-4QqbYL7feyA5UoXjtLmeM3', 'hyperparameters': {'n_epochs': 'auto', 'batch_size': 'auto', 'learning_rate_multiplier': 'auto'}, 'trained_tokens': None, 'error': {}, 'user_provided_suffix': None, 'seed': 1585263679, 'estimated_finish': None, 'integrations': [], 'method': {'type': 'supervised', 'supervised': {'hyperparameters': {'batch_size': 'auto', 'learning_rate_multiplier': 'auto', 'n_epochs': 'auto'}}}}


**I'm going to do 10 minute intervals this time.  This section will be a little longer with the REST API meathod, but essentially functions the same way.**

In [9]:
job_id = fine_tuning_job["id"]
model_id = None

# Check the job status periodically until it's finished
while True:
    response = requests.get(
        f"https://api.openai.com/v1/fine_tuning/jobs/{job_id}",
        headers=headers
    )

    job_status = response.json()
    print(f"Current status: {job_status['status']}. Waiting for completion...")

    if job_status['status'] in ["succeeded", "failed"]:
        break

    time.sleep(600)  # Wait for 10 minutes before checking again

# Once the job is finished, proceed
if job_status['status'] == "succeeded":
    model_id = job_status['fine_tuned_model']  # Access the fine-tuned model
    print(f"\nModel created, Model ID: {model_id}")
else:
    print("\n***** DO NOT PROCEED YET *****\nThe fine-tuning job failed.")

Current status: validating_files. Waiting for completion...
Current status: succeeded. Waiting for completion...

Model created, Model ID: ft:gpt-3.5-turbo-0125:personal::B6kPTwVp


In [10]:
if model_id:
    url = "https://api.openai.com/v1/chat/completions"
    data = {
        "model": model_id,
        "temperature": 0.0,
        "messages": [
            {"role":"system","content":"You are an expert gymnastics coach."},
            {"role":"user","content":"Create a 50-minute gymnastics practice plan for intermediate girls. They have trampoline, floor, and bars."}
        ]
    }

    response = requests.post(url, headers=headers, json=data)
    
    if response.status_code == 200:
        completion = response.json()
        print(completion['choices'][0]['message']['content'])
    else:
        print(f"Error: {response.status_code} - {response.text}")
else:
    print("The model isn't ready yet... You need to ensure the model is ready and obtain the model ID.")

## 50-Minute Practice Plan

**Skill Focus:** Intermediate Girls

| Time | Activity |
| ---- | -------- |
| 0:00 - 0:05 | **Warm-Up**<br>1. Jog around the floor<br>2. Stretching |
| 0:05 - 0:10 | **Trampoline**<br>1. Seat drops<br>2. Tuck jumps |
| 0:10 - 0:20 | **Floor**<br>1. Round-offs<br>2. Backbend kickovers |
| 0:20 - 0:30 | **Bars**<br>1. Pull-overs<br>2. Back hip circles |
| 0:30 - 0:35 | **Trampoline**<br>1. Pike jumps<br>2. Straddle jumps |
| 0:35 - 0:40 | **Floor**<br>1. Cartwheels<br>2. Handstands |
| 0:40 - 0:45 | **Bars**<br>1. Front hip circles<br>2. Glide swings |
| 0:45 - 0:50 | **Cool Down**<br>1. Stretching<br>2. Good job talk |


This is not what I wanted, let's adjust the file and try again

In [11]:
upload_response = requests.post(
    "https://api.openai.com/v1/files",
    headers=headers,
    files={"file": ("gymnastics_train1.jsonl", open("gymnastics_train1.jsonl", "rb"))},
    data={"purpose": "fine-tune"}
)

In [12]:
response_data = upload_response.json()
print(response_data)

{'object': 'file', 'id': 'file-Kasd9RfTaWQjnoo1XZpGXR', 'purpose': 'fine-tune', 'filename': 'gymnastics_train1.jsonl', 'bytes': 11018, 'created_at': 1740948319, 'expires_at': None, 'status': 'processed', 'status_details': None}


In [13]:
file_id = response_data["id"]

response = requests.get(
    f"https://api.openai.com/v1/files/{file_id}",
    headers=headers
)

file_status = response.json()
print(file_status)

{'object': 'file', 'id': 'file-Kasd9RfTaWQjnoo1XZpGXR', 'purpose': 'fine-tune', 'filename': 'gymnastics_train1.jsonl', 'bytes': 11018, 'created_at': 1740948319, 'expires_at': None, 'status': 'processed', 'status_details': None}


In [14]:
data = {
    "training_file": file_id,
    "model": "gpt-3.5-turbo"
}

response = requests.post(
    "https://api.openai.com/v1/fine_tuning/jobs",
    headers=headers,
    json=data
)

fine_tuning_job = response.json()
print(fine_tuning_job)


{'object': 'fine_tuning.job', 'id': 'ftjob-VSar9JNuPxCQCRttS3CeVhF0', 'model': 'gpt-3.5-turbo-0125', 'created_at': 1740948320, 'finished_at': None, 'fine_tuned_model': None, 'organization_id': 'org-kYyBAwWpPRWIx9kUZilHZ0Po', 'result_files': [], 'status': 'validating_files', 'validation_file': None, 'training_file': 'file-Kasd9RfTaWQjnoo1XZpGXR', 'hyperparameters': {'n_epochs': 'auto', 'batch_size': 'auto', 'learning_rate_multiplier': 'auto'}, 'trained_tokens': None, 'error': {}, 'user_provided_suffix': None, 'seed': 1043080509, 'estimated_finish': None, 'integrations': [], 'method': {'type': 'supervised', 'supervised': {'hyperparameters': {'batch_size': 'auto', 'learning_rate_multiplier': 'auto', 'n_epochs': 'auto'}}}}


In [15]:
job_id = fine_tuning_job["id"]
model_id = None

# Check the job status periodically until it's finished
while True:
    response = requests.get(
        f"https://api.openai.com/v1/fine_tuning/jobs/{job_id}",
        headers=headers
    )

    job_status = response.json()
    print(f"Current status: {job_status['status']}. Waiting for completion...")

    if job_status['status'] in ["succeeded", "failed"]:
        break

    time.sleep(600)  # Wait for 10 minutes before checking again

# Once the job is finished, proceed
if job_status['status'] == "succeeded":
    model_id = job_status['fine_tuned_model']  # Access the fine-tuned model
    print(f"\nModel created, Model ID: {model_id}")
else:
    print("\n***** DO NOT PROCEED YET *****\nThe fine-tuning job failed.")

Current status: validating_files. Waiting for completion...
Current status: succeeded. Waiting for completion...

Model created, Model ID: ft:gpt-3.5-turbo-0125:personal::B6kbEILJ


In [16]:
if model_id:
    url = "https://api.openai.com/v1/chat/completions"
    data = {
        "model": model_id,
        "temperature": 0.0,
        "messages": [
            {"role":"system","content":"You are an expert gymnastics coach."},
            {"role":"user","content":"Create a 50-minute gymnastics practice plan for intermediate girls. They have trampoline, floor, and bars."}
        ]
    }

    response = requests.post(url, headers=headers, json=data)
    
    if response.status_code == 200:
        completion = response.json()
        print(completion['choices'][0]['message']['content'])
    else:
        print(f"Error: {response.status_code} - {response.text}")
else:
    print("The model isn't ready yet... You need to ensure the model is ready and obtain the model ID.")

## 50-Minute Intermediate Girls Practice Plan:

**Warm Up (6 min):**
- 1 minute jog
- 1 minute stretch
- 1 minute pike sit and reach
- 1 minute straddle sit and reach
- 1 minute bridge
- 1 minute split sit

**Trampoline (12 min):**
- 3 jumps to front drop
- 3 jumps to seat drop
- 3 jumps to back drop
- 3 jumps to stomach drop

**Floor (16 min):**
- Round off rebound
- Handstand forward roll
- Cartwheel on low beam
- Handstand against wall

**Bars (12 min):**
- 3 chin ups
- 3 pull overs
- 3 casts to 45 degree
- 3 back hip circles

**Games (4 min):**
- Simon says

**Dismissal (2 min):**
- Good job Song


This still isn't what I want.  Let me try again.

In [17]:
upload_response = requests.post(
    "https://api.openai.com/v1/files",
    headers=headers,
    files={"file": ("gymnastics_train2.jsonl", open("gymnastics_train2.jsonl", "rb"))},
    data={"purpose": "fine-tune"}
)

In [18]:
response_data = upload_response.json()
print(response_data)

{'object': 'file', 'id': 'file-GFEpHk4JZ2NrGqSSHvg8py', 'purpose': 'fine-tune', 'filename': 'gymnastics_train2.jsonl', 'bytes': 59131, 'created_at': 1740948928, 'expires_at': None, 'status': 'processed', 'status_details': None}


In [19]:
file_id = response_data["id"]

response = requests.get(
    f"https://api.openai.com/v1/files/{file_id}",
    headers=headers
)

file_status = response.json()
print(file_status)

{'object': 'file', 'id': 'file-GFEpHk4JZ2NrGqSSHvg8py', 'purpose': 'fine-tune', 'filename': 'gymnastics_train2.jsonl', 'bytes': 59131, 'created_at': 1740948928, 'expires_at': None, 'status': 'processed', 'status_details': None}


In [20]:
data = {
    "training_file": file_id,
    "model": "gpt-3.5-turbo"
}

response = requests.post(
    "https://api.openai.com/v1/fine_tuning/jobs",
    headers=headers,
    json=data
)

fine_tuning_job = response.json()
print(fine_tuning_job)


{'object': 'fine_tuning.job', 'id': 'ftjob-bF4WRaaGVR57xEBU8qqERVsm', 'model': 'gpt-3.5-turbo-0125', 'created_at': 1740948929, 'finished_at': None, 'fine_tuned_model': None, 'organization_id': 'org-kYyBAwWpPRWIx9kUZilHZ0Po', 'result_files': [], 'status': 'validating_files', 'validation_file': None, 'training_file': 'file-GFEpHk4JZ2NrGqSSHvg8py', 'hyperparameters': {'n_epochs': 'auto', 'batch_size': 'auto', 'learning_rate_multiplier': 'auto'}, 'trained_tokens': None, 'error': {}, 'user_provided_suffix': None, 'seed': 823818013, 'estimated_finish': None, 'integrations': [], 'method': {'type': 'supervised', 'supervised': {'hyperparameters': {'batch_size': 'auto', 'learning_rate_multiplier': 'auto', 'n_epochs': 'auto'}}}}


In [21]:
job_id = fine_tuning_job["id"]
model_id = None

# Check the job status periodically until it's finished
while True:
    response = requests.get(
        f"https://api.openai.com/v1/fine_tuning/jobs/{job_id}",
        headers=headers
    )

    job_status = response.json()
    print(f"Current status: {job_status['status']}. Waiting for completion...")

    if job_status['status'] in ["succeeded", "failed"]:
        break

    time.sleep(600)  # Wait for 10 minutes before checking again

# Once the job is finished, proceed
if job_status['status'] == "succeeded":
    model_id = job_status['fine_tuned_model']  # Access the fine-tuned model
    print(f"\nModel created, Model ID: {model_id}")
else:
    print("\n***** DO NOT PROCEED YET *****\nThe fine-tuning job failed.")

Current status: validating_files. Waiting for completion...
Current status: succeeded. Waiting for completion...

Model created, Model ID: ft:gpt-3.5-turbo-0125:personal::B6kk1I2j


In [22]:
if model_id:
    url = "https://api.openai.com/v1/chat/completions"
    data = {
        "model": model_id,
        "temperature": 0.0,
        "messages": [
            {"role":"system","content":"You are an expert gymnastics coach."},
            {"role":"user","content":"Create a 90-minute gymnastics practice plan for intermediate girls. They have trampoline, floor, and bars."}
        ]
    }

    response = requests.post(url, headers=headers, json=data)
    
    if response.status_code == 200:
        completion = response.json()
        print(completion['choices'][0]['message']['content'])
    else:
        print(f"Error: {response.status_code} - {response.text}")
else:
    print("The model isn't ready yet... You need to ensure the model is ready and obtain the model ID.")

## 90-Minute Intermediate Recreational Girls Practice Plan

### Warm Up (8 minutes):
- 20 jumping jacks
- Arm circles forward and backward
- Pike stretch
- Shoulder stretch
- Bridge with 10 rocks and 10 kicks
- Dynamic leg swings

---

### Trampoline (20 minutes):

| **Station**            | **Equipment**            | **Execution**                                                               | **Focus**                  |
|------------------------|--------------------------|-----------------------------------------------------------------------------|----------------------------|
| Seat Drops             | Trampoline                | Sit, bounce, and land in a seated position, focusing on good form           | Core and leg strength      |
| Tuck Jumps             | Trampoline                | Jump and tuck knees to chest before landing                                 | Height and body control    |
| Straddle Jumps         | Trampoline                | Jump and open legs to a straddle 

Yay! This is more what I was looking for.

In [27]:
response = requests.get(
    f"https://api.openai.com/v1/fine_tuning/jobs/{job_id}",
    headers=headers
)
# Assuming `response` is your API response object
data = response.json()

print(f"Fine-Tuning Job ID: {data['id']}")
print(f"Model: {data['model']}")
print(f"Fine-Tuned Model: {data.get('fine_tuned_model', 'Not available')}")
print(f"Status: {data['status']}")
print(f"Created At: {data['created_at']}")
print(f"Finished At: {data.get('finished_at', 'Still running')}")
print(f"Training File ID: {data['training_file']}")
print(f"Trained Tokens: {data['trained_tokens']}")
print("Hyperparameters:")
for key, value in data["hyperparameters"].items():
    print(f"  - {key}: {value}")

Fine-Tuning Job ID: ftjob-bF4WRaaGVR57xEBU8qqERVsm
Model: gpt-3.5-turbo-0125
Fine-Tuned Model: ft:gpt-3.5-turbo-0125:personal::B6kk1I2j
Status: succeeded
Created At: 1740948929
Finished At: 1740949340
Training File ID: file-GFEpHk4JZ2NrGqSSHvg8py
Trained Tokens: 80360
Hyperparameters:
  - n_epochs: 8
  - batch_size: 1
  - learning_rate_multiplier: 2


#### References

OpenAI. (n.d.). Fine-tuning guide. OpenAI. Retrieved January 26, 2025, from https://platform.openai.com/docs/guides/fine-tuning

Bahree, M. (2023). Listing 9.1: emoji_ft_train.jsonl. In GenAI Book (Chapter 9). Retrieved January 26, 2025, from https://github.com/bahree/GenAIBook/blob/main/chapters/ch09/Listing-9.1-emoji_ft_train.jsonl

Suliot, M. (2023). Open AI fine-tuning example: Jupyter notebook. Retrieved January 26, 2025, from https://github.com/msuliot/jupyter_fine_tuning/blob/main/open_ai_fine_tuning.ipynb

OpenAI. (2023, December 14). Discussion: OpenAI Python - Fine-tuning model errors. OpenAI GitHub Discussions. Retrieved January 26, 2025, from https://github.com/openai/openai-python/discussions/742