# Azure OpenAI GPT-4o-mini fine-tuning

The first step of Reinforcement Fine-Tuning (RFT) is to train the model using Supervised Fine-Tuning (SFT), enabling it to generate Chains of Thought (CoT) in a specific format. This prepares the model for subsequent Reinforcement Learning, which guides it to reason step-by-step along the correct CoT to arrive at accurate answers.

In [1]:
import sys, os
import json
from openai import AzureOpenAI
from dotenv import load_dotenv
load_dotenv()

aoai_api_endpoint = os.getenv("AZURE_OPENAI_ENDPOINT")
aoai_api_key = os.getenv("AZURE_OPENAI_API_KEY")
aoai_api_version = os.getenv("AZURE_OPENAI_API_VERSION")

subscription = os.getenv("AZURE_SUBSCRIPTION_ID")
resource_group = os.getenv("AZURE_RESOURCE_GROUP")
resource_name = os.getenv("AZURE_OPENAI_SERVICE_NAME")
model_deployment_name = "gpt-4o-mini-2024-07-18-ft" # Custom deployment name you chose for your fine-tuning model


In [7]:
client = AzureOpenAI(
  azure_endpoint = aoai_api_endpoint,
  api_key = aoai_api_key,
  api_version = "2024-08-01-preview"  # This API version or later is required to access seed/events/checkpoint features
)
client

<openai.lib.azure.AzureOpenAI at 0x7f4de757a1a0>

## Data Set

After executing `data/reasoningplaning/run_debug.sh`, a data file for SFT, named `math_500_tst.{uuid}.flat.sft.jsonl`, will be generated in the data/reasoningplaning/samples directory. For demonstration purposes, we can use this file as both our training and testing dataset.

Then we can upload the data files to azure.

In [1]:
# After executing `data/reasoningplaning/run_debug.sh`, a data file for SFT, named `math_500_tst.{uuid}.flat.sft.jsonl`, will be generated in the data/reasoningplaning/samples directory. 
# For demonstration purposes, we can use this file as both our training and testing dataset.
filepath = os.path.normpath(os.path.join(os.path.abspath(os.getcwd()),'..','..','data/evolvemcts4rl/samples/math_500_tst.3e7b.flat.sft.jsonl'))

# Run preliminary checks

# Load the training set
with open(filepath, 'r', encoding='utf-8') as f:
    training_dataset = [json.loads(line) for line in f]

# Training dataset stats
print("Number of examples in data set:", len(training_dataset))
print("First example in data set:")
for message in training_dataset[0]["messages"]:
    print(message)

training_file_name = filepath
validation_file_name = filepath

Number of examples in data set: 122
First example in data set:
{'role': 'system', 'content': 'You are a math expert who excels at solving mathematical problems step by step.'}
{'role': 'user', 'content': 'In the math club at my educational institution, which comprises 6 male participants and 8 female participants, I am tasked with forming a team to represent us at the upcoming state-level mathematics competition. Our goal is to assemble a group of exactly 6 individuals. How many distinct combinations can I utilize to form this team, assuming there are no limitations on the selection process?'}
{'role': 'assistant', 'content': 'To find the total number of distinct combinations to form a team of 6 individuals from 14 participants (6 males and 8 females), we can use the combination formula:\n\n\\[\n\\binom{n}{r} = \\frac{n!}{r!(n - r)!}\n\\]\n\nwhere \\( n \\) is the total number of participants, and \\( r \\) is the number of participants to choose.\n\nHere, \\( n = 14 \\) and \\( r = 6 

In [2]:
# Then we can upload the data files to Azure OpenAI with the SDK.
training_response = client.files.create(
    file = open(training_file_name, "rb"), purpose="fine-tune"
)
training_file_id = training_response.id

validation_response = client.files.create(
    file = open(validation_file_name, "rb"), purpose="fine-tune"
)
validation_file_id = validation_response.id

print("Training file ID:", training_file_id)
print("Validation file ID:", validation_file_id)

Training file ID: file-5d729d05e4fb4c2c936c76f4d448a5ee
Validation file ID: file-53b00097ebcd4197952cd1ea6165d7c1


## Begin fine-tuning

Now that the fine-tuning files have been successfully uploaded you can submit your fine-tuning training job:

In [4]:
# Submit fine-tuning training job

response = client.fine_tuning.jobs.create(
    training_file = training_file_id,
    validation_file = validation_file_id,
    model = "gpt-4o-mini-2024-07-18", # Enter base model name. Note that in Azure OpenAI the model name contains dashes and cannot contain dot/period characters.
    seed = 105 # seed parameter controls reproducibility of the fine-tuning job. If no seed is specified one will be generated automatically.
)

job_id = response.id

# You can use the job ID to monitor the status of the fine-tuning job.
# The fine-tuning job will take some time to start and complete.

print("Job ID:", response.id)
print("Status:", response.status)
print(response.model_dump_json(indent=2))

Job ID: ftjob-ec3e8decda384cddaacf81b7f573b1c9
Status: pending
{
  "id": "ftjob-ec3e8decda384cddaacf81b7f573b1c9",
  "created_at": 1735688105,
  "error": null,
  "fine_tuned_model": null,
  "finished_at": null,
  "hyperparameters": {
    "n_epochs": -1,
    "batch_size": -1,
    "learning_rate_multiplier": 1
  },
  "model": "gpt-4o-mini-2024-07-18",
  "object": "fine_tuning.job",
  "organization_id": null,
  "result_files": null,
  "seed": 105,
  "status": "pending",
  "trained_tokens": null,
  "training_file": "file-5d729d05e4fb4c2c936c76f4d448a5ee",
  "validation_file": "file-53b00097ebcd4197952cd1ea6165d7c1",
  "estimated_finish": 1735689027,
  "integrations": null
}


In [8]:
job_id="ftjob-ec3e8decda384cddaacf81b7f573b1c9"

If you would like to **poll the training job status** until it's complete, you can run:

In [9]:
# Track training status
from IPython.display import clear_output
import time

start_time = time.time()

# Get the status of our fine-tuning job.
response = client.fine_tuning.jobs.retrieve(job_id)

status = response.status

# If the job isn't done yet, poll it every 10 seconds.
while status not in ["succeeded", "failed"]:
    time.sleep(10)

    response = client.fine_tuning.jobs.retrieve(job_id)
    print(response.model_dump_json(indent=2))
    print("Elapsed time: {} minutes {} seconds".format(int((time.time() - start_time) // 60), int((time.time() - start_time) % 60)))
    status = response.status
    print(f'Status: {status}')
    clear_output(wait=True)

print(f'Fine-tuning job {job_id} finished with status: {status}')

# List all fine-tuning jobs for this resource.
print('Checking other fine-tune jobs for this resource.')
response = client.fine_tuning.jobs.list()
print(f'Found {len(response.data)} fine-tune jobs.')

Fine-tuning job ftjob-ec3e8decda384cddaacf81b7f573b1c9 finished with status: succeeded
Checking other fine-tune jobs for this resource.
Found 1 fine-tune jobs.


It isn't unusual for training to take more than an hour to complete. Once training is completed the output message will change to something like:
```
Fine-tuning job ftjob-900fcfc7ea1d4360a9f0cb1697b4eaa6 finished with status: succeeded
Checking other fine-tune jobs for this resource.
Found 4 fine-tune jobs.
```

While not necessary to complete fine-tuning it can be helpful to **examine the individual fine-tuning events** that were generated during training. The full training results can also be examined after training is complete in the training results file.

In [10]:
response = client.fine_tuning.jobs.list_events(fine_tuning_job_id=job_id, limit=10)
print(response.model_dump_json(indent=2))

{
  "data": [
    {
      "id": "ftevent-8467d44edd3949c1ab3cc432f475609d",
      "created_at": 1735693954,
      "level": "info",
      "message": "Training tokens billed: 159000",
      "object": "fine_tuning.job.event",
      "type": "message"
    },
    {
      "id": "ftevent-ac9413c9e83d4e91b95f2a9a6e6c0736",
      "created_at": 1735693954,
      "level": "info",
      "message": "Model Evaluation Passed.",
      "object": "fine_tuning.job.event",
      "type": "message"
    },
    {
      "id": "ftevent-8f6bf4a901a543f2a29c40904e6184fe",
      "created_at": 1735693954,
      "level": "info",
      "message": "Completed results file: file-05d60eedd8d84226b4e047860dba9f03",
      "object": "fine_tuning.job.event",
      "type": "message"
    },
    {
      "id": "ftevent-4d7a06bec7f942178850a1fda209166b",
      "created_at": 1735693951,
      "level": "info",
      "message": "Postprocessing started.",
      "object": "fine_tuning.job.event",
      "type": "message"
    },
    {
  

When each training epoch completes a checkpoint is generated. A checkpoint is a fully functional version of a model which can both be deployed and used as the target model for subsequent fine-tuning jobs. Checkpoints can be particularly useful, as they can provide a snapshot of your model prior to overfitting having occurred. When a fine-tuning job completes you will have the three most recent versions of the model available to deploy. The final epoch will be represented by your fine-tuned model, the previous two epochs will be available as checkpoints. Let's use below code to **list checkpoints**.

In [11]:
response = client.fine_tuning.jobs.checkpoints.list(job_id)
print(response.model_dump_json(indent=2))

{
  "data": [
    {
      "id": "ftchkpt-37e580e8262549a0808a0ce33c13a2f5",
      "created_at": 1735693706,
      "fine_tuned_model_checkpoint": "gpt-4o-mini-2024-07-18.ft-ec3e8decda384cddaacf81b7f573b1c9",
      "fine_tuning_job_id": "ftjob-ec3e8decda384cddaacf81b7f573b1c9",
      "metrics": {
        "full_valid_loss": 0.19185392828990863,
        "full_valid_mean_token_accuracy": 0.7687091483267644,
        "step": 366.0,
        "train_loss": 0.09808275103569031,
        "train_mean_token_accuracy": 0.9593495726585388,
        "valid_loss": 0.0745824507947238,
        "valid_mean_token_accuracy": 0.9669811320754716
      },
      "object": "fine_tuning.job.checkpoint",
      "step_number": 366
    },
    {
      "id": "ftchkpt-c60ef8693a6d4c82a0bf1d6704709ecb",
      "created_at": 1735693443,
      "fine_tuned_model_checkpoint": "gpt-4o-mini-2024-07-18.ft-ec3e8decda384cddaacf81b7f573b1c9:ckpt-step-244",
      "fine_tuning_job_id": "ftjob-ec3e8decda384cddaacf81b7f573b1c9",
      "me

To get the **final results**, run the following:

In [12]:
# Retrieve fine_tuned_model name

response = client.fine_tuning.jobs.retrieve(job_id)

print(response.model_dump_json(indent=2))
fine_tuned_model = response.fine_tuned_model

{
  "id": "ftjob-ec3e8decda384cddaacf81b7f573b1c9",
  "created_at": 1735688105,
  "error": null,
  "fine_tuned_model": "gpt-4o-mini-2024-07-18.ft-ec3e8decda384cddaacf81b7f573b1c9",
  "finished_at": 1735693954,
  "hyperparameters": {
    "n_epochs": 3,
    "batch_size": 1,
    "learning_rate_multiplier": 1
  },
  "model": "gpt-4o-mini-2024-07-18",
  "object": "fine_tuning.job",
  "organization_id": null,
  "result_files": [
    "file-05d60eedd8d84226b4e047860dba9f03"
  ],
  "seed": 105,
  "status": "succeeded",
  "trained_tokens": 133713,
  "training_file": "file-5d729d05e4fb4c2c936c76f4d448a5ee",
  "validation_file": "file-53b00097ebcd4197952cd1ea6165d7c1",
  "estimated_finish": 1735689027,
  "integrations": null
}


## Deploy fine-tuned model

Unlike the previous Python SDK commands in this tutorial, since the introduction of the quota feature, model deployment must be done using the REST API, which requires separate authorization, a different API path, and a different API version.

Alternatively, you can deploy your fine-tuned model using any of the other common deployment methods like Azure OpenAI Studio, or Azure CLI.

Before you run the below command, you should run `az login` in the terminal first.

In [17]:
# you should run `az login` in the terminal first
!az account get-access-token --subscription {subscription}

{
  "accessToken": "eyJ0eXAiOiJKV1QiLCJhbGciOiJSUzI1NiIsIng1dCI6InoxcnNZSEhKOS04bWdndDRIc1p1OEJLa0JQdyIsImtpZCI6InoxcnNZSEhKOS04bWdndDRIc1p1OEJLa0JQdyJ9.eyJhdWQiOiJodHRwczovL21hbmFnZW1lbnQuY29yZS53aW5kb3dzLm5ldC8iLCJpc3MiOiJodHRwczovL3N0cy53aW5kb3dzLm5ldC8xNmIzYzAxMy1kMzAwLTQ2OGQtYWM2NC03ZWRhMDgyMGI2ZDMvIiwiaWF0IjoxNzM1Njk0OTQxLCJuYmYiOjE3MzU2OTQ5NDEsImV4cCI6MTczNTcwMDEyNSwiYWNyIjoiMSIsImFpbyI6IkFhUUFXLzhZQUFBQXdLQUdwdTFhUlY1Z01KNEdoTUVObHlRbVJNNldScnV4Y2FIRk9kcEMvNkdZT3Zha1cvT2JNS0dVL3lwbk01YlRXaGV0SjV6TE53S0ZRNmdTckdTeTZacnZGYTNCMHl1bkJ2amhiYWFEQXpXUkEvd3FDWkNuTlo2a0xWNEwrUy9idVlqVlZGbDFJWHJIdWsrQVRwcitmWFBWSkpkbWpHRDE2MkJzUnRrdTJhWDlkc1RqcVZTMXhYMzVxdjBUOVJvRnRSZVVZeGZVUG1xaUtzbXBjTDJ6T3c9PSIsImFsdHNlY2lkIjoiNTo6MTAwMzIwMDNBQzlFQ0UwRSIsImFtciI6WyJyc2EiLCJtZmEiXSwiYXBwaWQiOiIwNGIwNzc5NS04ZGRiLTQ2MWEtYmJlZS0wMmY5ZTFiZjdiNDYiLCJhcHBpZGFjciI6IjAiLCJkZXZpY2VpZCI6ImQ4MzU5YmJjLWRjY2YtNDhkNy1hN2JlLWQ5YTI0ZGQ2MDQxMCIsImVtYWlsIjoibHVvZ2FuZ0BtaWNyb3NvZnQuY29tIiwiZmFtaWx5X25hbWUiOiJMdW8iLCJnaXZ

Copy above `accessToken` value from the result of `az account get-access-token` and set it into a variable.

In [18]:
accesstoken="eyJ0eXAiOiJKV1QiLCJhbGciOiJSUzI1NiIsIng1dCI6InoxcnNZSEhKOS04bWdndDRIc1p1OEJLa0JQdyIsImtpZCI6InoxcnNZSEhKOS04bWdndDRIc1p1OEJLa0JQdyJ9.eyJhdWQiOiJodHRwczovL21hbmFnZW1lbnQuY29yZS53aW5kb3dzLm5ldC8iLCJpc3MiOiJodHRwczovL3N0cy53aW5kb3dzLm5ldC8xNmIzYzAxMy1kMzAwLTQ2OGQtYWM2NC03ZWRhMDgyMGI2ZDMvIiwiaWF0IjoxNzM1Njk0OTQxLCJuYmYiOjE3MzU2OTQ5NDEsImV4cCI6MTczNTcwMDEyNSwiYWNyIjoiMSIsImFpbyI6IkFhUUFXLzhZQUFBQXdLQUdwdTFhUlY1Z01KNEdoTUVObHlRbVJNNldScnV4Y2FIRk9kcEMvNkdZT3Zha1cvT2JNS0dVL3lwbk01YlRXaGV0SjV6TE53S0ZRNmdTckdTeTZacnZGYTNCMHl1bkJ2amhiYWFEQXpXUkEvd3FDWkNuTlo2a0xWNEwrUy9idVlqVlZGbDFJWHJIdWsrQVRwcitmWFBWSkpkbWpHRDE2MkJzUnRrdTJhWDlkc1RqcVZTMXhYMzVxdjBUOVJvRnRSZVVZeGZVUG1xaUtzbXBjTDJ6T3c9PSIsImFsdHNlY2lkIjoiNTo6MTAwMzIwMDNBQzlFQ0UwRSIsImFtciI6WyJyc2EiLCJtZmEiXSwiYXBwaWQiOiIwNGIwNzc5NS04ZGRiLTQ2MWEtYmJlZS0wMmY5ZTFiZjdiNDYiLCJhcHBpZGFjciI6IjAiLCJkZXZpY2VpZCI6ImQ4MzU5YmJjLWRjY2YtNDhkNy1hN2JlLWQ5YTI0ZGQ2MDQxMCIsImVtYWlsIjoibHVvZ2FuZ0BtaWNyb3NvZnQuY29tIiwiZmFtaWx5X25hbWUiOiJMdW8iLCJnaXZlbl9uYW1lIjoiR2FuZyIsImdyb3VwcyI6WyJiMTMwNDAyMi0wOGU2LTQ0N2QtYjA5NC0xNTM3MDU5N2M2YjYiLCIwOTUzMWE3Mi0yYzNlLTRlMDYtYmUxZS0yNTk2YmQwOGRjZGQiLCJkMzRjNGViZS00OTg0LTQ5MDMtYTY0ZC04YzIwMjgzZDUxNmIiLCJlMzA5NmRmNy1iNjVjLTRlMzItYWIxYS03YTM1ZGM2ODRmMGEiXSwiaWRwIjoiaHR0cHM6Ly9zdHMud2luZG93cy5uZXQvNzJmOTg4YmYtODZmMS00MWFmLTkxYWItMmQ3Y2QwMTFkYjQ3LyIsImlkdHlwIjoidXNlciIsImlwYWRkciI6IjE3Mi4xOTEuNi4xODciLCJuYW1lIjoiR2FuZyBMdW8iLCJvaWQiOiJhMDQ5MGMyZS00ZjZkLTQ5MGEtOWYwMy0wYjQ5YjhhM2M0ODAiLCJwdWlkIjoiMTAwMzIwMDNCMzRBMDUzQiIsInJoIjoiMS5BVVlBRThDekZnRFRqVWFzWkg3YUNDQzIwMFpJZjNrQXV0ZFB1a1Bhd2ZqMk1CUHhBSUpHQUEuIiwic2NwIjoidXNlcl9pbXBlcnNvbmF0aW9uIiwic2lkIjoiM2ZjMjFjMDctNGZjNC00MjVlLThjY2EtNDM3OTc5MmZhMGJlIiwic3ViIjoibEpSdGx5aWNUQnpmRENiVFFPMG9nazFhaVBfY3RwQWZKMkt4Yy1IRVY5VSIsInRpZCI6IjE2YjNjMDEzLWQzMDAtNDY4ZC1hYzY0LTdlZGEwODIwYjZkMyIsInVuaXF1ZV9uYW1lIjoibHVvZ2FuZ0BtaWNyb3NvZnQuY29tIiwidXRpIjoic0ExR0NSMUwwa3k1TUdERFNOSVdBQSIsInZlciI6IjEuMCIsIndpZHMiOlsiYjc5ZmJmNGQtM2VmOS00Njg5LTgxNDMtNzZiMTk0ZTg1NTA5Il0sInhtc19jYWUiOiIxIiwieG1zX2NjIjpbIkNQMSJdLCJ4bXNfZWRvdiI6dHJ1ZSwieG1zX2lkcmVsIjoiMSA0IiwieG1zX3RjZHQiOjE2NDUxMzcyMjh9.fVMIi6c-1rJnXfGim_ASvhk922JWxvTwTTn7bUsJ0SZuNdHZ0xqYmBymvnX-w6UhuTJtkBXlWzHa0Nty3LPNRA0unso8q3fYd9rGNzYBT5EVkkk_PRakErr_SNhHGye_mKmNl10uwi61iZs_2lktA7VncbPPd87TdkOWAPUI33TzvpzE4EI8heneXYow911tzcgEjjZ2uYpjvIexpfU3PYj5GJ335-y0QK_5zx3iGLumIjTgR29cnrwmR5laynqNimVlm9lvrKy-kJ2rzjweExxEDC3veqdC_wuNDXLfSz_mxiUbf9SjJBTOefz1sMBvCCL-i-n99eFmZVBLJ3Md5Q"

In [20]:
# Deploy fine-tuned model

import json
import requests

token = accesstoken

deploy_params = {'api-version': "2023-05-01"}
deploy_headers = {'Authorization': 'Bearer {}'.format(token), 'Content-Type': 'application/json'}

deploy_data = {
    "sku": {"name": "standard", "capacity": 1},
    "properties": {
        "model": {
            "format": "OpenAI",
            "name": "gpt-4o-mini-2024-07-18.ft-ec3e8decda384cddaacf81b7f573b1c9", #retrieve this value from the previous call, it will look like gpt-4o-mini-2024-07-18.ft-0e208cf33a6a466994aff31a08aba678
            "version": "1"
        }
    }
}
deploy_data = json.dumps(deploy_data)

request_url = f'https://management.azure.com/subscriptions/{subscription}/resourceGroups/{resource_group}/providers/Microsoft.CognitiveServices/accounts/{resource_name}/deployments/{model_deployment_name}'

print('Creating a new deployment...')

r = requests.put(request_url, params=deploy_params, headers=deploy_headers, data=deploy_data)

print(r)
print(r.reason)
print(r.json())

Creating a new deployment...


<Response [201]>
Created
{'id': '/subscriptions/<your subscription id>/resourceGroups/<resource group name>/providers/Microsoft.CognitiveServices/accounts/<azure ai service name>/deployments/gpt-4o-mini-2024-07-18-ft', 'type': 'Microsoft.CognitiveServices/accounts/deployments', 'name': 'gpt-4o-mini-2024-07-18-ft', 'sku': {'name': 'standard', 'capacity': 1}, 'properties': {'model': {'format': 'OpenAI', 'name': 'gpt-4o-mini-2024-07-18.ft-ec3e8decda384cddaacf81b7f573b1c9', 'version': '1'}, 'versionUpgradeOption': 'NoAutoUpgrade', 'capabilities': {'area': 'US', 'chatCompletion': 'true', 'jsonObjectResponse': 'true', 'maxContextToken': '128000', 'maxOutputToken': '16384', 'assistants': 'true'}, 'provisioningState': 'Creating'}, 'systemData': {'createdBy': '<account name>', 'createdByType': 'User', 'createdAt': '2025-01-01T01:59:53.9322892Z', 'lastModifiedBy': '<account name>', 'lastModifiedByType': 'User', 'lastModifiedAt': '2025-01-01T01:59:53.9322892Z'}, 'etag': '"52b74cb6-2965-4d16-a741-

Test the deployment.

In [4]:
# Use the deployed customized model

import os
from openai import AzureOpenAI

client2 = AzureOpenAI(
  azure_endpoint = os.getenv("AZURE_OPENAI_ENDPOINT"),
  api_key = os.getenv("AZURE_OPENAI_API_KEY"),
  api_version = "2024-06-01"
)

response = client2.chat.completions.create(
    model = model_deployment_name, # model = "Custom deployment name you chose for your fine-tuning model"
    messages = [
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "hello"},
        {"role": "assistant", "content": "Yes, customer managed keys are supported by Azure OpenAI."},
        {"role": "user", "content": "Do other Azure AI services support this too?"}
    ]
)

print(response.choices[0].message.content)

Yes, other Azure AI services also support customer-managed keys for data encryption. Services like Azure Cognitive Services, Azure Machine Learning, and others within the Azure ecosystem allow users to manage their encryption keys using Azure Key Vault, providing an added layer of security and control over sensitive data. Always check the specific documentation for each service for detailed information on how to implement customer-managed keys.
