# 2. OpenAI Model fine tuning

Now that we've created labelled training data, we can fine tune our model using the Supervised Fine Tuning technique. Azure OpenAI uses LoRA to fine tune models efficiently. **LoRA (Low-Rank Adaptation)** finetuning of a Large Language Model is a technique used to adapt pre-trained language models to specific tasks efficiently and with fewer computational resources.  

 Instead of adjusting all the model parameters, LoRA introduces a small number of additional parameters (low-rank matrices) that modify the model's behavior. These new parameters are trained while keeping the original model's parameters mostly unchanged. This way, the model can learn the new task without the need for extensive computational resources or time.

 Azure OpenAI lets developers customize OpenAI models with their own data and easily deploy their custom model using an easy to use and affordable managed service.

 While Fine Tuning can be a complex process, Azure OpenAI abstracts away a lot of the complexity to make fine tuning accessible to any developer.

## Overview
![](./doc/raft-process-ft.png)

### 0. How much will this cost?

Fine tuning pricing on Azure OpenAI makes fine tuning experiment cost very predictable. Training pricing is based on the number of tokens you're training your model on. Therefore, it is very easy to predict and manage the cost of your finetuning experiments.

For GPT-4o mini, training price is $0.003300 per 1K tokens.

So to estimate the cost of our fine tuning job we can use the following formula

`(Training cost per 1K input tokens / 1K) * number of tokens in input file * number of epochs trained`

**epoch:** a complete iteration through a dataset during the training process of a process, 

1. If the number of epochs is too low: Your model might be underfitted, which means it could perform poorly because it hasn't learned enough from the training data. In essence, it may not have had enough iterations to effectively learn and adjust its parameters (e.g., weights and biases).

2. If the number of epochs is too high: There's a risk of overfitting, where the model becomes too specialized in the training data and performs poorly on unseen data (examples that weren’t in your training dataset).

the number of epochs is a parameter of the fine tuning job, usually 3 epochs is a reasonable number

**Let's explore our dataset and estimate our fine tuning costs.**

In [None]:
import os
from dotenv import load_dotenv

# Variables passed by previous notebooks
load_dotenv(".env.state")

ds_name = os.getenv("DATASET_NAME")
ds_path = f"dataset/{ds_name}"
dataset_path_ft_train = f"{ds_path}-files/{ds_name}-ft.train.jsonl"
dataset_path_ft_valid = f"{ds_path}-files/{ds_name}-ft.valid.jsonl"

print(f"Using dataset {ds_name} for fine tuning")

STUDENT_MODEL_NAME = os.getenv("STUDENT_MODEL_NAME")
print(f"Training student model {STUDENT_MODEL_NAME}")

In [None]:
import tiktoken
import json

training_file_path = dataset_path_ft_train
try:
    encoding = tiktoken.encoding_for_model(STUDENT_MODEL_NAME)
except KeyError:
    print(f"Model {STUDENT_MODEL_NAME} not found in tiktoken encodings, using cl100k_base")
    encoding = tiktoken.get_encoding("o200k_base")

def num_tokens_from_messages(messages, tokens_per_message=3, tokens_per_name=1):
    num_tokens = 0
    for message in messages:
        num_tokens += tokens_per_message
        for key, value in message.items():
            num_tokens += len(encoding.encode(value))
            if key == "role":
                num_tokens += tokens_per_name
    num_tokens += 3
    return num_tokens

with open(training_file_path, 'r', encoding='utf-8') as f:
    num_tokens=0
    dataset = [json.loads(line) for line in f]
    messages = [ d.get('messages') for d in dataset]
    for message in messages:
        num_tokens += num_tokens_from_messages(message)
    
print(f"Number of tokens in training data: {num_tokens}")

training_cost_per_token = 0.003300 / 1000
num_epochs = 3
total_cost = num_tokens * training_cost_per_token * num_epochs

print(f"Total estimated cost for training: {total_cost:.2f} USD")

### 1. Uploading the training and validation data to Azure OpenAI

In [None]:
from openai import AzureOpenAI
from azure.identity import DefaultAzureCredential
from azure.identity import get_bearer_token_provider

aoai_endpoint = os.environ["FINETUNE_AZURE_OPENAI_ENDPOINT"]

# Authenticate using the default Azure credential chain
azure_credential = DefaultAzureCredential()

client = AzureOpenAI(
  azure_endpoint = aoai_endpoint,
  api_version = "2024-05-01-preview",  # This API version or later is required to access seed/events/checkpoint features
  azure_ad_token_provider = get_bearer_token_provider(
    azure_credential, "https://cognitiveservices.azure.com/.default"
  )
)

validation_file_path = dataset_path_ft_valid

# Upload the training and validation dataset files to Azure OpenAI with the SDK.
training_response = client.files.create(
    file = open(training_file_path, "rb"), purpose="fine-tune"
)
training_file_id = training_response.id

validation_response = client.files.create(
    file = open(validation_file_path, "rb"), purpose="fine-tune"
)
validation_file_id = validation_response.id

print("Training file ID:", training_file_id)
print("Validation file ID:", validation_file_id)

### 2. Creating the fine tuning job

For each fine tuning job, you can specify the following hyperparameters. 

- epochs: An "epoch" is a term used to describe one complete pass through the entire training dataset
- learning rate multiplier: this will be used as the learning rate for the fine tuning job, as a multiple of the model's original learning rate. We recommend experimenting with values in the range 0.02 to 0.2 to see what produces the best results
- batch size:  how many training examples you use at one time during training, common choices are (32, 64, 128, 256). This value is to be tuned based on the size of your data and available compute.

The general recommendation is to initially train without specifying any of these, Azure OpenAI will pick a default for you based on dataset size, then adjusting based on results to find the ideal combination

In [None]:
# Submit fine-tuning training job

response = client.fine_tuning.jobs.create(
    training_file = training_file_id,
    validation_file = validation_file_id,
    model = STUDENT_MODEL_NAME, # Enter base model name. Note that in Azure OpenAI the model name contains dashes and cannot contain dot/period characters.
    seed = 105 # seed parameter controls reproducibility of the fine-tuning job. If no seed is specified one will be generated automatically.
)

job_id = response.id

# You can use the job ID to monitor the status of the fine-tuning job.
# The fine-tuning job will take some time to start and complete.

print("Job ID:", response.id)
print("Status:", response.status)
print("Student model:", response.model)
#print(response.model_dump_json(indent=2))

In [None]:
from utils import update_state

update_state("STUDENT_OPENAI_JOB_ID", response.id)

## Next step -> Deployment
[./3_deploy_oai.ipynb](./3_deploy_oai.ipynb) to start deploying the fine-tuned student model