# Fine-tuning and deploying an Azure OpenAI model with function calling

## Objective

This notebook walks you through fine-tuning and deploying a gpt-35-turbo-0613 model with function calling using stock use case datasets.

Please note, fine-tuning with function calling is currently available for the gpt-35-turbo (0613) and gpt-35-turbo-16k (1106) models. With support for function calling, you can incorporate functions into your training data, and have your fine-tuned model make function calls. You can find more details [here](https://learn.microsoft.com/en-us/azure/ai-services/openai/how-to/fine-tuning-functions)

## Time

You should expect to sepnd 60-90 min running this sample

## Before you begin

### Installation

In [None]:
%pip install openai json requests os tiktoken time

## Prepare your data

The training data you use must be formatted as a JSON Lines (JSONL) document. Structure your examples as demonstrated, with each line including a list of "messages" and an optional list of "functions". The example below features two functions: the first one retrieves the current stock price, and the second one gets the stock price of last n days.

```json
{"messages": [{"role": "system", "content": "Don't make assumptions about what values to plug into functions. If you can't find the exact stock ticker symbol, you can ask for clarification. "}, {"role": "user", "content": "What was the highest price that Bank of America's stock reached last month?"}, {"role": "assistant", "function_call": {"name": "get_last_nday_stock_price", "arguments": "{\"symbol\": \"BAC\", \"period\": \"1mo\"}"}}], "functions": [{"name": "get_current_stock_price", "description": "Get the current stock price", "parameters": {"type": "object", "properties": {"symbol": {"type": "string", "description": "The stock symbol"}}, "required": ["symbol"]}}, {"name": "get_last_nday_stock_price", "description": "Get stock price last n days", "parameters": {"type": "object", "properties": {"symbol": {"type": "string", "description": "The stock symbol"}, "period": {"type": "string", "description": "Valid periods: 1d,5d,1mo,3mo,6mo,1y,2y,5y,10y,ytd,max"}}, "required": ["symbol", "period"]}}]}
```

While this one example is helpful to give you the general format, if you want to steer your custom fine-tuned model to respond in a similar way you would need more examples. Generally you want **at least 100 high quality examples**.

We have already created training and test datasets for our two scenarios:

**Hallucination:** A common problem with large language models is hallucinations – providing plausible but false responses. With function calling, hallucinations can happen when the model calls a function in the wrong context or provides incorrect information to for the function call. We will evaluate whether the fine-tuned model can correctly identify fake companies, and respond appropriately, instead of trying to quote a stock price. 

Hallucination datasets: `stock-train-hallucination.jsonl` and `stock-test-hallucination.jsonl`

**Token reduction:** The inclusion of functions in the system message directly impacts token usage. As the number of functions grows, so does the number of tokens within the system message, resulting in verbose prompts and increased costs.  Fine tuning lets you shorten your function calls.

Token reduction datasets: `stock-train-token-reduction.jsonl` and `stock-test-token-reduction.jsonl`

You can finetune a model with function calling for any of these two use cases. The fine-tuned models will be used in the inference notebooks in this repo. 

Let's run some preliminary checks on our training and validation files.

In [None]:
# Now you need to run some preliminary checks on our training and validation files.

import json

# Load the training set
from pathlib import Path

# Assuming the current directory is the root of your repository
with Path("Data/stock-train-hallucination.jsonl").open("r", encoding="utf-8") as f:
    training_dataset = [json.loads(line) for line in f]


# Training dataset stats
print("Number of examples in training set:", len(training_dataset))
print("First example in training set:")
for message in training_dataset[0]["messages"]:
    print(message)

Now you can then run some additional code from OpenAI using the tiktoken library to validate the token counts. Individual examples need to remain under the gpt-35-turbo-0613 model's input token limit of 4096 tokens.

In [None]:
# If that completes successfully, you can then run some additional code from OpenAI using the tiktoken library to validate the token counts.

import json
import tiktoken
import numpy as np
from typing import List, Dict, Any

encoding = tiktoken.get_encoding(
    "cl100k_base"
)  # default encoding used by gpt-4, turbo, and text-embedding-ada-002 models


def num_tokens_from_messages(
    messages: List[Dict[str, Any]], tokens_per_message: int = 3, tokens_per_name: int = 1
) -> int:
    num_tokens = 0
    for message in messages:
        num_tokens += tokens_per_message
        for key, value in message.items():
            if isinstance(value, str):
                num_tokens += len(encoding.encode(value))
            else:
                num_tokens += len(encoding.encode(str(value)))
            if key == "name":
                num_tokens += tokens_per_name
    num_tokens += 3
    return num_tokens


def num_assistant_tokens_from_messages(messages: List[Dict[str, Any]]) -> int:
    num_tokens = 0
    for message in messages:
        content = message.get("content")
        if content and message["role"] == "assistant":
            num_tokens += len(encoding.encode(content))
    return num_tokens


def print_distribution(values: List[int], name: str) -> None:
    print(f"\n#### Distribution of {name}:")
    print(f"min / max: {min(values)}, {max(values)}")
    print(f"mean / median: {np.mean(values)}, {np.median(values)}")
    print(f"p5 / p95: {np.quantile(values, 0.1)}, {np.quantile(values, 0.9)}")


files: List[str] = ["Data/stock-train-hallucination.jsonl"]

from pathlib import Path

for file in files:
    print(f"Processing file: {file}")
    with Path(file).open("r", encoding="utf-8") as f:
        dataset = [json.loads(line) for line in f]

    total_tokens = []
    assistant_tokens = []

    for ex in dataset:
        messages = ex.get("messages", {})
        total_tokens.append(num_tokens_from_messages(messages))
        assistant_tokens.append(num_assistant_tokens_from_messages(messages))

    print_distribution(total_tokens, "total tokens")
    print_distribution(assistant_tokens, "assistant tokens")
    print("*" * 50)

In [None]:
from pathlib import Path

# Upload fine-tuning files

import os
from openai import AzureOpenAI

client = AzureOpenAI(
    azure_endpoint=os.getenv("AZURE_OPENAI_ENDPOINT"),
    api_key=os.getenv("AZURE_OPENAI_API_KEY"),
    api_version="2023-12-01-preview",  # This API version or later is required to access fine-tuning for turbo
)

training_file_name = "Data/stock-train-hallucination.jsonl"

# Upload the training and validation dataset files to Azure OpenAI with the SDK.
with Path(training_file_name).open("rb") as file:
    training_response = client.files.create(file=file, purpose="fine-tune")

training_file_id = training_response.id

print("Training file ID:", training_file_id)

Now that the fine-tuning files have been successfully uploaded you can submit your fine-tuning training job:

In [None]:
response = client.fine_tuning.jobs.create(
    training_file=training_file_id,
    model="gpt-35-turbo-0613",  # Enter base model name. Note that in Azure OpenAI the model name contains dashes and cannot contain dot/period characters.
)

job_id = response.id

# You can use the job ID to monitor the status of the fine-tuning job.
# The fine-tuning job will take some time to start and complete.

print("Job ID:", response.id)
print("Status:", response.status)
print(response.model_dump_json(indent=2))

## Track training job status

In [None]:
# Track training status

from IPython.display import clear_output
import time

start_time = time.time()

# Get the status of our fine-tuning job.
response = client.fine_tuning.jobs.retrieve(job_id)

status = response.status

# If the job isn't done yet, poll it every 10 seconds.
while status not in ["succeeded", "failed"]:
    time.sleep(10)

    response = client.fine_tuning.jobs.retrieve(job_id)
    print(response.model_dump_json(indent=2))
    print(
        "Elapsed time: {} minutes {} seconds".format(
            int((time.time() - start_time) // 60), int((time.time() - start_time) % 60)
        )
    )
    status = response.status
    print(f"Status: {status}")
    clear_output(wait=True)

print(f"Fine-tuning job {job_id} finished with status: {status}")

# List all fine-tuning jobs for this resource.
print("Checking other fine-tune jobs for this resource.")
response = client.fine_tuning.jobs.list()
print(f"Found {len(response.data)} fine-tune jobs.")

In [None]:
# Retrieve fine_tuned_model name

response = client.fine_tuning.jobs.retrieve(job_id)

print(response.model_dump_json(indent=2))
fine_tuned_model = response.fine_tuned_model

## Deploy fine-tune dmodel

Here is how you can deploy your fine-tuned model using the [Rest API](https://learn.microsoft.com/en-us/rest/api/cognitiveservices/accountmanagement/deployments/create-or-update?tabs=HTTP) which requires separate authorization, a different API path, and a different API version. Alternatively, you can deploy your fine-tuned model using any of the other common deployment methods like [Azure OpenAI Studio](https://oai.azure.com/), or [Azure CLI](https://learn.microsoft.com/en-us/cli/azure/cognitiveservices/account/deployment#az-cognitiveservices-account-deployment-create).

In [None]:
import json
import requests

token = os.getenv("TEMP_AUTH_TOKEN")
subscription = "<YOUR_SUBSCRIPTION_ID>"
resource_group = "<YOUR_RESOURCE_GROUP_NAME>"
resource_name = "<YOUR_AZURE_OPENAI_RESOURCE_NAME>"
model_deployment_name = "YOUR_CUSTOM_MODEL_DEPLOYMENT_NAME"

deploy_params = {"api-version": "2023-05-01"}
deploy_headers = {"Authorization": "Bearer {}".format(token), "Content-Type": "application/json"}

deploy_data = {
    "sku": {"name": "standard", "capacity": 1},
    "properties": {
        "model": {
            "format": "OpenAI",
            "name": "<YOUR_FINE_TUNED_MODEL>",  # retrieve this value from the previous call, it will look like gpt-35-turbo-0613.ft-b044a9d3cf9c4228b5d393567f693b83
            "version": "1",
        }
    },
}
deploy_data = json.dumps(deploy_data)

request_url = f"https://management.azure.com/subscriptions/{subscription}/resourceGroups/{resource_group}/providers/Microsoft.CognitiveServices/accounts/{resource_name}/deployments/{model_deployment_name}"

print("Creating a new deployment...")

r = requests.put(request_url, params=deploy_params, headers=deploy_headers, data=deploy_data)

print(r)
print(r.reason)
print(r.json())

## Cleaning up

You can delete a custom model on the Models pane in Azure OpenAI Studio. Select the custom model to delete from the Customized models tab, and then select Delete to delete the custom model.

You can delete the deployment for your custom model on the Deployments pane in Azure OpenAI Studio. Select the deployment to delete, and then select Delete to delete the deployment.