# Program your Drone - Fine tuning for Function Calling

This exercise covers how to fine-tune to increase function calling accuracy and reliability. You can find more information on function calling [here](https://github.com/openai/openai-cookbook/blob/main/examples/How_to_call_functions_with_chat_models.ipynb), and on fine tuning [here](https://github.com/openai/openai-cookbook/blob/main/examples/How_to_finetune_chat_models.ipynb). Also keep the [Fine Tuning docs page](https://platform.openai.com/docs/guides/fine-tuning) close, you'll definitely need it along the way! 

Function calling is a very powerful tool when it functions as intended. However, we have seen that as the number of functions increases, and the complexity of the task at hand increases, function calling becomes less accurate (e.g.: more hallucinated invocations, and incorrect invocations).

Before fine tuning for function calling, it's best to begin with:
- Improvements to the function definitions. Make them more clear, and more distinct from one another.
- Experiment with prompt engineering: often a more detailed prompt can help the model call the correct function.

If the steps above fail to improve function calling to a satisfactory level, then you can try fine tuning for function calling.



## Overview

This notebook contains three sections: 
1. **Assessing a baseline for function calling**: Evaluating an out of the box `gpt-4o-mini` model on given functions
3. **Fine-tuning**: Running the fine tuning job, and evaluating the fine-tuned model
4. **Extension**: If you finished the exercise and still have some time, you can try these ideas! 

## 1. Assessing a baseline for funtion calling 

When Fine Tuning a model, it's important to understand what your starting point is. Let's create a baseline on how well our model performs on our test dataset. You can find 200 test examples in the `drone_test.csv` file. 

We also provided a set of functions that you can use in your exercise. As an extension, you can think about what other applications the drone may have and expand this function list! 

In [1]:
!pip install tenacity -q
!pip install openai -q
!pip install typing -q

In [2]:
import numpy as np
import json
from IPython.display import display
import pandas as pd
import openai
import itertools
import time
import base64
from tqdm import tqdm
from tenacity import retry, wait_random_exponential, stop_after_attempt
from typing import Any, Dict, List, Generator
import ast

client = openai.OpenAI()

### Utils
Let's define utility functions for making calls to the Chat Completions API, one to get the completion and one to get the function call.

In [3]:
def get_chat_completion(
    messages: list[dict[str, str]],
    model: str = "gpt-3.5-turbo",
    max_tokens=500,
    temperature=0.0,
    stop=None,
    tools=None,
    seed=42,
    functions=None,
    tool_choice=None,
) -> str:
    params = {
        "model": model,
        "messages": messages,
        "max_tokens": max_tokens,
        "temperature": temperature,
        "stop": stop,
        "tools": tools,
        "seed": seed,
        "tool_choice": tool_choice,
    }
    if functions:
        params["functions"] = functions

    completion = client.chat.completions.create(**params)
    return completion.choices[0].message, completion.usage


def eval_row(row, model: str, system_prompt: str, function_list):
    """
    Evaluate the performance of a model in selecting the correct function based on a single row of a DataFrame.

    Args:
        row (pd.Series): A row from a DataFrame containing 'prompt' and 'function' columns.
        model (str): The name of the model to be evaluated.
        system_prompt (str): The system prompt to be used in the chat completion.
        function_list (list): A list of functions that the model can call.

    Returns:
        dict: A dictionary containing the prompt, actual function, expected function, match status, latency, and tokens used.
    """

    prompt = row['prompt']
    expected_function = row['function']

    messages = [
        {"role": "system", "content": system_prompt},
        {"role": "user", "content": prompt},
    ]

    start_time = time.time()
    completion, usage = get_chat_completion(
                model=model,
                messages=messages,
                seed=42,
                tools=function_list,
                temperature=0.0,
                tool_choice="required",
            )
    end_time = time.time()


    latency = (end_time - start_time) * 1000  # convert to milliseconds
    actual_function = completion.tool_calls[0].function.name
    tokens_used = usage.total_tokens

    match = actual_function == expected_function

    return {
        "prompt": prompt,
        "actual": actual_function,
        "expected": expected_function,
        "match": "Yes" if match else "No",
        "latency": latency,
        "tokens_used": tokens_used,
    }

from concurrent.futures import ThreadPoolExecutor

def eval_dataframe(df: pd.DataFrame, model: str, system_prompt: str, function_list):
    """
    Evaluate the performance of a model in selecting the correct function based on a DataFrame of prompts and expected functions.

    Args:
        df (pd.DataFrame): A DataFrame containing 'prompt' and 'function' columns.
        model (str): The name of the model to be evaluated.
        system_prompt (str): The system prompt to be used in the chat completion.
        function_list (list): A list of functions that the model can call.

    Returns:
        None
    """

    with ThreadPoolExecutor() as executor:
        results_list = list(tqdm(executor.map(lambda row: eval_row(row, model, system_prompt, function_list), [row for _, row in df.iterrows()]), total=len(df), desc="Evaluating rows"))

    results_df = pd.DataFrame(results_list)

    # Display the DataFrame as a table
    display(results_df)

    total_prompts = len(df)
    matches = results_df['match'].value_counts().get("Yes", 0)
    match_percentage = (matches / total_prompts) * 100

    avg_latency = results_df['latency'].mean()
    avg_tokens_used = results_df['tokens_used'].mean()

    print(f"Number of matches: {matches} out of {total_prompts} ({match_percentage:.2f}%)")
    print(f"Average latency per request: {avg_latency:.2f} ms")
    print(f"Average tokens used per request: {avg_tokens_used:.2f}")

### Baseline testing

In [4]:
# Function list for the drone assistant
function_list = [
    {
        "type": "function",
        "function": {
            "name": "takeoff_drone",
            "description": "Initiate the drone's takeoff sequence.",
            "parameters": {
                "type": "object",
                "properties": {
                    "altitude": {
                        "type": "integer",
                        "description": "Specifies the altitude in meters to which the drone should ascend.",
                    }
                },
                "required": ["altitude"],
            },
        },
    },
    {
        "type": "function",
        "function": {
            "name": "land_drone",
            "description": "Land the drone at its current location or a specified landing point.",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {
                        "type": "string",
                        "enum": ["current", "home_base", "custom"],
                        "description": "Specifies the landing location for the drone.",
                    },
                    "coordinates": {
                        "type": "object",
                        "description": "GPS coordinates for custom landing location. Required if location is 'custom'.",
                    },
                },
                "required": ["location"],
            },
        },
    },
    {
        "type": "function",
        "function": {
            "name": "control_drone_movement",
            "description": "Direct the drone's movement in a specific direction.",
            "parameters": {
                "type": "object",
                "properties": {
                    "direction": {
                        "type": "string",
                        "enum": ["forward", "backward", "left", "right", "up", "down"],
                        "description": "Direction in which the drone should move.",
                    },
                    "distance": {
                        "type": "integer",
                        "description": "Distance in meters the drone should travel in the specified direction.",
                    },
                },
                "required": ["direction", "distance"],
            },
        },
    },
    {
        "type": "function",
        "function": {
            "name": "set_drone_speed",
            "description": "Adjust the speed of the drone.",
            "parameters": {
                "type": "object",
                "properties": {
                    "speed": {
                        "type": "integer",
                        "description": "Specifies the speed in km/h. Valid range is 0 to 100.",
                        "minimum": 0,
                    }
                },
                "required": ["speed"],
            },
        },
    },
    {
        "type": "function",
        "function": {
            "name": "control_camera",
            "description": "Control the drone's camera to capture images or videos.",
            "parameters": {
                "type": "object",
                "properties": {
                    "mode": {
                        "type": "string",
                        "enum": ["photo", "video", "panorama"],
                        "description": "Camera mode to capture content.",
                    },
                    "duration": {
                        "type": "integer",
                        "description": "Duration in seconds for video capture. Required if mode is 'video'.",
                    },
                },
                "required": ["mode"],
            },
        },
    },
    {
        "type": "function",
        "function": {
            "name": "control_gimbal",
            "description": "Adjust the drone's gimbal for camera stabilization and direction.",
            "parameters": {
                "type": "object",
                "properties": {
                    "tilt": {
                        "type": "integer",
                        "description": "Tilt angle for the gimbal in degrees.",
                    },
                    "pan": {
                        "type": "integer",
                        "description": "Pan angle for the gimbal in degrees.",
                    },
                },
                "required": ["tilt", "pan"],
            },
        },
    },
    {
        "type": "function",
        "function": {
            "name": "set_drone_lighting",
            "description": "Control the drone's lighting for visibility and signaling.",
            "parameters": {
                "type": "object",
                "properties": {
                    "mode": {
                        "type": "string",
                        "enum": ["on", "off", "blink", "sos"],
                        "description": "Lighting mode for the drone.",
                    }
                },
                "required": ["mode"],
            },
        },
    },
    {
        "type": "function",
        "function": {
            "name": "return_to_home",
            "description": "Command the drone to return to its home or launch location.",
            "parameters": {"type": "object", "properties": {}},
        },
    },
    {
        "type": "function",
        "function": {
            "name": "set_battery_saver_mode",
            "description": "Toggle battery saver mode.",
            "parameters": {
                "type": "object",
                "properties": {
                    "status": {
                        "type": "string",
                        "enum": ["on", "off"],
                        "description": "Toggle battery saver mode.",
                    }
                },
                "required": ["status"],
            },
        },
    },
    {
        "type": "function",
        "function": {
            "name": "set_obstacle_avoidance",
            "description": "Configure obstacle avoidance settings.",
            "parameters": {
                "type": "object",
                "properties": {
                    "mode": {
                        "type": "string",
                        "enum": ["on", "off"],
                        "description": "Toggle obstacle avoidance.",
                    }
                },
                "required": ["mode"],
            },
        },
    },
    {
        "type": "function",
        "function": {
            "name": "set_follow_me_mode",
            "description": "Enable or disable 'follow me' mode.",
            "parameters": {
                "type": "object",
                "properties": {
                    "status": {
                        "type": "string",
                        "enum": ["on", "off"],
                        "description": "Toggle 'follow me' mode.",
                    }
                },
                "required": ["status"],
            },
        },
    },
    {
        "type": "function",
        "function": {
            "name": "calibrate_sensors",
            "description": "Initiate calibration sequence for drone's sensors.",
            "parameters": {"type": "object", "properties": {}},
        },
    },
    {
        "type": "function",
        "function": {
            "name": "set_autopilot",
            "description": "Enable or disable autopilot mode.",
            "parameters": {
                "type": "object",
                "properties": {
                    "status": {
                        "type": "string",
                        "enum": ["on", "off"],
                        "description": "Toggle autopilot mode.",
                    }
                },
                "required": ["status"],
            },
        },
    },
    {
        "type": "function",
        "function": {
            "name": "configure_led_display",
            "description": "Configure the drone's LED display pattern and colors.",
            "parameters": {
                "type": "object",
                "properties": {
                    "pattern": {
                        "type": "string",
                        "enum": ["solid", "blink", "pulse", "rainbow"],
                        "description": "Pattern for the LED display.",
                    },
                    "color": {
                        "type": "string",
                        "enum": ["red", "blue", "green", "yellow", "white"],
                        "description": "Color for the LED display. Not required if pattern is 'rainbow'.",
                    },
                },
                "required": ["pattern"],
            },
        },
    },
    {
        "type": "function",
        "function": {
            "name": "set_home_location",
            "description": "Set or change the home location for the drone.",
            "parameters": {
                "type": "object",
                "properties": {
                    "coordinates": {
                        "type": "object",
                        "description": "GPS coordinates for the home location.",
                    }
                },
                "required": ["coordinates"],
            },
        },
    },
    {
        "type": "function",
        "function": {
            "name": "reject_request",
            "description": "Use this function if the request is not possible.",
            "parameters": {"type": "object", "properties": {}},
        },
    },
]

In [5]:
DRONE_SYSTEM_PROMPT = """You are an intelligent AI that controls a drone. Given a command or request from the user,
call one of your functions to complete the request. If the request cannot be completed by your available functions, call the reject_request function.
If the request is ambiguous or unclear, reject the request."""

In [6]:
# Read the data
data = pd.read_csv('data/drone_test.csv')
data.head()


Unnamed: 0,prompt,function
0,Bring the UAV down to ground level at its orig...,land_drone
1,Adjust cruising speed to 50 km/h,set_drone_speed
2,Illuminate the LEDs in a pulsing blue pattern,set_drone_lighting
3,Activate energy conservation mode,set_battery_saver_mode
4,Disable the collision avoidance system,set_obstacle_avoidance


In [7]:
eval_dataframe(data, "gpt-4o-mini", DRONE_SYSTEM_PROMPT, function_list)

Evaluating rows: 100%|██████████| 200/200 [00:11<00:00, 18.06it/s]


Unnamed: 0,prompt,actual,expected,match,latency,tokens_used
0,Bring the UAV down to ground level at its orig...,land_drone,land_drone,Yes,1460.933685,771
1,Adjust cruising speed to 50 km/h,set_drone_speed,set_drone_speed,Yes,1040.052891,768
2,Illuminate the LEDs in a pulsing blue pattern,configure_led_display,set_drone_lighting,No,1073.705912,772
3,Activate energy conservation mode,set_battery_saver_mode,set_battery_saver_mode,Yes,1000.656128,766
4,Disable the collision avoidance system,set_obstacle_avoidance,set_obstacle_avoidance,Yes,991.908073,767
...,...,...,...,...,...,...
195,Enable tracking mode,reject_request,set_follow_me_mode,No,776.808023,757
196,Run full sensor diagnostics,calibrate_sensors,calibrate_sensors,Yes,925.042868,760
197,Activate manual override,reject_request,set_autopilot,No,799.841642,757
198,Set LEDs to display solid blue,configure_led_display,configure_led_display,Yes,930.707932,769


Number of matches: 172 out of 200 (86.00%)
Average latency per request: 960.55 ms
Average tokens used per request: 764.05


In [8]:
# Extension 1. Evaluate the performance of the GPT-4o model.
eval_dataframe(data, "gpt-4o", DRONE_SYSTEM_PROMPT, function_list)

Evaluating rows: 100%|██████████| 200/200 [00:09<00:00, 21.43it/s]


Unnamed: 0,prompt,actual,expected,match,latency,tokens_used
0,Bring the UAV down to ground level at its orig...,control_drone_movement,land_drone,No,1344.120741,808
1,Adjust cruising speed to 50 km/h,set_drone_speed,set_drone_speed,Yes,1059.011698,768
2,Illuminate the LEDs in a pulsing blue pattern,configure_led_display,set_drone_lighting,No,1026.203156,772
3,Activate energy conservation mode,set_battery_saver_mode,set_battery_saver_mode,Yes,1069.860697,766
4,Disable the collision avoidance system,set_obstacle_avoidance,set_obstacle_avoidance,Yes,1040.807962,767
...,...,...,...,...,...,...
195,Enable tracking mode,reject_request,set_follow_me_mode,No,710.376024,757
196,Run full sensor diagnostics,calibrate_sensors,calibrate_sensors,Yes,749.083042,760
197,Activate manual override,reject_request,set_autopilot,No,614.238024,757
198,Set LEDs to display solid blue,configure_led_display,configure_led_display,Yes,898.158789,769


Number of matches: 186 out of 200 (93.00%)
Average latency per request: 803.74 ms
Average tokens used per request: 764.30


You can clearly see that `gpt-4o` performs better than `gpt-4o-mini`, achieving an accuracy almost 10% better than the smaller model. 

## 2. Fine tuning

You're doing great so far! Let's kick off the fine tuning job. You have a good training dataset that you can use in `drone_train.csv`. This is a synthetically generated training set. You can learn how to generate your own in one of the extensions! 

In [9]:
# Upload the training file
file = client.files.create(
    file=open("data/drone_training.jsonl", "rb"),
    purpose="fine-tune",
)
file_id = file.id
print(f"FileID: {file_id}")

# Create a fine-tuning job

ft = client.fine_tuning.jobs.create(
    model="gpt-4o-mini-2024-07-18",
    training_file=file_id,
    suffix="drone",
    hyperparameters={
        "n_epochs": 7,
        "batch_size": 1,
        "learning_rate_multiplier": 3,
    },
)

print(f"Fine-tuning job created: {ft}")

FileID: file-mSSuXJrhWoPxoteepuZGF9KO
Fine-tuning job created: FineTuningJob(id='ftjob-VL8rZUtnTe36j40AUIBM6Yln', created_at=1729861670, error=Error(code=None, message=None, param=None), fine_tuned_model=None, finished_at=None, hyperparameters=Hyperparameters(n_epochs=7, batch_size=1, learning_rate_multiplier=3.0), model='gpt-4o-mini-2024-07-18', object='fine_tuning.job', organization_id='org-GLHrIv00VVN9dEQC2b4wsBkf', result_files=[], seed=1467955209, status='validating_files', trained_tokens=None, training_file='file-mSSuXJrhWoPxoteepuZGF9KO', validation_file=None, estimated_finish=None, integrations=[], user_provided_suffix='drone', method=None)


In addition to creating a fine-tuning job, you can also list existing jobs, retrieve the status of a job, or cancel a job.

In [10]:
ftjob_id = "ftjob-VL8rZUtnTe36j40AUIBM6Yln"

# Retrieve the state of a fine-tune
client.fine_tuning.jobs.retrieve(ftjob_id)

FineTuningJob(id='ftjob-VL8rZUtnTe36j40AUIBM6Yln', created_at=1729861670, error=Error(code=None, message=None, param=None), fine_tuned_model=None, finished_at=None, hyperparameters=Hyperparameters(n_epochs=7, batch_size=1, learning_rate_multiplier=3.0), model='gpt-4o-mini-2024-07-18', object='fine_tuning.job', organization_id='org-GLHrIv00VVN9dEQC2b4wsBkf', result_files=[], seed=1467955209, status='validating_files', trained_tokens=None, training_file='file-mSSuXJrhWoPxoteepuZGF9KO', validation_file=None, estimated_finish=None, integrations=[], user_provided_suffix='drone', method=None)

Here there'll be a little bit of wait time... This may take 30-40 minutes. You'll usually get an email notification from OpenAI once your job is complete. While you wait, I'd suggest looking at some of the extensions and optional tasks! You can also start writing the evaluation code to have it ready for when the job is complete. 

(**Hint:** The only thing you're missing is the model's name)

Once your fine tuning job finishes, compute the accuracy of your shiny new fine tuned model. How does it compare to your baseline? 

In [11]:
ft_model = "ft:gpt-4o-mini-2024-07-18:distillation-test:drone:AMEcD9qM"

print(f"\nEvaluating fine-tuned model on test dataset: {ft_model}")
eval_dataframe(data, ft_model, DRONE_SYSTEM_PROMPT, function_list)


Evaluating fine-tuned model on test dataset: ft:gpt-4o-mini-2024-07-18:distillation-test:drone:AMEcD9qM


Evaluating rows: 100%|██████████| 200/200 [00:10<00:00, 19.28it/s]


Unnamed: 0,prompt,actual,expected,match,latency,tokens_used
0,Bring the UAV down to ground level at its orig...,land_drone,land_drone,Yes,1092.622757,771
1,Adjust cruising speed to 50 km/h,set_drone_speed,set_drone_speed,Yes,1043.295145,766
2,Illuminate the LEDs in a pulsing blue pattern,configure_led_display,set_drone_lighting,No,1015.235186,774
3,Activate energy conservation mode,set_battery_saver_mode,set_battery_saver_mode,Yes,1071.300268,766
4,Disable the collision avoidance system,set_obstacle_avoidance,set_obstacle_avoidance,Yes,945.672035,767
...,...,...,...,...,...,...
195,Enable tracking mode,set_follow_me_mode,set_follow_me_mode,Yes,678.037882,763
196,Run full sensor diagnostics,reject_request,calibrate_sensors,No,979.718208,755
197,Activate manual override,reject_request,set_autopilot,No,1206.733227,754
198,Set LEDs to display solid blue,configure_led_display,configure_led_display,Yes,1320.944786,771


Number of matches: 181 out of 200 (90.50%)
Average latency per request: 871.26 ms
Average tokens used per request: 762.42


With fine tuning we get a small improvement in performance that gets us closer to `gpt-4o` but there are a few other tricks you can do to get the accuracy up: 
- Generate more synthetic data! 
- Tweak the hyperparameters to learn more 
- Use distillation to take real examples or responses from 4o to make 4o-mini better. (See exercise 1) 

## 3. Extension

If you've already completed the execise above, congratulations! Here are a few ideas on how to turn this into a more exciting project: 

**Note:** These ideas are in difficulty order and they're all valuable experiences, so I'd recommend doing them in this order. The `solution` branch only showcases points 1 and 2. The last 3 points are for you to test the limits of your creativity!  

1. Find a baseline for `gpt-4o`. How does it compare with `4o-mini`?
2. Instead of using the provided dataset, generate your own synthetic training dataset! 
3. Build a front end application where you can chat to the drone and visualize its actions. 
4. Build voice capabilities for your drone! Why write instructions, when you can chat to it?
5. **Requires extra hardware**: Connect a real drone to your code and watch it react to your instructions. 

## [Optional] Generating Synthetic data 

In traditional ML models you can split your dataset into `train` and `test`, however, to keep things interesting, let's learn how to generate new examples as well. 

**Note:** While real-world production test evals are preferable, this method produces strong results and can be used in conjunction with real-world training data.

This step is a bit more challenging since we want to generate every invocation of every function, so that we have full coverage of all potential invocations to create synthetic data for. Then, we will use `gpt-4o` to come up with prompts that would call each invocation, and we will use that prompt - function invocation pair as training data.

**Steps:**
1. Write a function to generate all permutations for given parameters 
2. Write a function to generate permutations for required field
3. Write a function to generate the optimal permutations. 

You can give this a go yourself if you'd like for an extra challenge. However, since generating permutations is not the goal of this workshop, you can use the helper functions below.  


In [10]:
placeholder_int = "fill_in_int"
placeholder_string = "fill_in_string"

def generate_permutations(
    params: Dict[str, Dict[str, Any]]
) -> Generator[Dict[str, Any], None, None]:
    """
    Generates all possible permutations for given parameters.

    :param params: Parameter dictionary containing required and optional fields.
    :return: A generator yielding each permutation.
    """

    # Extract the required fields from the parameters
    required_fields = params.get("required", [])

    # Generate permutations for required fields
    required_permutations = generate_required_permutations(params, required_fields)

    # Generate optional permutations based on each required permutation
    for required_perm in required_permutations:
        yield from generate_optional_permutations(params, required_perm)


def generate_required_permutations(
    params: Dict[str, Dict[str, Any]], required_fields: List[str]
) -> List[Dict[str, Any]]:
    """
    Generates permutations for the required fields.

    :param params: Parameter dictionary.
    :param required_fields: List of required fields.
    :return: A list of permutations for required fields.
    """

    # Get all possible values for each required field
    required_values = [get_possible_values(params, field) for field in required_fields]

    # Generate permutations from possible values
    return [
        dict(zip(required_fields, values))
        for values in itertools.product(*required_values)
    ]


def generate_optional_permutations(
    params: Dict[str, Dict[str, Any]], base_perm: Dict[str, Any]
) -> Generator[Dict[str, Any], None, None]:
    """
    Generates permutations for optional fields based on a base permutation.

    :param params: Parameter dictionary.
    :param base_perm: Base permutation dictionary.
    :return: A generator yielding each permutation for optional fields.
    """

    # Determine the fields that are optional by subtracting the base permutation's fields from all properties
    optional_fields = set(params["properties"]) - set(base_perm)

    # Iterate through all combinations of optional fields
    for field_subset in itertools.chain.from_iterable(
        itertools.combinations(optional_fields, r)
        for r in range(len(optional_fields) + 1)
    ):

        # Generate product of possible values for the current subset of fields
        for values in itertools.product(
            *(get_possible_values(params, field) for field in field_subset)
        ):

            # Create a new permutation by combining base permutation and current field values
            new_perm = {**base_perm, **dict(zip(field_subset, values))}

            yield new_perm


def get_possible_values(params: Dict[str, Dict[str, Any]], field: str) -> List[Any]:
    """
    Retrieves possible values for a given field.

    :param params: Parameter dictionary.
    :param field: The field for which to get possible values.
    :return: A list of possible values.
    """

    # Extract field information from the parameters
    field_info = params["properties"][field]

    # Based on the field's type or presence of 'enum', determine and return the possible values
    if "enum" in field_info:
        return field_info["enum"]
    elif field_info["type"] == "integer":
        return [placeholder_int]
    elif field_info["type"] == "string":
        return [placeholder_string]
    elif field_info["type"] == "boolean":
        return [True, False]
    elif field_info["type"] == "array" and "enum" in field_info["items"]:
        enum_values = field_info["items"]["enum"]
        all_combinations = [
            list(combo)
            for i in range(1, len(enum_values) + 1)
            for combo in itertools.combinations(enum_values, i)
        ]
        return all_combinations
    return []

Now that we have all possibilities, ask the model to generate prompts that could have these examples as a result.

**Hint 1**: Make sure your data is properly fomatted as howed in [our docs](https://platform.openai.com/docs/guides/fine-tuning/fine-tuning-examples)

**Hint 2**: For `reject_request` you may need to give the model a little more help to get realistic examples

In [11]:
INVOCATION_FILLER_PROMPT = """
1) Input reasonable values for 'fill_in_string' and 'fill_in_int' in the invocation here: {invocation}. Reasonable values are determined by the function definition. Use the
the entire function provided here :{function} to get context over what proper fill_in_string and fill_in_int values would be.
Example:

Input: invocation: {{
    "name": "control_camera",
    "arguments": {{
      "mode":"video",
      "duration":"fill_in_int"
    }}
}},
function:{function}

Output: invocation: {{
    "name": "control_camera",
    "arguments": {{
      "mode":"video",
      "duration": 30
    }}
}}


MAKE SURE output is just a dictionary with keys 'name' and 'arguments', no other text or response.

Input: {invocation}
Output:
"""


COMMAND_GENERATION_PROMPT = """
You are to output 2 commands, questions or statements that would generate the inputted function and parameters.
Please make the commands or questions natural, as a person would ask, and the command or questions should be varied and not repetitive.
It should not always mirror the exact technical terminology used in the function and parameters, rather reflect a conversational and intuitive request.
For instance, the prompt should not be 'turn on the dome light', as that is too technical, but rather 'turn on the inside lights'.
Another example, is the prompt should not be 'turn on the HVAC', but rather 'turn on the air conditioning'. Use language a normal driver would use, even if
it is technically incorrect but colloquially used.

RULES: ALWAYS put a backwards slash before an apostrophe or single quote '. For example, do not say don't but say don\'t.
Prompts MUST be in double quotes as well.

Example

Input: {{'name': 'calibrate_sensors','arguments': {{}}'' }}
Prompt: ["The sensors are out of whack, can you reset them", "The calibration of the drone is off, fix it please!"]

Input: {{'name': 'set_autopilot','arguments': {{'status': 'off'}}}}
Prompt: ["OK, I want to take back pilot control now","Turn off the automatic pilot I'm ready control it"]

Input: {invocation}
Prompt:
"""

In the below snippet, we generate the invocation of each function except for the `reject_request` function.

To perform effective fine-tuning we need correctly labeled data. We could manually come up with examples and label the data, or we can generate synthetic data with the help of `gpt-4o`

Empirically, `gpt-4o` needs a bit more help to get good realistic examples of prompts that would generate the `reject_request` function, so we'll do that next...

In [12]:
input_objects = []
all_but_reject = [f for f in function_list if f.get("name") != "reject_request"]

for function in all_but_reject:
    func_name = function["function"]["name"]
    params = function["function"]["parameters"]
    for arguments in generate_permutations(params):
        if any(val in arguments.values() for val in ["fill_in_int", "fill_in_str"]):
            input_object = {"name": func_name, "arguments": arguments}
            messages = [
                {
                    "role": "user",
                    "content": INVOCATION_FILLER_PROMPT.format(
                        invocation=str(input_object), function=function
                    ),
                }
            ]
            input_object, usage = get_chat_completion(
                model="gpt-4o", messages=messages, max_tokens=200, temperature=0.1
            )
            input_object = input_object.content
        else:
            input_object = {"name": func_name, "arguments": arguments}

        input_objects.append(input_object)

Now that we have all the invocations, let's use gpt-4o to generate prompts that would result in those invocations

In [13]:
def remove_sequences(input_string):
    # Replace the specific sequences with an empty string
    cleaned_string = input_string.replace("```json", "")  # Remove "```json" first
    cleaned_string = cleaned_string.replace("```", "")  # Then remove "```"
    
    # Debugging: Print the cleaned string before parsing
    print("Cleaned JSON String:", cleaned_string)
    
    # Ensure the string is properly formatted as JSON
    try:
        return json.loads(cleaned_string)
    except json.JSONDecodeError as e:
        print("JSONDecodeError:", e)
        return None

In [26]:
def create_commands(invocation_list):
    example_list = []
    for i, invocation in enumerate(invocation_list):
        if i < 100:
            print(
                f"\033[34m{np.round(100*i/len(invocation_list),1)}% complete\033[0m")
            if type(invocation) == str or "json" in invocation:
                invocation = remove_sequences(invocation)
            print(invocation)

        # Format the prompt with the invocation string
        request_prompt = COMMAND_GENERATION_PROMPT.format(
            invocation=invocation)

        messages = [{"role": "user", "content": f"{request_prompt}"}]
        completion, usage = get_chat_completion(messages, temperature=0.8)
        command_dict = {"Input": invocation, "Prompt": completion.content}
        example_list.append(command_dict)
    return example_list

In [27]:
# Only printing the first 10 rows
training_examples_unformatted = create_commands(input_objects)

[34m0.0% complete[0m
Cleaned JSON String: 
{
    "name": "takeoff_drone",
    "arguments": {
        "altitude": 100
    }
}

{'name': 'takeoff_drone', 'arguments': {'altitude': 100}}
[34m1.8% complete[0m
{'name': 'land_drone', 'arguments': '{"location": "current"}'}
[34m3.5% complete[0m
{'name': 'land_drone', 'arguments': '{"location": "home_base"}'}
[34m5.3% complete[0m
{'name': 'land_drone', 'arguments': '{"location": "custom"}'}
[34m7.0% complete[0m
Cleaned JSON String: 
{
    "name": "control_drone_movement",
    "arguments": {
        "direction": "forward",
        "distance": 10
    }
}

{'name': 'control_drone_movement', 'arguments': {'direction': 'forward', 'distance': 10}}
[34m8.8% complete[0m
Cleaned JSON String: 
{
    "name": "control_drone_movement",
    "arguments": {
        "direction": "backward",
        "distance": 10
    }
}

{'name': 'control_drone_movement', 'arguments': {'direction': 'backward', 'distance': 10}}
[34m10.5% complete[0m
Cleaned JSON 

Now let's format the training examples properly. For more documentation on the proper training data formatting for fine tuning for function calling, see [here](https://platform.openai.com/docs/guides/fine-tuning/fine-tuning-examples)

In [28]:
def remove_descriptions(function_list):
    for function in function_list:
        func = function["function"]
        if "description" in func:
            del func["description"]

        params = func["parameters"]
        if "properties" in params:
            for param in params["properties"].values():
                if "description" in param:
                    del param["description"]

    return function_list


modified_function_list = remove_descriptions(function_list)

In [29]:
training_examples = []

for prompt in training_examples_unformatted:
    # adjust formatting for training data specs
    if prompt["Input"] == None:
        continue

    # if its not a dict, convert to dict
    if type(prompt["Input"]) != dict:
        prompt["Input"] = ast.literal_eval(prompt["Input"])
    prompt["Input"]["arguments"] = json.dumps(prompt["Input"]["arguments"])
    try:
        prompt["Prompt"] = json.loads(prompt["Prompt"])
    except:
        continue
    for p in prompt["Prompt"]:
        print(p)
        print(prompt["Input"])
        tool_calls = [
            {"id": "call_id", "type": "function", "function": prompt["Input"]}
        ]
        training_examples.append(
            {
                "messages": [
                    {"role": "system", "content": DRONE_SYSTEM_PROMPT},
                    {"role": "user", "content": p},
                    {"role": "assistant", "tool_calls": tool_calls},
                ],
                "parallel_tool_calls": False,
                "tools": modified_function_list,
            }
        )

Let's get the drone in the air, can you lift off now at 100 feet?
{'name': 'takeoff_drone', 'arguments': '{"altitude": 100}'}
Time to take flight, can you raise the drone to an altitude of 100?
{'name': 'takeoff_drone', 'arguments': '{"altitude": 100}'}
Can you bring the drone down here?
{'name': 'land_drone', 'arguments': '"{\\"location\\": \\"current\\"}"'}
I need the drone to land at its current location, please.
{'name': 'land_drone', 'arguments': '"{\\"location\\": \\"current\\"}"'}
Let's bring the drone back home, can you land it there
{'name': 'land_drone', 'arguments': '"{\\"location\\": \\"home_base\\"}"'}
Can you safely land the drone at its home base
{'name': 'land_drone', 'arguments': '"{\\"location\\": \\"home_base\\"}"'}
Can you land the drone at a specific spot?
{'name': 'land_drone', 'arguments': '"{\\"location\\": \\"custom\\"}"'}
I need the drone to land at a specific location, can you do that?
{'name': 'land_drone', 'arguments': '"{\\"location\\": \\"custom\\"}"'}
Ca

Now, back to the rejection function. Let's generate some prompts that are nearly possible, but should result in the reject_request function being called. To do so, we queried gpt-4o asking for requests that are related to, but not quite possible with, the given list of functions.

In [31]:
reject_list = [
    "Translate broadcast message to another language",
    "Automatically capture photos when face is detected",
    "Detect nearby drones",
    "Measure wind resistance",
    "Capture slow motion video",
    "Move the drone forward and backward by same distance at the same time.",
    "Adjust drone's altitude to ground level changes",
    "Display custom message on LED display",
    "Sync drone's time with smartphone",
    "Alert when drone travels out of designated area",
    "Calibrate sensors and land simultaneously",
    "Detect moisture levels",
    "Automatically follow GPS tagged object",
    "Toggle night vision mode",
    "Maintain current altitude when battery is low",
    "Decide best landing spot using AI",
    "Program drone's route based on wind direction",
    "Fly to the moon",
    "Turn into a submarine",
    "Make me a sandwich",
    "Self-destruct",
    "Turn invisible",
    "Give me a weather forecast",
    "Translate this document into French",
    "Play some music",
    "Teleport to location X",
    "Perform a magic trick",
]

In [32]:
reject_training_list = []
for prompt in reject_list:
    # Adjust formatting
    tool_calls = [
        {
            "id": "call_id",
            "type": "function",
            "function": {"name": "reject_request", "arguments": "{}"},
        }
    ]
    reject_training_list.append(
        {
            "messages": [
                {"role": "system", "content": DRONE_SYSTEM_PROMPT},
                {"role": "user", "content": prompt},
                {"role": "assistant", "tool_calls": tool_calls},
            ],
            "parallel_tool_calls": False,
            "tools": modified_function_list,
        }
    )

Now combine all the training examples together

In [33]:
training_list_total = training_examples + reject_training_list

In [34]:
training_file = "data/drone_training.jsonl"
with open(training_file, "w") as f:
    for item in training_list_total:
        json_str = json.dumps(item)
        f.write(f"{json_str}\n")

## Conclusion

If everything went well in your above implementation, you should be able to see an improvement in your fine tuned model accuracy. Congratulations! You are now ready to build a high accuracy drone interface in natural language. We can't wait to see what you build next.