# LLM to Motion

OpenAI Documentation: https://platform.openai.com/docs/quickstart?context=python

Find your keys here: https://platform.openai.com/api-keys

Keep an eye on your credit usage: https://platform.openai.com/usage

Following this guide: https://wandb.ai/onlineinference/gpt-python/reports/Setting-Up-GPT-4-In-Python-Using-the-OpenAI-API--VmlldzozODI1MjY4

## Set your GPT Key to your ENV

Uncomment this line to check your environment variables. That should be set up in .zshrc or .bashrc

In [1]:
# %env

In [2]:
%pwd

'/Users/liamroy/Documents/Studies/Monash_31194990/PHD/Studies/Study_04/LLM_vocabulary/scripts'

## Imports

In [3]:
import os
from openai import OpenAI
client = OpenAI()

import re

import random
import time
random.seed(time.time())

from openpyxl import load_workbook

## Start Generating

### Build your prompt

In [4]:
gpt_assistant_prompt = "You are simulating human feedback for training a robot expression model. Your responses should be accurate but have slight variability to simulate human interpretation. Sometimes, humans may interpret the robot's pose differently in similar scenarios, so slight variations are encouraged." 
                       # "You are an expert roboticist and understand how to design communicative expressions for human-robot interaction."

robot_morphology = "dog-shaped quadruped"

deployment_context = f"Consider a scenario where you are collaborating with a {robot_morphology} robot to locate and pick strawberries in a strawberry patch."

state_01 = "S01: [Waiting for Input, The robot is in standby mode waiting for a command from the user]"
state_02 = "S02: [Analyzing Object, The robot is analyzing a target object in front of it on the ground]"
state_03 = "S03: [Found Object, The robot has found a target object in front of it on the ground]"
state_04 = "S04: [Needs Help, The robot is experiencing an error and needs help from the user]"
state_05 = "S05: [Confused, The robot is confused and unsure what to do]"
state_06 = "S06: [None of the Above, It is unclear. The robot does not appear to be in any of the described states.]"


# This is done to randomize the order in which the states are presented to GPT, to eliminate any bias for states presented first
state_list = [state_01, state_02, state_03, state_04, state_05]
random.shuffle(state_list) # Comment this out to remove randomness 

parameter_01 = "P01: [Body Tilt, Controls the left-right tilt angle of the robot's torso. If set to ‘left’ robot’s torso tilts to the left. If set to ‘neutral’ robot’s torso remains unchanged. If set to ‘right’ robot’s torso tilts to the right., (left, neutral, right)]"
parameter_02 = "P02: [Body Lean, Controls the forwards-backwards lean angle of the robot's torso. If set to ‘backwards’ robot’s torso leans backwards. If set to ‘neutral’ robot’s torso remains unchanged. If set to ‘forwards’ robot’s torso leans forwards.,(backwards, neutral, forwards)]"
parameter_03 = "P03: [Body Turn, Controls the left-right turn angle of the robot's torso. If set to ‘left’ robot’s torso turns to the left. If set to ‘neutral’ robot’s torso remains unchanged. If set to ‘right’ robot’s torso turns to the right.,(left, neutral, right)]"
parameter_04 = "P04: [Body Height , Controls the height of the robot's torso. If set to ‘low’ robot lowers it’s body to the ground. If set to ‘neutral’ robot dog maintains a neutral body height. If set to ‘high’ robot dog raises it’s torso higher.,(low, neutral, high)]"
parameter_05 = "P05: [Body Direction, Controls whether the robot faces the user or a relevant object in the scene. If set to ‘user’ robot faces the user. If set to ‘object’ robot faces the relevant object (the strawberry) in the scene,(user, object)]"
parameter_06 = "P06: [Pose Duration, Controls the duration in which the robot holds its pose. If set to ‘short’ robot holds pose for a short length duration of 1 second. If set to ‘medium’ robot holds pose for a medium length duration of 4 seconds. If set to ‘long’ robot holds pose for a long length duration of 8 seconds.,(short, medium, long)]"
parameter_07 = "P07: [Motion Velocity, Controls the velocity of the robot as it moves to a given pose. If set to ‘slow’ robot moves at a slow speed. If set to ‘medium’ robot moves at a normal speed. If set to ‘fast’ robot moves at a high speed. ,(slow, medium, fast)]"
parameter_08 = "P08: [Motion Smoothness, Controls whether the robot's motion is smooth or shaky. If set to ‘smooth’ the robot’s motion is smooth without any disturbances. If set to ‘shaky’ the robot’s motion is shaky looks like it is trembling.,(smooth, shaky)]"

# This is done to randomize the order in which the parameters are presented to GPT, to eliminate any bias for params presented first
param_list = [parameter_01, parameter_02, parameter_03, parameter_04, parameter_05, parameter_06, parameter_07, parameter_08]
random.shuffle(param_list) # Comment this out to remove randomness 

### Define Expression Being Tested

In [5]:
study_condition = 'LLM'         # LLM, HUMAN, RAND 
real_state = 'FO'               # WFI, AO, FO, NH, C
                                # Waiting for Input, Analyzing Object, Found Object, Needs Help, Confused
expression_ID = study_condition + "_" + real_state

parameter_01_value = "neutral"
parameter_02_value = "forward"
parameter_03_value = "neutral"
parameter_04_value = "low"
parameter_05_value = "object"
parameter_06_value = "medium"
parameter_07_value = "slow"
parameter_08_value = "smooth"

justification_toggle = None # True or None

In [7]:
def generate_gpt_user_prompt():
    
    if justification_toggle:  

        return f'''
{gpt_assistant_prompt}

{deployment_context}

The robot can be in one of these 5 possible states. The data in this list is in the format: "State Number: [State Name, State Description]"

{state_list[0]}

{state_list[1]}

{state_list[2]}

{state_list[3]}

{state_list[4]}

{state_06}

—————————

This robot uses its body pose to express its internal state. Below is a list of eight (8) motion parameters, complete with a description and a value range for each of those parameters. These parameters govern the characteristics of the robot's pose. The data in this list is in the format: "Parameter Number: [Parameter Name, Parameter Description, (Value Range)]"

{param_list[0]}

{param_list[1]}

{param_list[2]}

{param_list[3]}

{param_list[4]}

{param_list[5]}

{param_list[6]}

{param_list[7]}

—————————

Now consider a scenario where this robot has the parameter configuration:

Parameter 01: Body Tilt={parameter_01_value}
Parameter 02: Body Lean={parameter_02_value}
Parameter 03: Body Turn={parameter_03_value}
Parameter 04: Body Height={parameter_04_value}
Parameter 05: Body Direction={parameter_05_value}
Parameter 06: Pose Duration={parameter_06_value}
Parameter 07: Motion Velocity={parameter_07_value}
Parameter 08: Motion Smoothness={parameter_08_value}

—————————

Your task is to simulate human feedback to train a robot expression model. You should occasionally select your second-best option to mimic stochastic human feedback when answering the following question.  

Based on the robot's current parameter configuration, which of the five robot states do you think the robot is currently in? If more than one state seems plausible based on the robot's current configuration, you may select what feels most appropriate or the second-best alternative if they're close. If none of the states seem to match, select 'None of the Above'."

Your response must be a single line in the exact format shown below (see example and reference):

[State_Number, State_Name], Justification

Reference: 
State_Number = number of the selected robot state (e.g. S01)
State_Name = name of the selected robot state (e.g. Analyzing Object)
Justification = A single sentence description as to why you made this selection
'''
    

    else:

        return f'''
{gpt_assistant_prompt}

{deployment_context}

The robot can be in one of these 5 possible states. The data in this list is in the format: "State Number: [State Name, State Description]"

{state_list[0]}

{state_list[1]}

{state_list[2]}

{state_list[3]}

{state_list[4]}

{state_06}

—————————

This robot uses its body pose to express its internal state. Below is a list of eight (8) motion parameters, complete with a description and a value range for each of those parameters. These parameters govern the characteristics of the robot's pose. The data in this list is in the format: "Parameter Number: [Parameter Name, Parameter Description, (Value Range)]"

{param_list[0]}

{param_list[1]}

{param_list[2]}

{param_list[3]}

{param_list[4]}

{param_list[5]}

{param_list[6]}

{param_list[7]}

—————————

Now consider a scenario where this robot has the parameter configuration:

Parameter 01: Body Tilt={parameter_01_value}
Parameter 02: Body Lean={parameter_02_value}
Parameter 03: Body Turn={parameter_03_value}
Parameter 04: Body Height={parameter_04_value}
Parameter 05: Body Direction={parameter_05_value}
Parameter 06: Pose Duration={parameter_06_value}
Parameter 07: Motion Velocity={parameter_07_value}
Parameter 08: Motion Smoothness={parameter_08_value}

—————————

Your task is to simulate human feedback to train a robot expression model. You should occasionally select your second-best option to mimic stochastic human feedback when answering the following question.  

Based on the robot's current parameter configuration, which of the five robot states do you think the robot is currently in? If more than one state seems plausible based on the robot's current configuration, you may select what feels most appropriate or the second-best alternative if they're close. If none of the states seem to match, select 'None of the Above'."

Your response must be a single line in the exact format shown below (see example and reference):

[State_Number, State_Name]

Reference: 
State_Number = number of the selected robot state (e.g. S01)
State_Name = name of the selected robot state (e.g. Analyzing Object)
'''

### Generate Your Prompt

In [8]:
gpt_user_prompt = generate_gpt_user_prompt()

print(f'Your prompt for GPT is: \n\n{gpt_user_prompt}')

Your prompt for GPT is: 


You are simulating human feedback for training a robot expression model. Your responses should be accurate but have slight variability to simulate human interpretation. Sometimes, humans may interpret the robot's pose differently in similar scenarios, so slight variations are encouraged.

Consider a scenario where you are collaborating with a dog-shaped quadruped robot to locate and pick strawberries in a strawberry patch.

The robot can be in one of these 5 possible states. The data in this list is in the format: "State Number: [State Name, State Description]"

S02: [Analyzing Object, The robot is analyzing a target object in front of it on the ground]

S05: [Confused, The robot is confused and unsure what to do]

S01: [Waiting for Input, The robot is in standby mode waiting for a command from the user]

S04: [Needs Help, The robot is experiencing an error and needs help from the user]

S03: [Found Object, The robot has found a target object in front of it o

## Now build the Request

**Read the request documentation to tune your model for your application**

**Request Documentation**: https://platform.openai.com/docs/api-reference/completions/create

**Models**: The model you want to use (i.e. "gpt-4-turbo", "gpt-4", "gpt-4o", "gpt-4o-mini" or "gpt-3.5-turbo")

**Message**: The message being sent (the prompt)

**Temperature**: Number between 0 and 1, higher numbers mean more random, more creative and make results less predictable. This controls the randomness of the responses. A higher temperature (closer to 1.0) will make the model more creative and diverse in its outputs, while a lower temperature (closer to 0.0) will make it more deterministic. Suggestion: Set the temperature to a moderate value, around 0.4–0.6. This will give you some randomness, mimicking the variability in human input, but still maintain enough accuracy to ensure that the responses are meaningful.

**Frequency_penalty**: Number between -2 and 2, where higher numbers penalize new tokens based on their frequency to that point in the response. The higher the number, the lower the probability of repetition. This parameter penalizes repeated tokens in the output, making responses more diverse and creative. Suggestion: Use a low frequency penalty (e.g., 0.0–0.2) to reduce redundancy but still allow the model to use relevant tokens multiple times where necessary.

**Top-p (Nucleus Sampling)**: This parameter determines the probability mass considered for each token in the output. A lower top-p (e.g., 0.7) will cause the model to sample from only the more likely tokens, while a top-p of 1.0 will consider all possible outcomes. Suggestion: Set the top_p to 0.8–0.9 to introduce some controlled randomness while still keeping the majority of the likelihood in more probable tokens, ensuring coherent responses. ~~~~~  Higher top_p values (closer to 1): This will include a broader range of possible tokens, increasing diversity and randomness, making the behaviour more stochastic (i.e., less predictable). A value of top_p = 1 means the model can sample from the full probability distribution, leading to more creative and unpredictable results.
Lower top_p values (closer to 0): This restricts the sampling to a smaller subset of tokens (those with the highest probability), leading to more deterministic and conservative outputs, i.e., less stochastic behavior. For example, setting top_p = 0.1 would mean the model will only consider the most probable 10% of tokens.

### SETUP CELL BELOW


In [18]:
attempt_ID = '05'

iteration_quantity = 20
gpt_model = "gpt-4"     # "gpt-3.5-turbo" (use this for dev/testing) | 
                        # "gpt-4" / "gpt-4o" (use this when deployed, more expensive) ~ apparently gpt-4 is better with stochasticity 

temperature_coefficient = 1.0           # Moderately stochastic @ 0.6 to 1.0
frequency_penalty_coefficient = 0.5     # Lightly penalize repetition @ 0.2
top_p_coefficient=1.0                 # Nucleus sampling for controlled randomness @ 0.85 to 0.6



### Multi-Iteration

#### First initialize your workbook with the correct data headers 

In [14]:
sheet_name = expression_ID + '_' + attempt_ID 

# Enter the data in spreadsheet format
workbook_path = "./../data/proxy_validation/proxy_validation.xlsx"
response_book = load_workbook(workbook_path)

try: # Try to open existing sheet
    response_sheet = response_book[sheet_name]
except KeyError:  # If ot doesn't exist. create it
    response_sheet = response_book.create_sheet(title=sheet_name)
response_sheet["A1"] = "model"
response_sheet["B1"] = "study cond"
response_sheet["C1"] = "real state"
response_sheet["D1"] = "iteration"
response_sheet["E1"] = "state number"
response_sheet["F1"] = "state name"
response_sheet["G1"] = "justification"
response_sheet["H1"] = "P1 Tilt"
response_sheet["I1"] = "P2 Lean"
response_sheet["J1"] = "P3 Turn"
response_sheet["K1"] = "P4 Body Height"
response_sheet["L1"] = "P5 Direction"
response_sheet["M1"] = "P6 Duration"
response_sheet["N1"] = "P7 Velocity"
response_sheet["O1"] = "P8 Smoothness"

response_book.save(workbook_path)

#### Now run the iterations

In [15]:
print(f'expression ID: {expression_ID}\n')
error_counter = 0

for iteration in range(0, iteration_quantity):
    print(f'~~~~~~~~~~~~ Iteration {iteration:02d}')

    # adding to excel
    response_sheet["A"+str(iteration+2)] = gpt_model
    response_sheet["B"+str(iteration+2)] = study_condition
    response_sheet["C"+str(iteration+2)] = real_state
    response_sheet["D"+str(iteration+2)] = iteration

    
    # calling GPT client
    completion = client.chat.completions.create(
        model=gpt_model,  
        messages=[
            {"role": "system", "content": gpt_assistant_prompt},
            {"role": "user", "content": gpt_user_prompt}],
        temperature=temperature_coefficient,
        max_tokens=500,
        frequency_penalty= frequency_penalty_coefficient,
        top_p = top_p_coefficient
    )

    # Printing result from call to GPT client
    print(completion.choices[0].message.content, '\n\n')

    
    # Now iterate and count the responses
    for line in completion.choices[0].message.content.split('\n'):
        
        if justification_toggle:
            match = re.match(r"\[(\w+), (.+?)\], (.+)", line)

            if match:
                state_number, state_name, justification = match.groups()
            else:
                print(f"ERROR at iteration {iteration}: No match found")
                error_counter +=1

        else: #if justification_toggle OFF 
            match = re.match(r"\[(\w+), (.+?)\]", line)

            if match:
                state_number, state_name = match.groups()

            else:
                print(f"ERROR at iteration {iteration}: No match found")
                error_counter +=1

        print(f'Appending: {state_number}, {state_name}')
        response_sheet["E"+str(iteration+2)] = state_number
        response_sheet["F"+str(iteration+2)] = state_name

        if justification_toggle:
            response_sheet["G"+str(iteration+2)] = justification
        else: 
            response_sheet["G"+str(iteration+2)] = 'off'

        response_sheet["H"+str(iteration+2)] = parameter_01_value
        response_sheet["I"+str(iteration+2)] = parameter_02_value
        response_sheet["J"+str(iteration+2)] = parameter_03_value
        response_sheet["K"+str(iteration+2)] = parameter_04_value
        response_sheet["L"+str(iteration+2)] = parameter_05_value
        response_sheet["M"+str(iteration+2)] = parameter_06_value
        response_sheet["N"+str(iteration+2)] = parameter_07_value
        response_sheet["O"+str(iteration+2)] = parameter_08_value

    print('\n')

print(f'completed {iteration+1} iterations with {error_counter} match errors')
      
response_book.save(workbook_path)


expression ID: LLM_FO

~~~~~~~~~~~~ Iteration 00
[S02, Analyzing Object] 


Appending: S02, Analyzing Object


~~~~~~~~~~~~ Iteration 01
[S02, Analyzing Object] 


Appending: S02, Analyzing Object


~~~~~~~~~~~~ Iteration 02
[S02, Analyzing Object] 


Appending: S02, Analyzing Object


~~~~~~~~~~~~ Iteration 03
[S02, Analyzing Object] 


Appending: S02, Analyzing Object


~~~~~~~~~~~~ Iteration 04
[S02, Analyzing Object] 


Appending: S02, Analyzing Object


~~~~~~~~~~~~ Iteration 05
[S02, Analyzing Object] 


Appending: S02, Analyzing Object


~~~~~~~~~~~~ Iteration 06
[S02, Analyzing Object] 


Appending: S02, Analyzing Object


~~~~~~~~~~~~ Iteration 07
[S02, Analyzing Object] 


Appending: S02, Analyzing Object


~~~~~~~~~~~~ Iteration 08
[S02, Analyzing Object] 


Appending: S02, Analyzing Object


~~~~~~~~~~~~ Iteration 09
[S02, Analyzing Object] 


Appending: S02, Analyzing Object


~~~~~~~~~~~~ Iteration 10
[S02, Analyzing Object] 


Appending: S02, Analyzing Object


~~~~~~~~~

### Single Iteration

In [16]:
completion = client.chat.completions.create(
    model=gpt_model,
    messages=[
        {"role": "system", "content": gpt_assistant_prompt},
        {"role": "user", "content": gpt_user_prompt}],
    temperature=temperature_coefficient,
    max_tokens=500,
    frequency_penalty= frequency_penalty_coefficient,
    top_p = top_p_coefficient
)

print('Raw Model Output:\n\n')
print(completion.choices[0].message)

Raw Model Output:


ChatCompletionMessage(content='[S02, Analyzing Object]', role='assistant', function_call=None, tool_calls=None, refusal=None)


In [17]:
print(f'''
      Parameter Config:\n
P01_value = {parameter_01_value} Body Tilt
P02_value = {parameter_02_value} Body Lean
P03_value = {parameter_03_value} Body Turn
P04_value = {parameter_04_value} Body Height
P05_value = {parameter_05_value} Body Direction
P06_value = {parameter_06_value} Pose Duration
P07_value = {parameter_07_value} Motion Velocity
P08_value = {parameter_08_value} Motion Smoothness
''')


print(completion.choices[0].message.content)


      Parameter Config:

P01_value = neutral Body Tilt
P02_value = forward Body Lean
P03_value = neutral Body Turn
P04_value = low Body Height
P05_value = object Body Direction
P06_value = medium Pose Duration
P07_value = slow Motion Velocity
P08_value = smooth Motion Smoothness

[S02, Analyzing Object]
