In [26]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import re

# Proof of Concept Environments

In [61]:
num_scenarios = 3

In [62]:
df = pd.DataFrame()
for s in range(1, num_scenarios + 1):
    for t in type_mapping:
        data = pd.read_csv(f"scenario{s}_type{t}.csv", index_col = False)
        df = pd.concat([df, data], axis = 0, ignore_index = True)

In [63]:
df["scenario"] = df["scenario"].astype(int)
df["succeeded"] = df["succeeded"].astype(bool)
df["steps_to_succeed"] = df["steps_to_succeed"].astype(float)
df["total_interactions"] = df["total_interactions"].astype(int)

In [64]:
df.replace([np.inf, -np.inf], np.nan, inplace = True)

In [65]:
table = df.groupby(["scenario", "inst_type"]).apply(lambda g: pd.Series({"success_rate": g["succeeded"].sum() / len(g),
                                                                         "steps_to_succeed": g["steps_to_succeed"].mean()})).reset_index()
table.rename(columns = {"scenario": "env_type", "inst_type": "inst_level"}, inplace = True)

In [66]:
table

Unnamed: 0,env_type,inst_level,success_rate,steps_to_succeed
0,1,high,0.0,
1,1,low,0.0,
2,1,mid-direct-explore,0.0,
3,2,high,0.075,7.0
4,2,low,0.368421,12.0
5,2,mid-direct-explore,0.2,7.833333
6,3,high,0.173913,4.875
7,3,low,0.478261,11.272727
8,3,mid-direct-explore,0.103448,1.666667


### Thoughts
#### Environment 1: TA has an empty room, LA has to unlock a door
- High: "Get to the goal." Obviously will fail, LA has no idea what to do in the room
- Low: "Go forward all the way until you hit a wall, then turn right, then go forward all the way again until you hit the goal." Fail because LA's environment is vastly different
- Mid: "Pick up the key, use it to unlock the door, and go through the door to get to the goal. For each of the key, door, and goal, if you don't see it first, explore the room until you find it." LA is aware of key and door existence, so failure most likely due to getting LLM to understand that you cannot interact with objects if you are not directly facing it and it is directly in front of you.

#### Environment 2: TA has to unlock a door, LA has an empty room
- High: "Get to the goal." Small success rate. Seems to be only if the goal is readily visible/accessible to the agent (otherwise it just won't know where to go) and it doesn't do a weird move that removes the goal from its field of vision.
- Low: "Turn right and move forward until you see a key and pick it up. Then turn around, go forward one step, and turn right to use the key to unlock the door. Then go through the door and continue walking until you hit a wall. Then turn right and go forward until you hit the goal." Truly unsure why the success rate is so high. Originally it would hallucinate that there was a key due to the instruction so the success rate was lower. Then after removing the invalid actions I suppose it just kept turning and moving according the instruction and inherently saw the goal.
- Mid: "There are no obstacles around you. Find the goal and go straight to it. If you don't see it at first, explore the room more until you find it." Like above, would hallucinate objects until invalid actions were taken away. Not sure why success rate is lower, but at least num steps is also lower.

#### Environment 3: TA has one green goal, LA has multiple colored goals
- High: "Get to the goal." If the green goal is closest, it will succeed. On second thought, should have ended the run when any goal is reached but actually count as failure if not green; this would have lowered the success rate.
- Low: "Walk forward until you hit a wall, then turn and keep walking forward until you hit the goal." Same as above, I guess it just kept on walking and walking until it hit the green goal???
- Mid: "Go straight to the green goal. If you don't see it anywhere, explore the room more until you find it. Do not go to a goal of another color." Again, no clue why this is not more clearly better.

What is the next action you take in order to complete your given task? Choose from the following list.
1. Move forward one cell.
2. Turn left.
3. Turn right.
4. Pick up [OBJECT]. (Object must be directly one cell in front of you!)
5. Put down [OBJECT]. (You must be currently holding the object and there must be empty floor directly one cell in front of you!)
6. Unlock door. (You must have a matching key and the door must be directly one cell in front of you!)
7. Open door. (Door must be closed/unlocked and directly one cell in front of you!)
8. Close door. (Door must be open and directly one cell in front of you!)
Adhere strictly to these response formatting rules: If you choose options 1-3 or 6-8, your response must be in the form \"[NUMBER]\"; for example, if you want to move forward, you must say \"1\". If you choose options 4 or 5, you must also specify the object you want to handle in the form \"[NUMBER], [OBJECT]\"; for example, to put down a key, you must say "5, key". Do not choose options 4 or 5 if the object you have in mind is not said to exist in the room. Do not choose options 6-8 if it is not said that a door exists in the room. If none of the options seem feasible, say, \"I'm stumped.\" Don't say anything else in your response that strays from these rules

STARTING ENVIRONMENT 6
Bartolomé de las Casas was told: You are an agent who is trying to complete a task in an unknown, grid-like room. Your abilities include the following:
1. Moving forward one cell. For example, if you are in cell (0, 0) and facing north, moving forward would put you in (0, 1).
2. Turning left. For example, if you are in cell (0, 0) and facing north, turning left would have you in the same cell but facing west.
3. Turning right For example, if you are in cell (0, 0) and facing north, turning right would have you in the same cell but facing east.
4. Picking up an object (if it is in a cell cardinally adjacent to you and you are facing it). For example, if you are in cell (0, 0) and facing east, and there is a ball at (1, 0), you can pick up the ball and will be able to travel with it until you put it down.
5. Putting down an object (if there is empty floor directly in front of you to put it down). For example, if you are in cell (0, 0), facing north, and holding a key, and there is nothing except the floor at (0, 1), you can put the key down on (0, 1).
6. Unlock a door (if it is in a cell cardinally adjacent to you and you are facing it, and if you have a key of the same color as the door). For example, if you are in cell (0, 0), facing north, and holding a green key, and there is a green door at (0, 1), you can use the key to unlock the door, after which the door will also open.
7. Open a door (if it is in a cell cardinally adjacent to you and you are facing it). For example, if you are in cell (0, 0) and facing east, and there is an unlocked door at (1, 0), you can open the door.
8. Close a door (if it is in a cell cardinally adjacent to you and you are facing it). For example, if you are in cell (0, 0) and facing east, and there is an open door at (1, 0), you can close the door.
Each of these abilities are primitive actions that must be performed step-by-step. For example, if you are in cell (0, 0) facing east and there is a closed door at (0, 3) which you want to go through, you cannot immediately perform an "open door" action. You must first "move forward" twice to (0, 2) so that the door is directly in front of you. Only then can you "open door".
YOUR TASK IS: Turn right and move forward until you see a key and pick it up. Then turn around, go forward one step, and turn right to use the key to unlock the door. Then go through the door and continue walking until you hit a wall. Then turn right and go forward until you hit the goal.

Bartolomé de las Casas was told: You are facing west. Your field of vision is a 5x5 square in which you are located at the bottom middle. In the following description, an "unknown cell" is one for which your vision is blocked, so you can't tell what is there. Directly to your left is empty floor. Two cells to your left is empty floor. Directly to your right is empty floor. Two cells to your right is empty floor. One row in front of you, from left to right, you see empty floor for five cells (the floor is directly in front of you). Two rows in front of you, from left to right, you see empty floor for five cells. Three rows in front of you, from left to right, you see empty floor for two cells, a green goal, and empty floor for two cells. Four rows in front of you, from left to right, you see empty floor for five cells. Reminder that your task is, Turn right and move forward until you see a key and pick it up. Then turn around, go forward one step, and turn right to use the key to unlock the door. Then go through the door and continue walking until you hit a wall. Then turn right and go forward until you hit the goal. What is the next action you take in order to complete your given task? Choose from the following list.
1. Move forward one cell.
2. Turn left.
3. Turn right.
Adhere strictly to these response formatting rules: your response must be in the form "[NUMBER]"; for example, if you want to move forward, you must say "1". If none of the options seem feasible, say, "I'm stumped." Don't say anything else in your response that strays from these rules.
Bartolomé de las Casas says: 3

Bartolomé de las Casas was told: You are facing north. Your field of vision is a 5x5 square in which you are located at the bottom middle. In the following description, an "unknown cell" is one for which your vision is blocked, so you can't tell what is there. Directly to your left is empty floor. Two cells to your left is empty floor. Directly to your right is empty floor. Two cells to your right is a grey wall. One row in front of you, from left to right, you see empty floor for four cells, and a grey wall (the floor is directly in front of you). Two rows in front of you, from left to right, you see empty floor for four cells, and a grey wall. Three rows in front of you, from left to right, you see grey walls for five cells. You cannot see 4 rows in front of you. Reminder that your task is, Turn right and move forward until you see a key and pick it up. Then turn around, go forward one step, and turn right to use the key to unlock the door. Then go through the door and continue walking until you hit a wall. Then turn right and go forward until you hit the goal. What is the next action you take in order to complete your given task? Choose from the following list.
1. Move forward one cell.
2. Turn left.
3. Turn right.
Adhere strictly to these response formatting rules: your response must be in the form "[NUMBER]"; for example, if you want to move forward, you must say "1". If none of the options seem feasible, say, "I'm stumped." Don't say anything else in your response that strays from these rules.
Bartolomé de las Casas says: 2

Bartolomé de las Casas was told: You are facing west. Your field of vision is a 5x5 square in which you are located at the bottom middle. In the following description, an "unknown cell" is one for which your vision is blocked, so you can't tell what is there. Directly to your left is empty floor. Two cells to your left is empty floor. Directly to your right is empty floor. Two cells to your right is empty floor. One row in front of you, from left to right, you see empty floor for five cells (the floor is directly in front of you). Two rows in front of you, from left to right, you see empty floor for five cells. Three rows in front of you, from left to right, you see empty floor for two cells, a green goal, and empty floor for two cells. Four rows in front of you, from left to right, you see empty floor for five cells. Reminder that your task is, Turn right and move forward until you see a key and pick it up. Then turn around, go forward one step, and turn right to use the key to unlock the door. Then go through the door and continue walking until you hit a wall. Then turn right and go forward until you hit the goal. What is the next action you take in order to complete your given task? Choose from the following list.
1. Move forward one cell.
2. Turn left.
3. Turn right.
Adhere strictly to these response formatting rules: your response must be in the form "[NUMBER]"; for example, if you want to move forward, you must say "1". If none of the options seem feasible, say, "I'm stumped." Don't say anything else in your response that strays from these rules.
Bartolomé de las Casas says: 1

Bartolomé de las Casas was told: You are facing west. Your field of vision is a 5x5 square in which you are located at the bottom middle. In the following description, an "unknown cell" is one for which your vision is blocked, so you can't tell what is there. Directly to your left is empty floor. Two cells to your left is empty floor. Directly to your right is empty floor. Two cells to your right is empty floor. One row in front of you, from left to right, you see empty floor for five cells (the floor is directly in front of you). Two rows in front of you, from left to right, you see empty floor for two cells, a green goal, and empty floor for two cells. Three rows in front of you, from left to right, you see empty floor for five cells. Four rows in front of you, from left to right, you see grey walls for five cells. Reminder that your task is, Turn right and move forward until you see a key and pick it up. Then turn around, go forward one step, and turn right to use the key to unlock the door. Then go through the door and continue walking until you hit a wall. Then turn right and go forward until you hit the goal. What is the next action you take in order to complete your given task? Choose from the following list.
1. Move forward one cell.
2. Turn left.
3. Turn right.
Adhere strictly to these response formatting rules: your response must be in the form "[NUMBER]"; for example, if you want to move forward, you must say "1". If none of the options seem feasible, say, "I'm stumped." Don't say anything else in your response that strays from these rules.
Bartolomé de las Casas says: 1

Bartolomé de las Casas was told: You are facing west. Your field of vision is a 5x5 square in which you are located at the bottom middle. In the following description, an "unknown cell" is one for which your vision is blocked, so you can't tell what is there. Directly to your left is empty floor. Two cells to your left is empty floor. Directly to your right is empty floor. Two cells to your right is empty floor. One row in front of you, from left to right, you see empty floor for two cells, a green goal, and empty floor for two cells (the green goal is directly in front of you). Two rows in front of you, from left to right, you see empty floor for five cells. Three rows in front of you, from left to right, you see grey walls for five cells. You cannot see 4 rows in front of you. Reminder that your task is, Turn right and move forward until you see a key and pick it up. Then turn around, go forward one step, and turn right to use the key to unlock the door. Then go through the door and continue walking until you hit a wall. Then turn right and go forward until you hit the goal. What is the next action you take in order to complete your given task? Choose from the following list.
1. Move forward one cell.
2. Turn left.
3. Turn right.
Adhere strictly to these response formatting rules: your response must be in the form "[NUMBER]"; for example, if you want to move forward, you must say "1". If none of the options seem feasible, say, "I'm stumped." Don't say anything else in your response that strays from these rules.
Bartolomé de las Casas says: 1

Bartolomé de las Casas has succeeded! This round took 5 interactions.