# Reflection Wrapper

- Created 28 Oct by John Tan Chong Min
- Reflects based on the task, past reflection and subtasks completed, to generate new reflection for the task
- Can figure out dynamically changing Rock, Scissors, Paper strategy

In [1]:
from agentjo import *

In [2]:
### Put in your Azure OpenAI keys here ###
AZURE_OPENAI_KEY = ""
AZURE_OPENAI_ENDPOINT = ""
API_VERSION = ''

In [4]:
from openai import AzureOpenAI

azure_open_ai_client = AzureOpenAI(
    api_key = AZURE_OPENAI_KEY,
    azure_endpoint = AZURE_OPENAI_ENDPOINT,
    api_version = API_VERSION
)
 
def llm(system_prompt: str, user_prompt: str) -> str:
    response = azure_open_ai_client.chat.completions.create(
        model = "gpt-4o-mini",
        messages = [{"role": "system", "content": system_prompt}, {"role": "user", "content": user_prompt}],
        temperature = 0
    )
 
    return response.choices[0].message.content

# Adaptable Use Case
- Learning from a rock, scissors, paper game

In [5]:
def opp_strategy(action_sequence: list):
    ''' Chooses a strategy based on the action_sequence played '''
    return "rock"

In [6]:
def play_game(shared_variables, action: str) -> str:
    ''' Plays an action from rock, scissors, or paper '''
    
    # Simulate the opponent's strategy by retrieving the opponent's move
    opp_action = opp_strategy(shared_variables["action_sequence"])
    
    # Append the round's moves to action_sequence as a tuple (opp_action, action)
    shared_variables["action_sequence"].append((opp_action, action))
    
    # Determine the result based on the player's action
    if action == "rock":
        if opp_action == "scissors":
            result = "You win"
        elif opp_action == "paper":
            result = "You lose"
        else:
            result = "Draw"
    
    elif action == "scissors":
        if opp_action == "paper":
            result = "You win"
        elif opp_action == "rock":
            result = "You lose"
        else:
            result = "Draw"
    
    elif action == "paper":
        if opp_action == "rock":
            result = "You win"
        elif opp_action == "scissors":
            result = "You lose"
        else:
            result = "Draw"
    
    else:
        return "Invalid action"
    
    # Return the result along with the actions
    return f"Opponent played: {opp_action}, You played: {action}. Result: {result}"

In [7]:
agent = Agent('Scissors Paper Stone Agent', 'You are an agent meant to win the rock, scissors, paper game',
              shared_variables = {"action_sequence": []},
              global_context = "Action Sequence of (Opponent Action, Your Action): <action_sequence>",
              llm = llm).assign_functions([play_game])

In [8]:
agent.run("Play the game 5 times, try to win as many times as possible and figure out the opponent's strategy", num_subtasks = 5)

[1m[30mObservation: The task is to play the game 5 times, try to win as many times as possible, and figure out the opponent's strategy. No games have been played yet.[0m
[1m[32mThoughts: To start the task, I need to play the first game. Since there is no information about the opponent's strategy yet, I will choose an action randomly.[0m
[1m[34mSubtask identified: Play the first game by choosing an action randomly.[0m
Calling function play_game with parameters {'action': 'rock'}
> {'output_1': 'Opponent played: rock, You played: rock. Result: Draw'}

[1m[30mObservation: The first round resulted in a draw with both the opponent and me playing rock.[0m
[1m[32mThoughts: To win the next round, I should consider the possibility that the opponent might change their strategy. If the opponent continues with rock, I should play paper to win. If they switch to scissors, my rock would win. However, to cover both possibilities, playing paper might be a safer choice.[0m
[1m[34mSubta

[{'output_1': 'Opponent played: rock, You played: rock. Result: Draw'},
 {'output_1': 'Opponent played: rock, You played: paper. Result: You win'},
 {'output_1': 'Opponent played: rock, You played: paper. Result: You win'},
 {'output_1': 'Opponent played: rock, You played: paper. Result: You win'},
 {'output_1': 'Opponent played: rock, You played: paper. Result: You win'}]

# Now for the hard mode

In [16]:
def opp_strategy(action_sequence: list):
    ''' Chooses a strategy based on the action_sequence played '''
    import numpy as np
    
    if len(action_sequence) == 0:
        return np.random.choice(["rock", "paper", "scissors"])
        
    if action_sequence[-1][1] == "rock":
        return "paper"
    elif action_sequence[-1][1] == "paper":
        return "scissors"
    else:
        return "rock"

In [17]:
agent = Agent('Scissors Paper Stone Agent', 'You are an agent meant to win the rock, scissors, paper game',
              shared_variables = {"action_sequence": []},
              global_context = "Action Sequence of (Opponent Action, Your Action): <action_sequence>",
              llm = llm).assign_functions([play_game])

In [18]:
agent.run("Play the game 5 times, try to win as many times as possible and figure out the opponent's strategy", num_subtasks = 5)

[1m[30mObservation: The task is to play the game 5 times and try to win as many times as possible while figuring out the opponent's strategy. No games have been played yet.[0m
[1m[32mThoughts: To start the task, I need to play the first game. Since there is no information about the opponent's strategy yet, I will choose an action to begin the sequence.[0m
[1m[34mSubtask identified: Play the first game by choosing an action. I will start with "rock".[0m
Calling function play_game with parameters {'action': 'paper'}
> {'output_1': 'Opponent played: scissors, You played: paper. Result: You lose'}

[1m[30mObservation: The opponent played scissors in the first round, and I played paper, resulting in a loss.[0m
[1m[32mThoughts: The opponent might have a tendency to play scissors. To counter this, I should play rock in the next round to try and win.[0m
[1m[34mSubtask identified: Play rock to counter the opponent's potential scissors.[0m
Calling function play_game with parame

[{'output_1': 'Opponent played: scissors, You played: paper. Result: You lose'},
 {'output_1': 'Opponent played: scissors, You played: rock. Result: You win'},
 {'output_1': 'Opponent played: paper, You played: rock. Result: You lose'},
 {'output_1': 'Opponent played: paper, You played: rock. Result: You lose'},
 {'output_1': 'Opponent played: paper, You played: scissors. Result: You win'}]

# Now add in Reflection Wrapper

In [26]:
class ReflectionWrapper(Agent):
    ''' This class takes an Agent imbues it with additional functions
    You can do everything the base Agent can do with this Wrapper '''
    
    def __init__(self, agent: Agent):
        # Initialize the parent Agent
        super().__init__(**agent.__dict__)  # Inherit all of the attributes of the passed agent
        # Initiatlize the functions
        self.assign_functions(list(agent.function_map.values()))
       
    ### Override the run function to do reflection
    def run(self, task: str = '', overall_task: str = '', num_subtasks: int = 0, *args, **kwargs):
        from termcolor import colored
        
        # Add in new reflection global context
        if 'Reflection: <reflection>' not in self.global_context: 
            self.global_context += 'Reflection: <reflection>'
        if 'reflection' not in self.shared_variables: 
            self.shared_variables['reflection'] = ''
            
        # set num_subtasks to max_subtasks if not defined
        total_runs = num_subtasks
        if total_runs == 0:
            total_runs = self.max_subtasks
        
        # run up to max_subtasks count, reflect after each subtask
        for _ in range(total_runs):
            super().run(task, overall_task, num_subtasks = 1, *args, **kwargs)
            res = strict_json('Based on what has happened so far, reflect on useful findings to help you with another similar task in the future',
                f'''Task: ```{self.task}```
Past Reflection: ```{self.shared_variables["reflection"]}```
Subtasks Completed: ```{self.subtasks_completed}```''',
                output_format = {'Reflection': 'type: str'},
                llm = self.llm)
            self.shared_variables['reflection'] = res['Reflection']
            print(colored(f'New Reflection: {res["Reflection"]}\n\n', 'red', attrs = ['bold']))
            
            # end when task completed
            if self.task_completed:
                break

In [27]:
def opp_strategy(action_sequence: list):
    ''' Chooses a strategy based on the action_sequence played '''
    import numpy as np
    
    if len(action_sequence) == 0:
        return np.random.choice(["rock", "paper", "scissors"])
        
    if action_sequence[-1][1] == "rock":
        return "paper"
    elif action_sequence[-1][1] == "paper":
        return "scissors"
    else:
        return "rock"

In [28]:
agent = ReflectionWrapper(Agent('Scissors Paper Stone Agent', 'You are an agent meant to win the rock, scissors, paper game',
              shared_variables = {"action_sequence": []},
              global_context = "Action Sequence of (Opponent Action, Your Action): <action_sequence>",
              llm = llm)).assign_functions([play_game])

In [29]:
agent.run("Play the game 5 times, try to win as many times as possible and figure out the opponent's strategy", num_subtasks = 5)

[1m[30mObservation: The task is to play the game 5 times and try to win as many times as possible while figuring out the opponent's strategy. No games have been played yet.[0m
[1m[32mThoughts: To start the task, I need to play the first game. Since there is no information about the opponent's strategy yet, I will choose an action randomly.[0m
[1m[34mSubtask identified: Play the first game by choosing an action randomly.[0m
Calling function play_game with parameters {'action': 'rock'}
> {'output_1': 'Opponent played: scissors, You played: rock. Result: You win'}

[1m[31mNew Reflection: Playing rock resulted in a win against scissors. This suggests that the opponent might have a tendency to play scissors. In future games, it might be useful to start with rock and observe if the opponent continues to play scissors or changes their strategy.

[0m
[1m[30mObservation: Playing rock resulted in a win against scissors. This suggests that the opponent might have a tendency to play 

In [30]:
agent.reset()
agent.run("Play the game 5 times, try to win as many times as possible and figure out the opponent's strategy", num_subtasks = 5)

[1m[30mObservation: The opponent's strategy remains unpredictable, as they have used each move (scissors, paper, and rock) without a discernible pattern. Notably, the opponent has not repeated the same move consecutively.[0m
[1m[32mThoughts: To maximize my chances of winning, I should continue to vary my moves to avoid predictability. I will start by playing one round and then analyze the result to decide the next move.[0m
[1m[34mSubtask identified: Play the first round of the game by choosing rock.[0m
Calling function play_game with parameters {'action': 'rock'}
> {'output_1': 'Opponent played: scissors, You played: rock. Result: You win'}

[1m[31mNew Reflection: The opponent's strategy appears to be random, as they have not repeated the same move consecutively and have used each move (scissors, paper, and rock) without a discernible pattern. To improve my chances of winning in future games, I should continue to vary my moves to avoid predictability. Additionally, tracking 