# [3.4] LM Agent Evaluations

# Setup (don't read just run)

In [30]:
import json
import os
os.chdir("c:\\Users\\styme\\OneDrive\\Documents\\AI STUFF\\Model Written Evals\\Code Replication\\ARENA_evals\\curriculum")
import wikipedia
from wikipedia import WikipediaPage
from openai import OpenAI
from anthropic import Anthropic
from utils import establish_client_OpenAI
from utils import retry_with_exponential_backoff
from pprint import pprint
from inspect_ai.model import ChatMessageUser, ChatMessageAssistant, ChatMessageSystem
import re
#from utils import countrylist

# 1️⃣ Intro to LM Agents

## Resources:

- [OpenAI Function Calling Guide](https://platform.openai.com/docs/guides/function-calling)

- [Anthropic Function Calling Guide](https://docs.anthropic.com/en/docs/build-with-claude/tool-use)

- [Evaluating Language-Model Agents on Realistic Autonomous Tasks](https://evals.alignment.org/Evaluating_LMAs_Realistic_Tasks.pdf) (Kinniment et al., ARC Evaluations Team (now METR), 2023)

- [Large Language Models can Strategically Deceive their Users when Put Under Pressure](https://arxiv.org/pdf/2311.07590) (Scheurer et al., Apollo Research, ICLR 2024)

- [LLM Powered Autonomous Agents](https://lilianweng.github.io/posts/2023-06-23-agent/) (Lilian Weng, OpenAI Safety Team, 2023)

- [AXRP Episode 34 - AI Evaluations with Beth Barnes](https://www.alignmentforum.org/posts/vACr4DExfeRMaCoo7/axrp-episode-34-ai-evaluations-with-beth-barnes) (Daniel Filan, 2024)

- [Reflexion: Language Agents with Verbal Reinforcement Learning](https://arxiv.org/pdf/2303.11366) (Shinn et al., 2023)

- [Answering Questions by Meta-Reasoning over Multiple Chains of Thought](https://arxiv.org/pdf/2304.13007) (Yoran et al., 2024)

- [Toolformer: Language Models Can Teach Themselves to Use Tools](https://arxiv.org/pdf/2302.04761) (Schick et al., META AI Research, 2023)


LM agents are important and we should evaluate them.

They might be able to do more things.

Lots of threat models go through agentic behaviour.

Bad actors might scaffold agents and do bad things.

Yada yada yada

# 2️⃣ A Basic LM agent

First, we will start by building a simple LM agent to solve the following simple calculation task.

### Exercise - Build a simple arithmetic problem
```c
Difficulty: 🔴🔴🔴⚪⚪
Importance: 🔵🔵⚪⚪⚪

You should spend up to 20-25 minutes on this exercise.
```

Build a class for a "game" that takes in two numbers, and creates a task of `len(operation_list)` many problems of the form "Calculate `num1 operation_list[i] num2`".


In [31]:
class arithmeticProblem:
    def __init__(self, num1, num2):
        self.num1 = num1
        self.num2 = num2
        self.operation_list=["+","-","*","/", "%", "//"]
        self.answer_dict = {str(num1) + self.operation_list[i] + str(num2): str(eval(str(num1) + self.operation_list[i] + str(num2))) for i in range(len(self.operation_list))}
        self.solved_dict = {str(num1) + self.operation_list[i] + str(num2): False for i in range(len(self.operation_list))}
        self.count_task = 0
        
    def getCurrentTask(self):
        #Returns the current arithmetic task
        return str(self.num1) + " " + self.operation_list[self.count_task] + " " + str(self.num2)
    
    def checkSolved(self):
        #Checks if all tasks are solved
        return all(self.solved_dict.values())
    
    def checkAnswer(self, answer : str):
        #Checks if the answer is correct. 
        # 
        # If you're testing the model without access to tools, you'll need to include some "wiggle room," as its answer may not be exact.
        
        if abs(float(self.answer_dict[str(self.num1) + self.operation_list[self.count_task] + str(self.num2)]) - float(answer)) <= 0.00001:
            return True
        else:
            return False
    def update(self):
        #Changes task once the current task is solved
        self.solved_dict[str(self.num1) + self.operation_list[self.count_task] + str(self.num2)] = True
        self.count_task+=1
        if self.count_task>len(self.operation_list)-1:
            self.count_task = 0
        
    def calculate(self, expression : str):
        #Use Python's "eval()" function. This is generally bad practice, but alright for this task. Just make sure to clean the expression using regex first (to make sure gpt-3.5 doesn't try to run arbitrary code).
        expression = re.sub(r'[^0-9+\-*/().]','',expression)
        print(expression)
        return str(eval(expression))
        
    
x = arithmeticProblem(10,15)
print(x.answer_dict)

{'10+15': '25', '10-15': '-5', '10*15': '150', '10/15': '0.6666666666666666', '10%15': '10', '10//15': '0'}


### Exercise - Define and implement a tool or list of tools for this task
```c
Difficulty: 🔴🔴⚪⚪⚪
Importance: 🔵🔵🔵⚪⚪

You should spend up to 10-15 minutes on this exercise.
```

When you do this exercise, make sure you refer back to [OpenAI's function calling guide](https://platform.openai.com/docs/guides/function-calling). Also, Anthropic has a good example of what good and bad tool calling looks like [here](https://docs.anthropic.com/en/docs/build-with-claude/tool-use#example-poor-tool-description) which you might want to refer to. The main takeaway, as always with LLMs, is to **provide explicit descriptions**.

In [32]:
tool_list = [
    {
        "type" : "function",
        "function" : {
            "name" : "calculate",
            "description" : "Calculates the result of an arithmetic expression. For example, you could provide an input in the form \"2+3\" and the function would return 5. Or you could provide an expression like \"10/3\" and the function would return 3.3333333333333335.",
            "parameters" : {
                "type" : "object",
                "properties" : {
                    "expression" : {
                        "type" : "string",
                        "description" : "The arithmetic expression that you want to be evaluated."
                    }
                },
                "required" : ["expression"],
                "additionalProperties" : False
            }
        }
    },
]

Here are two functions that will be useful. If you're unsure why they're written in this way, then you should refer back to OpenAI's API documentation.

In [33]:
def tool_call_message(tool_call, content):
    return {
        "role" : "tool",
        "tool_call_id" : tool_call.id,
        "name" : tool_call.function.name,
        "content" : content
    }

def user_message(content):
    return {
        "role" : "user",
        "content" : content
    }

### Exercise - Implement an LM agent class using the arithmetic task above
```c
Difficulty: 🔴🔴🔴🔴⚪
Importance: 🔵🔵🔵🔵🔵

You should spend up to 25-30 minutes on this exercise.
```

Build a simple agent class

In [34]:
class simpleAgent:
    def __init__(self, task, model = "gpt-4o-mini", tools = tool_list):
        self.task = task 
        self.model = model
        self.tools = tools
        self.client = establish_client_OpenAI()


        self.user_message = "Calculate the result of the following expression: " + self.task.getCurrentTask() + "."
        self.answer_message = "Now provide your answer in the format Answer: NUMBER where NUMBER Is the answer to the question. Only output in this format."
        self.incorrect_message = "Incorrect."
        self.correct_message = "Correct."

        self.messages=[user_message(self.user_message)]

    @retry_with_exponential_backoff
    def get_response_with_tools(self):
        # Generate a response where the model can use tools and add the response to the messages
        response = self.client.chat.completions.create(
            model=self.model,
            messages=self.messages,
            tools = self.tools,
            tool_choice = "auto"
        )
        self.messages.append(response.choices[0].message)
        return response.choices[0].message

    def get_response_no_tools(self):
        # Generate a response where the model cannot use tools and add the response to the messages
        response = self.client.chat.completions.create(
            model=self.model,
            messages=self.messages,
        )
        self.messages.append(response.choices[0].message)
        return response.choices[0].message
        
    def do_tool_calls(self, message):
        # Do the tool calls and return the response and add the tool calls to the messages
        tool_calls = message.tool_calls
        if tool_calls:
            for tool_call in tool_calls:
                if tool_call.function.name == "calculate":
                    func = getattr(self.task, tool_call.function.name)
                    arguments = json.loads(tool_call.function.arguments)
                    tool_response = func(**arguments)
                    self.messages.append(tool_call_message(tool_call, str(tool_response)))
            return tool_response
        
    def output_and_check_answer(self):
        '''
        Get the model to output a final answer and then check if it is correct, and update the task. 
        Add the final answer to the messages, and tell the model if the answer is correct or incorrect.
        '''
        self.messages.append(user_message(self.answer_message))
        response = self.client.chat.completions.create(
            model=self.model,
            messages=self.messages,
        )
        self.messages.append(response.choices[0].message)
        if self.task.checkAnswer(str(response.choices[0].message.content)[8:]):
            self.messages.append(user_message(self.correct_message))
            self.task.update()
            self.user_message = "Calculate the result of the following expression: " + self.task.getCurrentTask() + "."
            self.messages.append(user_message(self.user_message))
        else:
            self.messages.append(user_message(self.incorrect_message))
        return response.choices[0].message
        
                    



    

### Exercise - Define an agent_loop which uses the agent class to solve the task
```c
Difficulty: 🔴🔴⚪⚪⚪
Importance: 🔵🔵🔵🔵🔵

You should spend up to 10-15 minutes on this exercise.
```

An agent is essentially just a loop, where you can look at what you've done in the past. Try implementing the agent_loop with and without tools, to compare how much better the model does when we give it tools.

In [35]:
task = arithmeticProblem(1500,1091)
agent = simpleAgent(task)

def agent_loop(numLoops = 10):
    for i in range(numLoops):
        if task.checkSolved():
            print("All tasks solved.")
        else:
            response = agent.get_response_with_tools()
            print(response.content)
            tool_response = agent.do_tool_calls(response)
            print(tool_response)
            response = agent.output_and_check_answer()
            print(response.content)

agent_loop()


None
1500+1091
2591
Answer: 2591
None
1500-1091
409
Answer: 409
None
1500*1091
1636500
Answer: 1636500
None
1500/1091
1.374885426214482
Answer: 1.374885426214482
None
15001091
15001091
Answer: 409
None
1500//1091
1
Answer: 1
All tasks solved.
All tasks solved.
All tasks solved.
All tasks solved.


In [36]:
for i in agent.messages:
    try: print(str(i.content)[0:])
    except: print(i["content"])
'''
print(task.operation_list[task.count_task])
print(task.solved_dict)
#print(str(agent.messages[5].content)[8:])
print(type(task.solved_dict.values()))
'''

Calculate the result of the following expression: 1500 + 1091.
None
2591
Now provide your answer in the format Answer: NUMBER where NUMBER Is the answer to the question. Only output in this format.
Answer: 2591
Correct.
Calculate the result of the following expression: 1500 - 1091.
None
409
Now provide your answer in the format Answer: NUMBER where NUMBER Is the answer to the question. Only output in this format.
Answer: 409
Correct.
Calculate the result of the following expression: 1500 * 1091.
None
1636500
Now provide your answer in the format Answer: NUMBER where NUMBER Is the answer to the question. Only output in this format.
Answer: 1636500
Correct.
Calculate the result of the following expression: 1500 / 1091.
None
1.374885426214482
Now provide your answer in the format Answer: NUMBER where NUMBER Is the answer to the question. Only output in this format.
Answer: 1.374885426214482
Correct.
Calculate the result of the following expression: 1500 % 1091.
None
15001091
Now provide y

'\nprint(task.operation_list[task.count_task])\nprint(task.solved_dict)\n#print(str(agent.messages[5].content)[8:])\nprint(type(task.solved_dict.values()))\n'

# 3️⃣ Building a More Complex Task

Now that we know how to do function calling, and have a rough understanding of what goes into designing an LM agent, we're going to build a more complicated task. This will enable us to see if we can elicit better capabilities from models.

The task we'll build and elicit behavior for will be the Wikipedia game. 

First of all, build a simple function to get a wikipedia page. Probably also mess around with wikipedia api

In [37]:
def get_page(title):
    return wikipedia.page(title,auto_suggest=False,redirect=True)

# Exercise - Build a class for the Wikipedia game
```c
Difficulty: 🔴🔴🔴🔴⚪
Importance: 🔵🔵🔵🔵⚪

You should spend up to 30-35 mins on this exercise.
```

Implement the wikipedia game class below. 

In [38]:
class WikipediaGame:
    def __init__(self, starting_page : str, goal_page : str, rules : list | type[None] = None):
        '''
        Initialises the wikipedia game object

        starting_page is the page the agent starts on.

        goal_page is the page the agent is trying to get to.

        rules is a list of dictionaries specifying any additional rules along with a description of that rule to be fed to the agent.

        Rules that are handled currently:
            - "no country" bans country articles.
            - "no backtrack" bans backtracking.
        '''
        self.page_title_history=[starting_page]
        self.starting_page = get_page(starting_page) # page the game starts on
        self.goal_page = get_page(goal_page) # page the game ends on
        self.rules = rules # any additional rules, (no countries, no articles above a certain length, no backtracking etc. Need to add functionality to deal with these which I haven't done yet)
        self.current_page = get_page(starting_page) # current page the game state is on.

    def get_page_summary(self, page : wikipedia.WikipediaPage):
        '''
        Get summary of a wikipedia page, to the last full stop within the first 500 characters.
        '''
        summary = page.content[0:500]
        return summary[0: summary.rindex(".")+1]

    def move_page(self, **args):
        '''
        Changes the current page of the game. To be used when we want to move from Page A to Page B
        '''
        new_page = args["new_page"]
        if self.is_permitted_link(new_page):
            self.current_page = get_page(new_page)
            self.page_title_history.append(self.current_page.title)
            return "Moving page to " + self.current_page.title
        elif self.is_permitted_link(new_page.replace("_"," ")):
            self.current_page = get_page(new_page.replace("_"," "))
            self.page_title_history.append(self.current_page.title)
            return "Moving page to " + self.current_page.title
        else:
            return "Couldn't move page to " + new_page

    def get_plain_content(self):
        return self.current_page.content

    def get_content(self,**args):
        '''
        Gives the content of the wikipedia page. Make sure that accessible links are wrapped with <link> </link> tags.
        '''
        content = self.current_page.content
        for word in sorted(self.get_links(), key=len, reverse = True):
            content = re.sub(" " + word + " ", " " + f"<link>{word}</link>" + " ",content, flags = re.I)
        for word in sorted(self.get_links(), key=len, reverse = True):
            content = re.sub(" " + word + ",", " " + f"<link>{word}</link>" + ",", content, flags=re.I)
        for word in sorted(self.get_links(), key=len, reverse = True):
            content = re.sub(" " + word + ".", " " + f"<link>{word}</link>" + ".", content, flags=re.I)
        for word in sorted(self.get_links(), key=len, reverse = True):
            content = re.sub("\(" + word + "\)", "(" + f"<link>{word}</link>" + ")", content, flags = re.I)
        for word in sorted(self.get_links(), key=len, reverse = True):
            content = re.sub(" " + word + "s", " " + f"<link>{word}</link>" + "s", content, flags = re.I)
        return content

    def get_links(self):
        #Only certain links will be accessible, since the wikipedia api returns a list of ALL possible links (lots of which would not be included in the actual content of the wikipedia, and almost all of which would be banned according to most rules of wiki racing).
        all_links = self.current_page.links
        content = self.current_page.content
        permitted_links = []
        for i in all_links:
            if i in content:
                permitted_links.append(i)
        return permitted_links

    def is_permitted_link(self, link):
        if link in self.get_links():
            return True
        else:
            return False

    def check_win(self):
        if self.current_page==self.goal_page:
            return True
        else:
            return False

# Exercise - Write a list of tools for this game.
```c
Difficulty: 🔴🔴⚪⚪⚪
Importance: 🔵🔵🔵🔵⚪

You should spend up to 10-15 mins on this exercise.
```

Fill in the tools that the agent will need to use to accomplish this game. You should name the tools according to the names in the Game class above, so that when we do tool calling we can access the correct function more easily.

In [39]:
WikipediaGameTools = [
    {
        "type" : "function",
        "function" : {
            "name" : "get_content",
            "description" : "Get all the content for the wikipedia page you are currently on. Anything which corresponds to a link you can select will be wrapped in <link></link> tags.",
            "parameters" : {
                "type" : "object",
                "properties" : {},
                "required" : []
            }
        }
    },
    {
        "type" : "function",
        "function" : {
            "name" : "move_page",
            "description" : "Changes your current page to a specified new page which is accessible via a link from the current page. You can only call this function once at a time, as it will take you to a different page.",
            "parameters": {
                "type": "object",
                "properties": {
                    "new_page": {
                        "type": "string",
                        "description": "The title of the new page you want to move to. This should be formatted the way the title appears on wikipedia (e.g. to move to the page for the United States of America, you should enter \"United States\"). Underscores are not necessary.",
                    },
                },
                "required": ["new_page"]
            }
        }
    }

]

# Exercise - Build an agent class for the wikipedia racer
```c
🔴🔴🔴🔴⚪
🔵🔵🔵🔵⚪

You should spend up to 30-40 mins on this exercise.
```

Now that you have the `WikipediaGame` class and the tooling set up, build out a `WikipediaRacingAgent` that can access these tools and solve the wikipedia game. Build the agent so that it can be thrown into an agent loop (similar to the one we had for the arithmetic game) without much additional scaffolding. 

There are a few further considerations in this case that we didn't have for the arithmetic game. 

- Since the agent will need to read (potentially very long) wikipedia articles to interact with the game, the context window length becomes relevant. GPT-4o and GPT-4o-mini both have context windows of 128k tokens (which corresponds to ~96k words, but for reference, the wikipedia page for the United States has around 10k words alone and the agent will often need to visit more than 10 articles in one run of the game, not counting its own output, which eventually adds up to be significant). The way we'll solve this for now is by resetting the messages of the agent every time it reaches a new wikipedia page, and providing an updated user and system message so that the agent can locate itself, and then proceed (We'll address different methods for solving this issue later, you can probably already think of some). So be careful to include the current page and goal page for the agent.

- The goal page shouldn't be anything the agent is completely unfamiliar with (do you believe there's even a semicolon on wikipedia that OpenAI, Anthropic and Google haven't scraped?), but it may be easily confused with something else. So you should use the game's get_summary function to provide details of the goal page to the agent.




In [40]:
class WikipediaRacingAgent:
    def __init__(self, wikigame, model = "gpt-4o-mini", tools = WikipediaGameTools):
        # Initialises the agent
        self.game=wikigame
        self.model = model
        self.client = OpenAI()
        self.system_message = {
                "role" : "system", 
                "content" : "You are a wikipedia-racing AI. Your goal is to reach " + self.game.goal_page.title + " by accessing links from a series of wikipedia pages. Your current page is " + self.game.current_page.title + "."
            }
        self.user_message={
                "role" : "user",
                "content" : "You are currently on page: " + self.game.current_page.title + ". Make sure you start by reasoning about what steps you should take to get to the article on " + self.game.goal_page.title + ". When coming up with a strategy, make sure to pay attention to the path you have already taken, and if your current strategy doesn't seem to be working out, try something else. In case you're unsure, " + self.game.goal_page.title + " has the following summary: \n\n[Begin Summary]\n" + self.game.get_page_summary(self.game.goal_page) + "\n[End Summary] " 
            }
        self.messages = [self.system_message, self.user_message] # Messages that are currently in the chat history.
        self.all_messages = [self.system_message, self.user_message] # All messages that have been sent in the chat history. We have to erase each time a new page is reached for context window reasons.
        self.tools = tools
        self.response=None

        #print starting messages to watch what the agent is doing
        print("\nSYSTEM: " + self.system_message["content"] + "\n\nUSER: " + self.user_message["content"])
    
    @retry_with_exponential_backoff
    def generate_response_with_tools(self):
        # Generate a response with tools and add it to the messages.
        
        new_response = self.client.chat.completions.create(
            model = self.model,
            messages = self.messages,
            tools = self.tools,
            tool_choice = "auto"
        )

        self.response = new_response.choices[0].message
        self.messages.append(self.response)
        self.all_messages.append(self.response)

        #Print message to watch what the agent is doing
        print("\n" + str(self.response.content))

        return self.response

    def do_tool_calls(self, output):
        # Do the tool calls and return the response and add the tool calls to the messages
        tool_calls = output.tool_calls
        move_page_count = 0
        if tool_calls:
            print(tool_calls)
            for tool_call in tool_calls:
                func = getattr(self.game, tool_call.function.name)
                arguments = json.loads(tool_call.function.arguments)
                new_response = func(**arguments)
                self.messages.append(tool_call_message(tool_call,new_response))
                self.all_messages.append(tool_call_message(tool_call,new_response))
                
                #Print message to watch what the agent is doing
                print("\nTOOL CALL" + tool_call.function.name, arguments)
                print("\nTOOL RESPONSE:" + new_response[:300])

                if tool_call.function.name == "move_page" and "Moving page" in new_response:
                    self.new_state()

                    # TODO: Absolutely NEED to fix this goddamn moving page issue. Really need to think of a non-awful way to solve this problem. (pop things from a list, maybe just learn more about how OpenAI stores these lists)

                    return "new state"
                '''
                elif tool_call.function.name=="move_page" and "Couldn't" in new_response:
                    self.messages.append(user_message("That is not a valid link."))
                    self.all_messages.append(user_message("That is not a valid link."))
                '''
                
            self.messages.append(user_message("What's your next step to get to "+ self.game.goal_page.title + "?"))
            self.all_messages.append(user_message("What's your next step to get to "+ self.game.goal_page.title + "?"))
            
            #Print message to watch what the agent is doing
            print("\nUSER: What's your next step to get to " + self.game.goal_page.title + "?")
                
                    

    def new_state(self):
        # Begin a new state when a new page is visited. Updat user_message and system_message, and reset self.messages (for context window reasons).

        self.system_message = {
                "role" : "system", 
                "content" : "You are a wikipedia-racing AI. Your goal is to reach " + self.game.goal_page.title + " by accessing links from a series of wikipedia pages. Your current page is " + self.game.current_page.title + "."
            }
        self.user_message={
                "role" : "user",
                "content" : "You are currently on page: " + self.game.current_page.title + ". Make sure you start by reasoning about what steps you should take to get to the article on " + self.game.goal_page.title + ". When coming up with a strategy, make sure to pay attention to the path you have already taken, and if your current strategy doesn't seem to be working out, try something else. In case you're unsure " + self.game.goal_page.title + " has the following summary: \n\n [Begin Summary] \n" + self.game.get_page_summary(self.game.goal_page) + "\n[End Summary]"
            }
        self.messages = [self.system_message,self.user_message]
        self.all_messages.extend([self.system_message,self.user_message])

        #Print message to watch what the agent is doing
        print(("-" * 50) + "\n\nNEW STATE\n\n" + "HISTORY: " + " -> ".join(game.page_title_history) + "\n\n" + ("-"*50))
        print("SYSTEM: " + self.system_message["content"] + "\n\nUSER: " + self.user_message["content"])

In [41]:
game = WikipediaGame("Aristotle", "GPT-4")
agent = WikipediaRacingAgent(game,model="gpt-4o-mini")
def agent_loop(agent, game, num_loops = 10):
    for i in range(num_loops):
        if game.check_win():
            print("Success")
            return ""
        response = agent.generate_response_with_tools()
        if agent.do_tool_calls(response) == "new state":
            new_state_string = ("-" * 50) + "\n\nNEW STATE\n\n" + " -> ".join(game.page_title_history) + "\n\n" + ("-"*50)
    


SYSTEM: You are a wikipedia-racing AI. Your goal is to reach GPT-4 by accessing links from a series of wikipedia pages. Your current page is Aristotle.

USER: You are currently on page: Aristotle. Make sure you start by reasoning about what steps you should take to get to the article on GPT-4. When coming up with a strategy, make sure to pay attention to the path you have already taken, and if your current strategy doesn't seem to be working out, try something else. In case you're unsure, GPT-4 has the following summary: 

[Begin Summary]
Generative Pre-trained Transformer 4 (GPT-4) is a multimodal large language model created by OpenAI, and the fourth in its series of GPT foundation models. It was launched on March 14, 2023, and made publicly available via the paid chatbot product ChatGPT Plus, via OpenAI's API, and via the free chatbot Microsoft Copilot.  As a transformer-based model, GPT-4 uses a paradigm where pre-training using both public data and "data licensed from third-party

In [45]:
game = WikipediaGame("1511 Daléra", "Zalíbená")
agent = WikipediaRacingAgent(game,model="gpt-4o-mini")
agent_loop(agent, game, 40)


SYSTEM: You are a wikipedia-racing AI. Your goal is to reach Zalíbená by accessing links from a series of wikipedia pages. Your current page is 1511 Daléra.

USER: You are currently on page: 1511 Daléra. Make sure you start by reasoning about what steps you should take to get to the article on Zalíbená. When coming up with a strategy, make sure to pay attention to the path you have already taken, and if your current strategy doesn't seem to be working out, try something else. In case you're unsure, Zalíbená has the following summary: 

[Begin Summary]
Zalíbená is a village and administrative part of Podveky in Kutná Hora District in the Central Bohemian Region of the Czech Republic. It has about 20 inhabitants.
[End Summary] 

None
[ChatCompletionMessageToolCall(id='call_KGVbbPwLMwKdxgQW38snSpk4', function=Function(arguments='{}', name='get_content'), type='function')]

TOOL CALLget_content {}

TOOL RESPONSE:1511 Daléra, provisional designation 1939 FB, is an <link>Asteroid</link> fro

KeyboardInterrupt: 

In [None]:

for message in agent.messages:
    try: print(message)
    except: print(message)
'''
print(agent.all_messages[2].tool_calls)
print(len([x for x in agent.all_messages[2].tool_calls if x.function.name=="move_page"]))
agent.all_messages[2]
'''

In [None]:
x = get_page("The Sinclair's Mysteries")
print(x.content)
print("\n\n\n\n\n\n\n\n\n\n\n")
print(x.html())
print("\n\n\n\n\n\n\n\n\n\n\n")
a = get_page("Edwardian")
print(a.html())

#Maybe I can just extract links. Extract <a href = "link_name">text</a> then pair blah blah blah and link_name together. then go through content and wrap <link = "link_name">text</link>



# 4️⃣ Elicitation

In [None]:
class WikipediaGame:
    def __init__(self, starting_page : str, goal_page : str, rules : list | type[None] = None):
        '''
        Initialises the wikipedia game object

        starting_page is the page the agent starts on.

        goal_page is the page the agent is trying to get to.

        rules is a list of dictionaries specifying any additional rules along with a description of that rule to be fed to the agent.

        Rules that are handled currently:
            - "no country" bans country articles.
            - "no backtrack" bans backtracking.
        '''
        self.page_title_history=[starting_page]
        self.starting_page = get_page(starting_page) # page the game starts on
        self.goal_page = get_page(goal_page) # page the game ends on
        self.rules = rules # any additional rules, (no countries, no articles above a certain length, no backtracking etc. Need to add functionality to deal with these which I haven't done yet)
        self.current_page = get_page(starting_page) # current page the game state is on.

    def get_page_summary(self, page : wikipedia.WikipediaPage):
        '''
        Get summary of a wikipedia page, to the last full stop within the first 500 characters.
        '''
        summary = page.content[0:500]
        return summary[0: summary.rindex(".")+1]

    def move_page(self, **args):
        '''
        Changes the current page of the game. To be used when we want to move from Page A to Page B
        '''
        new_page = args["new_page"]
        if self.is_permitted_link(new_page):
            self.current_page = get_page(new_page)
            self.page_title_history.append(self.current_page.title)
            return "Moving page to " + self.current_page.title
        elif self.is_permitted_link(new_page.replace("_"," ")):
            self.current_page = get_page(new_page.replace("_"," "))
            self.page_title_history.append(self.current_page.title)
            return "Moving page to " + self.current_page.title
        else:
            return "Couldn't move page to " + new_page

    def get_plain_content(self):
        return self.current_page.content

    def get_content(self,**args):
        '''
        Gives the content of the wikipedia page. Make sure that accessible links are wrapped with <link> </link> tags.
        '''
        content = self.current_page.content
        for word in sorted(self.get_links(), key=len, reverse = True):
            content = re.sub(" " + word + " ", " " + f"<link>{word}</link>" + " ",content, flags = re.I)
        for word in sorted(self.get_links(), key=len, reverse = True):
            content = re.sub(" " + word + ",", " " + f"<link>{word}</link>" + ",", content, flags=re.I)
        for word in sorted(self.get_links(), key=len, reverse = True):
            content = re.sub(" " + word + ".", " " + f"<link>{word}</link>" + ".", content, flags=re.I)
        for word in sorted(self.get_links(), key=len, reverse = True):
            content = re.sub("\(" + word + "\)", "(" + f"<link>{word}</link>" + ")", content, flags = re.I)
        for word in sorted(self.get_links(), key=len, reverse = True):
            content = re.sub(" " + word + "s", " " + f"<link>{word}</link>" + "s", content, flags = re.I)
        return content

    def get_links(self):
        #Only certain links will be accessible, since the wikipedia api returns a list of ALL possible links (lots of which would not be included in the actual content of the wikipedia, and almost all of which would be banned according to most rules of wiki racing).
        all_links = self.current_page.links
        content = self.current_page.content
        permitted_links = []
        for i in all_links:
            if i in content:
                permitted_links.append(i)
        return permitted_links

    def is_permitted_link(self, link):
        if link in self.get_links():
            return True
        else:
            return False

    def check_win(self):
        if self.current_page==self.goal_page:
            return True
        else:
            return False

In [None]:
class WikipediaRacingAgent:
    def __init__(self, wikigame, model = "gpt-4o-mini", tools = WikipediaGameTools):
        # Initialises the agent
        self.game=wikigame
        self.model = model
        self.client = OpenAI()
        self.system_message = {
                "role" : "system", 
                "content" : "You are a wikipedia-racing AI. Your goal is to reach " + self.game.goal_page.title + " by accessing links from a series of wikipedia pages. Your current page is " + self.game.current_page.title + "."
            }
        self.user_message={
                "role" : "user",
                "content" : "You are currently on page: " + self.game.current_page.title + ". Make sure you start by reasoning about what steps you should take to get to the article on " + self.game.goal_page.title + ". When coming up with a strategy, make sure to pay attention to the path you have already taken, and if your current strategy doesn't seem to be working out, try something else. In case you're unsure, " + self.game.goal_page.title + " has the following summary: \n\n[Begin Summary]\n" + self.game.get_page_summary(self.game.goal_page) + "\n[End Summary] " 
            }
        self.messages = [self.system_message, self.user_message] # Messages that are currently in the chat history.
        self.all_messages = [self.system_message, self.user_message] # All messages that have been sent in the chat history. We have to erase each time a new page is reached for context window reasons.
        self.tools = tools
        self.response=None

        #print starting messages to watch what the agent is doing
        print("\nSYSTEM: " + self.system_message["content"] + "\n\nUSER: " + self.user_message["content"])
    
    @retry_with_exponential_backoff
    def generate_response_with_tools(self):
        # Generate a response with tools and add it to the messages.
        
        new_response = self.client.chat.completions.create(
            model = self.model,
            messages = self.messages,
            tools = self.tools,
            tool_choice = "auto"
        )

        self.response = new_response.choices[0].message
        self.messages.append(self.response)
        self.all_messages.append(self.response)

        #Print message to watch what the agent is doing
        print("\n" + str(self.response.content))

        return self.response

    def do_tool_calls(self, output):
        # Do the tool calls and return the response and add the tool calls to the messages
        tool_calls = output.tool_calls
        move_page_count = 0
        if tool_calls:
            print(tool_calls)
            for tool_call in tool_calls:
                func = getattr(self.game, tool_call.function.name)
                arguments = json.loads(tool_call.function.arguments)
                new_response = func(**arguments)
                self.messages.append(tool_call_message(tool_call,new_response))
                self.all_messages.append(tool_call_message(tool_call,new_response))
                
                #Print message to watch what the agent is doing
                print("\nTOOL CALL" + tool_call.function.name, arguments)
                print("\nTOOL RESPONSE:" + new_response[:300])

                if tool_call.function.name == "move_page" and "Moving page" in new_response:
                    self.new_state()

                    # TODO: Absolutely NEED to fix this goddamn moving page issue. Really need to think of a non-awful way to solve this problem. (pop things from a list, maybe just learn more about how OpenAI stores these lists)

                    return "new state"
                
            self.messages.append(user_message("What's your next step to get to "+ self.game.goal_page.title + "?"))
            self.all_messages.append(user_message("What's your next step to get to "+ self.game.goal_page.title + "?"))
            
            #Print message to watch what the agent is doing
            print("\nUSER: What's your next step to get to " + self.game.goal_page.title + "?")
                
                    

    def new_state(self):
        # Begin a new state when a new page is visited. Updat user_message and system_message, and reset self.messages (for context window reasons).

        self.system_message = {
                "role" : "system", 
                "content" : "You are a wikipedia-racing AI. Your goal is to reach " + self.game.goal_page.title + " by accessing links from a series of wikipedia pages. Your current page is " + self.game.current_page.title + "."
            }
        self.user_message={
                "role" : "user",
                "content" : "You are currently on page: " + self.game.current_page.title + ". Make sure you start by reasoning about what steps you should take to get to the article on " + self.game.goal_page.title + ". When coming up with a strategy, make sure to pay attention to the path you have already taken, and if your current strategy doesn't seem to be working out, try something else. In case you're unsure " + self.game.goal_page.title + " has the following summary: \n\n [Begin Summary] \n" + self.game.get_page_summary(self.game.goal_page) + "\n[End Summary]"
            }
        self.messages = [self.system_message,self.user_message]
        self.all_messages.extend([self.system_message,self.user_message])

        #Print message to watch what the agent is doing
        print(("-" * 50) + "\n\nNEW STATE\n\n" + "HISTORY: " + " -> ".join(game.page_title_history) + "\n\n" + ("-"*50))
        print("SYSTEM: " + self.system_message["content"] + "\n\nUSER: " + self.user_message["content"])

## Prompting

We've disallowed the agent to view its past history and plans in detail. We had to do this because of context length considerations, but there's another solution: Replace the output of prior wikipedia pages so it doesn't take up space in the context window.



## Scaffolding

We can improve the scaffolding by:
- Telling the agent its history.
- Giving the agent a "reflexion tool" (link reflexion paper)
- The prompts are also technically a part of the "scaffolding"
- We can tell the agent if it's already visited a page.

## Tool use (might be worth experimenting with this somewhat anyway)

We could also give the agent additional tools (although remember)

# 5️⃣ Results