# **AI_Agents_Course Module_3**

[Link of the Course](https:///www.coursera.org/learn/ai-agents-python)

**Instructor: Dr. Jules White**

**Codes Edited by: Houshyar Jafari Asl**

"""
OLLAMA LLM IN COLAB (LOCAL ALTERNATIVE TO OPENAI)

This code sets up a free, local LLM (Llama3) in Google Colab using:
1. Ollama - Runs the model locally
2. LiteLLM - Provides OpenAI-like API interface

HOW IT WORKS:
1. First run: Downloads Llama3 (4.7GB, one-time)
2. Starts Ollama server with CPU
3. Uses LiteLLM to send/receive messages like OpenAI API

ADVANTAGES:
- No API keys needed
- Free to use
- Works offline after setup

NOTE:
- Colab may disconnect after ~1 hour
- Responses slightly slower than GPT-4
"""

# The GAME Framework: Designing AI Agents

The starting point of an agent should be thinking through its design. While much of our focus has been on implementing code, taking a step back to structure an agent’s architecture before writing a single line is crucial. The GAME framework provides a methodology for systematically defining an agent’s goals, actions, memory, and environment, allowing us to approach the design in a logical and modular fashion. By thinking through how these components interact within the agent loop, we can sketch out the agent’s behavior and dependencies before diving into code implementation. This structured approach not only improves clarity but also makes the transition from design to coding significantly smoother and more efficient.

The **GAME** framework provides a structured way to design AI agents, ensuring modularity and adaptability. It breaks agent design into four essential components:

* **G - Goals / Instructions**: What the agent is trying to accomplish and its instructions on how to try to achieve its goals.
* **A - Actions**: The tools the agent can use to achieve its goals.
* **M - Memory**: How the agent retains information across interactions, which determines what information it will have available in each iteration of the agent loop.
* **E - Environment**: The agent’s interface to the external world where it executes actions and gets feedback on the results of those actions.

Goals and instructions are grouped together under “G” because they work in tandem to shape the agent’s behavior. Goals specify what the agent is trying to achieve, serving as the high-level objectives that define the desired outcomes of the agent’s operation. Instructions, on the other hand, provide the how, detailing the specific steps, strategies, and constraints that guide the agent toward fulfilling its goals effectively. Together, they form the foundation that ensures the agent not only understands its purpose but also follows a structured approach to accomplishing its tasks.

One important discussion is the relationship between Actions and the Environment. Actions define **what** the agent can do—they are abstract descriptions of potential choices available to the agent. The Environment, on the other hand, determines **how** those actions are carried out, providing concrete implementations that execute within the real-world context of the agent. This distinction allows us to separate high-level decision-making from the execution details, making the agent more modular and adaptable.

You can think of Actions as an “interface” specifying the available capabilities, while the Environment acts as the “implementation” that brings those capabilities to life. For example, an agent might have an action called `read_file()`, which is simply a placeholder in the Actions layer. The Environment then provides the actual logic, handling file I/O operations and error handling to ensure the action is executed correctly. This separation ensures flexibility—agents can be designed to operate across different environments by simply swapping out implementations while keeping their decision logic intact.

**Motivating Example: The Proactive Coder**

To illustrate how the GAME framework applies in practice, consider an AI agent designed to proactively enhance a codebase. This **Proactive Coder** agent will scan a repository, analyze patterns in the code, and propose potential new features that it could implement with a small number of changes. If the user approves a feature, the agent will generate the initial implementation and suggest refinements.

Using the GAME framework, we break down the agent design:

* Goals:

 * Goals (What to achieve):
 * Identify potential enhancements
 * Make sure that the enhancements are helpful and relevant
 * Make sure that the enhancements are small and self-contained so that they can be implemented by the agent with minimal risk
 * Ensure that the changes do not break existing interfaces
 * Ensure that the agent only implements features that the user agrees to
* Instructions (How to achieve it):
 * Pick a random file in the code base and read through it
 * Read some related files to the original file
 * Read at most 5 files
 * Propose three feature ideas that are implementable in 2-3 functions and require minimal editing of the existing code
 * Ask the user to select which feature to implement
 * List the files that will need to be edited and provide a list of proposed changes for each
 * Go file by file implementing the changes until they are all edited
* Actions:
 * List project files
 * Read project file
 * Ask user to select a feature
 * Edit project file
* Memory:
 * We will use a simple conversational memory and store the complete contents of files in the conversation for reference

# Simulating GAME Agents in a Conversation

**Testing Agent Designs Through Conversation Simulation**

Before we write a single line of code for our agent, we should test whether our GAME design is actually feasible. One powerful technique is to simulate the agent’s decision-making process through conversation with an LLM in a chat interface (e.g., ChatGPT). This approach helps us identify potential problems with our design early, when they’re easiest to fix. Let’s explore how to conduct these simulations effectively.

**Why Simulate First?**

Think of agent simulation like a dress rehearsal for a play. Before investing in costumes and sets, you want to make sure the script makes sense and the actors can perform their roles effectively. Similarly, before implementing an agent, we want to verify that:

1. The goals are achievable with the planned actions
2. The memory requirements are reasonable
3. The actions available are sufficient to solve the problem
4. The agent can make appropriate decisions with the available information

**Setting Up Your Simulation**

When starting a conversation with an LLM to simulate your agent, begin by establishing the framework. We can do this with a simple prompt in a chat interface. The prompt should clearly outline the agent’s goals, actions, and the simulation process. Here’s a template you can use:

In [None]:
I'd like to simulate an AI agent that I'm designing. The agent will be built using these components:

Goals: [List your goals]
Actions: [List available actions]

At each step, your output must be an action to take.

Stop and wait and I will type in the result of
the action as my next message.

Ask me for the first task to perform.

For a Proactive Coder agent, you might use the following prompt to kick-off a simulation in ChatGPT:

In [None]:
I'd like to simulate an AI agent that I'm designing. The agent will be built using these components:

Goals:
* Find potential code enhancements
* Ensure changes are small and self-contained
* Get user approval before making changes
* Maintain existing interfaces

Actions available:
* list_project_files()
* read_project_file(filename)
* ask_user_approval(proposal)
* edit_project_file(filename, changes)

At each step, your output must be an action to take.

Stop and wait and I will type in the result of
the action as my next message.

Ask me for the first task to perform.

Take a moment to open up ChatGPT and try out this prompt. You can use the same prompt in any chat interface that supports LLMs. What worked? What didn’t?

**Understanding Agent Reasoning**
When you begin simulating your agent’s behavior, you’re essentially conducting a series of experiments to understand how well it can reason with the tools and goals you’ve provided. Start by presenting a simple scenario – perhaps a small Python project with just a few files. Watch how the agent approaches the task. Does it immediately jump to reading files, or does it first list the available files to get an overview? These initial decisions reveal a lot about whether your goals and actions enable systematic problem-solving.

As you observe the agent’s decisions, you’ll notice that the way you present information significantly impacts its reasoning. For instance, when you return the results of list_project_files(), you might first try returning just the filenames:

In [None]:
["main.py", "utils.py", "data_processor.py"]

Then experiment with providing more context:

In [None]:
{
    "files": ["main.py", "utils.py", "data_processor.py"],
    "total_files": 3,
    "directory": "/project"
}

You might discover that the additional metadata helps the agent make more informed decisions about which files to examine next. This kind of experimentation with result formats helps you understand how much context your agent needs to reason effectively.

**Evolving Your Tools and Goals**

The simulation process often reveals that your initial tool descriptions aren’t as clear as you thought. For example, you might start with a simple description for read_project_file():

In [None]:
read_project_file(filename) -> Returns the content of the specified file

Through simulation, you might find the agent using it incorrectly, leading you to enhance the description:

In [None]:
read_project_file(filename) -> Returns the content of a Python file from the project directory.
The filename should be one previously returned by list_project_files().

Similarly, your goals might evolve. You might start with “Find potential code enhancements” but discover through simulation that the agent needs more specific guidance. This might lead you to refine the goal to “Identify opportunities to improve error handling and input validation in functions.”

**Understanding Memory Through Chat**

One of the most enlightening aspects of simulation is realizing that the chat format naturally mimics the list-based memory system we use in our agent loop memory. Each exchange between you and the LLM represents an iteration of the agent loop and a new memory entry – the agent’s actions and the environment’s responses accumulate just as they would in our implemented memory system. This helps you understand how much history the agent can accumulate and still maintain context and make good decisions.

**Learning from Failures**

Introducing controlled chaos into your simulation provides valuable insights. Try returning error messages instead of successful results:

In [None]:
{"error": "FileNotFoundError: main.py does not exist"}

Or return malformed data:

In [None]:
{"cont3nt": "def broken_func(): pass"}

Watch how the agent handles these situations. Does it try alternative approaches? Does it give up too easily? Does it maintain its goal focus despite errors? These observations help you design better error handling and recovery strategies.

**Preventing Runaway Agents**

The simulation environment provides a safe space to test termination conditions. You can experiment with different criteria for when the agent should conclude its task. Perhaps it should stop after examining a certain number of files, or after making a specific number of improvement suggestions. The chat format lets you quickly try different approaches without worrying about infinite loops or resource consumption.

**Rapid Iteration and Improvement**

The true power of simulation lies in its speed. You can test dozens of scenarios in the time it would take to implement a single feature. Want to see how the agent handles a project with 100 files? Just tell it that’s what list_project_files() returned. Curious about how it would handle deeply nested function calls? Paste in some complex code and see how it analyzes it.

**Learning from the Agent**

At the end of your simulation sessions, ask the agent to reflect on its experience. What tools did it wish it had? Were any instructions unclear? Which goals were too vague? The LLM can often provide surprisingly insightful suggestions about how to improve your GAME design.

For example, the agent might suggest: “The ask_user_approval() action would be more effective if it could include code snippets showing the proposed changes. This would help users make more informed decisions about the suggested improvements.”

**Building Your Example Library**

As you conduct these simulations, you’re building a valuable library of examples. When you see the agent make a particularly good decision, save that exchange. When it makes a poor choice, save that too. These examples become invaluable when you move to implementation – they can be used to craft better prompts and test cases.

Keep a record of exchanges like this:

Good Example:

In [None]:
Agent: "Before modifying utils.py, I should read its contents to understand the current error handling patterns."
Action: read_project_file("utils.py")
Result: [file contents]
Agent: "I notice these functions lack input validation. I'll propose focused improvements for each function."

Poor Example:

In [None]:
Agent: "I'll start editing all the files to add error handling."
Action: edit_project_file("utils.py", {...})
[Missing analysis and user approval steps]

These examples help you understand what patterns to encourage or discourage in your implemented agent.

Through this iterative process of simulation, observation, and refinement, you develop a deep understanding of how your agent will behave in the real world. This understanding is invaluable when you move to implementation, helping you build agents that are more robust, more capable, and better aligned with your goals.

Remember, the time spent in simulation is an investment that pays off in better design decisions and fewer implementation surprises. When you finally start coding, you’re not just hoping your design will work – you’ve already seen it work in hundreds of scenarios.

# Building a Simple Agent Framework 1

We are designing our agents in terms of GAME. Ideally, we would like our code to reflect how we design the agent, so that we can easily translate our design into an implementation. Also, we can see that the GAME components are what change from one agent to another while the core loop stays the same. We would like to design a framework that allows us to reuse as much as possible while making it easy to change the GAME pieces without affecting the GAME rules (e.g., the agent loop).

At first, it will appear that we are adding complexity to the agent — and we are. However, this complexity is necessary to create a framework that is flexible and reusable. The goal is to create a framework that allows us to build agents quickly and easily without changing the core loop. We are going to look at each of the individual GAME component implementations and then how they fit into the overall framework at the end.

**G - Goals Implementation**

First, let’s create a simple goal class that defines what our agent is trying to accomplish:

In [None]:
@dataclass(frozen=True)
class Goal:
    priority: int
    name: str
    description: str

Goals will describe what we are trying to achieve and how to achieve it. By encapsulating them into objects, we can move away from large “walls of text” that represent the instructions for our agent. Additionally, we can add priority to our goals, which will help us decide which goal to pursue first and how to sort or format them when combining them into a prompt.

We broadly use the term “goal” to encompass both “what” the agent is trying to achieve and “how” it should approach the task. This duality is crucial for guiding the agent’s behavior effectively. An important type of goal can be examples that show the agent how to reason in certain situations. We can also build goals that define core rules that are common across all agents in our system or that give it special instructions on how to solve certain types of tasks.

Now, let’s take a look at how we might create a goal related to file management for our agent:

In [None]:
from game.core import Goal

# Define a simple file management goal
file_management_goal = Goal(
    priority=1,
    name="file_management",
    description="""Manage files in the current directory by:
    1. Listing files when needed
    2. Reading file contents when needed
    3. Searching within files when information is required
    4. Providing helpful explanations about file contents"""
)

**A - Actions Implementation with JSON Schemas**

Actions define what the agent can do. Think of them as the agent’s toolkit. Each action is a discrete capability that can be executed in the environment. The action system has two main parts: the Action class and the ActionRegistry.

The actions are the interface between our agent and its environment. These are descriptions of what the agent can do to affect the environment. We have previously built out actions using Python functions, but let’s encapsulate the parts of an action into an object:

In [None]:
class Action:
    def __init__(self,
                 name: str,
                 function: Callable,
                 description: str,
                 parameters: Dict,
                 terminal: bool = False):
        self.name = name
        self.function = function
        self.description = description
        self.terminal = terminal
        self.parameters = parameters

    def execute(self, **args) -> Any:
        """Execute the action's function"""
        return self.function(**args)

At first, it may not appear that this is much different from the previous implementation. However, later, we will see that this makes it much easier to create different agents by simply swapping out the actions without having to modify the core loop.

When the agent provides a response, it is going to return JSON. However, we are going to want a way to lookup the actual object associated with the action indicated by the JSON. To do this, we will create an `ActionRegistry` that will allow us to register actions and look them up by name:

In [None]:
class ActionRegistry:
    def __init__(self):
        self.actions = {}

    def register(self, action: Action):
        self.actions[action.name] = action

    def get_action(self, name: str) -> [Action, None]:
        return self.actions.get(name, None)

    def get_actions(self) -> List[Action]:
        """Get all registered actions"""
        return list(self.actions.values())

Here is an example of how we might define some actions for a file management agent:

In [None]:
def list_files() -> list:
    """List all files in the current directory."""
    return os.listdir('.')

def read_file(file_name: str) -> str:
    """Read and return the contents of a file."""
    with open(file_name, 'r') as f:
        return f.read()

def search_in_file(file_name: str, search_term: str) -> list:
    """Search for a term in a file and return matching lines."""
    results = []
    with open(file_name, 'r') as f:
        for i, line in enumerate(f.readlines()):
            if search_term in line:
                results.append((i+1, line.strip()))
    return results

# Create and populate the action registry
registry = ActionRegistry()

registry.register(Action(
    name="list_files",
    function=list_files,
    description="List all files in the current directory",
    parameters={
        "type": "object",
        "properties": {},
        "required": []
    },
    terminal=False
))

registry.register(Action(
    name="read_file",
    function=read_file,
    description="Read the contents of a specific file",
    parameters={
        "type": "object",
        "properties": {
            "file_name": {
                "type": "string",
                "description": "Name of the file to read"
            }
        },
        "required": ["file_name"]
    },
    terminal=False
))

registry.register(Action(
    name="search_in_file",
    function=search_in_file,
    description="Search for a term in a specific file",
    parameters={
        "type": "object",
        "properties": {
            "file_name": {
                "type": "string",
                "description": "Name of the file to search in"
            },
            "search_term": {
                "type": "string",
                "description": "Term to search for"
            }
        },
        "required": ["file_name", "search_term"]
    },
    terminal=False
))

**M - Memory Implementation**

Almost every agent needs to remember what happens from one loop iteration to the next. This is where the Memory component comes in. It allows the agent to store and retrieve information about its interactions, which is critical for context and decision-making. We can create a simple class to represent the memory:

In [None]:
class Memory:
    def __init__(self):
        self.items = []  # Basic conversation histor

    def add_memory(self, memory: dict):
        """Add memory to working memory"""
        self.items.append(memory)

    def get_memories(self, limit: int = None) -> List[Dict]:
        """Get formatted conversation history for prompt"""
        return self.items[:limit]

Originally, we just used a simple list of messages. Is it worth wrapping the list in this additional class? Yes, because it allows us to add additional functionality later without changing the core loop. For example, we might want to store the memory in a database and dynamically change what memories the agent sees at each loop iteration based on some analysis of the state of the memory. With this simple interface, we can create subclasses that implement different memory strategies without changing the core loop.

One thing to note is that our memory always has to be represented as a list of messages in the prompt. Because of this, we provide a simple interface to the memory that returns the last N messages in the correct format. This allows us to keep the memory class agnostic to how it is used. We can change how we store the memory (e.g., in a database) without changing how we access it in the agent loop. Even if we store the memory in a complicated graph structure, we are still going to need to pass the memories to the LLM as a list and format them as messages.

**E - Environment Implementation**

In our original implementation, we hardcoded our “environment” interface as a series of if/else statements and function calls. We would like to have a more modular interface that allows us to execute actions without needing to know how they are implemented or have conditional logic in the loop. This is where the Environment component comes in. It serves as a bridge between the agent and the outside world, executing actions and returning results.

In [None]:
class Environment:
    def execute_action(self, action: Action, args: dict) -> dict:
        """Execute an action and return the result."""
        try:
            result = action.execute(**args)
            return self.format_result(result)
        except Exception as e:
            return {
                "tool_executed": False,
                "error": str(e),
                "traceback": traceback.format_exc()
            }

    def format_result(self, result: Any) -> dict:
        """Format the result with metadata."""
        return {
            "tool_executed": True,
            "result": result,
            "timestamp": time.strftime("%Y-%m-%dT%H:%M:%S%z")
        }

# Building a Simple Agent Framework 2

Now, we are going to put the components together into a reusable agent class. This class will encapsulate the GAME components and provide a simple interface for running the agent loop. The agent will be responsible for constructing prompts, executing actions, and managing memory. We can create different agents simply by changing the goals, actions, and environment without modifying the core loop.

Let’s take a look at our agent class:

In [None]:
class Agent:
    def __init__(self,
                 goals: List[Goal],
                 agent_language: AgentLanguage,
                 action_registry: ActionRegistry,
                 generate_response: Callable[[Prompt], str],
                 environment: Environment):
        """
        Initialize an agent with its core GAME components
        """
        self.goals = goals
        self.generate_response = generate_response
        self.agent_language = agent_language
        self.actions = action_registry
        self.environment = environment

    def construct_prompt(self, goals: List[Goal], memory: Memory, actions: ActionRegistry) -> Prompt:
        """Build prompt with memory context"""
        return self.agent_language.construct_prompt(
            actions=actions.get_actions(),
            environment=self.environment,
            goals=goals,
            memory=memory
        )

    def get_action(self, response):
        invocation = self.agent_language.parse_response(response)
        action = self.actions.get_action(invocation["tool"])
        return action, invocation

    def should_terminate(self, response: str) -> bool:
        action_def, _ = self.get_action(response)
        return action_def.terminal

    def set_current_task(self, memory: Memory, task: str):
        memory.add_memory({"type": "user", "content": task})

    def update_memory(self, memory: Memory, response: str, result: dict):
        """
        Update memory with the agent's decision and the environment's response.
        """
        new_memories = [
            {"type": "assistant", "content": response},
            {"type": "user", "content": json.dumps(result)}
        ]
        for m in new_memories:
            memory.add_memory(m)

    def prompt_llm_for_action(self, full_prompt: Prompt) -> str:
        response = self.generate_response(full_prompt)
        return response

    def run(self, user_input: str, memory=None, max_iterations: int = 50) -> Memory:
        """
        Execute the GAME loop for this agent with a maximum iteration limit.
        """
        memory = memory or Memory()
        self.set_current_task(memory, user_input)

        for _ in range(max_iterations):
            # Construct a prompt that includes the Goals, Actions, and the current Memory
            prompt = self.construct_prompt(self.goals, memory, self.actions)

            print("Agent thinking...")
            # Generate a response from the agent
            response = self.prompt_llm_for_action(prompt)
            print(f"Agent Decision: {response}")

            # Determine which action the agent wants to execute
            action, invocation = self.get_action(response)

            # Execute the action in the environment
            result = self.environment.execute_action(action, invocation["args"])
            print(f"Action Result: {result}")

            # Update the agent's memory with information about what happened
            self.update_memory(memory, response, result)

            # Check if the agent has decided to terminate
            if self.should_terminate(response):
                break

        return memory

Now, let’s walk through how the GAME components work together in this agent architecture, explaining each part of agent loop.

**Step 1: Constructing the Prompt**

When the agent loop begins, it first constructs a prompt using the `construct_prompt` method:

In [None]:
def construct_prompt(self, goals: List[Goal], memory: Memory, actions: ActionRegistry) -> Prompt:
    """Build prompt with memory context"""
    return self.agent_language.construct_prompt(
        actions=actions.get_actions(),
        environment=self.environment,
        goals=goals,
        memory=memory
    )

This method leverages the `AgentLanguage` component to build a structured prompt containing:

* The agent’s goals (what it’s trying to accomplish)
* Available actions (tools the agent can use)
* Current memory context (conversation history and relevant information)
* Environment details (constraints and context for operation)

We are going to discuss the `AgentLanguage` in more detail later. For now, what you need to know is that it is responsible for formatting the prompt that is sent to the LLM and parsing the response from the LLM. Most of the time, we are going to use function calling, so the parsing will just be reading the returned tool calls. However, the `AgentLanguage` can be changed to allow us to also take the same agent and implement it without function calling.

**Step 2: Generating a Response**

Next, the agent sends this prompt to the language model:

In [None]:
def prompt_llm_for_action(self, full_prompt: Prompt) -> str:
    response = self.generate_response(full_prompt)
    return response

The `generate_response` function is a simple python function provided during initialization. This abstraction allows the framework to work with different language models without changing the core loop. We will use LiteLLM to call the LLM, but you could easily swap this out for any other LLM provider.

**Step 3: Parsing the Response**

Once the language model returns a response, the agent parses it to identify the intended action. The parsing will generally be just getting the tool calls from the response, however the agent language gets to decide how this is done. Once the response is parsed, the agent can look up the action in the `ActionRegistry`:

In [None]:
def get_action(self, response):
    invocation = self.agent_language.parse_response(response)
    action = self.actions.get_action(invocation["tool"])
    return action, invocation

The `action` is the interface definition of what the agent “can” do. The `invocation` is the specific parameters that the agent has chosen to use for this action. The `ActionRegistry` allows the agent to look up the action by name, and the `invocation` provides the arguments needed to execute it. We could also add validation at this step to ensure that the invocation parameters match the action’s expected parameters.

**Step 4: Executing the Action**

The agent then executes the chosen action in the environment:

In [None]:
# Execute the action in the environment
result = self.environment.execute_action(action, invocation["args"])

The `Environment` handles the actual execution of the action, which might involve:

* Making API calls
* Reading/writing files
* Querying databases
* Processing data

Actions are defined in the `ActionRegistry` but executed within the context of the `Environment`, which provides access to resources and handles the mechanics of execution.

**Step 5: Updating Memory**

After execution, the agent updates its memory with both its decision and the result:

In [None]:
def update_memory(self, memory: Memory, response: str, result: dict):
    """
    Update memory with the agent's decision and the environment's response.
    """
    new_memories = [
        {"type": "assistant", "content": response},
        {"type": "user", "content": json.dumps(result)}
    ]
    for m in new_memories:
        memory.add_memory(m)

This creates a continuous record of the agent’s reasoning and actions, which becomes part of the context of future loop iterations. The memory serves both as a record of past actions and as context for future prompt construction.

**Step 6: Termination Check**

Finally, the agent checks if it should terminate the loop:

In [None]:
def should_terminate(self, response: str) -> bool:
    action_def, _ = self.get_action(response)
    return action_def.terminal

This allows certain actions (like a “terminate” action) to signal that the agent has finished its work.

**The Flow of Information Through the Loop**

To better understand how these components interact, let’s trace how information flows through a single iteration of the loop:

1. The `Memory` provides context about what the user has asked the agent to do and past decisions and results from the agent loop
2. The `Goals` define what the agent is trying to accomplish and rules on how to accomplish it
3. The `ActionRegistry` defines what the agent can do and helps lookup the action to execute by name
4. The `AgentLanguage` formats Memory, Actions, and Goals into a prompt for the LLM
5. The LLM generates a response choosing an action
6. The `AgentLanguage` parses the response into an action invocation, which will typically be extracted from tool calls
7. The `Environment` executes the action with the given arguments
8. The result is stored back in `Memory`
9. The loop repeats with the updated memory until the agent calls a terminal tool or reaches the maximum number of iterations

**Creating Specialized Agents**

The beauty of this framework is that we can create entirely different agents by changing the GAME components without modifying the core loop:

In [None]:
# A research agent
research_agent = Agent(
    goals=[Goal("Find and summarize information on topic X")],
    agent_language=ResearchLanguage(),
    action_registry=ActionRegistry([SearchAction(), SummarizeAction(), ...]),
    generate_response=openai_call,
    environment=WebEnvironment()
)

# A coding agent
coding_agent = Agent(
    goals=[Goal("Write and debug Python code for task Y")],
    agent_language=CodingLanguage(),
    action_registry=ActionRegistry([WriteCodeAction(), TestCodeAction(), ...]),
    generate_response=anthropic_call,
    environment=DevEnvironment()
)

Each agent operates using the same fundamental loop but exhibits completely different behaviors based on its GAME components.

# Building a Simple Agent Framework 3

Let’s go back to the file agent that we built earlier. The original implementation uses direct function calls and a lot of conditional logic in the agent loop. Let’s redo the implementation using our new framework.

**Define the Goals**

First, let’s define some goals for our file explorer agent:

In [None]:
# Define clear goals for the agent
goals = [
    Goal(
        priority=1,
        name="Explore Files",
        description="Explore files in the current directory by listing and reading them"
    ),
    Goal(
        priority=2,
        name="Terminate",
        description="Terminate the session when tasks are complete with a helpful summary"
    )
]

**Create Actions Using the Framework**

Next, let’s convert our tool functions into properly structured Actions in our AgentRegistry:

In [None]:
def list_files() -> List[str]:
    """List files in the current directory."""
    return os.listdir(".")

def read_file(file_name: str) -> str:
    """Read a file's contents."""
    try:
        with open(file_name, "r") as file:
            return file.read()
    except FileNotFoundError:
        return f"Error: {file_name} not found."
    except Exception as e:
        return f"Error: {str(e)}"

def terminate(message: str) -> str:
    """Terminate the agent loop and provide a summary message."""
    return message

# Create and register the actions
action_registry = ActionRegistry()

action_registry.register(Action(
    name="list_files",
    function=list_files,
    description="Returns a list of files in the directory.",
    parameters={},
    terminal=False
))

action_registry.register(Action(
    name="read_file",
    function=read_file,
    description="Reads the content of a specified file in the directory.",
    parameters={
        "type": "object",
        "properties": {
            "file_name": {"type": "string"}
        },
        "required": ["file_name"]
    },
    terminal=False
))

action_registry.register(Action(
    name="terminate",
    function=terminate,
    description="Terminates the conversation. Prints the provided message for the user.",
    parameters={
        "type": "object",
        "properties": {
            "message": {"type": "string"},
        },
        "required": ["message"]
    },
    terminal=True
))

**Create and Run the Agent**

Now we can put it all together:

In [None]:
# Create the agent
file_explorer_agent = Agent(
    goals=goals,
    agent_language=agent_language,
    action_registry=action_registry,
    generate_response=generate_response,
    environment=environment
)

# Run the agent
user_input = input("What would you like me to do? ")
final_memory = file_explorer_agent.run(user_input, max_iterations=10)

# Print the final conversation if desired
for item in final_memory.get_memories():
    print(f"\n{item['type'].upper()}: {item['content']}")

**Complete Implementation**

Here’s the full implementation using the GAME framework:

In [None]:
def main():
    # Define the agent's goals
    goals = [
        Goal(
            priority=1,
            name="Explore Files",
            description="Explore files in the current directory by listing and reading them"
        ),
        Goal(
            priority=2,
            name="Terminate",
            description="Terminate the session when tasks are complete with a helpful summary"
        )
    ]

    # Define tool functions
    def list_files() -> List[str]:
        """List files in the current directory."""
        return os.listdir(".")

    def read_file(file_name: str) -> str:
        """Read a file's contents."""
        try:
            with open(file_name, "r") as file:
                return file.read()
        except FileNotFoundError:
            return f"Error: {file_name} not found."
        except Exception as e:
            return f"Error: {str(e)}"

    def terminate(message: str) -> str:
        """Terminate the agent loop and provide a summary message."""
        return message

    # Create action registry and register actions
    action_registry = ActionRegistry()

    action_registry.register(Action(
        name="list_files",
        function=list_files,
        description="Returns a list of files in the directory.",
        parameters={},
        terminal=False
    ))

    action_registry.register(Action(
        name="read_file",
        function=read_file,
        description="Reads the content of a specified file in the directory.",
        parameters={
            "type": "object",
            "properties": {
                "file_name": {"type": "string"}
            },
            "required": ["file_name"]
        },
        terminal=False
    ))

    action_registry.register(Action(
        name="terminate",
        function=terminate,
        description="Terminates the conversation. Prints the provided message for the user.",
        parameters={
            "type": "object",
            "properties": {
                "message": {"type": "string"},
            },
            "required": ["message"]
        },
        terminal=True
    ))

    # Define the agent language and environment
    agent_language = AgentFunctionCallingActionLanguage()
    environment = Environment()

    # Create the agent
    file_explorer_agent = Agent(
        goals=goals,
        agent_language=agent_language,
        action_registry=action_registry,
        generate_response=generate_response,
        environment=environment
    )

    # Run the agent
    user_input = input("What would you like me to do? ")
    final_memory = file_explorer_agent.run(user_input, max_iterations=10)

    # Print the termination message (if any)
    for item in final_memory.get_memories():
        print(f"\nMemory: {item['content']}")

if __name__ == "__main__":
    main()

**Key Differences and Benefits**

By converting our agent to the GAME framework, we gain several benefits:

1. **Better Organization**: Each component has a clear purpose and is separated from others.
2. **Reusability**: We can swap out components (like the actions or environment) without changing the core logic.
3. **Extensibility**: New goals and actions can be added easily.
4. **Standard Interface**: Using the Agent class gives us a consistent way to interact with different agents.
5. **Memory Management**: The framework handles memory updates automatically.

This structure also makes it easier to understand and maintain the code, especially as your agent grows in complexity.

**Using the Agent**

Once implemented, you can use your file explorer agent like this:

In [None]:
What would you like me to do? Tell me what Python files are in this directory and summarize how they fit together.

Agent thinking...
Agent Decision: I'll help you explore the Python files in this directory.

{"tool_name": "list_files", "args": {}}

Action Result: {'tool_executed': True, 'result': ['file1.py', 'file2.py', 'main.py', ...], 'timestamp': '2025-03-02T12:34:56+0000'}

{"tool_name": "read_file", "args": {"file_name": "file1.py"}}

Action Result: {'tool_executed': True, 'result': '# This is file1.py\n\ndef hello_world():\n    print("Hello, World!")\n\nif __name__ == "__main__":\n    hello_world()', 'timestamp': '2025-03-02T12:34:58+0000'}

[Additional file readings...]

{"tool_name": "terminate", "args": {"message": "I've explored all Python files in this directory. Here's a summary: file1.py contains a simple hello_world function, file2.py implements a calculator class, and main.py imports both files and uses their functionality."}}

...

This structured approach makes it much easier to develop, maintain, and extend your agents over time.

# Try Out the Agent Framework

In [3]:
!pip install litellm

# Install Ollama and pull Llama3 model
!curl -fsSL https://ollama.com/install.sh | sh
!ollama pull llama3

# Start Ollama server in background
!nohup ollama serve > /dev/null 2>&1 &

# Give server time to start
import time
time.sleep(10)

import os
import json
import traceback
from dataclasses import dataclass, field
from typing import List, Callable, Dict, Any
from litellm import completion

@dataclass
class Prompt:
    messages: List[Dict] = field(default_factory=list)
    tools: List[Dict] = field(default_factory=list)
    metadata: dict = field(default_factory=dict)

def generate_response(prompt: Prompt) -> str:
    """Call LLM to get response"""
    messages = prompt.messages
    tools = prompt.tools

    result = None

    if not tools:
        response = completion(
            model="ollama/llama3",
            messages=messages,
            max_tokens=1024,
            api_base="http://localhost:11434"
        )
        result = response.choices[0].message.content
    else:
        response = completion(
            model="ollama/llama3",
            messages=messages,
            tools=tools,
            max_tokens=1024,
            api_base="http://localhost:11434"
        )

        if response.choices[0].message.tool_calls:
            tool = response.choices[0].message.tool_calls[0]
            result = {
                "tool": tool.function.name,
                "args": json.loads(tool.function.arguments),
            }
            result = json.dumps(result)
        else:
            result = response.choices[0].message.content

    return result

@dataclass(frozen=True)
class Goal:
    priority: int
    name: str
    description: str

class Action:
    def __init__(self,
                 name: str,
                 function: Callable,
                 description: str,
                 parameters: Dict,
                 terminal: bool = False):
        self.name = name
        self.function = function
        self.description = description
        self.terminal = terminal
        self.parameters = parameters

    def execute(self, **args) -> Any:
        """Execute the action's function"""
        return self.function(**args)

class ActionRegistry:
    def __init__(self):
        self.actions = {}

    def register(self, action: Action):
        self.actions[action.name] = action

    def get_action(self, name: str) -> [Action, None]:
        return self.actions.get(name, None)

    def get_actions(self) -> List[Action]:
        """Get all registered actions"""
        return list(self.actions.values())

class Memory:
    def __init__(self):
        self.items = []  # Basic conversation history

    def add_memory(self, memory: dict):
        """Add memory to working memory"""
        self.items.append(memory)

    def get_memories(self, limit: int = None) -> List[Dict]:
        """Get formatted conversation history for prompt"""
        return self.items[:limit]

    def copy_without_system_memories(self):
        """Return a copy of the memory without system memories"""
        filtered_items = [m for m in self.items if m["type"] != "system"]
        memory = Memory()
        memory.items = filtered_items
        return memory

class Environment:
    def execute_action(self, action: Action, args: dict) -> dict:
        """Execute an action and return the result."""
        try:
            result = action.execute(**args)
            return self.format_result(result)
        except Exception as e:
            return {
                "tool_executed": False,
                "error": str(e),
                "traceback": traceback.format_exc()
            }

    def format_result(self, result: Any) -> dict:
        """Format the result with metadata."""
        return {
            "tool_executed": True,
            "result": result,
            "timestamp": time.strftime("%Y-%m-%dT%H:%M:%S%z")
        }

class AgentLanguage:
    def __init__(self):
        pass

    def construct_prompt(self,
                         actions: List[Action],
                         environment: Environment,
                         goals: List[Goal],
                         memory: Memory) -> Prompt:
        raise NotImplementedError("Subclasses must implement this method")

    def parse_response(self, response: str) -> dict:
        raise NotImplementedError("Subclasses must implement this method")

class AgentFunctionCallingActionLanguage(AgentLanguage):
    def __init__(self):
        super().__init__()

    def format_goals(self, goals: List[Goal]) -> List:
        sep = "\n-------------------\n"
        goal_instructions = "\n\n".join([f"{goal.name}:{sep}{goal.description}{sep}" for goal in goals])
        return [
            {"role": "system", "content": goal_instructions}
        ]

    def format_memory(self, memory: Memory) -> List:
        """Generate response from language model"""
        items = memory.get_memories()
        mapped_items = []
        for item in items:
            content = item.get("content", None)
            if not content:
                content = json.dumps(item, indent=4)

            if item["type"] == "assistant":
                mapped_items.append({"role": "assistant", "content": content})
            elif item["type"] == "environment":
                mapped_items.append({"role": "assistant", "content": content})
            else:
                mapped_items.append({"role": "user", "content": content})

        return mapped_items

    def format_actions(self, actions: List[Action]) -> [List,List]:
        """Generate response from language model"""
        tools = [
            {
                "type": "function",
                "function": {
                    "name": action.name,
                    "description": action.description[:1024],
                    "parameters": action.parameters,
                },
            } for action in actions
        ]

        return tools

    def construct_prompt(self,
                         actions: List[Action],
                         environment: Environment,
                         goals: List[Goal],
                         memory: Memory) -> Prompt:

        prompt = []
        prompt += self.format_goals(goals)
        prompt += self.format_memory(memory)

        tools = self.format_actions(actions)

        return Prompt(messages=prompt, tools=tools)

    def adapt_prompt_after_parsing_error(self,
                                         prompt: Prompt,
                                         response: str,
                                         traceback: str,
                                         error: Any,
                                         retries_left: int) -> Prompt:

        return prompt

    def parse_response(self, response: str) -> dict:
        """Parse LLM response into structured format"""
        try:
            data = json.loads(response)
            if "tool" not in data:
                return {
                    "tool": "terminate",
                    "args": {"message": response}
                }
            return data
        except Exception as e:
            return {
                "tool": "terminate",
                "args": {"message": response}
            }

class Agent:
    def __init__(self,
                 goals: List[Goal],
                 agent_language: AgentLanguage,
                 action_registry: ActionRegistry,
                 generate_response: Callable[[Prompt], str],
                 environment: Environment):
        self.goals = goals
        self.generate_response = generate_response
        self.agent_language = agent_language
        self.actions = action_registry
        self.environment = environment

    def construct_prompt(self, goals: List[Goal], memory: Memory, actions: ActionRegistry) -> Prompt:
        """Build prompt with memory context"""
        return self.agent_language.construct_prompt(
            actions=actions.get_actions(),
            environment=self.environment,
            goals=goals,
            memory=memory
        )

    def get_action(self, response):
        invocation = self.agent_language.parse_response(response)
        action = self.actions.get_action(invocation["tool"])
        return action, invocation

    def should_terminate(self, response: str) -> bool:
        action_def, _ = self.get_action(response)
        if action_def is None:
            return True
        return action_def.terminal

    def set_current_task(self, memory: Memory, task: str):
        memory.add_memory({"type": "user", "content": task})

    def update_memory(self, memory: Memory, response: str, result: dict):
        """
        Update memory with the agent's decision and the environment's response.
        """
        new_memories = [
            {"type": "assistant", "content": response},
            {"type": "environment", "content": json.dumps(result)}
        ]
        for m in new_memories:
            memory.add_memory(m)

    def prompt_llm_for_action(self, full_prompt: Prompt) -> str:
        response = self.generate_response(full_prompt)
        return response

    def run(self, user_input: str, memory=None, max_iterations: int = 50) -> Memory:
        """
        Execute the GAME loop for this agent with a maximum iteration limit.
        """
        memory = memory or Memory()
        self.set_current_task(memory, user_input)

        for _ in range(max_iterations):
            # Construct a prompt that includes the Goals, Actions, and the current Memory
            prompt = self.construct_prompt(self.goals, memory, self.actions)

            print("Agent thinking...")
            # Generate a response from the agent
            response = self.prompt_llm_for_action(prompt)
            print(f"Agent Decision: {response}")

            # Determine which action the agent wants to execute
            action, invocation = self.get_action(response)

            # Execute the action in the environment
            result = self.environment.execute_action(action, invocation["args"])
            print(f"Action Result: {result}")

            # Update the agent's memory with information about what happened
            self.update_memory(memory, response, result)

            # Check if the agent has decided to terminate
            if self.should_terminate(response):
                print("Agent decided to terminate.")
                break

        return memory

# Create a sample file for the agent to read
with open("sample.py", "w") as f:
    f.write("""# Sample Python file
def hello():
    print("Hello World!")
""")

# Define the agent's goals
goals = [
    Goal(priority=1, name="Gather Information", description="Read each file in the project"),
    Goal(priority=1, name="Terminate", description="Call the terminate call when you have read all the files "
                                               "and provide the content of the README in the terminate message")
]

# Define the agent's language
agent_language = AgentFunctionCallingActionLanguage()

def read_project_file(name: str) -> str:
    with open(name, "r") as f:
        return f.read()

def list_project_files() -> List[str]:
    return sorted([file for file in os.listdir(".") if file.endswith(".py")])

def write_readme(content: str) -> str:
    with open("README.md", "w") as f:
        f.write(content)
    return "README.md created successfully"

# Define the action registry and register some actions
action_registry = ActionRegistry()
action_registry.register(Action(
    name="list_project_files",
    function=list_project_files,
    description="Lists all files in the project.",
    parameters={},
    terminal=False
))
action_registry.register(Action(
    name="read_project_file",
    function=read_project_file,
    description="Reads a file from the project.",
    parameters={
        "type": "object",
        "properties": {
            "name": {"type": "string"}
        },
        "required": ["name"]
    },
    terminal=False
))
action_registry.register(Action(
    name="write_readme",
    function=write_readme,
    description="Writes a README file with the given content.",
    parameters={
        "type": "object",
        "properties": {
            "content": {"type": "string"}
        },
        "required": ["content"]
    },
    terminal=False
))
action_registry.register(Action(
    name="terminate",
    function=lambda message: f"{message}\nTerminating...",
    description="Terminates the session and prints the message to the user.",
    parameters={
        "type": "object",
        "properties": {
            "message": {"type": "string"}
        },
        "required": []
    },
    terminal=True
))

# Define the environment
environment = Environment()

# Create an agent instance
agent = Agent(goals, agent_language, action_registry, generate_response, environment)

# Run the agent with user input
user_input = "Write a README for this project."
final_memory = agent.run(user_input)

# Print the final memory
print("\nFinal Memory:")
for mem in final_memory.get_memories():
    print(mem)

# Print the generated README if it exists
if os.path.exists("README.md"):
    print("\nGenerated README:")
    with open("README.md", "r") as f:
        print(f.read())

>>> Cleaning up old version at /usr/local/lib/ollama
>>> Installing ollama to /usr/local
>>> Downloading Linux amd64 bundle
######################################################################## 100.0%
>>> Adding ollama user to video group...
>>> Adding current user to ollama group...
>>> Creating ollama systemd service...
>>> The Ollama API is now available at 127.0.0.1:11434.
>>> Install complete. Run "ollama" from the command line.
[?2026h[?25l[1G[?25h[?2026l[?2026h[?25l[1G[?25h[?2026l[?2026h[?25l[1G[?25h[?2026l[?2026h[?25l[1G[?25h[?2026l[?2026h[?25l[1G[?25h[?2026l
Agent thinking...
Agent Decision: {"tool": "terminate", "args": {"message": "Congratulations! We have read all the files in the project. The content of the README file is: \n\nWelcome to this project!\nThe purpose of this project is to gather information from each file.\nIt uses a few functions to achieve this:\n* list_project_files: Lists all files in the project.\n* read_project_file: Reads a 

# Building a Simple Agent Framework 4

We’ve discussed Goals, Actions, Memory, and Environment, but there’s another crucial component we need to explore: the AgentLanguage. This component serves as the translator between our structured agent components and the language model’s input/output format. Think of it as a diplomatic interpreter that ensures clear communication between two different worlds: our agent’s structured GAME components and the LLM’s text-based interface.

As we have already seen, there are multiple ways that we can prompt the LLM for a next action. For example, we can have the LLM generate a standard completion with text that we parse or use function calling to extract an action. There are also many different ways that we could represent memories to the LLM, from concatenating them into a single string to including them as individual message entries in ChatML. The AgentLanguage allows us to create reusable strategies for handling these concerns and plug them into the agent.

For example, we might define an AgentLanguage that always constructs a system message explaining the agent’s role, followed by a user message containing the agent’s current observations, memory, and a request for the next action. Alternatively, we could use function calling to directly extract structured actions, bypassing the need for parsing. Each of these choices influences how the LLM reasons and responds, shaping the agent’s behavior.

**The Role of AgentLanguage**

The AgentLanguage component has two primary responsibilities:

1. **Prompt Construction**: Transforming our GAME components into a format the LLM can understand
2. **Response Parsing**: Interpreting the LLM’s response to determine what action the agent should take

Let’s look at how this works in practice, starting with the base abstract class:

In [None]:
class AgentLanguage:
    def construct_prompt(self,
                        actions: List[Action],
                        environment: Environment,
                        goals: List[Goal],
                        memory: Memory) -> Prompt:
        raise NotImplementedError("Subclasses must implement this method")

    def parse_response(self, response: str) -> dict:
        raise NotImplementedError("Subclasses must implement this method")

This abstract class defines the interface that all agent languages must implement. Let’s examine three different implementations to understand how we can adapt our agent’s communication style.

**Where this Fits in the Agent Loop**

Let’s examine how the AgentLanguage component integrates with each stage of the loop, transforming raw data into meaningful communication and back again.

Consider this portion of our agent’s run method:

In [None]:
def run(self, user_input: str, memory=None, max_iterations: int = 50) -> Memory:
    memory = memory or Memory()
    self.set_current_task(memory, user_input)

    for _ in range(max_iterations):
        # 1. Build prompt using AgentLanguage
        prompt = self.construct_prompt(self.goals, memory, self.actions)

        # 2. Get LLM response
        response = self.prompt_llm_for_action(prompt)

        # 3. Parse response using AgentLanguage
        action, invocation = self.get_action(response)

        # 4. Execute action in environment
        result = self.environment.execute_action(action, invocation["args"])

        # 5. Update memory
        self.update_memory(memory, response, result)

        if self.should_terminate(response):
            break

    return memory

At two crucial points in this loop, the AgentLanguage acts as an interpreter between our structured world and the LLM’s text-based world:

**Stage 1: Constructing the Prompt**

When the agent needs to decide its next action, the AgentLanguage takes our GAME components and transforms them into a format the LLM can understand. This transformation involves several steps:

In [None]:
def construct_prompt(self, goals: List[Goal], memory: Memory, actions: ActionRegistry):
    # The AgentLanguage decides how to present each component to the LLM
    prompt = []

    # Transform goals into instructions
    prompt += self.format_goals(goals)

    # Transform available actions into tool descriptions
    prompt += self.format_actions(actions.get_actions())

    # Transform memory into conversation context
    prompt += self.format_memory(memory)

    return Prompt(messages=prompt, tools=tools)

For example, when using function calling, this might produce:

In [None]:
{
    "messages": [
        {"role": "system", "content": "Your goal is to process all files..."},
        {"role": "user", "content": "Please analyze file.txt"},
        {"role": "assistant", "content": "I'll read the file..."}
    ],
    "tools": [
        {
            "type": "function",
            "function": {
                "name": "read_file",
                "description": "Reads a file from the system",
                "parameters": {
                    "type": "object",
                    "properties": {
                        "file_path": {"type": "string"}
                    }
                }
            }
        }
    ]
}

**Stage 2: Parsing the Response**

After the LLM generates a response, the AgentLanguage must interpret it to determine what action the agent should take:

In [None]:
def get_action(self, response):
    # AgentLanguage parses the LLM's response into a structured format
    invocation = self.agent_language.parse_response(response)

    # The parsed response is used to look up the actual action
    action = self.actions.get_action(invocation["tool"])
    return action, invocation

For instance, when using JSON action format, the AgentLanguage might receive this response from the LLM that mixes the agent’s chatty response with a markdown block containing the specification for the action:

In [None]:
Let me analyze the contents of the file.

```action
{
    "tool": "read_file",
    "args": {
        "file_path": "file.txt"
    }
}

The AgentLanguage would then parse this to extract the JSON and convert it into a structured action:

In [None]:
{
    "tool": "read_file",
    "args": {
        "file_path": "file.txt"
    }
}

The AgentLanguage ensures that regardless of how the LLM prefers to communicate (function calling, JSON blocks, or natural language), the agent’s core loop remains unchanged. It’s like having different translators for different languages – the meaning stays the same, but the way it’s expressed adapts to the audience.

**Two Example Agent Languages**

Let’s look at two example implementations of the AgentLanguage component, each with a different approach to prompting and parsing. The first is a simple natural language approach, like what we used in our very first agents. The second is a more structured approach that leverages LLM function calling.

**JSON Action Language**

This language allows the LLM to output text and specify actions in special ```action markdown blocks. This is similar to what we did in our first agent examples:

In [None]:
class AgentJsonActionLanguage(AgentLanguage):
    action_format = """
<Stop and think step by step. Insert your thoughts here.>

```action
{
    "tool": "tool_name",
    "args": {...fill in arguments...}
}
```"""

    def format_actions(self, actions: List[Action]) -> List:
        # Convert actions to a description the LLM can understand
        action_descriptions = [
            {
                "name": action.name,
                "description": action.description,
                "args": action.parameters
            }
            for action in actions
        ]

        return [{
            "role": "system",
            "content": f"""
Available Tools: {json.dumps(action_descriptions, indent=4)}

{self.action_format}
"""
        }]

    def parse_response(self, response: str) -> dict:
        """Extract and parse the action block"""
        try:
            start_marker = "```action"
            end_marker = "```"

            stripped_response = response.strip()
            start_index = stripped_response.find(start_marker)
            end_index = stripped_response.rfind(end_marker)
            json_str = stripped_response[
                start_index + len(start_marker):end_index
            ].strip()

            return json.loads(json_str)
        except Exception as e:
            print(f"Failed to parse response: {str(e)}")
            raise e

**Function Calling Language**

This next language uses the LLM’s function calling capabilities to directly specify actions. This approach helps alleviate the burden of parsing free-form text. The downside is that we don’t necessarily get to see the LLM’s reasoning, but the upside is that it simplifies getting valid JSON as output.

In [None]:
class AgentFunctionCallingActionLanguage(AgentLanguage):
    def format_actions(self, actions: List[Action]) -> List:
        """Convert actions to function descriptions"""
        return [
            {
                "type": "function",
                "function": {
                    "name": action.name,
                    "description": action.description[:1024],
                    "parameters": action.parameters,
                },
            }
            for action in actions
        ]

    def construct_prompt(self,
                        actions: List[Action],
                        environment: Environment,
                        goals: List[Goal],
                        memory: Memory) -> Prompt:
        prompt = []
        prompt += self.format_goals(goals)
        prompt += self.format_memory(memory)

        tools = self.format_actions(actions)

        return Prompt(messages=prompt, tools=tools)

    def parse_response(self, response: str) -> dict:
        """Parse the function call response"""
        try:
            return json.loads(response)
        except Exception as e:
            return {
                "tool": "terminate",
                "args": {"message": response}
            }

**The Power of Swappable Languages**

The ability to swap agent languages gives us remarkable flexibility in how our agent communicates. Consider these scenarios:

In [None]:
# Create an agent that uses natural language for simple tasks
simple_agent = Agent(
    goals=goals,
    agent_language=AgentJsonActionLanguage(),
    action_registry=registry,
    generate_response=llm.generate,
    environment=env
)

# Create an agent that uses function calling for complex tasks
complex_agent = Agent(
    goals=goals,
    agent_language=AgentFunctionCallingActionLanguage(),
    action_registry=registry,
    generate_response=llm.generate,
    environment=env
)

The same agent can behave differently just by changing its language implementation. This separation of concerns means we can:

1. Experiment with different prompt formats without changing the agent’s logic
2. Support different LLM providers with their own communication styles, allowing us to adjust prompting style to match the LLM’s strengths
3. Add new response formats without modifying existing code
4. Handle errors and retry logic at the language level

**Wraping Up**

The AgentLanguage component is crucial because it:

1. **Centralizes Communication Logic**: All prompt construction and response parsing is in one place
2. **Enables Experimentation**: We can try different prompt strategies by creating new language implementations
3. **Improves Reliability**: Structured response formats and error handling make the agent more robust
4. **Supports Evolution**: As LLM capabilities change, we can adapt our communication approach without changing the agent’s core logic

By separating the “how to communicate” from the “what to do,” we create agents that can evolve and improve their interaction with LLMs while maintaining their core functionality. This flexibility is essential as language model capabilities continue to advance and new communication patterns emerge.

# Putting It All Together: Building a Simple README Agent

Now that we understand all the components of our framework, let’s see how they work together by building a simple but practical agent. We’ll create an agent that can analyze Python files in a project and write a README file. This example will demonstrate how our modular design makes it straightforward to assemble an agent from well-defined components.

**Understanding Our Agent’s Purpose**

Before we dive into the code, let’s understand what we want our agent to do. Our README agent will:

1. Look for Python files in a project directory
2. Read the contents of each file
3. Analyze what it finds
4. Generate a README based on its analysis

This task is perfect for demonstrating our framework because it requires the agent to make decisions about which files to read, process information iteratively, and produce a final output.

**Defining the Goals**

Let’s start by defining what our agent should achieve. We use goals to give the agent its purpose and guide its decision-making:

In [None]:
goals = [
    Goal(
        priority=1,
        name="Gather Information",
        description="Read each file in the project"
    ),
    Goal(
        priority=1,
        name="Terminate",
        description="Call the terminate call when you have read all the files "
                   "and provide the content of the README in the terminate message"
    )
]

Notice how we break the task into two clear goals. The first goal drives the agent to explore the project’s files, while the second ensures it knows when to stop and produce output. We give both goals equal priority (priority=1) since they’re sequential steps in the process.

**Creating the Actions**

Next, we define what our agent can do by creating its available actions. We need three basic capabilities:

In [None]:
def read_project_file(name: str) -> str:
    with open(name, "r") as f:
        return f.read()

def list_project_files() -> List[str]:
    return sorted([file for file in os.listdir(".")
                  if file.endswith(".py")])

# Register these actions with clear descriptions
action_registry = ActionRegistry()
action_registry.register(Action(
    name="list_project_files",
    function=list_project_files,
    description="Lists all files in the project.",
    parameters={},
    terminal=False
))

action_registry.register(Action(
    name="read_project_file",
    function=read_project_file,
    description="Reads a file from the project.",
    parameters={
        "type": "object",
        "properties": {
            "name": {"type": "string"}
        },
        "required": ["name"]
    },
    terminal=False
))

action_registry.register(Action(
    name="terminate",
    function=lambda message: f"{message}\nTerminating...",
    description="Terminates the session and prints the message to the user.",
    parameters={
        "type": "object",
        "properties": {
            "message": {"type": "string"}
        },
        "required": []
    },
    terminal=True
))

Each action is carefully designed with:

* A clear name that describes its purpose
* A function that implements the action
* A description that helps the LLM understand when to use it
* A schema defining its parameters
* A terminal flag indicating if it ends the agent’s execution

**Choosing the Agent Language**

For our README agent, we’ll use the function calling language implementation because it provides the most reliable way to structure the agent’s actions:

In [None]:
agent_language = AgentFunctionCallingActionLanguage()

This choice means our agent will use the LLM’s built-in function calling capabilities to select actions. The AgentLanguage will:

1. Format our goals as system messages
2. Present our actions as function definitions
3. Maintain conversation history in the memory
4. Parse function calls from the LLM’s responses

**Setting Up the Environment**

Our environment is simple since we’re just working with local files:

In [None]:
environment = Environment()

We use the default Environment implementation because our actions are straightforward file operations. For more complex agents, we might need to customize the environment to handle specific execution contexts or error cases.

**Assembling and Running the Agent**

Now we can bring all these components together:

In [None]:
# Create an agent instance with our components
agent = Agent(
    goals=goals,
    agent_language=AgentFunctionCallingActionLanguage(),
    action_registry=action_registry,
    generate_response=generate_response,
    environment=environment
)

# Run the agent with our task
user_input = "Write a README for this project."
final_memory = agent.run(user_input)

When we run this agent, several things happen:

1. The agent receives the user’s request for a README
2. It uses list_project_files to discover what files exist
3. It uses read_project_file to examine each relevant file
4. When it has gathered enough information, it uses terminate to provide the README content

**Understanding the Flow**

Let’s walk through a typical execution:

1. First Iteration:

 * Agent constructs prompt with goals and available actions
 * LLM decides to list files first (logical starting point)
 * Environment executes list_project_files
 * Memory stores the list of files

2. Middle Iterations:

 * Agent includes file list in context
 * LLM chooses files to read based on their names
 * Environment executes read_project_file for each chosen file
 * Memory accumulates file contents

3. Final Iteration:

 * Agent determines it has enough information
 * LLM generates README content
 * Agent uses terminate action to deliver the result

**Making It Work**

The modular design means we could easily modify this agent to:

* Handle different file types by adding new actions
* Generate different documentation by changing the goals
* Work with remote files by modifying the environment
* Use different LLM providers by changing the agent language

This example demonstrates how our framework’s separation of concerns makes it easy to create focused, task-specific agents. Each component has a clear responsibility, making the code easy to understand and modify. The GAME architecture lets us think about each aspect of the agent’s behavior independently while ensuring they work together seamlessly.

Remember, this is just a starting point. With this foundation, we can build more sophisticated agents by:

* Adding more complex actions
* Implementing smarter memory management
* Creating specialized environments
* Developing custom agent languages for specific needs

The key is that our framework makes these extensions possible without having to change the core agent loop or other components. This modularity is what makes our framework both powerful and practical.

**The Complete Code for Our Agent**

In [None]:
def main():
    # Define the agent's goals
    goals = [
        Goal(priority=1, name="Gather Information", description="Read each file in the project"),
        Goal(priority=1, name="Terminate", description="Call the terminate call when you have read all the files "
                                                       "and provide the content of the README in the terminate message")
    ]

    # Define the agent's language
    agent_language = AgentFunctionCallingActionLanguage()

    def read_project_file(name: str) -> str:
        with open(name, "r") as f:
            return f.read()

    def list_project_files() -> List[str]:
        return sorted([file for file in os.listdir(".") if file.endswith(".py")])

    # Define the action registry and register some actions
    action_registry = ActionRegistry()
    action_registry.register(Action(
        name="list_project_files",
        function=list_project_files,
        description="Lists all files in the project.",
        parameters={},
        terminal=False
    ))
    action_registry.register(Action(
        name="read_project_file",
        function=read_project_file,
        description="Reads a file from the project.",
        parameters={
            "type": "object",
            "properties": {
                "name": {"type": "string"}
            },
            "required": ["name"]
        },
        terminal=False
    ))
    action_registry.register(Action(
        name="terminate",
        function=lambda message: f"{message}\nTerminating...",
        description="Terminates the session and prints the message to the user.",
        parameters={
            "type": "object",
            "properties": {
                "message": {"type": "string"}
            },
            "required": []
        },
        terminal=True
    ))

    # Define the environment
    environment = Environment()

    # Create an agent instance
    agent = Agent(goals, agent_language, action_registry, generate_response, environment)

    # Run the agent with user input
    user_input = "Write a README for this project."
    final_memory = agent.run(user_input)

    # Print the final memory
    print(final_memory.get_memories())


if __name__ == "__main__":
    main()