## GAME framework

* agent definition
    * Goals & instruction
    * Action
    * Memory
        * result of past actions
    * Environment
        * tools to interact with outside world & receive feedback from


### Motivating Example: The Proactive Coder

To illustrate how the GAME framework applies in practice, consider an AI agent designed to proactively enhance a codebase. This Proactive Coder agent will scan a repository, analyze patterns in the code, and propose potential new features that it could implement with a small number of changes. If the user approves a feature, the agent will generate the initial implementation and suggest refinements.

Using the GAME framework, we break down the agent design:

#### Goals

**Goals (What to achieve):**

* Identify potential enhancements
* Make sure that the enhancements are helpful and relevant
* Make sure that the enhancements are small and self-contained so that they can be implemented by the agent with minimal risk
* Ensure that the changes do not break existing interfaces
* Ensure that the agent only implements features that the user agrees to

**Instructions (How to achieve it):**

* Pick a random file in the code base and read through it
* Read some related files to the original file
* Read at most 5 files
* Propose three feature ideas that are implementable in 2–3 functions and require minimal editing of the existing code
* Ask the user to select which feature to implement
* List the files that will need to be edited and provide a list of proposed changes for each
* Go file by file implementing the changes until they are all edited

#### Actions

* List project files
* Read project file
* Ask user to select a feature
* Edit project file

#### Memory

We will use a simple conversational memory and store the complete contents of files in the conversation for reference.

#### Environment

We will provide simple implementations of the actions in Python to run locally, but could later change to an implementation that works in GitHub Actions.



# Simulating agents

* to rapidly iterate, we can simulate a agent using LLM and inputting tool outcomes (user acts as environment)
* allows to iterate on agent definition, before coding

#### Testing Agent Designs Through Conversation Simulation

Before writing code, test whether the GAME design is feasible by simulating the agent’s decision-making in a chat interface (e.g., ChatGPT). Early simulation exposes design flaws when they’re easiest to fix.

##### Why Simulate First

Like a dress rehearsal, simulation ensures:

* Goals are achievable with planned actions
* Memory requirements are reasonable
* Available actions are sufficient
* The agent can decide effectively with the given information

##### Setting Up a Simulation

Start a chat with a prompt defining goals and actions. For example:

```
I'd like to simulate an AI agent that I'm designing.

Goals:
* Find potential code enhancements
* Ensure changes are small and self-contained
* Get user approval before making changes
* Maintain existing interfaces

Actions available:
* list_project_files()
* read_project_file(filename)
* ask_user_approval(proposal)
* edit_project_file(filename, changes)

At each step, output an action to take. 
Stop and wait; I will type in the result of the action as my next message.
Ask me for the first task to perform.
```

##### Observing Agent Reasoning

Begin with a small project and watch the agent’s approach. Does it list files first or immediately read one? Adjust how you present results—extra metadata (e.g., file counts) can improve its choices.

##### Evolving Tools and Goals

Simulation reveals unclear tool descriptions or vague goals. Refine them, e.g.:

```
read_project_file(filename) -> Returns the content of a Python file. 
Filename must be from list_project_files().
```

Goals may also need tightening, such as specifying “improve error handling and input validation.”

##### Understanding Memory

Each chat exchange mirrors an agent-loop memory entry. Monitoring how much history the model can handle clarifies memory needs.

##### Learning from Failures

Introduce errors or malformed data to test recovery:

```
{"error": "FileNotFoundError: main.py does not exist"}
{"cont3nt": "def broken_func(): pass"}
```

Observe if the agent stays goal-focused.

##### Preventing Runaway Agents

Experiment with stop criteria—e.g., limit number of files read or improvements proposed—without risking infinite loops.

##### Rapid Iteration

Simulation lets you test many scenarios quickly, such as large projects or complex code, before implementation.

##### Gathering Insights

At the end, ask the agent for feedback: missing tools, unclear instructions, vague goals. Its reflections can guide design improvements.

##### Building an Example Library

Save good and bad exchanges to refine prompts and create test cases:

*Good:* Agent reads a file before editing and proposes specific improvements.
*Poor:* Agent edits files without prior analysis or user approval.

Through repeated simulation and refinement, you gain a clear picture of real-world agent behavior, leading to more robust and well-aligned implementations.


## Modular AI Agent Design

#### Building a Simple Agent Framework

Design agents around the **GAME** concept so the code mirrors the design: the **core loop** stays the same while **Goals, Actions, Memory, Environment** vary. The added structure makes the framework flexible and reusable.

##### Goals

Use a data class to capture what the agent should achieve and how:

```python
@dataclass(frozen=True)
class Goal:
    priority: int
    name: str
    description: str
```

Encapsulating goals (including examples or core rules) allows prioritization and easier prompt construction:

```python
file_management_goal = Goal(
    priority=1,
    name="file_management",
    description="""Manage files by:
    1. Listing files
    2. Reading contents
    3. Searching within files
    4. Explaining contents"""
)
```

##### Actions

Actions are the agent’s toolkit. Wrap each capability in an object:

```python
class Action:
    def __init__(self, name, function, description, parameters, terminal=False):
        self.name = name
        self.function = function
        self.description = description
        self.parameters = parameters
        self.terminal = terminal
    def execute(self, **args):
        return self.function(**args)
```

Maintain a registry for lookup:

```python
class ActionRegistry:
    def __init__(self): self.actions = {}
    def register(self, action): self.actions[action.name] = action
    def get_action(self, name): return self.actions.get(name)
    def get_actions(self): return list(self.actions.values())
```

Example actions:

```python
def list_files(): return os.listdir('.')
def read_file(file_name): return open(file_name).read()
def search_in_file(file_name, term):
    return [(i+1, l.strip()) for i,l in enumerate(open(file_name)) if term in l]

registry = ActionRegistry()
registry.register(Action("list_files", list_files, "List all files", {}, False))
registry.register(Action("read_file", read_file, "Read a file",
    {"type":"object","properties":{"file_name":{"type":"string"}},"required":["file_name"]}, False))
registry.register(Action("search_in_file", search_in_file, "Search term in file",
    {"type":"object","properties":{"file_name":{"type":"string"}, "search_term":{"type":"string"}},
     "required":["file_name","search_term"]}, False))
```

##### Memory

Provide a simple interface now, allowing later expansion:

```python
class Memory:
    def __init__(self): self.items = []
    def add_memory(self, memory: dict): self.items.append(memory)
    def get_memories(self, limit=None): return self.items[:limit]
```

Wrapping a list enables future changes (e.g., database-backed memory) without altering the core loop.

##### Environment

Bridge between agent and world—executes actions and returns results:

```python
class Environment:
    def execute_action(self, action: Action, args: dict) -> dict:
        try:
            result = action.execute(**args)
            return self.format_result(result)
        except Exception as e:
            return {"tool_executed": False, "error": str(e), "traceback": traceback.format_exc()}

    def format_result(self, result: Any) -> dict:
        return {
            "tool_executed": True,
            "result": result,
            "timestamp": time.strftime("%Y-%m-%dT%H:%M:%S%z")
        }
```

This modular design keeps the **agent loop** untouched while swapping in different GAME components.


#### Building a Simple Agent Framework (Part 2)

Design a reusable **Agent** that encapsulates GAME components (Goals, Actions, Memory, Environment) and a pluggable **AgentLanguage** + **LLM caller**. Swap GAME pieces without touching the core loop.

##### Agent Class (condensed)

```python
class Agent:
    def __init__(self, goals, agent_language, action_registry, generate_response, environment):
        self.goals = goals
        self.agent_language = agent_language
        self.actions = action_registry
        self.generate_response = generate_response
        self.environment = environment

    def construct_prompt(self, goals, memory, actions):
        return self.agent_language.construct_prompt(
            actions=actions.get_actions(),
            environment=self.environment,
            goals=goals,
            memory=memory
        )

    def get_action(self, response):
        inv = self.agent_language.parse_response(response)
        return self.actions.get_action(inv["tool"]), inv

    def should_terminate(self, response):
        action_def, _ = self.get_action(response)
        return action_def.terminal

    def set_current_task(self, memory, task): memory.add_memory({"type":"user","content":task})

    def update_memory(self, memory, response, result):
        for m in [{"type":"assistant","content":response},
                  {"type":"user","content":json.dumps(result)}]:
            memory.add_memory(m)

    def prompt_llm_for_action(self, prompt): return self.generate_response(prompt)

    def run(self, user_input, memory=None, max_iterations=50):
        memory = memory or Memory()
        self.set_current_task(memory, user_input)
        for _ in range(max_iterations):
            prompt = self.construct_prompt(self.goals, memory, self.actions)
            response = self.prompt_llm_for_action(prompt)
            action, inv = self.get_action(response)
            result = self.environment.execute_action(action, inv["args"])
            self.update_memory(memory, response, result)
            if self.should_terminate(response): break
        return memory
```

##### Loop Steps

1. **Construct Prompt** → `AgentLanguage.construct_prompt(goals, actions, memory, environment)`
2. **Generate Response** → `generate_response(prompt)` (LLM-agnostic)
3. **Parse Response** → `AgentLanguage.parse_response` → `{tool, args}`
4. **Execute Action** → `Environment.execute_action(action, args)`
5. **Update Memory** → append agent decision + env result
6. **Terminate?** → stop on terminal action or iteration cap

##### What Each Part Contributes

* **Memory**: rolling context of tasks, decisions, results.
* **Goals**: “what/how” rules guiding decisions.
* **ActionRegistry**: capability surface + lookup by name.
* **AgentLanguage**: prompt formatting + response parsing (with/without function calling).
* **Environment**: safe, uniform action execution + result formatting.

##### Specialized Agents (examples)

```python
# Research agent
research_agent = Agent(
  goals=[Goal(priority=1, name="research", description="Find & summarize topic X")],
  agent_language=ResearchLanguage(),
  action_registry=ActionRegistry(),   # e.g., SearchAction(), SummarizeAction()
  generate_response=openai_call,
  environment=WebEnvironment()
)

# Coding agent
coding_agent = Agent(
  goals=[Goal(priority=1, name="coding", description="Write & debug code for task Y")],
  agent_language=CodingLanguage(),
  action_registry=ActionRegistry(),   # e.g., WriteCodeAction(), TestCodeAction()
  generate_response=anthropic_call,
  environment=DevEnvironment()
)
```

Same loop, different GAME components → different behaviors without changing core logic.


#### Building a Simple Agent Framework (Part 3)

Use the GAME framework to rebuild the **file explorer agent** with clear, swappable components and an unchanged core loop.

##### Goals (what & when to stop)

```python
goals = [
    Goal(priority=1, name="Explore Files",
         description="List and read files in the current directory"),
    Goal(priority=2, name="Terminate",
         description="End session with a helpful summary when done")
]
```

##### Actions (capabilities via registry)

```python
def list_files() -> List[str]:
    return os.listdir(".")

def read_file(file_name: str) -> str:
    try:
        with open(file_name, "r") as f: return f.read()
    except FileNotFoundError: return f"Error: {file_name} not found."
    except Exception as e:      return f"Error: {e}"

def terminate(message: str) -> str:
    return message

action_registry = ActionRegistry()
action_registry.register(Action("list_files", list_files,
    "List files in the directory.", parameters={}, terminal=False))
action_registry.register(Action("read_file", read_file,
    "Read a file's contents.",
    parameters={"type":"object","properties":{"file_name":{"type":"string"}},
               "required":["file_name"]}, terminal=False))
action_registry.register(Action("terminate", terminate,
    "Finish with a summary.",
    parameters={"type":"object","properties":{"message":{"type":"string"}},
               "required":["message"]}, terminal=True))
```

##### Agent wiring (language, env, loop)

```python
agent_language = AgentFunctionCallingActionLanguage()
environment = Environment()

file_explorer_agent = Agent(
    goals=goals,
    agent_language=agent_language,
    action_registry=action_registry,
    generate_response=generate_response,  # your LLM caller
    environment=environment
)
```

##### Run

```python
user_input = input("What would you like me to do? ")
final_memory = file_explorer_agent.run(user_input, max_iterations=10)

for item in final_memory.get_memories():
    print(f"\n{item['type'].upper()}: {item['content']}")
```

##### Minimal main (all-in-one)

```python
def main():
    # goals, actions, registry as above...
    agent_language = AgentFunctionCallingActionLanguage()
    environment = Environment()
    agent = Agent(goals, agent_language, action_registry, generate_response, environment)
    mem = agent.run(input("What would you like me to do? "), max_iterations=10)
    for m in mem.get_memories(): print(f"\nMemory: {m['content']}")

if __name__ == "__main__":
    main()
```

##### Loop flow (one iteration)

1. **Prompt**: `AgentLanguage.construct_prompt(goals, actions, memory, environment)`
2. **LLM**: `generate_response(prompt)` → tool call `{tool, args}`
3. **Lookup**: `ActionRegistry.get_action(tool)`
4. **Execute**: `Environment.execute_action(action, args)`
5. **Memory**: append decision + result
6. **Stop?**: terminal action or iteration cap

##### Example interaction (abridged)

```
Agent Decision → {"tool_name": "list_files", "args": {}}
Action Result  → {"tool_executed": true, "result": ["file1.py","main.py"], ...}

Agent Decision → {"tool_name": "read_file", "args": {"file_name": "file1.py"}}
Action Result  → {"tool_executed": true, "result": "...file contents...", ...}

Agent Decision → {"tool_name": "terminate",
                  "args": {"message": "Summary of files and roles"}}
```

##### Benefits

* **Organization**: clear separation of GAME parts
* **Reusability**: swap goals/actions/env without touching the loop
* **Extensibility**: add new tools and rules painlessly
* **Consistency**: one Agent API for many agents
* **Memory**: standardized logging of decisions and outcomes


#### AgentLanguage (translator between GAME and the LLM)

Keeps the **agent loop** generic by handling:

* **Prompt construction**: Goals + Actions + Memory (+ Environment) → LLM-ready prompt
* **Response parsing**: LLM output → `{tool, args}` action invocation

##### Base interface

```python
class AgentLanguage:
    def construct_prompt(self, actions, environment, goals, memory) -> Prompt:
        raise NotImplementedError
    def parse_response(self, response: str) -> dict:
        raise NotImplementedError
```

##### Where it fits in the loop

1. Build prompt → `AgentLanguage.construct_prompt(...)`
2. Get LLM response → `generate_response(prompt)`
3. Parse to action → `AgentLanguage.parse_response(response)`
4. Execute action → `Environment.execute_action(...)`
5. Update memory → decisions + results

##### Prompt construction (concept)

```python
def construct_prompt(self, goals, memory, actions):
    prompt = []
    prompt += self.format_goals(goals)        # instructions
    prompt += self.format_actions(actions)    # tool specs
    prompt += self.format_memory(memory)      # context/history
    return Prompt(messages=prompt, tools=tools)
```

##### Example prompt (function-calling style)

```json
{
  "messages": [
    {"role": "system", "content": "You process files."},
    {"role": "user", "content": "Analyze file.txt"}
  ],
  "tools": [{
    "type": "function",
    "function": {
      "name": "read_file",
      "description": "Reads a file",
      "parameters": {"type":"object","properties":{"file_path":{"type":"string"}}}
    }
  }]
}
```

---

#### Two concrete AgentLanguage implementations

##### JSON Action Language (natural text + ```action block)

````python
class AgentJsonActionLanguage(AgentLanguage):
    action_format = """
<Think step by step>

```action
{"tool": "tool_name", "args": {...}}
````

"""
def format_actions(self, actions):
desc = [{"name": a.name, "description": a.description, "args": a.parameters} for a in actions]
return [{"role":"system","content": f"Available Tools: {json.dumps(desc, indent=4)}\n\n{self.action_format}"}]

````
def parse_response(self, response: str) -> dict:
    start, end = "```action", "```"
    s = response.strip()
    js = s[s.find(start)+len(start): s.rfind(end)].strip()
    return json.loads(js)
````

````

##### Function Calling Language (structured tool calls)
```python
class AgentFunctionCallingActionLanguage(AgentLanguage):
    def format_actions(self, actions):
        return [{"type":"function","function":{
            "name": a.name, "description": a.description[:1024], "parameters": a.parameters
        }} for a in actions]

    def construct_prompt(self, actions, environment, goals, memory):
        prompt = []
        prompt += self.format_goals(goals)
        prompt += self.format_memory(memory)
        tools = self.format_actions(actions)
        return Prompt(messages=prompt, tools=tools)

    def parse_response(self, response: str) -> dict:
        try:
            return json.loads(response)  # expects {"tool": "...", "args": {...}}
        except Exception:
            return {"tool": "terminate", "args": {"message": response}}
````

---

#### Why this separation helps

* **Centralized communication logic**: one place for prompt/parse rules
* **Swappable strategies**: try NL, JSON blocks, or function-calling without touching the loop
* **LLM portability**: adapt to providers’ styles and strengths
* **Reliability**: structured outputs + localized error handling
* **Future-proofing**: evolve prompting/formatting as LLMs change

---

#### Quick usage examples

```python
# Natural language + action blocks
simple_agent = Agent(
    goals=goals,
    agent_language=AgentJsonActionLanguage(),
    action_registry=registry,
    generate_response=llm.generate,
    environment=env
)

# Function calling for stricter structure
complex_agent = Agent(
    goals=goals,
    agent_language=AgentFunctionCallingActionLanguage(),
    action_registry=registry,
    generate_response=llm.generate,
    environment=env
)
```
