<a href="https://colab.research.google.com/github/micah-shull/AI_Agents/blob/main/096_the_Orchestrator.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>


# **Building a Simple Agent Framework (The Orchestrator Pattern)**

In this lecture, you should focus on **how the full GAME framework comes together in a unified orchestrator loop** and why this design makes agents flexible, reusable, and easy to extend.

---

### **1. The Orchestrator Is the Glue**

* The `Agent` class acts as the **Orchestrator**, coordinating all GAME components:

  * **Goals** (G) – What the agent is trying to achieve.
  * **Actions** (A) – The tools it can use, stored in an `ActionRegistry`.
  * **Memory** (M) – Context from prior interactions.
  * **Environment** (E) – The “body” that actually executes actions.
  * **AgentLanguage** – Formats prompts and parses responses for the LLM.
  * **generate\_response** – The LLM interface itself.
* The orchestrator ensures these parts work in harmony without you rewriting the core loop for every new agent.

---

### **2. Six Steps in the Orchestrator Loop**

1. **Construct Prompt** — Combines Goals, Memory, Actions, and Environment into a structured prompt via `AgentLanguage`.
2. **Generate Response** — Uses `generate_response()` to call your chosen LLM.
3. **Parse Response** — Maps the LLM’s output to an actionable `tool_name` + parameters.
4. **Execute Action** — Delegates to the Environment, which calls the registered Action with the given arguments.
5. **Update Memory** — Adds the agent’s decision and the execution result for future context.
6. **Check Termination** — Ends if the selected action has `terminal=True`.

---

### **3. Benefits of the Orchestrator Design**

* **Separation of Concerns** — Each component does one job; the orchestrator just coordinates.
* **Reusability** — The same orchestrator loop can drive agents for coding, research, file management, or entirely different domains.
* **Swappability** — Swap in different Goals, Actions, Memory strategies, or Environments without breaking the rest of the system.
* **LLM-Agnostic** — Works with OpenAI, Anthropic, or local models by changing only the `generate_response` function.
* **Low Maintenance** — Adding new tools means registering them; you never touch the orchestrator loop.

---

### **4. Key Abstractions to Notice**

* **`AgentLanguage`** — The “translator” between your components and the LLM; controls prompt formatting & response parsing.
* **`ActionRegistry`** — A labeled, structured “tool shed” where every action is documented and callable by name.
* **`Environment`** — The body that handles operational details, freeing the orchestrator (brain) from implementation clutter.

---

### **5. The Orchestrated Flow**

Mentally trace the data path:

**User Input → Goals → Prompt → LLM → Parsed Action → Environment Execution → Result → Memory Update → Next Loop (or Terminate)**



In [None]:
from typing import List, Dict, Any, Callable

# --- GAME Components (Simplified) ---

class Goal:
    def __init__(self, priority: int, name: str, description: str):
        self.priority = priority
        self.name = name
        self.description = description


class Action:
    def __init__(self, name: str, function: Callable, description: str, parameters: Dict, terminal: bool = False):
        self.name = name
        self.function = function
        self.description = description
        self.parameters = parameters
        self.terminal = terminal

    def execute(self, **kwargs) -> Any:
        return self.function(**kwargs)


class ActionRegistry:
    def __init__(self):
        self._actions: Dict[str, Action] = {}

    def register(self, action: Action):
        self._actions[action.name] = action

    def get_action(self, name: str) -> Action:
        return self._actions.get(name)

    def get_actions(self) -> List[Action]:
        return list(self._actions.values())


class Memory:
    def __init__(self):
        self.items = []

    def add_memory(self, memory: dict):
        self.items.append(memory)

    def get_memories(self, limit: int = None) -> List[dict]:
        return self.items[-limit:] if limit else self.items


class Environment:
    def execute_action(self, action: Action, args: dict) -> dict:
        try:
            result = action.execute(**args)
            return {"tool_executed": True, "result": result}
        except Exception as e:
            return {"tool_executed": False, "error": str(e)}


class AgentLanguage:
    """Mock AgentLanguage for demonstration purposes"""
    def construct_prompt(self, actions: List[Action], environment: Environment, goals: List[Goal], memory: Memory) -> str:
        return f"Goals: {[g.name for g in goals]} | Tools: {[a.name for a in actions]} | Memory: {memory.items}"

    def parse_response(self, response: str) -> dict:
        # Simulated parsing: "use <tool_name>"
        tool = response.split()[-1]
        return {"tool": tool, "args": {}}

# --- Orchestrator ---

class Agent:
    def __init__(self, goals, agent_language, action_registry, generate_response, environment):
        self.goals = goals
        self.agent_language = agent_language
        self.actions = action_registry
        self.generate_response = generate_response
        self.environment = environment

    def construct_prompt(self, goals, memory, actions):
        return self.agent_language.construct_prompt(
            actions=actions.get_actions(),
            environment=self.environment,
            goals=goals,
            memory=memory
        )

    def get_action(self, response):
        invocation = self.agent_language.parse_response(response)
        action = self.actions.get_action(invocation["tool"])
        return action, invocation

    def should_terminate(self, response):
        action_def, _ = self.get_action(response)
        return action_def and action_def.terminal

    def run(self, user_input: str, memory=None, max_iterations: int = 5):
        memory = memory or Memory()
        memory.add_memory({"type": "user", "content": user_input})

        for _ in range(max_iterations):
            prompt = self.construct_prompt(self.goals, memory, self.actions)
            print(f"Prompt to LLM: {prompt}")

            response = self.generate_response(prompt)
            print(f"Agent Decision: {response}")

            action, invocation = self.get_action(response)
            if not action:
                print("Unknown action.")
                break

            result = self.environment.execute_action(action, invocation["args"])
            print(f"Action Result: {result}")

            memory.add_memory({"type": "assistant", "content": response})
            memory.add_memory({"type": "system", "content": str(result)})

            if self.should_terminate(response):
                print("Agent terminating.")
                break


# --- Example usage ---
def mock_generate_response(prompt):
    # Always chooses the list_files tool for demo
    return "use list_files"

def list_files():
    return ["file1.txt", "file2.txt"]

# Setup components
goals = [Goal(1, "File Management", "Manage and read files.")]
registry = ActionRegistry()
registry.register(Action("list_files", list_files, "List all files", {"type": "object", "properties": {}}, terminal=False))
environment = Environment()
language = AgentLanguage()

# Create and run agent
agent = Agent(goals, language, registry, mock_generate_response, environment)
agent.run("List all files in the directory")




# What makes this a top-notch template

* **Orchestrator pattern (brain)**: One small loop that just coordinates (prompt → decision → execute → memory → repeat). You never rewrite this.
* **Separation of concerns (GAME)**:

  * Goals = *what & how* (prioritized)
  * Actions = *what you can do* (registry + schemas)
  * Memory = *what happened* (uniform `{role, content}`)
  * Environment = *how it’s executed* (uniform results)
* **LLM-agnostic boundary**: `generate_response(prompt)` is a single swap point (mock, OpenAI, Anthropic, local).
* **Uniform envelopes**: Environment always returns

  * success → `{"tool_executed": True, "result": ...}`
  * failure → `{"tool_executed": False, "error": "...", "hint": "...", "retryable": bool}`
    This massively improves the model’s ability to recover.
* **Schema-driven actions**: Each tool has a JSON-schema-like `parameters` and optional validation before execution.
* **Error-aware memory updates**: Unknown tools, invalid args, and tool errors are all written back to memory—so the model can self-correct next turn.
* **Termination & guardrails**: `terminal=True` on actions + `max_iterations`. Simple and reliable.

# What to be sure to include in future agents

* **Invariant interfaces (don’t break these):**

  * `AgentLanguage.construct_prompt(...)` and `.parse_response(...)`
  * `ActionRegistry.get_action(name)` and `Action.execute(**kwargs)`
  * `Environment.execute_action(action, args)` → uniform envelope
* **Consistent memory schema**: always `{role, content}`; pick one set of roles (user/assistant/tool) and stick to it.
* **Deterministic vs. cognitive split**:

  * Deterministic (Python): sorting, filtering, validation, truncation, safety checks
  * Cognitive (LLM): planning, tradeoffs, summarization, choosing tools
* **Just-in-time hints in errors**: return `hint`/`next_tool` in error payloads (e.g., “Call `list_txt_files` first”), not buried in the big system prompt.
* **Observability hooks**: a place to log `on_step_start`, `on_action`, `on_result`, `on_terminate` (even if no-ops now).
* **Deterministic prompt construction**: sort goals/actions, cap memory window, avoid noisy text; keep the prompt compact and stable.

# Quick checklist before you “ship” an agent

* [ ] Actions have clear names, good descriptions, and correct schemas.
* [ ] Registry exports tools (if using function calling) and validates args.
* [ ] Environment enforces safety (paths, sizes, timeouts) and returns uniform envelopes.
* [ ] Memory uses consistent roles; context window is bounded (sliding window OK).
* [ ] Orchestrator handles: unknown tool, invalid args, tool failure, termination.
* [ ] `generate_response` is injectable (easy to mock in tests).

# Nice future upgrades (when you need them)

* Summarized/Vector recall memory strategies.
* `return_schema` on actions for stricter post-validation.
* `requires_approval`, `idempotent`, `timeout_s` in action `metadata`.
* Function calling integration (so the LLM emits structured tool calls directly).



In [None]:
# Orchestrator (Agent) — Updated Minimal Template
# Includes:
# - Consistent memory roles (user/assistant/tool)
# - Unknown/invalid action handling recorded in memory
# - Tool failure recovery (continue loop)
# - Optional arg validation via registry schema
# - Uniform environment result envelope

from typing import List, Dict, Any, Optional, Callable
from dataclasses import dataclass

# ---------------- G: Goals ----------------
@dataclass(frozen=True)
class Goal:
    priority: int
    name: str
    description: str

# --------------- A: Actions + Registry ---------------
class Action:
    def __init__(self, name: str, fn: Callable, description: str, parameters: Dict, terminal: bool=False):
        self.name = name
        self.fn = fn
        self.description = description
        self.parameters = parameters  # JSON Schema-like dict
        self.terminal = terminal
    def execute(self, **kwargs):
        return self.fn(**kwargs)

class ActionRegistry:
    def __init__(self):
        self._actions: Dict[str, Action] = {}
    def register(self, action: Action):
        if action.name in self._actions:
            raise ValueError(f"Action already registered: {action.name}")
        self._actions[action.name] = action
    def get_action(self, name: str) -> Optional[Action]:
        return self._actions.get(name)
    def get_actions(self) -> List[Action]:
        return list(self._actions.values())
    def validate_args(self, action: Action, args: Dict[str, Any]) -> (bool, str):
        schema = action.parameters or {"type":"object","properties":{},"required":[]}
        required = schema.get("required", [])
        for key in required:
            if key not in args:
                return False, f"Missing required arg: {key}"
        # (Extend with type checks or jsonschema as needed)
        return True, "ok"

# ---------------- M: Memory ----------------
class Memory:
    def __init__(self):
        self.items: List[Dict[str, Any]] = []  # each: {role, content}
    def add_memory(self, m: Dict[str, Any]):
        # Expect keys: role, content
        self.items.append(m)
    def get_memories(self, limit: Optional[int]=None) -> List[Dict[str, Any]]:
        return self.items[-limit:] if limit else self.items

# ---------------- E: Environment ----------------
class Environment:
    def execute_action(self, action: Action, args: Dict[str, Any]) -> Dict[str, Any]:
        try:
            result = action.execute(**args)
            return {"tool_executed": True, "result": result}
        except Exception as e:
            return {"tool_executed": False, "error": str(e)}

# ------------- AgentLanguage (prompt + parse) -------------
class AgentLanguage:
    def construct_prompt(self, actions: List[Action], environment: Environment, goals: List[Goal], memory: Memory) -> Dict[str, Any]:
        return {
            "goals": [g.description for g in sorted(goals, key=lambda g: g.priority)],
            "tools": [
                {"name": a.name, "description": a.description, "parameters": a.parameters}
                for a in actions
            ],
            "memory": memory.get_memories(8)
        }
    def parse_response(self, response: Dict[str, Any]) -> Dict[str, Any]:
        # Expect structured: {"tool": "name", "args": {...}}
        return response

# ---------------- Orchestrator (Agent) ----------------
class Agent:
    def __init__(self, goals, agent_language, action_registry, generate_response, environment):
        self.goals = goals
        self.agent_language = agent_language
        self.actions = action_registry
        self.generate_response = generate_response  # Callable[prompt_dict] -> response_dict
        self.environment = environment

    def construct_prompt(self, goals, memory, actions):
        return self.agent_language.construct_prompt(
            actions=actions.get_actions(),
            environment=self.environment,
            goals=goals,
            memory=memory
        )

    def prompt_llm_for_action(self, full_prompt):
        return self.generate_response(full_prompt)

    def get_action(self, response):
        invocation = self.agent_language.parse_response(response)  # -> {tool, args}
        action = self.actions.get_action(invocation.get("tool"))
        return action, invocation

    def should_terminate(self, response):
        action_def, _ = self.get_action(response)
        return bool(action_def and action_def.terminal)

    def update_memory(self, memory, response, result):
        memory.add_memory({"role": "assistant", "content": response})
        memory.add_memory({"role": "tool", "content": result})

    def run(self, user_input: str, memory: Optional[Memory]=None, max_iterations: int=5, verbose: bool=True) -> Memory:
        memory = memory or Memory()
        memory.add_memory({"role": "user", "content": user_input})

        for _ in range(max_iterations):
            prompt = self.construct_prompt(self.goals, memory, self.actions)
            if verbose:
                print(f"Prompt → {prompt}")

            response = self.prompt_llm_for_action(prompt)
            if verbose:
                print(f"Decision ← {response}")

            action, invocation = self.get_action(response)
            if not action:
                err = {"tool_executed": False, "error": f"Unknown action: {invocation.get('tool')}"}
                memory.add_memory({"role": "tool", "content": err})
                if verbose:
                    print(f"Error ← {err}")
                break

            # Optional: validate args before executing
            ok, msg = self.actions.validate_args(action, invocation.get("args", {}))
            if not ok:
                err = {"tool_executed": False, "error": f"Invalid args: {msg}", "hint": action.parameters}
                memory.add_memory({"role": "tool", "content": err})
                if verbose:
                    print(f"Arg Error ← {err}")
                continue

            result = self.environment.execute_action(action, invocation.get("args", {}))
            if verbose:
                print(f"Result ← {result}")

            # Always feed back results to memory for self-correction
            memory.add_memory({"role": "tool", "content": result})

            # If the tool failed, allow the model to recover next turn
            if not result.get("tool_executed", False):
                memory.add_memory({"role": "assistant", "content": "Received an error; selecting another action."})
                continue

            # Record decision too (optional if already added elsewhere)
            memory.add_memory({"role": "assistant", "content": response})

            if self.should_terminate(response):
                if verbose:
                    print("Terminate signal: stopping loop.")
                break

        return memory

# ---------------- Example wiring ----------------
# Toy action and mock LLM for quick sanity check

def list_txt_files():
    return ["000.txt", "001.txt"]

registry = ActionRegistry()
registry.register(Action(
    name="list_txt_files",
    fn=list_txt_files,
    description="Return .txt file names",
    parameters={"type":"object","properties":{},"required":[]},
))

language = AgentLanguage()

def mock_llm(prompt_dict):
    # Always choose our single tool
    return {"tool": "list_txt_files", "args": {}}

env = Environment()
goals = [Goal(1, "list", "List available text files.")]

# Create agent and run one pass
if __name__ == "__main__":
    agent = Agent(goals, language, registry, mock_llm, env)
    _ = agent.run("Show me the files", verbose=True)
