<a href="https://colab.research.google.com/github/micah-shull/AI_Agents/blob/main/084_Capability_Architectural_Pattern.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In this notebokk focus on **three core ideas** that really tie everything together:

---

### 1. **The Capability Pattern Is the Big Picture**

* This is the *architectural* layer for extending your agent without touching its core `Agent` class.
* Capabilities are **modular behaviors** you can “snap in” like LEGO bricks.
* They hook into multiple points in the **agent loop lifecycle** — from initialization to shutdown — without changing the loop itself.

💡 **Why it matters:**
This is the bridge between *just adding tools* (which only do one-off actions) and *changing how your agent thinks, acts, and reacts* at a structural level.

---

### 2. **Lifecycle Hooks = Full Control of the Agent’s Behavior**

The Capability class exposes a clear set of “hooks” you can override:

* `init()` → setup state when agent starts
* `process_prompt()` → alter the prompt before sending to LLM
* `process_response()` → modify/validate the LLM’s reply
* `process_action()` → change the parsed action
* `process_result()` → modify what happens after an action runs
* `process_new_memories()` → adjust what’s stored in memory
* `end_agent_loop()` → reflect or log after each loop iteration
* `should_terminate()` + `terminate()` → clean shutdown control

💡 **Why it matters:**
You now have *every phase of the agent’s thinking and doing cycle* as a customizable extension point.

---

### 3. **Example: Time Awareness**

The `TimeAwareCapability` shows how a capability:

* Stores baseline data in memory at startup (`init`)
* Keeps that data fresh every prompt (`process_prompt`)
* Influences downstream decisions without altering the agent core

The **EnhancedTimeAwareCapability** then builds on it, adding:

* Timestamps when actions execute (`process_action`)
* Duration tracking for performance insight (`process_result`)

💡 **Why it matters:**
It’s a pattern you can use for *anything* — logging, monitoring, enforcing safety rules, tracking plan adherence — all without touching the core agent logic.

---

If I were you, I’d treat this lecture as:

1. **Memorize the lifecycle hooks** (they’re your Swiss Army knife for agent customization).
2. **Understand the difference between tools vs. capabilities**:

   * Tools = *What* the agent can do.
   * Capabilities = *How* the agent thinks and behaves during its loop.
3. **Practice building one “tiny” capability** so you can see the full loop in action.






## 🚀 Extending the Agent Loop with **Capabilities**

While **tools** give our agent *specific actions* it can perform, sometimes we need to influence **how the agent itself thinks and behaves** throughout its execution loop.

That’s where the **Capability Pattern** comes in — a design approach that lets us **extend the agent’s core behavior** in fundamental ways, without cluttering or rewriting the main loop logic.

---

### 💡 The Core Idea

The Capability pattern:

* Encapsulates **specific adaptations** of the agent loop inside a class.
* Lets you **plug in** these classes to modify the agent’s behavior.
* Works **without touching** the main `Agent` class code.
* Has a **lifecycle** that runs from just before the agent starts until just after it stops.

You can use a Capability to:

* Open database connections
* Log every prompt sent to the LLM
* Inject metadata into the agent’s responses
* Apply safety checks or business rules
* …and much more

---

### 🧩 Composability

By stacking multiple capabilities together, you can **compose highly specialized agents** from the same core — just like adding layers of middleware.

---

### ⏳ Example: Time Awareness

Let’s start with something simple but *surprisingly powerful*:

> Making our agent **aware of the current time**.

An agent that understands time can:

* Schedule meetings sensibly
* Respect deadlines
* Handle time-sensitive tasks more intelligently

---

## 🛠 The Capability Pattern in Action

A Capability can hook into **multiple points** in the agent loop lifecycle.
Looking at our `Agent` class, we’ll soon see exactly **where** and **how** these interaction points occur.




In [None]:
def run(self, user_input: str, memory=None, action_context_props=None):

    ... existing code ...

    # Initialize capabilities
    for capability in self.capabilities:
        capability.init(self, action_context)

    while True:
        # Start of loop capabilities
        can_start_loop = reduce(lambda a, c: c.start_agent_loop(self, action_context),
                              self.capabilities, False)

        ... existing code ...

        # Construct prompt with capability modifications
        prompt = reduce(lambda p, c: c.process_prompt(self, action_context, p),
                      self.capabilities, base_prompt)

        ... existing code ...

        # Process response with capabilities
        response = reduce(lambda r, c: c.process_response(self, action_context, r),
                        self.capabilities, response)

        ... existing code ...

        # Process action with capabilities
        action = reduce(lambda a, c: c.process_action(self, action_context, a),
                      self.capabilities, action)

        ... existing code ...

        # Process result with capabilities
        result = reduce(lambda r, c: c.process_result(self, action_context, response,
                                                     action_def, action, r),
                       self.capabilities, result)

        ... existing code ...

        # End of loop capabilities
        for capability in self.capabilities:
            capability.end_agent_loop(self, action_context)



### What’s new

* **Lifecycle hooks wired in:** The agent now calls a set of capability methods at key points: `init → start_agent_loop → process_prompt → process_response → process_action → process_result → end_agent_loop`. This is the middleware-style extension point you’ve been building toward.
* **Functional chaining with `reduce`:** Each phase uses `functools.reduce` to **pipe** a value through all capabilities in order.

  * `base_prompt -> process_prompt(...) -> ...`
  * `response -> process_response(...) -> ...`
  * `action -> process_action(...) -> ...`
  * `result -> process_result(...) -> ...`
    This pattern means each capability can modify (or observe) the object and pass it along.

### What to focus on

1. **Data flow (and types) through each phase**

   * Make sure each reducer’s **seed** is correct (e.g., `base_prompt` for the prompt phase, the raw `response` for response phase, etc.).
   * Ensure each capability returns the **same kind of object** it received (Prompt→Prompt, str→str, dict→dict). One bad return type will break the chain.

2. **Order of capabilities**

   * The order in `self.capabilities` is significant. E.g., a `TimeAwareCapability` that **prepends** time to the system message should likely run **before** a `LoggingCapability` that records the final prompt.

3. **Side effects vs pure transforms**

   * Prefer **pure functions** in `process_*` hooks (change the value, not global state).
   * Use `init` and `end_agent_loop` for side effects (e.g., logging, metrics, progress notes) so you don’t surprise other capabilities.

4. **Idempotency & safety**

   * If the loop retries, your capability methods shouldn’t duplicate content (e.g., don’t prepend “Current time:” twice). Add guards or markers.

5. **Performance**

   * Every phase now calls **all** capabilities. Expensive work (extra LLM calls, I/O) should be limited to the right hooks (e.g., progress reflection at `end_agent_loop` and maybe not every iteration).

6. **Error handling**

   * Consider wrapping each reducer segment with try/except so one misbehaving capability doesn’t crash the agent.
   * Optionally give capabilities a way to signal “skip” or “fail-soft.”

7. **Start-of-loop flag**

   * `can_start_loop` is computed but not used here. Decide what it means:

     * If any capability returns `False`, should you `continue` or `break`?
     * Or should it be an **AND**/**OR** across capabilities? (Right now the reducer’s seed is `False`, which can be odd—think through the intended logic.)

8. **Consistency of `action_context`**

   * You’re passing the same `action_context` into each hook—perfect. Make sure anything capabilities read from it (e.g., timezone, auth, config) is stable or versioned per iteration.

9. **Extensibility expectations**

   * This wiring assumes capabilities won’t conflict (e.g., two capabilities both rewriting the same field in incompatible ways). Document “who owns what” or set a priority scheme if needed.

### Small nits / suggestions

* **`start_agent_loop` reducer:** If you intend gatekeeping, consider:

  ```python
  can_start = all(c.start_agent_loop(self, action_context) for c in self.capabilities)
  if not can_start:
      continue  # or break
  ```
* **Type hints & immutability:** If `Prompt` is mutable, a capability could mutate in place and also return it—be consistent. Immutable patterns reduce surprises.
* **Telemetry:** Add a lightweight `TracingCapability` early to log each phase’s input/output once while you develop other capabilities.





## 🔄 How Capability Hooks Map to the Agent’s Execution Cycle

The **Capability pattern** hooks into each stage of the agent’s lifecycle, giving you precise control over behavior without touching the agent’s core loop.

---

### 1️⃣ **Initialization Phase**

`init()` → Runs **once** when the agent starts.

* Set up **initial state** or preload memory with important context.
* Example: In `TimeAwareCapability`, this is where we **first tell the agent the current time**.

---

### 2️⃣ **Loop Start Phase**

`start_agent_loop()` → Runs **before each iteration** of the loop.

* Check conditions or prep for the next iteration.
* Example: Skip running until **enough time has passed** since the last loop.

---

### 3️⃣ **Prompt Construction Phase**

`process_prompt()` → Runs **just before** sending a prompt to the LLM.

* Modify or enrich the outgoing prompt.
* Example: Add **current time** info to every prompt.

---

### 4️⃣ **Response Processing Phase**

`process_response()` → Runs **after** the LLM responds but **before parsing**.

* Validate or clean the raw text.
* Example: Strip extraneous formatting, check for missing fields.

---

### 5️⃣ **Action Processing Phase**

`process_action()` → Runs **after parsing** into an action, **before execution**.

* Inject metadata or validate arguments.
* Example: Attach **execution timestamps** or ensure all required parameters are present.

---

### 6️⃣ **Result Processing Phase**

`process_result()` → Runs **after the action executes**.

* Enrich, format, or analyze results.
* Example: Append **additional context** or transform into a structured format.

---

### 7️⃣ **Memory Update Phase**

`process_new_memories()` → Runs when **adding new memories**.

* Modify, enrich, or filter what gets stored.
* Example: Filter out noisy logs or tag memories with categories.

---

### 8️⃣ **Loop End Phase**

`end_agent_loop()` → Runs **at the end** of each iteration.

* Perform cleanup or logging.
* Example: Save progress reports or **trigger adaptive strategy changes**.

---

### 9️⃣ **Termination Phase**

* `should_terminate()` → Decide if the agent should **stop running**.
* `terminate()` → Perform **final cleanup** before shutdown.

---

💡 **Key Insight:**
Each hook gets both:

* **`agent`** → the agent instance
* **`action_context`** → full access to dependencies, memory, tools, and more

This pattern turns your agent loop into a **customizable pipeline**, where you can snap in new behavior like LEGO bricks — and the `reduce()` calls ensure each capability is applied **in sequence**.



In [None]:
class Agent:
    def __init__(self,
                 goals: List[Goal],
                 agent_language: AgentLanguage,
                 action_registry: ActionRegistry,
                 generate_response: Callable[[Prompt], str],
                 environment: Environment,
                 capabilities: List[Capability] = [],
                 max_iterations: int = 10,
                 max_duration_seconds: int = 180):
        """
        Initialize an agent with its core GAME components and capabilities.

        Goals, Actions, Memory, and Environment (GAME) form the core of the agent,
        while capabilities provide ways to extend and modify the agent's behavior.

        Args:
            goals: What the agent aims to achieve
            agent_language: How the agent formats and parses LLM interactions
            action_registry: Available tools the agent can use
            generate_response: Function to call the LLM
            environment: Manages tool execution and results
            capabilities: List of capabilities that extend agent behavior
            max_iterations: Maximum number of action loops
            max_duration_seconds: Maximum runtime in seconds
        """
        self.goals = goals
        self.generate_response = generate_response
        self.agent_language = agent_language
        self.actions = action_registry
        self.environment = environment
        self.capabilities = capabilities or []
        self.max_iterations = max_iterations
        self.max_duration_seconds = max_duration_seconds
This design lets us compose an agent with exactly the capabilities it needs. For example, we might create an agent that’s both time-aware and able to log its actions:

agent = Agent(
    goals=[
        Goal(name="scheduling",
             description="Schedule meetings considering current time and availability")
    ],
    agent_language=JSONAgentLanguage(),
    action_registry=registry,
    generate_response=llm.generate,
    environment=PythonEnvironment(),
    capabilities=[
        TimeAwareCapability(),
        LoggingCapability(log_level="INFO"),
        MetricsCapability(metrics_server="prometheus:9090")
    ]
)



## 🔄 How Capabilities Work Together

Each capability in the list gets a **chance to participate in every phase** of the agent’s execution.

* These phases are applied **in sequence** using Python’s `reduce()` function.

💡 **Example:**

* `TimeAwareCapability` adds **time information** to a prompt.
* Then, `LoggingCapability` logs that **time-enhanced prompt** before it goes to the LLM.

---

## 🧱 Why This Works

This architecture enables **complex behaviors** by **composing simple, focused capabilities**, each responsible for **one aspect** of the agent’s behavior.

It’s similar to **middleware in web frameworks**:

* Each piece can **modify** the request/response cycle
* The **core application** doesn’t need to know about these modifications.

---

## ⏰ Implementing Time Awareness

The `TimeAwareCapability` is responsible for:

* Informing the agent about the **current time**
* Ensuring this information **persists** throughout its decision-making process




In [None]:
from datetime import datetime
from zoneinfo import ZoneInfo

class TimeAwareCapability(Capability):
    def __init__(self):
        super().__init__(
            name="Time Awareness",
            description="Allows the agent to be aware of time"
        )

    def init(self, agent, action_context: ActionContext) -> dict:
        """Set up time awareness at the start of agent execution."""
        # Get timezone from context or use default
        time_zone_name = action_context.get("time_zone", "America/Chicago")
        timezone = ZoneInfo(time_zone_name)

        # Get current time in specified timezone
        current_time = datetime.now(timezone)

        # Format time in both machine and human-readable formats
        iso_time = current_time.strftime("%Y-%m-%dT%H:%M:%S%z")
        human_time = current_time.strftime("%H:%M %A, %B %d, %Y")

        # Store time information in memory
        memory = action_context.get_memory()
        memory.add_memory({
            "type": "system",
            "content": f"""Right now, it is {human_time} (ISO: {iso_time}).
            You are in the {time_zone_name} timezone.
            Please consider the day/time, if relevant, when responding."""
        })

    def process_prompt(self, agent, action_context: ActionContext,
                      prompt: Prompt) -> Prompt:
        """Update time information in each prompt."""
        time_zone_name = action_context.get("time_zone", "America/Chicago")
        current_time = datetime.now(ZoneInfo(time_zone_name))

        # Add current time to system message
        system_msg = (f"Current time: "
                     f"{current_time.strftime('%H:%M %A, %B %d, %Y')} "
                     f"({time_zone_name})\n\n")

        # Add to existing system message or create new one
        messages = prompt.messages
        if messages and messages[0]["role"] == "system":
            messages[0]["content"] = system_msg + messages[0]["content"]
        else:
            messages.insert(0, {
                "role": "system",
                "content": system_msg
            })

        return Prompt(messages=messages)

In [None]:
# Now we can use this capability when creating our agent:

agent = Agent(
    goals=[Goal(name="task", description="Complete the assigned task")],
    agent_language=JSONAgentLanguage(),
    action_registry=registry,
    generate_response=llm.generate,
    environment=PythonEnvironment(),
    capabilities=[
        TimeAwareCapability()
    ]
)

# Our agent now consistently knows the current time, enabling it to make time-aware decisions.
# For example, if we ask it to schedule a meeting, it might respond:

agent.run("Schedule a team meeting for today")

# Agent response might include:
'''Since it's already 5:30 PM on Friday, I recommend scheduling the meeting
for Monday morning instead. Would you like me to look for available times on Monday?"



## ⏱ How Time Awareness Changes Agent Behavior

The `TimeAwareCapability` modifies agent behavior in two main phases:

1. **`init()`**

   * Runs **once** when the agent starts.
   * Establishes **baseline time awareness** by adding the current time to memory.

2. **`process_prompt()`**

   * Runs **before each prompt** is sent to the LLM.
   * Updates the current time, ensuring the agent always has **fresh time data** for decision-making.

---

### 🌊 Ripple Effect on Behavior

* These modifications **flow through** the agent’s decision-making process.
* The **core `Agent` class stays untouched** — all the customization is handled by the **capability pattern**.

---

## 🚀 Extending Time Awareness

We can enhance this capability to handle **richer time-related features**.

```python
class EnhancedTimeAwareCapability(TimeAwareCapability):
    def process_action(self, agent, action_context: ActionContext, action: dict) -> dict:
        """Add timing information to action results."""
        action["execution_time"] = datetime.now(
            ZoneInfo(action_context.get("time_zone", "America/Chicago"))
        ).isoformat()
        return action
        
    def process_result(self, agent, action_context: ActionContext,
                       response: str, action_def: Action,
                       action: dict, result: any) -> any:
        """Add duration information to results."""
        if isinstance(result, dict):
            result["action_duration"] = (
                datetime.now(ZoneInfo(action_context.get("time_zone"))) -
                datetime.fromisoformat(action["execution_time"])
            ).total_seconds()
        return result
```

---

### 📊 Benefits of the Enhanced Version

* **Tracks execution time** for each action.
* **Calculates duration** for more accurate performance metrics.
* Builds a **richer temporal awareness** into the agent’s operations.

