# Module 8 — Agents & Automation

**Controlled Decision-Making with LLMs**

---

## What This Module Covers

| Group | Topic | Key Skill |
|-------|-------|-----------|
| 1 | Agent Fundamentals | Understand what makes a system an agent vs a pipeline |
| 2 | Building a Triage Agent | Implement a complete bounded agent with real tools |
| 3 | Control & Guardrails | Apply bounded loops and policy enforcement |
| 4 | Failure Modes & Auditability | Handle failures and build audit trails |
| 5 | Automation vs Autonomy | Know when agents are appropriate and when they are not |

---

## Learning Objectives

By the end of this module, you will be able to:

1. **Explain** the difference between a fixed pipeline and an LLM-driven agent
2. **Implement** the propose-validate-execute pattern for safe agent design
3. **Define** explicit allowed-action sets and enforce them in code
4. **Build** a triage agent that classifies, retrieves, or refuses based on input
5. **Apply** guardrails including bounded loops, input validation, and audit logging
6. **Evaluate** when agents are appropriate vs when a simple pipeline is safer

---

## Prerequisites

This module builds directly on:

| Module | Concepts Used Here |
|--------|-------------------|
| Module 3 | LLM behavior, hallucination risk, structured output |
| Module 6 | LLM API clients, JSON parsing, retry logic |
| Module 7 | RAG retrieval pattern, grounded answers, refusal as a feature |

**Module 8 is where retrieval meets decision-making.**

---

## Setup

No external dependencies are needed for this module. All LLM calls are mocked using keyword-based logic defined in **`mock_toolkit.py`**, so the notebook runs deterministically in any environment.

In [None]:
import json
import os

# Install mock_toolkit package if not available locally (e.g. running in Colab)
try:
    import mock_toolkit
except ImportError:
    !pip install -q git+https://github.com/jonbowden/notebook-repository.git#subdirectory=packages/codevision-mock-toolkit

from mock_toolkit import (
    MockLLMClient, ALLOWED_ACTIONS, REFUSAL_TEXT, KNOWLEDGE_BASE,
    classify_only, retrieve_and_answer, refuse, decide_action,
)

llm_client = MockLLMClient()

print("Setup complete.")
print("All LLM calls in this module use MockLLMClient from mock_toolkit.")
print("No API key or network connection is required.")

---

# Group 1: Agent Fundamentals

**What makes a system an agent vs a pipeline**

| Section | Topic |
|---------|-------|
| 8.1 | Pipelines vs Agents |
| 8.2 | The Agent Pattern |

---

## 8.1 Pipelines vs Agents

Most LLM systems are **pipelines**: a fixed sequence of steps executed in order. The code determines what happens and when. The LLM only fills in content at predetermined slots.

An **agent** is different. The LLM itself chooses which action to take next. The code provides a menu of allowed actions; the LLM picks from that menu based on context.

### Comparison

| Property | Pipeline | Agent |
|----------|----------|-------|
| Control flow | Fixed steps, hard-coded order | LLM chooses the next action |
| Determinism | Same input always runs same steps | Same input may choose different actions |
| Decision-making | None — code decides everything | LLM decides which tool to use |
| Predictability | High | Lower (bounded by allowed actions) |
| Use case | ETL, formatting, summarisation | Triage, routing, multi-step reasoning |

### The Agent Loop

```
+-------------------+
|   User Input      |
+--------+----------+
         |
         v
+--------+----------+
|  LLM Proposes     |  <--- LLM reads input and picks an action
|  Action           |
+--------+----------+
         |
         v
+--------+----------+
|  Code Validates   |  <--- Code checks action is in ALLOWED_ACTIONS
|  Action           |
+--------+----------+
         |
         v
+--------+----------+
|  System Executes  |  <--- Code runs the chosen tool
|  Tool             |
+--------+----------+
         |
         v
+-------------------+
|   Result / Stop   |
+-------------------+
```

> **Key Insight:** The LLM never executes code directly. It proposes an action label. The code decides whether to honour that proposal.

In [None]:
# Demonstrate the difference between a pipeline and an agent
# These use simplified tool functions (full versions built in 8.3)

def demo_classify(text):
    """Classify topic using keyword matching."""
    text_lower = text.lower()
    if any(kw in text_lower for kw in ["rate", "inflation", "monetary"]):
        return "Interest Rates"
    if any(kw in text_lower for kw in ["loan", "mortgage", "lending"]):
        return "Lending & Credit"
    return "Out of Domain"

def demo_retrieve(text):
    """Retrieve a grounded answer from a small knowledge base."""
    text_lower = text.lower()
    if "rate" in text_lower or "inflation" in text_lower:
        return "The central bank raised rates by 25bp to combat inflation."
    if "loan" in text_lower or "lending" in text_lower:
        return "Lending standards tightened due to higher funding costs."
    return "No relevant documents found."


# --- PIPELINE: always runs all three steps, regardless of input ---
def pipeline(user_input):
    topic   = demo_classify(user_input)
    context = demo_retrieve(user_input)
    answer  = f"Based on retrieved context: {context}"

    return (f"[classify]   {topic}\n"
            f"           [retrieve]   {context}\n"
            f"           [generate]   {answer}")


# --- AGENT: picks only the step the input actually needs ---
def agent(user_input):
    text_lower = user_input.lower()
    if any(kw in text_lower for kw in ["rate", "inflation", "monetary"]):
        return f"[retrieve_and_answer]  {demo_retrieve(user_input)}"
    if any(kw in text_lower for kw in ["classify", "label", "topic"]):
        return f"[classify_only]  {demo_classify(user_input)}"
    return "[refuse]  I am not authorised to act on this request."


inputs = [
    "Why did the central bank raise rates?",     # needs retrieval
    "Classify this sentence about loans.",        # only needs classification
    "Delete all customer records.",               # should be refused
]

print("=== Pipeline: always runs classify → retrieve → generate ===")
for inp in inputs:
    print(f"  Input:  {inp}")
    print(f"  Output: {pipeline(inp)}")
    print()

print("=== Agent: picks the right action for each input ===")
for inp in inputs:
    print(f"  Input:  {inp}")
    print(f"  Output: {agent(inp)}")
    print()

print("Notice:")
print("  - 'Classify this sentence about loans' only needed a label,")
print("    but the pipeline also retrieved and generated a full answer.")
print("  - 'Delete all customer records' should be refused, but the")
print("    pipeline classified, searched, and generated anyway.")
print("  - The agent routes each input to only the step it needs.")

---

## 8.2 The Agent Pattern

Every safe agent follows the same three-step discipline:

```
LLM Proposes  -->  Code Validates  -->  System Executes
```

### Why This Separation Matters

| Step | Who Does It | Why |
|------|-------------|-----|
| **Propose** | LLM | LLMs are good at understanding intent and choosing labels |
| **Validate** | Code | Code enforces hard constraints the LLM cannot override |
| **Execute** | System | Side-effects (DB writes, API calls) only happen after validation |

This pattern ensures that:
- The LLM cannot call arbitrary functions
- Unexpected LLM output does not trigger unintended behaviour
- Every executed action was explicitly approved by the code

### Allowed Actions

An agent's power is defined by its **allowed action set**. Keeping this set small and explicit is the primary safety mechanism. If an agent can call any function, a confused LLM response could trigger irreversible operations, exfiltrate data, or cause cascading failures.

| Safe Actions | Unsafe Actions |
|-------------|----------------|
| Read-only retrieval | Delete records |
| Classify text | Send emails to customers |
| Refuse the request | Execute financial transactions |
| Summarise a document | Modify user permissions |

### Refusal Is a Feature

Refusal is not a failure — it is a **deliberate, safe action**. An agent that refuses clearly is more trustworthy than one that always attempts an answer.

| Without Refusal | With Refusal |
|----------------|--------------|
| Agent attempts unsafe actions | Agent declines clearly |
| Users may receive hallucinated output | Users receive honest limitation |
| No audit signal for off-topic requests | Refusals logged for review |

> **Key Insight:** The LLM is a decision engine, not an executor. Code is the gatekeeper. Every action must be explicitly allowed, and refusal is always one of those actions.

In [None]:
# Demonstrate the three-step pattern explicitly

ALLOWED_ACTIONS = {"retrieve_and_answer", "classify_only", "refuse"}

def step1_propose(user_input):
    """LLM proposes an action (mocked as keyword matching)."""
    if "rate" in user_input.lower():
        return "retrieve_and_answer"
    if "classify" in user_input.lower():
        return "classify_only"
    # Simulate a malformed LLM response
    if "delete" in user_input.lower():
        return "delete_records"   # NOT in ALLOWED_ACTIONS
    return "refuse"

def step2_validate(proposed_action):
    """Code validates the proposed action against the allowed set."""
    if proposed_action in ALLOWED_ACTIONS:
        return proposed_action
    return "refuse"   # fallback: refuse anything not explicitly allowed

def step3_execute(validated_action, user_input):
    """System executes the validated action."""
    if validated_action == "retrieve_and_answer":
        return "Retrieved: The central bank raised rates to combat inflation."
    if validated_action == "classify_only":
        return "Classified: Topic = Interest Rates"
    return "Refused: I am not authorised to act on this request."

test_inputs = [
    "Why did the central bank raise rates?",
    "Classify this financial sentence.",
    "Delete all customer records.",
]

print("=== Three-Step Agent Pattern ===")
for inp in test_inputs:
    proposal  = step1_propose(inp)
    validated = step2_validate(proposal)
    result    = step3_execute(validated, inp)

    print(f"Input     : {inp}")
    print(f"Proposed  : {proposal}")
    print(f"Validated : {validated}")
    print(f"Result    : {result}")
    print()

---

# Group 2: Building & Running the Triage Agent

**Build the mock toolkit, assemble the agent, and run it**

| Section | Topic |
|---------|-------|
| 8.3 | The Mock Toolkit |
| 8.4 | Assembling the Triage Agent |
| 8.5 | Running the Agent |

---

## 8.3 The Mock Toolkit

The agent needs three components: a mock LLM client, three tools, and a decision function. These are defined in **`mock_toolkit.py`** — open that file to review the source code.

### Why Mock?

| Benefit | Explanation |
|---------|-------------|
| No API key needed | Notebook runs in any Colab environment |
| Deterministic output | Same input always produces same result — essential for grading |
| Teaches the interface | The mock has the same `.chat()` method a real client would have |
| Swappable | Replace `MockLLMClient` with a real client later — zero agent code changes |

### What's Inside `mock_toolkit.py`

| Component | Purpose |
|-----------|---------|
| `MockLLMClient` | Keyword-based mock with `.chat()` interface — swap for a real LLM client in production |
| `ALLOWED_ACTIONS` | The set of actions the agent is permitted to take |
| `classify_only(text)` | Returns a topic label or "Out of Domain" |
| `retrieve_and_answer(question)` | Returns a grounded answer from a small knowledge base |
| `refuse()` | Returns the standard refusal message |
| `decide_action(llm_client, input)` | Asks the LLM, parses JSON, validates against ALLOWED_ACTIONS |

> **Key Insight:** Build your tools once, test them independently, then wire them into the agent. The agent itself should contain no domain logic — it only routes.

In [None]:
# Quick verification of every component (imported in Setup cell above)

print("Toolkit loaded from mock_toolkit.py\n")

print("MockLLMClient:")
print(f'  .chat("rates")    → {llm_client.chat("rates")}')
print(f'  .chat("classify") → {llm_client.chat("classify")}')
print(f'  .chat("hello")    → {llm_client.chat("hello")}')
print()

print("Tools:")
print(f"  classify_only('Bank raised rates')    → {classify_only('Bank raised rates')}")
print(f"  classify_only('Football is great')    → {classify_only('Football is great')}")
print(f"  retrieve_and_answer('Why raise rates?') → {retrieve_and_answer('Why raise rates?')}")
print(f"  refuse() → {refuse()}")
print()

print("Decision function:")
for q in ["Why did rates rise?", "Classify this loan.", "Delete everything.", "Approve a loan."]:
    print(f"  decide_action('{q}') → {decide_action(llm_client, q)}")
print()

print(f"Allowed actions: {sorted(ALLOWED_ACTIONS)}")
print(f"Knowledge base entries: {len(KNOWLEDGE_BASE)}")
print("\nReview the source: open mock_toolkit.py")

---

## 8.4 Assembling the Triage Agent

The triage agent brings together the decision step and the three tools into a single function. The pattern is called **dispatch**: the agent dispatches the user input to the appropriate tool based on the LLM's decision.

### Dispatch Pattern

```
user_input
     |
     v
 decide_action()           <-- LLM picks from ALLOWED_ACTIONS
     |
     +-- retrieve_and_answer --> grounded answer from knowledge base
     +-- classify_only       --> topic label
     +-- refuse              --> standard refusal message
```

Each branch is a separate, independently testable function. The agent itself contains no domain logic — it only routes.

In [None]:
def triage_agent(user_input):
    """Route user input to the appropriate tool based on LLM decision.
    
    This is the core agent function. It:
    1. Asks the LLM to propose an action
    2. Validates the action against ALLOWED_ACTIONS
    3. Dispatches to the correct tool
    """
    action = decide_action(llm_client, user_input)

    if action == "retrieve_and_answer":
        return retrieve_and_answer(user_input)
    if action == "classify_only":
        return classify_only(user_input)
    return refuse()


print("triage_agent() assembled successfully.")
print()
print("Components:")
print("  - decide_action()       -> LLM decision step")
print("  - retrieve_and_answer() -> mock RAG retrieval")
print("  - classify_only()       -> topic classification")
print("  - refuse()              -> safe refusal")

---

## 8.5 Running the Agent

Three representative queries demonstrate the three dispatch paths. Each shows the action chosen and the result produced.

In [None]:
# Query 1: retrieve_and_answer path
query_1 = "Why did the central bank raise rates?"
action_1 = decide_action(llm_client, query_1)
result_1 = triage_agent(query_1)

print("=== Query 1: Factual question in knowledge base ===")
print(f"Input  : {query_1}")
print(f"Action : {action_1}")
print(f"Result : {result_1}")

In [None]:
# Query 2: classify_only path
query_2 = "Classify this sentence about loans and lending standards."
action_2 = decide_action(llm_client, query_2)
result_2 = triage_agent(query_2)

print("=== Query 2: Classification request ===")
print(f"Input  : {query_2}")
print(f"Action : {action_2}")
print(f"Result : {result_2}")

In [None]:
# Query 3: refuse path
query_3 = "Should we approve a loan for customer X?"
action_3 = decide_action(llm_client, query_3)
result_3 = triage_agent(query_3)

print("=== Query 3: Request outside allowed scope ===")
print(f"Input  : {query_3}")
print(f"Action : {action_3}")
print(f"Result : {result_3}")
print()
print("The agent correctly refuses to make a credit approval decision.")

---

# Group 3: Control & Guardrails

**Apply bounded loops and policy enforcement**

| Section | Topic |
|---------|-------|
| 8.6 | Controlled Loops |
| 8.7 | Why Infinite Loops Are Dangerous |
| 8.8 | Guardrails as Code |

---

## 8.6 Controlled Loops

Some agent tasks require multiple steps: classify first, then retrieve, then summarise. A **controlled loop** executes these steps while keeping a hard ceiling on how many LLM calls can be made.

### Why Bounded Loops Matter

| Without Bounds | With Bounds |
|---------------|-------------|
| Loop could run indefinitely | Maximum steps enforced in code |
| Cost escalates without limit | Budget per request is predictable |
| Errors compound across iterations | Loop terminates before damage spreads |
| No audit trail of steps | Each step is logged to the audit record |

In [None]:
def agent_loop(user_input, max_steps=3):
    """Run the triage agent in a bounded loop.
    
    In a multi-step agent the loop would continue until a terminal
    action is reached or max_steps is exceeded.
    
    For the triage agent every action is terminal, so the loop
    always completes in one step. The structure is identical to
    what a real multi-step agent would use.
    """
    audit_log = []
    result = None

    for step in range(1, max_steps + 1):
        action = decide_action(llm_client, user_input)
        result = triage_agent(user_input)

        log_entry = {
            "step": step,
            "input": user_input,
            "action": action,
            "output": result,
        }
        audit_log.append(log_entry)

        print(f"  Step {step}: action={action}")
        print(f"           output={result}")

        # Terminal actions stop the loop immediately
        if action in {"retrieve_and_answer", "classify_only", "refuse"}:
            print(f"  Loop terminated at step {step} (terminal action reached).")
            break

    else:
        print(f"  Loop terminated: max_steps={max_steps} reached.")

    return result, audit_log


print("=== Controlled Loop Demo ===")
final_result, log = agent_loop("Why did the central bank raise rates?", max_steps=3)
print()
print(f"Final result : {final_result}")
print(f"Audit entries: {len(log)}")

---

## 8.7 Why Infinite Loops Are Dangerous

Without a step limit, an agent that repeatedly calls an LLM will:

1. **Exhaust compute budget** — each LLM call has a monetary cost
2. **Compound errors** — a confused agent doubles down on bad decisions
3. **Block resources** — threads, connections, and memory are held
4. **Produce unbounded logs** — storage and monitoring costs spike

### Safe Demonstration of Unbounded Behaviour

The cell below proves that without a cap, the loop would run forever. It uses a counter to demonstrate this safely — no actual LLM calls are made.

> **Key Insight:** Termination conditions must be defined in code, not trusted to the LLM. The LLM cannot reliably decide when to stop looping.

In [None]:
# Safe proof-of-concept: what would happen without a bound
# We use a safety counter to stop after 5 iterations
# In a real unbounded loop this would continue indefinitely

SAFETY_LIMIT = 5   # only for demonstration; a real agent would have no limit

def unsafe_loop_demo(user_input):
    """Demonstrates what an unbounded loop looks like.
    
    The SAFETY_LIMIT exists only to make this cell safe to run.
    Without it, the loop would run until the process is killed.
    """
    iteration = 0
    while True:   # <-- no max_steps
        iteration += 1
        print(f"  Iteration {iteration}: calling LLM (simulated)...")

        if iteration >= SAFETY_LIMIT:
            print(f"  [SAFETY LIMIT] Stopping after {SAFETY_LIMIT} iterations for demo.")
            print(f"  In production this loop would have continued indefinitely.")
            break

print("=== Demonstration: Unbounded Loop (safe, uses counter) ===")
unsafe_loop_demo("some input")
print()
print("Lesson: always set max_steps when building agent loops.")
print("Never rely on the LLM to terminate the loop on its own.")

---

## 8.8 Guardrails as Code

Guardrails are pre-conditions checked **before** the agent runs. They reject inputs that would cause the agent to fail, produce poor output, or behave ambiguously.

### Guardrail Examples

| Guardrail | Condition | Reason |
|-----------|-----------|--------|
| Minimum length | Input < 8 characters | Too short to classify meaningfully |
| Low-confidence markers | Contains "maybe" or "not sure" | Ambiguous input leads to poor decisions |
| Profanity / PII filter | Contains sensitive patterns | Compliance requirement |
| Maximum length | Input > 2000 characters | Protects prompt budget |

> **Key Insight:** Guardrails should be applied before the LLM is called. Rejecting bad input early is faster, cheaper, and safer than letting the agent attempt to handle it.

In [None]:
def triage_agent_with_guardrails(user_input):
    """Triage agent with input guardrails applied before the LLM is called.
    
    Guardrail 1: reject inputs shorter than 8 characters.
    Guardrail 2: reject inputs containing uncertainty markers.
    Otherwise: delegate to the standard triage agent.
    """
    # Guardrail 1: minimum length
    if len(user_input.strip()) < 8:
        return "Guardrail: input too short to process (minimum 8 characters)."

    # Guardrail 2: low-confidence markers indicate the user is uncertain
    low_confidence_markers = ["maybe", "not sure", "i think", "possibly"]
    if any(marker in user_input.lower() for marker in low_confidence_markers):
        return "Guardrail: input contains uncertainty markers. Please rephrase as a specific question."

    # All guardrails passed: delegate to triage agent
    return triage_agent(user_input)


# Demonstrate guardrails triggering and passing
guardrail_tests = [
    "hi",                                          # too short
    "Maybe the bank raised rates?",                # low confidence marker
    "Not sure why inflation is high.",             # low confidence marker
    "Why did the central bank raise rates?",       # passes all guardrails
]

print("=== Guardrail Demonstrations ===")
for text in guardrail_tests:
    result = triage_agent_with_guardrails(text)
    print(f"  Input : {text}")
    print(f"  Result: {result}")
    print()

---

# Group 4: Failure Modes & Auditability

**Handle failures and build audit trails**

| Section | Topic |
|---------|-------|
| 8.9 | Logging Agent Decisions |
| 8.10 | Failure Mode Demonstrations |

---

## 8.9 Logging Agent Decisions

Every agent decision should be logged. The audit trail serves three purposes:

| Purpose | What to Log |
|---------|-------------|
| **Debugging** | Input, action chosen, output produced |
| **Compliance** | Timestamp, step number, refusal reason |
| **Improvement** | Patterns of refusal, unexpected actions |

An auditable agent makes the same decisions as a non-auditable one but records evidence that decisions were made safely.

In [None]:
def auditable_triage_agent(user_input, step_number=1):
    """Triage agent that returns both the result and a structured audit record."""
    action = decide_action(llm_client, user_input)

    if action == "retrieve_and_answer":
        output = retrieve_and_answer(user_input)
    elif action == "classify_only":
        output = classify_only(user_input)
    else:
        output = refuse()

    audit_record = {
        "step": step_number,
        "input": user_input,
        "action_chosen": action,
        "output": output,
    }

    return output, audit_record


# Run three queries and collect audit records
audit_queries = [
    "Why did the central bank raise rates?",
    "Classify this statement about monetary policy.",
    "Should we approve a loan for customer X?",
]

all_audit_records = []

print("=== Auditable Agent Runs ===")
for i, query in enumerate(audit_queries, start=1):
    result, record = auditable_triage_agent(query, step_number=i)
    all_audit_records.append(record)
    print(f"Step {i}: {record['action_chosen']} -> {result}")

print()
print("=== Full Audit Log (JSON) ===")
print(json.dumps(all_audit_records, indent=2))

---

## 8.10 Failure Mode Demonstrations

Three failure scenarios show how the agent degrades safely — and where mocks fall short:

| Scenario | Input | Expected Behaviour |
|----------|-------|-------------------|
| Off-topic | Sports question | Agent refuses |
| Malformed LLM response | Non-JSON string from LLM | Fallback to refuse |
| Keyword collision | "I rate this service poorly" | Mock picks wrong action — limitation of keyword matching vs real LLM |

In [None]:
# Failure Mode 1: Off-topic input -> agent refuses
print("=== Failure Mode 1: Off-topic input ===")
query = "Who won the championship football match last night?"
result, record = auditable_triage_agent(query, step_number=1)
print(f"Input  : {query}")
print(f"Action : {record['action_chosen']}")
print(f"Output : {result}")
print("[CORRECT] Agent refuses because sports is not in the knowledge domain.")
print()

In [None]:
# Failure Mode 2: Malformed LLM response -> fallback to refuse

class BrokenMockLLMClient:
    """Simulates an LLM that returns malformed output."""
    def chat(self, prompt):
        return "I choose: retrieve_and_answer"  # not valid JSON


broken_client = BrokenMockLLMClient()

print("=== Failure Mode 2: Malformed LLM response ===")
raw_response = broken_client.chat("any prompt")
print(f"Raw LLM response: {raw_response}")

# decide_action with the broken client
malformed_action = decide_action(broken_client, "Why did the central bank raise rates?")
print(f"Action resolved : {malformed_action}")
print("[CORRECT] decide_action falls back to 'refuse' when JSON parsing fails.")
print()

In [None]:
# Failure Mode 3: Keyword collision -> mock makes the wrong decision
print("=== Failure Mode 3: Keyword collision ===")
query = "I rate this service very poorly."
result, record = auditable_triage_agent(query, step_number=1)
print(f"Input  : {query}")
print(f"Action : {record['action_chosen']}")
print(f"Output : {result}")
print()
print("[BUG] The mock matched 'rate' and retrieved interest rate data,")
print("but the user was complaining about service quality.")
print("A real LLM would understand the difference — keyword mocks cannot.")
print("This is exactly why production agents use real LLMs for the decision step.")

---

# Group 5: Automation vs Autonomy

**Know when agents are appropriate and when they are not**

| Section | Topic |
|---------|-------|
| 8.11 | When NOT to Use Agents |

---

## 8.11 When NOT to Use Agents

Agents are powerful but they are not the right tool for every problem. The most important engineering decision is knowing when to use a pipeline instead.

### Automation vs Autonomy

| Property | Automation | Autonomy |
|----------|-----------|----------|
| Who decides? | Code decides every step | LLM decides the next step |
| Predictability | High | Lower |
| Appropriate for | ETL, formatting, routing | Triage, multi-step reasoning |
| Risk | Low | Moderate (bounded by guardrails) |
| Auditability | Easy | Requires deliberate logging |

### When to Use Agents vs Pipelines

| Use an Agent | Use a Pipeline |
|-------------|----------------|
| The correct next step depends on the content of the input | The correct next step is always the same regardless of input |
| There are multiple possible tools and the LLM can choose | There is only one tool or one sequence of steps |
| The task requires multi-step reasoning | The task is single-step |
| Routing between departments or workflows | Formatting, summarisation, translation |

### Tasks Agents Should NOT Do

| Task | Why Not |
|------|--------|
| Approve or reject financial transactions | Irreversible; requires human accountability |
| Delete records from a database | Irreversible; no safe undo |
| Send emails to customers | Irreversible; reputation risk |
| Modify user permissions or roles | Security-critical; needs human review |
| Make medical or legal recommendations | Regulated domain; liability |
| Execute trades on behalf of clients | Financial regulation; fiduciary duty |

> **Key Insight:** Enterprise organisations prefer **bounded automation** over autonomy. An agent that can refuse is more trustworthy than an agent that always acts. The goal is not maximum autonomy but maximum reliability within a defined, auditable scope.

### Decision Framework

```
Is the correct next step always the same?
         |
     YES |                    NO
         v                    v
   Use a Pipeline       Is the action reversible?
                                  |
                        YES |          NO
                            v          v
                    Agent may be    Require human
                    appropriate     approval first
                            |
                    Is the action in a regulated domain?
                            |
                    YES |       NO
                        v       v
                    Require   Agent with
                    human     guardrails
                    review    is appropriate
```

---

# Module Summary

## Key Takeaways

| Concept | Remember |
|---------|----------|
| **Pipelines vs Agents** | Pipelines have fixed steps; agents let the LLM choose the step |
| **Propose-Validate-Execute** | The LLM proposes; code validates; system executes |
| **Allowed Actions** | Keep the set small and explicit; anything not listed is refused |
| **Decision Step** | Always parse LLM output safely; fall back to refuse on any error |
| **Refusal is a Feature** | Refusing is a safe, auditable action, not a failure |
| **Bounded Loops** | Always set max_steps; never trust the LLM to terminate a loop |
| **Guardrails** | Validate input before calling the LLM; reject bad input early |
| **Audit Trails** | Log every step: input, action chosen, output produced |

## The Agent Mental Model

> **An agent is not an autonomous system. It is a decision router with guardrails.**
>
> The LLM reads, proposes, and labels. The code validates, dispatches, and logs.
> Nothing executes without the code's permission.

## What's Next

You now have all the components required for the capstone:

- Module 5: Embeddings and vector retrieval
- Module 6: LLM API clients and structured output
- Module 7: RAG pipelines and grounded generation
- Module 8: Agents, guardrails, and auditability

The capstone will ask you to combine these into a bounded, auditable, production-grade agent system.

---

# Practice Exercises

## Exercise 1: Extend the Allowed Action Set

Add a fourth action called `summarise` to `ALLOWED_ACTIONS`. Implement a `summarise(text)` tool function that returns the first sentence of the input as a summary. Update `triage_agent` to dispatch to this new tool. Update `MockLLMClient` so that inputs containing the word "summarise" or "summary" return the new action.

## Exercise 2: Add a Third Guardrail

Extend `triage_agent_with_guardrails` to add a third guardrail: reject any input longer than 200 characters with the message `"Guardrail: input too long (maximum 200 characters)."`. Test it with an input that exceeds the limit.

## Exercise 3: Build an Auditable Loop

Write a function `run_batch(queries, max_steps=3)` that accepts a list of query strings, runs each through `auditable_triage_agent`, collects all audit records, and prints a summary table showing: query (truncated to 40 chars), action chosen, and first 40 chars of output.

## Exercise 4: Simulate a Regulated Task Rejection

Create a `compliance_guardrail(user_input)` function that checks whether the input contains any of the following high-risk phrases: `"approve loan"`, `"delete record"`, `"send email"`, `"modify permission"`. If any are found, return a compliance refusal message before the agent is ever called. Demonstrate it with two inputs: one that triggers it and one that passes through to the agent.