# Debug Drill 10: The Runaway Agent

**Symptom:** Your colleague deployed a customer support agent that can issue refunds. A customer asked for a $25 refund, but the agent issued $2,500 by calling the refund tool 100 times in a loop.

**Your task:** Add guardrails, stop conditions, and write a postmortem.

**Time:** 15 minutes

In [None]:
import time
from typing import Dict, List, Any

In [None]:
# Simulated tools
refund_log = []

def issue_refund(order_id: str, amount: float, reason: str) -> Dict:
    """Tool to issue a refund"""
    refund_log.append({"order_id": order_id, "amount": amount, "reason": reason})
    return {"status": "success", "refund_id": f"REF-{len(refund_log)}", "amount": amount}

def lookup_order(order_id: str) -> Dict:
    """Tool to lookup order details"""
    return {"order_id": order_id, "total": 49.99, "status": "delivered"}

def send_email(to: str, subject: str, body: str) -> Dict:
    """Tool to send email"""
    return {"status": "sent", "to": to}

In [None]:
# ===== COLLEAGUE'S CODE (CONTAINS BUGS - NO GUARDRAILS) =====

def agent_loop_buggy(user_request: str, max_iterations: int = 1000):  # BUG: way too high!
    """
    Agent with NO guardrails:
    - No max tool calls
    - No spending limits
    - No human approval
    - No stop condition checking
    """
    iteration = 0
    total_refunded = 0
    
    while iteration < max_iterations:
        # Simulated "confused" agent behavior
        # In reality, this would be LLM deciding actions
        
        # BUG: Agent keeps trying to "help more" by issuing more refunds
        if "refund" in user_request.lower():
            result = issue_refund("ORD-123", 25.00, "customer request")
            total_refunded += 25.00
            print(f"Iteration {iteration}: Issued ${result['amount']} refund")
        
        # BUG: No proper stop condition - just keeps going
        iteration += 1
        
        # Simulate some delay
        if iteration > 10:  # Stop early for demo
            break
    
    return {"iterations": iteration, "total_refunded": total_refunded}

# Run the buggy agent
refund_log.clear()
print("Running buggy agent...")
result = agent_loop_buggy("I need a refund for my order")
print(f"\nResult: {result}")
print(f"Total refunds issued: {len(refund_log)}")
print(f"Total amount refunded: ${sum(r['amount'] for r in refund_log)}")

## Your Investigation

**Q1:** List all the missing guardrails in the buggy code.

In [None]:
# TODO: List missing guardrails
# 1. 
# 2. 
# 3. 
# 4. 

**Q2:** What's the business risk of each missing guardrail?

In [None]:
# TODO: Describe risks
# No max iterations: 
# No spending limit: 
# No human approval: 
# No stop condition: 

## Fix the Bug

**Q3:** Build a safe agent with proper guardrails.

In [None]:
# TODO: Define guardrail configuration

GUARDRAILS = {
    "max_iterations": 5,           # Maximum tool calls
    "max_refund_single": 50.00,    # Max single refund without approval
    "max_refund_total": 100.00,    # Max total refunds per session
    "require_approval_above": 100.00,  # Human approval threshold
    "allowed_tools": ["issue_refund", "lookup_order", "send_email"],
    "forbidden_actions": ["delete_account", "change_password"]
}

class AgentState:
    def __init__(self):
        self.iterations = 0
        self.total_refunded = 0.0
        self.actions_taken = []
        self.task_complete = False
        self.needs_human = False
        self.error = None

In [None]:
def check_guardrails(state: AgentState, action: str, params: Dict) -> tuple:
    """
    Check if action is allowed. Returns (allowed, reason).
    """
    # Check iteration limit
    if state.iterations >= GUARDRAILS["max_iterations"]:
        return False, f"Max iterations ({GUARDRAILS['max_iterations']}) reached"
    
    # Check refund limits
    if action == "issue_refund":
        amount = params.get("amount", 0)
        
        # Single refund limit
        if amount > GUARDRAILS["max_refund_single"]:
            return False, f"Refund ${amount} exceeds single limit ${GUARDRAILS['max_refund_single']}"
        
        # Total refund limit
        if state.total_refunded + amount > GUARDRAILS["max_refund_total"]:
            return False, f"Would exceed total refund limit ${GUARDRAILS['max_refund_total']}"
        
        # Human approval threshold
        if amount > GUARDRAILS["require_approval_above"]:
            return False, f"Amount ${amount} requires human approval"
    
    # Check allowed tools
    if action not in GUARDRAILS["allowed_tools"]:
        return False, f"Action '{action}' not in allowed tools"
    
    return True, "OK"


def agent_loop_safe(user_request: str) -> Dict:
    """
    Agent WITH guardrails.
    """
    state = AgentState()
    
    while not state.task_complete and state.error is None:
        state.iterations += 1
        
        # Simulate agent deciding on action
        if "refund" in user_request.lower() and state.total_refunded == 0:
            action = "issue_refund"
            params = {"order_id": "ORD-123", "amount": 25.00, "reason": "customer request"}
        else:
            # Task complete - no more actions needed
            state.task_complete = True
            continue
        
        # CHECK GUARDRAILS
        allowed, reason = check_guardrails(state, action, params)
        
        if not allowed:
            print(f"BLOCKED: {reason}")
            state.error = reason
            break
        
        # Execute action
        if action == "issue_refund":
            result = issue_refund(**params)
            state.total_refunded += params["amount"]
            state.actions_taken.append({"action": action, "result": result})
            print(f"Iteration {state.iterations}: Issued ${params['amount']} refund")
            
            # STOP CONDITION: Refund issued, task complete
            state.task_complete = True
    
    return {
        "iterations": state.iterations,
        "total_refunded": state.total_refunded,
        "task_complete": state.task_complete,
        "error": state.error,
        "actions": state.actions_taken
    }

In [None]:
# Test the safe agent
refund_log.clear()
print("Running SAFE agent...")
result = agent_loop_safe("I need a refund for my order")
print(f"\nResult: {result}")
print(f"Total refunds issued: {len(refund_log)}")
print(f"Total amount refunded: ${sum(r['amount'] for r in refund_log)}")

## Self-Check

In [None]:
# Verify guardrails work
assert result['iterations'] <= GUARDRAILS['max_iterations'], "Should respect iteration limit"
assert result['total_refunded'] <= GUARDRAILS['max_refund_total'], "Should respect refund limit"
assert result['task_complete'] == True, "Should complete task"
assert len(refund_log) == 1, "Should only issue ONE refund"

print("PASS: Agent has proper guardrails!")

## Postmortem

Write 3 bullets:
1. **Root cause:** 
2. **How we detected it:** 
3. **Prevention for next time:** 