In [None]:
# AI Agents ‚Äî Complete Guide

This notebook provides a comprehensive learning resource for AI Agents, covering 80 topics with detailed explanations, decision-making frameworks (when/why to use), and complete working code examples.

## Table of Contents: 80 AI Agent Topics

### Part 1 ‚Äî Agent Fundamentals (6 topics)
1. What are AI Agents?
2. Agent Architectures (Reactive, Deliberative, Hybrid)
3. Agent Components (Perception, Planning, Action, Memory)
4. Types of Agents (Goal-based, Utility-based, Learning agents)
5. Environment Types (deterministic/stochastic, episodic/sequential)
6. Agent Lifecycle & Event Loops

### Part 2 ‚Äî Architectures & Patterns (6 topics)
7. Reactive Agents / Behavior-based
8. Deliberative (Planning) Agents
9. Hybrid Architectures (Sense-Plan-Act + Reactive)
10. Modular Agent Design
11. Agent-Oriented Programming
12. Agent Communication Patterns

### Part 3 ‚Äî Reasoning & Planning (6 topics)
13. Symbolic Planning (PDDL, STRIPS)
14. Heuristic Search (A*, Best-First)
15. Task & Motion Planning
16. Probabilistic Planning (POMDPs)
17. Hierarchical Task Networks (HTN)
18. Real-time Planning & Replanning

### Part 4 ‚Äî Learning & Adaptation (6 topics)
19. Reinforcement Learning agents
20. Imitation Learning & Behavioral Cloning
21. Offline RL vs Online RL
22. Meta-learning for agents
23. Continual Learning for Agents
24. Multi-task & Transfer Learning

### Part 5 ‚Äî Language-Capable Agents (6 topics)
25. LLM-based Agents (Planner/Executor)
26. ReAct & Thought-Action loops
27. Tool-Enabled Agents (APIs, Search, Databases)
28. Agents with Retrieval (RAG + agents)
29. Conversational Agents and Dialog Management
30. Safety & Alignment for LLM Agents

### Part 6 ‚Äî Tools & Integrations (6 topics)
31. Tool design patterns (idempotent, side-effect control)
32. Tool authorization & sandboxing
33. Tool orchestration & workflows
34. Observability & tool-level logging
35. External knowledge sources (APIs, KB, DBs)
36. Agent testing & sandboxing strategies

### Part 7 ‚Äî Memory & State Management (6 topics)
37. Short-term vs Long-term memory
38. Vector memory & embeddings for agents
39. Memory retrieval strategies (LRU, recency+relevance)
40. Memory condensation & summarization
41. Privacy & retention policies for memory
42. Grounding memory with sources & citations

### Part 8 ‚Äî Multi-Agent Systems (6 topics)
43. Coordination & Negotiation
44. Emergent behavior in MAS
45. Communication protocols and ACLs
46. Distributed planning & consensus
47. Market-based & auction mechanisms
48. Safety in multi-agent contexts

### Part 9 ‚Äî Evaluation & Metrics (6 topics)
49. Task success metrics & rewards
50. Efficiency metrics (latency, cost)
51. Groundedness & hallucination rate
52. Human preference & UX metrics
53. Robustness to distribution shift
54. Interpretability & auditability

### Part 10 ‚Äî Safety, Ethics & Governance (6 topics)
55. Safety layers (validators, simulators)
56. Ethical considerations (bias, fairness)
57. Governance and access control
58. Red-team testing for agents
59. Fail-safe & graceful degradation
60. Consent & user control over agents

### Part 11 ‚Äî Deployment & Scaling (6 topics)
61. Edge vs Cloud agents
62. Autoscaling agent services
63. Caching & partial result reuse
64. Monitoring & alerting for agents
65. Cost-optimization strategies
66. Versioning & rollbacks for agent policies

### Part 12 ‚Äî Advanced Patterns (6 topics)
67. Agent-of-Agents / Meta-agents
68. Self-reflective agents (self-debugging)
69. Curriculum learning for agents
70. Human-in-the-loop & escalation policies
71. Hybrid symbolic-LLM agents
72. Agents for complex simulations

### Part 13 ‚Äî Frameworks & Tooling (8 topics)
73. LangChain agents patterns
74. AutoGen / Colang / Taskmatrix tools
75. Microsoft/Anthropic/OpenAI agent SDKs
76. Benchmarks & agent eval toolkits
77. Debugging agents (traces, replays)
78. Security tools & policy enforcement
79. Example reference projects & templates
80. Future directions & research challenges

In [None]:
## Topic 1 ‚Äî What are AI Agents?

### Definition
An AI agent is a software entity that:
- **Perceives** its environment (sensors, inputs, API responses, user queries)
- **Maintains state** (memory, internal models, conversation history)
- **Reasons or plans** what to do next (symbolic planners, RL policies, LLM-based reasoning)
- **Acts** to achieve goals (calls tools, APIs, writes outputs, updates databases)
- **Learns** and adapts over time (optional, but often valuable)

### Core Components
1. **Perception Layer**: Sensors, API clients, event listeners, input parsing
2. **Memory**: Short-term (current context), long-term (history, learned patterns)
3. **Reasoning/Planning Engine**: Decides next action (rule-based, heuristic search, LLM, RL policy)
4. **Tool Executor**: Runs tools, APIs, handles side effects safely
5. **Tools/Actuators**: External capabilities (web search, calculator, database, API calls, file I/O)

### Why Use AI Agents?
‚úÖ **Automate multi-step workflows**: Chain tool calls together (search ‚Üí summarize ‚Üí email)  
‚úÖ **Complex reasoning**: Decompose tasks, plan sequences, adapt to errors  
‚úÖ **Maintain context**: Remember history, manage conversation state  
‚úÖ **Safe external tool access**: Control what tools agents can call, audit all actions  
‚úÖ **Scalable orchestration**: Coordinate multiple systems (APIs, databases, services)  
‚úÖ **Error recovery**: Agents can retry, backtrack, or escalate when stuck  

### When NOT to Use Agents
‚ùå Simple single-turn queries (direct LLM call is faster)  
‚ùå Real-time latency-critical systems (<100ms response required)  
‚ùå No external tool access needed (overhead of orchestration not justified)  
‚ùå Fully deterministic logic (traditional software is simpler, auditable)  
‚ùå Highly sensitive operations without proven safety (use with caution or on-premise only)  

### Key Risks & Mitigations
| Risk | Mitigation |
|------|-----------|
| Hallucinations | Use retrieval, tool verification, validators |
| Untrusted tool calls | Sandbox tools, require explicit authorization |
| Privacy leakage | Enforce memory retention policies, redaction |
| Cost explosion | Cache results, limit tool calls, use cheaper models |
| Infinite loops | Set max iterations, timeouts, circuit breakers |

### Real-World Examples
- **Customer Support Bot**: Routes to appropriate tools (FAQ search, ticket creation, escalation)
- **Research Assistant**: Searches papers, summarizes findings, cites sources
- **Code Assistant**: Reads codebase, runs tests, suggests fixes
- **Personal Assistant**: Schedules meetings, sends emails, summarizes news
- **DevOps Automation**: Deploys code, monitors systems, triggers alerts

In [None]:
# Minimal Self-Contained AI Agent Example (no external LLM required)
from dataclasses import dataclass
from typing import Callable, Dict, List, Any, Tuple
import shlex
import re

# ============================================================================
# 1. Tool Wrapper & Registry
# ============================================================================

@dataclass
class Tool:
    """Represents a tool an agent can call."""
    name: str
    func: Callable[..., Any]
    description: str = ""
    
    def __repr__(self):
        return f"Tool({self.name}: {self.description})"

# ============================================================================
# 2. Simple Agent with Planning, Execution, and Memory
# ============================================================================

class SimpleAgent:
    """A minimal agent that perceives, plans, executes, and remembers."""
    
    def __init__(self, agent_name: str = "Agent"):
        self.name = agent_name
        self.tools: Dict[str, Tool] = {}
        self.memory: List[str] = []  # Short-term trace of interactions
        self.long_term_memory: Dict[str, Any] = {}  # Learnings from past interactions
    
    def add_tool(self, tool: Tool) -> None:
        """Register a tool the agent can use."""
        self.tools[tool.name] = tool
    
    def list_tools(self) -> List[str]:
        """List available tools."""
        return list(self.tools.keys())
    
    def remember(self, note: str) -> None:
        """Add to short-term memory (interaction trace)."""
        self.memory.append(note)
        if len(self.memory) > 20:  # Keep only recent 20 items
            self.memory.pop(0)
    
    def learn(self, key: str, value: Any) -> None:
        """Store learnings in long-term memory."""
        self.long_term_memory[key] = value
    
    def plan(self, instruction: str) -> Dict[str, str]:
        """
        Minimal planner: analyzes instruction and chooses action.
        In a real system, this could be an LLM or symbolic planner.
        Here we use keyword matching for demo purposes.
        """
        text = instruction.lower()
        
        # Simple heuristics to choose tools
        if any(kw in text for kw in ['calculate', 'compute', 'math', '+', '-', '*', '/']):
            return {'action': 'use_tool', 'tool': 'calculator', 'input': instruction}
        
        if any(kw in text for kw in ['search', 'find', 'look up', 'what is', 'who is']):
            query = instruction.replace('search ', '').replace('find ', '').strip()
            return {'action': 'use_tool', 'tool': 'knowledge_base', 'input': query}
        
        if any(kw in text for kw in ['remember', 'note', 'save', 'store']):
            return {'action': 'remember', 'input': instruction}
        
        # Default: try knowledge base
        return {'action': 'use_tool', 'tool': 'knowledge_base', 'input': instruction}
    
    def execute(self, plan: Dict[str, str]) -> str:
        """Execute the plan (either use tool or remember)."""
        action = plan.get('action', 'unknown')
        
        if action == 'remember':
            self.learn('last_note', plan['input'])
            return f"‚úì Remembered: {plan['input']}"
        
        if action == 'use_tool':
            tool_name = plan.get('tool')
            tool_input = plan.get('input', '')
            
            if tool_name not in self.tools:
                return f"‚úó Tool '{tool_name}' not available. Available: {self.list_tools()}"
            
            try:
                tool = self.tools[tool_name]
                result = tool.func(tool_input)
                return str(result)
            except Exception as e:
                return f"‚úó Tool error: {e}"
        
        return "‚úó Unknown action"
    
    def act(self, instruction: str) -> str:
        """
        Main agent loop: perceive ‚Üí plan ‚Üí execute ‚Üí remember.
        This is the core agent cycle.
        """
        # PERCEIVE: process input
        self.remember(f"INPUT: {instruction}")
        
        # PLAN: decide what to do
        plan = self.plan(instruction)
        self.remember(f"PLAN: {plan}")
        
        # EXECUTE: carry out the plan
        outcome = self.execute(plan)
        
        # REMEMBER: store interaction for learning
        self.remember(f"OUTPUT: {outcome}")
        
        return outcome

# ============================================================================
# 3. Tool Implementations
# ============================================================================

def calculator_tool(expression: str) -> float:
    """
    Safe calculator tool: evaluates arithmetic expressions.
    Filters to only allow safe characters.
    """
    # Remove spaces and extra text
    tokens = shlex.split(expression)
    expr_str = ''.join(tokens)
    
    # Whitelist only safe math characters
    safe_chars = set('0123456789+-*/().%')
    filtered = ''.join(ch for ch in expr_str if ch in safe_chars)
    
    if not filtered or filtered in '+-*/(.)%':
        raise ValueError(f"Invalid expression: {expression}")
    
    try:
        # Use eval with restricted namespace (no dangerous built-ins)
        result = eval(filtered, {"__builtins__": {}})
        return result
    except Exception as e:
        raise ValueError(f"Calculation failed: {e}")

def knowledge_base_tool(query: str) -> str:
    """
    Simulated knowledge base: local search over a small company KB.
    In production, this would query a real database or vector store.
    """
    kb = {
        'vacation days': '15 days PTO per year, accrues 1.25 days/month',
        'health insurance': 'Employee pays 20% premium; benefits start after 30 days',
        'remote work': 'Up to 3 days/week remote with manager approval',
        'holidays': '10 paid holidays (New Year, Memorial Day, July 4, Labor Day, Thanksgiving 2 days, Christmas 3 days)',
        '401k': 'Eligible after 90 days; 50% match up to 6% salary; 4-year vesting',
        'sick leave': '8 days per year; no rollover',
        'maternity leave': '12 weeks paid; 6 weeks paid paternity'
    }
    
    q = query.lower()
    
    # Search for exact or partial matches
    for key, value in kb.items():
        if key in q or any(word in q for word in key.split()):
            return f"üìñ Found in KB: {value}"
    
    return f"üìñ No match found in KB for '{query}'. Try: {list(kb.keys())}"

# ============================================================================
# 4. Demo: Agent in Action
# ============================================================================

print("="*70)
print(f"DEMO: SimpleAgent")
print("="*70)

# Create and configure agent
agent = SimpleAgent(agent_name="CompanyAssistant")
agent.add_tool(Tool("calculator", calculator_tool, "Evaluates math expressions"))
agent.add_tool(Tool("knowledge_base", knowledge_base_tool, "Searches company policies"))

print(f"\nAgent Name: {agent.name}")
print(f"Available Tools: {agent.list_tools()}\n")

# Test interactions
test_queries = [
    "How many vacation days do I get per year?",
    "Calculate 15 * 12 + 3.5",
    "What is the remote work policy?",
    "Remember that my manager is Alice",
]

for query in test_queries:
    print(f"\nüë§ User: {query}")
    result = agent.act(query)
    print(f"ü§ñ Agent: {result}")

# Show memory trace
print("\n" + "="*70)
print("AGENT MEMORY TRACE (last 10 items):")
print("="*70)
for i, item in enumerate(agent.memory[-10:], 1):
    print(f"  {i}. {item}")

# Show long-term learnings
print("\n" + "="*70)
print("LONG-TERM LEARNINGS:")
print("="*70)
for key, value in agent.long_term_memory.items():
    print(f"  {key}: {value}")

In [None]:
## Topic 2 ‚Äî Agent Architectures

### Architecture Types

#### 1. Reactive Agents (Reflex Agents)
**How it works**: Map percepts directly to actions via condition-action rules.  
**Latency**: Very fast (milliseconds)  
**Complexity**: Simple, no state needed  
**When to use**: 
- Real-time control (robotics, games)
- Microservices with immediate responses
- Safety-critical systems (fast fallback)

**Pros**: Fast, predictable, easy to understand  
**Cons**: No planning, no learning, limited to reflex behavior

#### 2. Deliberative Agents (Planning Agents)
**How it works**: Build internal model, use search/optimization to plan action sequences.  
**Latency**: Slower (seconds to minutes), depends on planning complexity  
**Complexity**: High, requires domain modeling  
**When to use**:
- Multi-step tasks (assembly, scheduling)
- Constraint satisfaction (resource allocation)
- Complex reasoning (diagnosis, strategy)

**Pros**: Handles long-term goals, can reason about consequences  
**Cons**: Planning overhead, requires good world model

#### 3. Hybrid Architectures
**How it works**: Combine reactive + deliberative: planner sets goals, reactive layer handles immediate responses and safety.  
**Latency**: Mixed (fast reactions + slower planning in parallel)  
**Complexity**: Moderate, but well-structured  
**When to use**: Most modern assistants (balance responsiveness + reasoning)

**Pros**: Responsive yet thoughtful, can interrupt planner for safety  
**Cons**: More complex, harder to debug interactions

#### 4. LLM-Based Agent Architecture (Modern Split)
**Planner (LLM)**: Reads state, emits plan (tool calls, subtasks, reasoning)  
**Executor (Deterministic)**: Carries out tool calls, validates results, handles side effects  
**Integration Loop**: Executor feeds results back to LLM for next decision

```
User Query
    ‚Üì
[LLM Planner] ‚Üí "I should call web_search, then summarize"
    ‚Üì
[Tool Executor] ‚Üí Runs web_search, collects results
    ‚Üì
[LLM Planner] ‚Üí "Now I'll summarize" ‚Üí emits answer
    ‚Üì
User Answer (with citations)
```

**When to use**: Flexible, language-based reasoning required  
**Pros**: Uses LLM's reasoning, auditable tool calls, repeatable  
**Cons**: Latency (multiple LLM round-trips), cost (per-query LLM calls)

### Decision Framework: Which Architecture?

| Scenario | Best Choice | Why |
|----------|------------|-----|
| Stock trading bot (10ms latency) | Reactive | Speed critical |
| Travel planner ("book flight + hotel") | Deliberative | Multi-step, constraints |
| Customer support chatbot | Hybrid/LLM-based | Needs reactions + reasoning |
| Robot navigating obstacles | Reactive layer + Planner | Safety + goal-seeking |
| Code review assistant | LLM planner + tools | Language understanding + tool calls |
| Game AI | Hybrid (planner for strategy, reactive for combat) | Balanced |

### Architectural Considerations
- **Error handling**: How to recover if a tool fails? Retry? Escalate? Replanning?
- **Authorization**: Who can call which tools? Fine-grained access control?
- **Observability**: Log every decision for debugging and audit trails
- **Memory management**: What context to pass to the planner? Token limits?
- **Feedback loops**: How does the agent learn from mistakes?

### Common Pitfalls
‚ùå **Too much context** ‚Üí LLM gets confused, slower  
‚ùå **Too little context** ‚Üí Agent misses important info  
‚ùå **Unsafe tool permissions** ‚Üí Agent calls dangerous tools  
‚ùå **No error handling** ‚Üí Agent stuck in infinite loop  
‚ùå **Poor observability** ‚Üí Can't debug why agent failed  

### Implementation Pattern: Modular Stack
```
Perception Layer    ‚Üê User inputs, sensor data, API responses
    ‚Üì
Memory/Context      ‚Üê Short-term & long-term memory
    ‚Üì
Reasoning Engine    ‚Üê Planner, decision logic
    ‚Üì
Tool/Action Layer   ‚Üê Executors, API clients
    ‚Üì
Output/Feedback     ‚Üê User responses, logging, learning
```

Each layer is replaceable: swap planners, tool sets, memory backends, etc.

In [None]:
# Extended Agent: Demonstrating Different Architectures

# ============================================================================
# 1. REACTIVE AGENT (Condition-Action Rules)
# ============================================================================

class ReactiveAgent:
    """Ultra-fast agent: percept ‚Üí action via rules. No planning."""
    
    def __init__(self):
        self.rules = []
    
    def add_rule(self, condition, action):
        """condition: lambda percept -> bool, action: lambda percept -> str"""
        self.rules.append((condition, action))
    
    def act(self, percept: str) -> str:
        """Check rules in order, execute first match."""
        for condition, action in self.rules:
            if condition(percept):
                return action(percept)
        return "No rule matched"

# Setup reactive agent
reactive = ReactiveAgent()
reactive.add_rule(
    lambda p: 'temperature' in p.lower(),
    lambda p: "üå°Ô∏è  I'll monitor temperature"
)
reactive.add_rule(
    lambda p: 'danger' in p.lower(),
    lambda p: "‚ö†Ô∏è  ALERT: Safety protocol activated!"
)

print("REACTIVE AGENT (rules-based, ~1ms latency):")
print(f"  Input: 'Temperature is rising'  ‚Üí {reactive.act('Temperature is rising')}")
print(f"  Input: 'Danger detected'        ‚Üí {reactive.act('Danger detected')}\n")

# ============================================================================
# 2. DELIBERATIVE AGENT (Planning with Search)
# ============================================================================

from collections import deque

class DeliberativeAgent:
    """Planner agent: builds plan via backward search."""
    
    def __init__(self):
        self.domain = {}  # {action_name: (preconditions, effects)}
        self.state = set()
    
    def add_action(self, name, preconditions, effects):
        """Register a PDDL-like action."""
        self.preconditions = lambda: preconditions
        self.domain[name] = (preconditions, effects)
    
    def plan(self, goal, state, depth=0, max_depth=5):
        """Simple backward-chaining planner."""
        if depth > max_depth:
            return None  # Depth limit
        
        if goal in state:
            return []  # Goal achieved
        
        # Try each action
        for action_name, (preconds, effects) in self.domain.items():
            if goal in effects:
                # This action achieves the goal; check preconditions
                sub_plan = self.plan(preconds[0], state, depth + 1, max_depth)
                if sub_plan is not None:
                    return sub_plan + [action_name]
        
        return None  # No plan found
    
    def act_deliberatively(self, goal: str) -> str:
        """Plan and describe the sequence of actions."""
        simple_plan = ["step1: search", "step2: analyze", "step3: report"]
        return f"üìã Plan to achieve '{goal}':\n  " + "\n  ".join(simple_plan)

deliberative = DeliberativeAgent()
print("DELIBERATIVE AGENT (planning-based, ~100ms latency):")
print(deliberative.act_deliberatively("gather market data"))
print()

# ============================================================================
# 3. HYBRID AGENT (Reactive + Planner)
# ============================================================================

class HybridAgent:
    """Combines fast reactions with deliberative planning."""
    
    def __init__(self):
        self.reactive_rules = []
        self.planner_goals = {}
    
    def add_reactive_rule(self, condition, action):
        self.reactive_rules.append((condition, action))
    
    def add_goal(self, goal_name, plan_steps):
        self.planner_goals[goal_name] = plan_steps
    
    def act(self, percept: str) -> str:
        # REACTIVE: Check for immediate threats/opportunities
        for condition, action in self.reactive_rules:
            if condition(percept):
                return f"üö® REACTIVE: {action(percept)}"
        
        # DELIBERATIVE: Use planner for non-urgent goals
        for goal_name, steps in self.planner_goals.items():
            if goal_name in percept.lower():
                return f"üéØ PLANNING: {goal_name}\n  Steps: " + " ‚Üí ".join(steps)
        
        return "‚ö†Ô∏è No reactive rule or planned goal matched"

hybrid = HybridAgent()
hybrid.add_reactive_rule(
    lambda p: 'emergency' in p.lower(),
    lambda p: "Activating emergency protocol!"
)
hybrid.add_goal('book flight', ['search flights', 'check price', 'confirm booking'])

print("HYBRID AGENT (reactive + planner, ~50ms latency):")
print(f"  Input: 'emergency situation'  ‚Üí {hybrid.act('emergency situation')}")
print(f"  Input: 'book flight to NYC'  ‚Üí {hybrid.act('book flight to NYC')}\n")

# ============================================================================
# 4. LLM-BASED AGENT (Planner + Executor Split)
# ============================================================================

class LLMBasedAgent:
    """
    Mimics modern LLM-based agents: LLM as planner,
    separate executor for tool calls.
    """
    
    def __init__(self):
        self.tools = {}
        self.conversation_history = []
    
    def register_tool(self, name: str, func):
        self.tools[name] = func
    
    def llm_planner_mock(self, query: str) -> Dict[str, Any]:
        """Simulates LLM deciding what tools to call.
        In reality, this would be an API call to GPT-4, Claude, etc.
        """
        # Heuristic: emulate what an LLM might decide
        if 'weather' in query.lower():
            return {'thought': 'Need to get weather data', 'tool': 'get_weather', 'input': 'current'}
        if 'calculate' in query.lower():
            return {'thought': 'This needs arithmetic', 'tool': 'calculator', 'input': query}
        return {'thought': 'Direct answer', 'tool': None, 'input': None}
    
    def execute_tool(self, tool_name: str, tool_input: str) -> str:
        """Execute the tool chosen by planner."""
        if tool_name not in self.tools:
            return f"‚ùå Tool '{tool_name}' not found"
        return self.tools[tool_name](tool_input)
    
    def interact(self, user_query: str) -> str:
        """
        Full LLM agent loop:
        1. User query
        2. LLM planner decides tool
        3. Executor runs tool
        4. LLM generates answer
        """
        self.conversation_history.append(('user', user_query))
        
        # Step 1: Plan
        plan = self.llm_planner_mock(user_query)
        
        # Step 2: Execute tool if needed
        if plan['tool']:
            tool_result = self.execute_tool(plan['tool'], plan['input'])
        else:
            tool_result = "No tool needed"
        
        # Step 3: Generate answer
        answer = f"üí≠ Thought: {plan['thought']}\nüîß Action: {plan['tool'] or 'direct answer'}\nüìù Result: {tool_result}"
        
        self.conversation_history.append(('assistant', answer))
        return answer

llm_agent = LLMBasedAgent()
llm_agent.register_tool('get_weather', lambda q: "üå§Ô∏è  Sunny, 72¬∞F")
llm_agent.register_tool('calculator', lambda q: f"Result: {eval(q.replace('calculate', '').strip())}")

print("LLM-BASED AGENT (planner + executor, ~500ms with LLM latency):")
print(llm_agent.interact("Calculate 25 * 4"))
print()
print(llm_agent.interact("What is the weather?"))