# 107 LangGraph: Looping Workflows - Retry and Iteration

**Workshop**: LangGraph 101  
**Duration**: ~25 minutes  
**Difficulty**: Intermediate

## Learning Objectives

By completing this notebook, you will:
- Understand how to create **self-referencing edges** that loop back to previous nodes
- Implement **retry logic** with counter-based termination to prevent infinite loops
- Build **iteration patterns** for pagination, batching, and data processing
- Design **termination conditions** that safely exit loops
- Apply looping workflows to **SCM list pagination** for handling large datasets
- Understand when to use **loops vs sequential chains** in workflow design
- Recognize and prevent **infinite loop pitfalls** with best practices

## Prerequisites

- **Knowledge**: Completed Notebooks 103, 104, 105, and 106
- **Concepts**: Understanding of TypedDict, StateGraph, conditional edges, and router functions  
- **Setup**: None required for this notebook

## Table of Contents

1. [Introduction](#1-introduction)
2. [Your LangGraph Journey](#2-your-langgraph-journey)
3. [Network Admin Analogies](#3-network-admin-analogies)
4. [Understanding Loops](#4-understanding-loops)
5. [Infinite Loop Prevention](#5-infinite-loop-prevention)
6. [Building a Retry Loop](#6-building-a-retry-loop)
7. [Exercise: SCM Pagination](#7-exercise-scm-pagination)
8. [What's Next](#8-whats-next)
9. [Summary](#9-summary)

---

## 1. Introduction

Welcome to Notebook 107! You've mastered sequential workflows and conditional routing. Now learn the most powerful pattern: **looping workflows**.

### What are Looping Workflows?

Looping workflows let graphs cycle back to previous nodes, enabling:
- **Retry logic**: Try operation up to N times before failing
- **Pagination**: Fetch data in chunks until complete
- **Polling**: Check status repeatedly until condition met
- **Batch processing**: Process items one at a time
- **Iterative refinement**: Improve results through multiple passes

Real-world SCM examples:
- Paginate through 1000+ address objects (200 per API call)
- Poll HA sync status every 5 seconds until synced
- Retry failed API calls up to 3 times with backoff

‚ö†Ô∏è **WARNING**: Loops can create infinite cycles! Every loop MUST have:
1. Counter tracking iterations
2. Maximum iteration limit
3. Clear termination condition
4. Safety validation

Let's build safe, production-ready loops!

---

## 2. Your LangGraph Journey So Far

Quick recap of how notebooks 103-106 prepared you for looping:

**Notebook 103: State & Reducers**
- State is workflow memory (TypedDict)
- Reducers merge state updates  
- **For Loops**: State tracks counters and termination flags

**Notebook 104: Sequential Graphs**
- Linear workflows: A ‚Üí B ‚Üí C ‚Üí END
- Edges connect nodes
- **For Loops**: Edges can point backwards!

**Notebook 105: Graph Visualization**
- Mermaid diagrams show workflow
- **For Loops**: Visualization reveals cycles

**Notebook 106: Conditional Routing**
- Branch based on state conditions
- Router functions with `Literal` return types  
- **For Loops**: Conditional edges create loops!

**What Loops Add**:
- **Linear** (104): A ‚Üí B ‚Üí C
- **Branching** (106): A ‚Üí (B OR C)
- **Looping** (107): A ‚Üí B ‚Üí [check] ‚Üí (back to B OR forward)

Loops combine all previous concepts!

---

## 3. Network Admin Analogies for Loops

You already know these loop patterns from networking!

### TCP Retransmission = Retry Logic

TCP retransmits on timeout:
```
Send SYN ‚Üí Timeout? ‚Üí Retry count < max? ‚Üí YES: Resend SYN (loop)
                                          ‚Üí NO: Connection failed
```

LangGraph retry loop:
```python
def should_retry(state) -> Literal["retry", "success", "failed"]:
    if state["success"]:
        return "success"
    elif state["retry_count"] < 3:
        return "retry"  # Loop back!
    return "failed"
```

### BGP Route Advertisement = Pagination

BGP sends 100K routes in 1K chunks:
```
Send chunk 1 ‚Üí chunk 2 ‚Üí ... ‚Üí chunk 100 ‚Üí Done
    ‚Üë______________|  (loop until all sent)
```

SCM API pagination:
```python
def should_paginate(state) -> Literal["fetch_more", "done"]:
    if state["offset"] < state["total"]:
        return "fetch_more"  # Loop for next page!
    return "done"
```

### IP TTL = Loop Counter

TTL prevents routing loops:
```
TTL=64 ‚Üí Router decrements ‚Üí TTL=63 ‚Üí ... ‚Üí TTL=0 ‚Üí Drop packet
```

Loop counter prevents infinite loops:
```python
if state["count"] >= MAX_ITERATIONS:
    return "terminate"  # Safety exit!
```

| Network Concept | LangGraph Loop | Safety Mechanism |
|-----------------|----------------|------------------|
| TCP Retransmit | Retry loop | max_retries |
| BGP Chunks | Pagination | total_count |
| IP TTL | Iteration counter | MAX_ITERATIONS |

---

## 4. Understanding Looping Workflows

### Anatomy of a Loop

Every loop has these components:

```
START ‚Üí initialize (set counter=0)
          ‚Üì
        process (do work, increment counter)
          ‚Üì
        check (decision point)
          ‚Üì
    counter < max? ‚Üí YES ‚Üí process (LOOP BACK!)
          ‚Üì
         NO
          ‚Üì
        END
```

### Four Essential Elements

**1. Loop Counter in State**
```python
class LoopState(TypedDict):
    retry_count: int     # Tracks iterations
    max_retries: int     # Maximum allowed
```

**2. Processing Node** (increments counter)
```python
def process(state: LoopState) -> dict:
    return {"retry_count": state["retry_count"] + 1}
```

**3. Router Function** (loop decision)
```python
def should_continue(state) -> Literal["loop", "exit"]:
    if state["retry_count"] < state["max_retries"]:
        return "loop"  # Continue
    return "exit"      # Stop
```

**4. Conditional Edge** (creates loop)
```python
graph.add_conditional_edges(
    source="check",
    path=should_continue,
    path_map={
        "loop": "process",  # ‚Üê Points backwards!
        "exit": END
    }
)
```

### When to Use Loops

| Scenario | Pattern |
|----------|---------|
| Fixed sequence | Sequential (no loop) |
| One decision | Conditional routing (no loop) |
| Retry with limit | Loop (hybrid) |
| Unknown iterations | Loop (condition) |
| Batch processing | Loop (count) |

---

## 5. Infinite Loop Prevention - CRITICAL!

‚ö†Ô∏è **WARNING**: Infinite loops are the #1 pitfall! Here's how to prevent them.

### The Five Safety Rules

**Rule 1: Always Use a Counter**
```python
class SafeState(TypedDict):
    iteration_count: int     # REQUIRED!
    max_iterations: int      # REQUIRED!
```

**Rule 2: Always Increment Counter**
```python
def process(state) -> dict:
    return {"iteration_count": state["iteration_count"] + 1}  # MUST DO!
```

**Rule 3: Always Check Maximum FIRST**
```python
def router(state) -> Literal["retry", "exit", "max_reached"]:
    # Check max FIRST!
    if state["iteration_count"] >= state["max_iterations"]:
        return "max_reached"  # Safety exit!
    # Then check success
    if state["success"]:
        return "exit"
    return "retry"
```

**Rule 4: Always Have Multiple Exits**
```python
path_map={
    "retry": "process",        # Loop
    "exit": END,               # Success exit
    "max_reached": "error"     # Safety exit
}
```

**Rule 5: Validate Before Running**
```python
assert "iteration_count" in state, "Missing counter!"
assert 0 < state["max_iterations"] < 1000, "Max too high!"
```

### Safety Checklist

Before deploying any loop:
- [ ] Counter field exists (`iteration_count`)
- [ ] Maximum field exists (`max_iterations`)
- [ ] Counter initializes to 0
- [ ] Counter increments each iteration
- [ ] Router checks max FIRST
- [ ] Multiple exit paths exist
- [ ] Maximum is reasonable (< 100)
- [ ] Exit conditions are achievable

Remember: **Every loop is potentially infinite until proven safe!**

---

## 6. Building a Retry Loop

Now let's build a complete retry loop workflow! We'll create an SCM API connection retry system that tries up to 3 times.

### The Pattern

```
START
  ‚Üì
initialize (retry_count = 0)
  ‚Üì
attempt_connection (increment count, try API call)
  ‚Üì
check_result (router: success? retry? max_reached?)
  ‚Üì
  ‚îú‚îÄ‚Üí success ‚Üí END
  ‚îú‚îÄ‚Üí max_reached ‚Üí error_handler ‚Üí END
  ‚îî‚îÄ‚Üí retry ‚Üí attempt_connection (LOOP!)
```

Let's build it step by step!

### 6.1 Define State for Looping Workflow

First, define state with ALL required fields for a safe loop:
- Counter field (`retry_count`)
- Maximum field (`max_retries`)
- Success flag (`connected`)
- Result tracking (`result`)

In [None]:
from typing import TypedDict, Literal
from langgraph.graph import StateGraph, START, END
from IPython.display import Image, display
import random
import time

class ConnectionRetryState(TypedDict):
    """State for API connection retry workflow."""
    retry_count: int          # Current attempt number (starts at 0)
    max_retries: int          # Maximum attempts allowed
    connected: bool           # Did connection succeed?
    result: str               # Description of what happened
    error_message: str        # Error details if failed
    backoff_seconds: float    # Exponential backoff delay

print("‚úÖ ConnectionRetryState defined!")
print("\nSafety features:")
print("  - retry_count: Tracks iterations")
print("  - max_retries: Hard limit for termination")
print("  - connected: Success condition for exit")
print("  - backoff_seconds: Exponential delay between retries")

### 6.2 Create Initialization Node

Initialize the workflow and set counter to 0:

In [None]:
def initialize_retry(state: ConnectionRetryState) -> dict:
    """Node: Initialize retry workflow.
    
    Sets retry_count to 0 to start the loop.
    """
    print("üöÄ Initializing retry workflow...")
    print(f"   Max retries allowed: {state['max_retries']}")
    
    return {
        "retry_count": 0,
        "connected": False,
        "result": "Initialized",
        "error_message": "",
        "backoff_seconds": 0.0
    }

print("‚úÖ initialize_retry node defined")

### 6.3 Create Connection Attempt Node

This node:
1. Increments the counter (CRITICAL!)
2. Simulates API connection attempt  
3. Updates success flag

In [None]:
def attempt_connection(state: ConnectionRetryState) -> dict:
    """Node: Attempt to connect to SCM API with exponential backoff.
    
    Increments retry_count and simulates connection attempt.
    Implements exponential backoff: 1s, 2s, 4s, 8s...
    70% success rate for demo purposes.
    """
    # ‚ö†Ô∏è CRITICAL: Increment counter!
    new_count = state["retry_count"] + 1
    
    # Calculate exponential backoff delay
    if new_count > 1:
        backoff = 2 ** (new_count - 2)  # 0s, 1s, 2s, 4s, 8s...
        print(f"\n‚è±Ô∏è  Waiting {backoff}s before retry (exponential backoff)...")
        time.sleep(backoff)
    else:
        backoff = 0
    
    print(f"\nüîÑ Attempt {new_count} of {state['max_retries']}...")
    
    # Simulate connection attempt (70% success rate)
    # In production, this would be a real SCM API call with try/except
    success = random.random() < 0.7
    
    if success:
        print("   ‚úÖ Connection successful!")
        return {
            "retry_count": new_count,
            "connected": True,
            "result": f"Connected successfully on attempt {new_count}",
            "backoff_seconds": backoff
        }
    else:
        print("   ‚ùå Connection failed")
        return {
            "retry_count": new_count,
            "connected": False,
            "error_message": f"Connection timeout on attempt {new_count}",
            "backoff_seconds": backoff
        }

print("‚úÖ attempt_connection node defined")
print("\n‚ö†Ô∏è Notice: Counter MUST increment each time!")
print("üí° Exponential backoff: 1s, 2s, 4s, 8s between retries")

### 6.4 Create Loop Decision Function

This is THE KEY! This router decides whether to:
- Exit (success)
- Loop back (retry)
- Exit (max attempts reached)

‚ö†Ô∏è **CRITICAL**: Check maximum FIRST!

In [None]:
def should_retry(state: ConnectionRetryState) -> Literal["success", "retry", "max_reached"]:
    """Router: Decide whether to retry connection attempt.
    
    Returns:
        "success" - Connected! Exit loop.
        "max_reached" - Hit maximum retries. Exit loop.
        "retry" - Try again. Loop back!
    """
    # ‚ö†Ô∏è CRITICAL: Check max FIRST to prevent infinite loop!
    if state["retry_count"] >= state["max_retries"]:
        print(f"\nüõë Maximum retries ({state['max_retries']}) reached!")
        return "max_reached"
    
    # Then check success condition
    if state["connected"]:
        print(f"\n‚úÖ Success! Exiting loop.")
        return "success"
    
    # Otherwise, retry
    print(f"   ‚Üí Will retry (attempt {state['retry_count']} of {state['max_retries']})...")
    return "retry"

print("‚úÖ should_retry router function defined")
print("\n‚ö†Ô∏è Notice: max_retries check comes FIRST!")
print("   This guarantees loop termination!")

### 6.5 Build the Looping Graph

Now assemble all pieces into a looping workflow:

In [None]:
# Step 1: Create the graph
retry_graph = StateGraph(ConnectionRetryState)

# Step 2: Add all nodes
retry_graph.add_node("initialize", initialize_retry)
retry_graph.add_node("attempt", attempt_connection)
retry_graph.add_node("check", lambda state: state)  # Router with passthrough

# Step 3: Set entry point
retry_graph.set_entry_point("initialize")

# Step 4: Add sequential edges
retry_graph.add_edge("initialize", "attempt")
retry_graph.add_edge("attempt", "check")

# Step 5: ‚≠ê Add conditional edge that creates the LOOP!
retry_graph.add_conditional_edges(
    source="check",
    path=should_retry,
    path_map={
        "success": END,              # Exit on success
        "max_reached": END,          # Exit on max retries (safety!)
        "retry": "attempt"           # LOOP BACK to attempt! ‚Üê‚Üê THE LOOP!
    }
)

# Step 6: Compile
retry_app = retry_graph.compile()

print("‚úÖ Looping retry graph built and compiled!")
print("\nüí° The magic: path_map['retry'] = 'attempt' creates the loop!")
print("   Flow: attempt ‚Üí check ‚Üí [decision] ‚Üí back to attempt OR END")

### 6.6 Visualize the Looping Graph

Let's see what a loop looks like visually:

In [None]:
# Visualize the looping graph
display(Image(retry_app.get_graph().draw_mermaid_png()))

print("\nüí° Notice in the diagram:")
print("   - The edge from 'check' that points BACK to 'attempt'")
print("   - This creates a cycle in the graph")
print("   - The loop continues until success OR max_reached")

### 6.7 Test the Looping Workflow - Successful Connection

Let's test! We'll set max=3 and see it retry until success:

In [None]:
# Set random seed for reproducible demo (70% success rate means usually succeeds)
random.seed(42)

print("="*70)
print("TEST 1: Retry Loop with Success")
print("="*70)

result = retry_app.invoke({
    "retry_count": 0,
    "max_retries": 3,
    "connected": False,
    "result": "",
    "error_message": "",
    "backoff_seconds": 0.0
})

print("\n" + "="*70)
print("FINAL RESULT")
print("="*70)
print(f"Connected: {result['connected']}")
print(f"Total attempts: {result['retry_count']}")
print(f"Result: {result['result']}")
print(f"Total wait time: {result['backoff_seconds']}s")

### 6.8 Test the Looping Workflow - Max Attempts Reached

Now test what happens when we DON'T succeed (safety exit):

In [None]:
# Set seed that causes failures
random.seed(99)

print("="*70)
print("TEST 2: Max Retries Reached (Safety Exit)")
print("="*70)

result = retry_app.invoke({
    "retry_count": 0,
    "max_retries": 3,
    "connected": False,
    "result": "",
    "error_message": "",
    "backoff_seconds": 0.0
})

print("\n" + "="*70)
print("FINAL RESULT")
print("="*70)
print(f"Connected: {result['connected']}")
print(f"Total attempts: {result['retry_count']}")
print(f"Error: {result['error_message']}")
print(f"Total wait time: {result['backoff_seconds']}s")
print("\n‚ö†Ô∏è Loop exited safely via max_reached path!")
print("   This prevents infinite loops!")

### 6.9 Production Pattern: Real SCM CRUD Operations with Retry Logic

Now let's see what the retry logic looks like with **real SCM SDK CRUD operations** and **production error handling**!

**Key differences from simulation:**
- Use `try/except` to catch real exceptions
- Handle specific SCM error types (`InvalidObjectError`, `NameNotUniqueError`, etc.)
- Wrap actual pan-scm-sdk CRUD operations in retry logic
- Return detailed error information

**Production Pattern 1: Address Object Creation with Retry**
```python
from scm.client import ScmClient
from scm.exceptions import InvalidObjectError, NameNotUniqueError

def attempt_address_creation(state: ConnectionRetryState) -> dict:
    """Node: Attempt to create SCM address object with retry logic."""
    new_count = state["retry_count"] + 1

    # Exponential backoff
    if new_count > 1:
        backoff = 2 ** (new_count - 2)
        print(f"‚è±Ô∏è  Waiting {backoff}s before retry...")
        time.sleep(backoff)
    else:
        backoff = 0

    print(f"üîÑ Attempt {new_count} of {state['max_retries']}...")

    try:
        # Initialize SCM client
        client = ScmClient(
            client_id=state.get("client_id"),
            client_secret=state.get("client_secret"),
            tsg_id=state.get("tsg_id")
        )

        # Real CRUD operation: Create address object
        address_response = client.address.create({
            "name": state.get("address_name", "Test-Address"),
            "ip_netmask": "10.1.1.0/24",
            "folder": "Texas",
            "description": f"Created on attempt {new_count}"
        })

        print(f"   ‚úÖ Address created successfully: {address_response['id']}")
        return {
            "retry_count": new_count,
            "connected": True,
            "result": f"Created address '{address_response['name']}' (ID: {address_response['id']}) on attempt {new_count}",
            "backoff_seconds": backoff
        }

    except NameNotUniqueError as e:
        # Address name already exists - don't retry
        print(f"   ‚ùå Name conflict: {str(e)}")
        return {
            "retry_count": new_count,
            "connected": False,
            "error_message": f"NameNotUniqueError on attempt {new_count}: {str(e)}",
            "backoff_seconds": backoff
        }

    except InvalidObjectError as e:
        # Invalid configuration - don't retry
        print(f"   ‚ùå Invalid object: {str(e)}")
        return {
            "retry_count": new_count,
            "connected": False,
            "error_message": f"InvalidObjectError on attempt {new_count}: {str(e)}",
            "backoff_seconds": backoff
        }

    except ConnectionError as e:
        # Network issue - retry makes sense
        print(f"   ‚ùå Connection error: {str(e)}")
        return {
            "retry_count": new_count,
            "connected": False,
            "error_message": f"ConnectionError on attempt {new_count}: {str(e)}",
            "backoff_seconds": backoff
        }

    except TimeoutError as e:
        # Timeout - retry makes sense
        print(f"   ‚ùå Timeout error: {str(e)}")
        return {
            "retry_count": new_count,
            "connected": False,
            "error_message": f"TimeoutError on attempt {new_count}: {str(e)}",
            "backoff_seconds": backoff
        }

    except Exception as e:
        # Unexpected error
        print(f"   ‚ùå Unexpected error: {str(e)}")
        return {
            "retry_count": new_count,
            "connected": False,
            "error_message": f"Error on attempt {new_count}: {str(e)}",
            "backoff_seconds": backoff
        }
```

**Production Pattern 2: Address Object Update with Retry**
```python
def attempt_address_update(state: ConnectionRetryState) -> dict:
    """Node: Attempt to update SCM address object with retry logic."""
    new_count = state["retry_count"] + 1

    # Exponential backoff
    if new_count > 1:
        backoff = 2 ** (new_count - 2)
        print(f"‚è±Ô∏è  Waiting {backoff}s before retry...")
        time.sleep(backoff)
    else:
        backoff = 0

    print(f"üîÑ Update attempt {new_count} of {state['max_retries']}...")

    try:
        client = ScmClient(
            client_id=state.get("client_id"),
            client_secret=state.get("client_secret"),
            tsg_id=state.get("tsg_id")
        )

        # Fetch-then-modify pattern (best practice)
        address_id = state.get("address_id")
        existing_address = client.address.get(address_id)

        # Update fields
        existing_address["description"] = f"Updated on attempt {new_count}"
        existing_address["ip_netmask"] = "10.1.2.0/24"

        # Submit update
        updated_address = client.address.update(existing_address)

        print(f"   ‚úÖ Address updated successfully: {updated_address['id']}")
        return {
            "retry_count": new_count,
            "connected": True,
            "result": f"Updated address '{updated_address['name']}' on attempt {new_count}",
            "backoff_seconds": backoff
        }

    except ConnectionError as e:
        print(f"   ‚ùå Connection error: {str(e)}")
        return {
            "retry_count": new_count,
            "connected": False,
            "error_message": f"ConnectionError on attempt {new_count}: {str(e)}",
            "backoff_seconds": backoff
        }

    except TimeoutError as e:
        print(f"   ‚ùå Timeout error: {str(e)}")
        return {
            "retry_count": new_count,
            "connected": False,
            "error_message": f"TimeoutError on attempt {new_count}: {str(e)}",
            "backoff_seconds": backoff
        }

    except Exception as e:
        print(f"   ‚ùå Unexpected error: {str(e)}")
        return {
            "retry_count": new_count,
            "connected": False,
            "error_message": f"Error on attempt {new_count}: {str(e)}",
            "backoff_seconds": backoff
        }
```

**Benefits of wrapping CRUD operations in retry logic:**
- ‚úÖ **Handles transient network failures** - ConnectionError, TimeoutError
- ‚úÖ **Distinguishes retriable vs non-retriable errors** - Don't retry InvalidObjectError or NameNotUniqueError
- ‚úÖ **Logs detailed error information** - Know exactly what failed and why
- ‚úÖ **Returns actionable error messages** - State includes full error context
- ‚úÖ **Integrates with pan-scm-sdk properly** - Uses real create(), update(), get() operations
- ‚úÖ **Production-ready** - Same pattern used in enterprise SCM automation

**When to retry:**
- ‚úÖ `ConnectionError` - Network issues (transient)
- ‚úÖ `TimeoutError` - Request timeout (transient)
- ‚úÖ HTTP 429 - Rate limiting (transient)
- ‚úÖ HTTP 500-503 - Server errors (transient)

**When NOT to retry:**
- ‚ùå `InvalidObjectError` - Bad configuration (permanent)
- ‚ùå `NameNotUniqueError` - Duplicate name (permanent)
- ‚ùå `ObjectNotPresentError` - Missing object (permanent)
- ‚ùå HTTP 401/403 - Authentication/authorization (permanent)

**Note:** For this workshop, we use simulations to avoid requiring API credentials. But in production SCM automation, use these patterns!

**Reference:** See `docs/examples/address_objects.py` for more SCM SDK CRUD patterns.

### 6.10 Debugging Looping Workflows

Loops can be tricky! Here are common issues and how to debug them.

#### Problem 1: Infinite Loop

**Symptom:** Notebook hangs, never completes, or loops many more times than expected

**Common Causes:**
1. Counter not incrementing
2. Maximum check missing or incorrect
3. Router always returns "retry" path
4. Exit condition never satisfied

**Debugging steps:**
```python
def debug_router(state: ConnectionRetryState) -> Literal["success", "retry", "max_reached"]:
    """Router with debug output."""
    # Print current state
    print(f"DEBUG: retry_count={state['retry_count']}, max={state['max_retries']}")
    print(f"DEBUG: connected={state['connected']}")
    
    # Check max FIRST
    if state["retry_count"] >= state["max_retries"]:
        print("DEBUG: Exiting via max_reached")
        return "max_reached"
    
    if state["connected"]:
        print("DEBUG: Exiting via success")
        return "success"
    
    print("DEBUG: Continuing loop (retry)")
    return "retry"
```

**Prevention:**
- ‚úÖ Always test with `max_retries=2` first (fails fast)
- ‚úÖ Add print statements in processing node and router
- ‚úÖ Verify counter increments each iteration
- ‚úÖ Check max limit BEFORE other conditions

#### Problem 2: Loop Never Starts

**Symptom:** Exits immediately without retrying

**Common Causes:**
1. Counter initialized to value >= max
2. Exit condition already true
3. Wrong conditional edge configuration

**Debugging steps:**
```python
# Check initial state
print(f"Initial state: retry_count={state['retry_count']}, max={state['max_retries']}")
assert state["retry_count"] == 0, "Counter should start at 0"
assert state["max_retries"] > 0, "Max should be positive"
```

#### Problem 3: Counter Not Incrementing

**Symptom:** Loops more than `max_retries` times

**Common Cause:** Forgot to increment or increment not returned

**Fix:**
```python
def process(state):
    new_count = state["retry_count"] + 1  # MUST increment
    
    # ... do work ...
    
    return {
        "retry_count": new_count  # MUST return updated count!
    }
```

#### Problem 4: Wrong Exit Path

**Symptom:** Exits through wrong path (e.g., "max_reached" when should be "success")

**Debugging:**
```python
def debug_router(state):
    # Check conditions in order
    if state["retry_count"] >= state["max_retries"]:
        print(f"EXIT: max_reached (count={state['retry_count']} >= max={state['max_retries']})")
        return "max_reached"
    
    if state["connected"]:
        print(f"EXIT: success (connected={state['connected']})")
        return "success"
    
    print(f"CONTINUE: retry (count={state['retry_count']} < max={state['max_retries']}, not connected)")
    return "retry"
```

#### Debugging Checklist

Before running any loop:
- [ ] Counter field exists in state
- [ ] Counter initializes to 0
- [ ] Max field exists and is reasonable (< 100)
- [ ] Processing node increments counter
- [ ] Processing node returns incremented counter
- [ ] Router checks max FIRST
- [ ] Router has multiple exit paths
- [ ] Conditional edge maps all router return values
- [ ] Test with small max (2-3) first

#### Quick Debug Template

```python
def safe_loop_node(state):
    \"\"\"Template for safe loop nodes.\"\"\"
    # 1. Increment counter
    new_count = state["iteration_count"] + 1
    print(f"DEBUG: Iteration {new_count} of {state['max_iterations']}")
    
    # 2. Do work
    result = do_work()
    
    # 3. Return updated state
    return {
        "iteration_count": new_count,
        "result": result
    }

def safe_loop_router(state):
    \"\"\"Template for safe loop routers.\"\"\"
    # 1. Check max FIRST
    if state["iteration_count"] >= state["max_iterations"]:
        print(f"DEBUG: Max reached ({state['iteration_count']} >= {state['max_iterations']})")
        return "exit"
    
    # 2. Check success condition
    if state["success"]:
        print("DEBUG: Success!")
        return "exit"
    
    # 3. Continue loop
    print(f"DEBUG: Continue ({state['iteration_count']} < {state['max_iterations']})")
    return "continue"
```

**Pro tip:** Use `assert` statements to validate loop state before running:
```python
assert 0 <= state["retry_count"] < state["max_retries"], "Invalid counter state"
```

---

## 7. Exercises: Practice Looping Patterns

Time to practice! You'll build four different looping patterns commonly used in SCM automation.

### Real-World SCM Pagination Example

Before we start, note that the **pan-scm-sdk handles pagination automatically**! In production:

```python
from scm.client import ScmClient
from scm.config.objects import Address

# The SDK paginates automatically (200 objects per page)
client = ScmClient(client_id="...", client_secret="...", tsg_id="...")
address_service = Address(client, max_limit=200)

# This fetches ALL addresses, handling pagination internally
all_addresses = address_service.list(folder="Texas")  # Auto-paginates!
print(f"Total addresses: {len(all_addresses)}")
```

**See `docs/examples/address_objects.py` lines 130-141** for production pagination examples.

### Why Learn Manual Pagination?

Even though the SDK handles it, you should understand the pattern because:
1. **Other APIs** may not auto-paginate
2. **Custom processing** may require manual control
3. **Understanding loops** makes you better at debugging
4. **It's a fundamental pattern** in workflow automation

### The Four Exercises

| Exercise | Pattern | SCM Use Case |
|----------|---------|--------------|
| **7.1** | Pagination | Fetch all address objects (200/page) |
| **7.2** | Exponential Backoff | Retry API calls with delays |
| **7.3** | Polling | Check HA sync status every 5s |
| **7.4** | Batch Processing | Create 50 objects one at a time |

Let's start!

---

### 7.1 Exercise: SCM Address Object List Pagination

Build a pagination loop to fetch all SCM address objects.

### The Challenge

SCM API returns address objects in pages:
- Each request returns max 200 objects
- Response includes `total` count and current `offset`
- Need to loop until all objects fetched

### Requirements

**1. Define State** with fields:
```python
class PaginationState(TypedDict):
    offset: int           # Current position in list
    limit: int            # Objects per page (200)
    total_count: int      # Total objects available
    page_count: int       # Number of pages fetched
    max_pages: int        # Safety limit
    addresses: list       # Accumulated results
    complete: bool        # All objects fetched?
```

**2. Create Nodes**:
- `initialize`: Set offset=0, page_count=0
- `fetch_page`: Simulate API call, increment counters, add to list
- `check_pagination`: Router with lambda passthrough

**3. Create Router** that returns:
- `"fetch_more"`: If offset < total_count AND page_count < max_pages
- `"complete"`: If all objects fetched
- `"max_pages_reached"`: If hit safety limit

**4. Build Graph** with loop:
```
START ‚Üí initialize ‚Üí fetch_page ‚Üí check ‚Üí [decision]
                        ‚Üë___________| (loop back!)
```

**5. Test** with:
- Total of 450 objects (should fetch 3 pages)
- Limit of 200 per page
- Max of 10 pages (safety)

**Expected**: Should loop 3 times and exit via "complete" path.

Try it below!

In [None]:
# Your code here!
# Build the SCM address object list pagination workflow

# Step 1: Define PaginationState


# Step 2: Create initialize_pagination node


# Step 3: Create fetch_page node (simulates API call)
#         Remember to:
#         - Increment page_count
#         - Increment offset by limit
#         - Simulate adding objects to addresses list


# Step 4: Create should_fetch_more router function
#         Check max_pages FIRST!


# Step 5: Build graph


# Step 6: Add nodes


# Step 7: Set entry point and add edges


# Step 8: Add conditional edge with loop


# Step 9: Compile


# Step 10: Visualize


# Step 11: Test with 450 total objects (should loop 3 times)


# ============================================================================
# HINTS (uncomment if you need help)
# ============================================================================

# Hint 1: State Definition
# class PaginationState(TypedDict):
#     offset: int
#     limit: int
#     total_count: int
#     page_count: int
#     max_pages: int
#     addresses: list
#     complete: bool

# Hint 2: Initialize Node
# def initialize_pagination(state: PaginationState) -> dict:
#     return {
#         "offset": 0,
#         "page_count": 0,
#         "addresses": [],
#         "complete": False
#     }

# Hint 3: Fetch Page Node (increment counters!)
# def fetch_page(state: PaginationState) -> dict:
#     new_page_count = state["page_count"] + 1
#     new_offset = state["offset"] + state["limit"]
#     
#     # Simulate fetching objects
#     objects_to_fetch = min(state["limit"], state["total_count"] - state["offset"])
#     new_addresses = state["addresses"] + [f"address_{i}" for i in range(objects_to_fetch)]
#     
#     return {
#         "offset": new_offset,
#         "page_count": new_page_count,
#         "addresses": new_addresses
#     }

# Hint 4: Router (check max_pages FIRST!)
# def should_fetch_more(state: PaginationState) -> Literal["fetch_more", "complete", "max_pages_reached"]:
#     if state["page_count"] >= state["max_pages"]:
#         return "max_pages_reached"
#     if state["offset"] >= state["total_count"]:
#         return "complete"
#     return "fetch_more"

# Hint 5: Build and test
# pagination_graph = StateGraph(PaginationState)
# pagination_graph.add_node("initialize", initialize_pagination)
# pagination_graph.add_node("fetch_page", fetch_page)
# pagination_graph.add_node("check", lambda state: state)
# pagination_graph.set_entry_point("initialize")
# pagination_graph.add_edge("initialize", "fetch_page")
# pagination_graph.add_edge("fetch_page", "check")
# pagination_graph.add_conditional_edges(
#     source="check",
#     path=should_fetch_more,
#     path_map={
#         "fetch_more": "fetch_page",  # Loop back!
#         "complete": END,
#         "max_pages_reached": END
#     }
# )
# pagination_app = pagination_graph.compile()
# 
# # Test
# result = pagination_app.invoke({
#     "offset": 0,
#     "limit": 200,
#     "total_count": 450,
#     "page_count": 0,
#     "max_pages": 10,
#     "addresses": [],
#     "complete": False
# })

---

### 7.2 Exercise: Retry with Exponential Backoff

Build a retry workflow with exponential backoff delays for handling transient API failures.

### The Challenge

When API calls fail temporarily (network glitches, rate limits), you should:
- Retry with increasing delays: 1s, 2s, 4s, 8s
- Give the system time to recover
- Avoid overwhelming the API with rapid retries

### Requirements

**1. Define State** with fields:
```python
class RetryBackoffState(TypedDict):
    retry_count: int      # Current attempt
    max_retries: int      # Max attempts (4)
    success: bool         # Operation succeeded?
    result: str           # Result message
    total_wait: float     # Total time waited
```

**2. Create Nodes**:
- `initialize_retry`: Set retry_count=0, total_wait=0
- `attempt_operation`: 
  - If retry_count > 0: wait `2 ** (retry_count - 1)` seconds
  - Increment retry_count
  - Simulate operation (50% success rate)
  - Track total_wait
- `check_retry`: Router with passthrough

**3. Create Router** that returns:
- `"success"`: If operation succeeded
- `"max_reached"`: If retry_count >= max_retries
- `"retry"`: Otherwise (loop back!)

**4. Build Graph** with loop back to attempt_operation

**5. Test** with:
- max_retries = 4
- Should see delays: 1s, 2s, 4s (total 7s if all fail)

### Expected Output

```
Attempt 1/4...
   ‚ùå Failed
‚è±Ô∏è  Waiting 1s before retry...

Attempt 2/4...
   ‚ùå Failed
‚è±Ô∏è  Waiting 2s before retry...

Attempt 3/4...
   ‚úÖ Success!

Total attempts: 3
Total wait time: 3s
```

### Why Exponential Backoff?

**Bad (fixed 1s delay):**
```
Fail ‚Üí Wait 1s ‚Üí Fail ‚Üí Wait 1s ‚Üí Fail ‚Üí Wait 1s ‚Üí Fail
Total: 3s, hammers API every second
```

**Good (exponential):**
```
Fail ‚Üí Wait 1s ‚Üí Fail ‚Üí Wait 2s ‚Üí Fail ‚Üí Wait 4s ‚Üí Fail
Total: 7s, gives system time to recover
```

### Real SCM Use Cases

- **Rate Limiting**: SCM returns 429 ‚Üí retry with backoff
- **Commit Jobs**: Job submission fails ‚Üí retry with delays
- **Network Issues**: Transient connection errors ‚Üí backoff and retry

Try it below!

In [None]:
# Your code here for Exercise 7.2!
# Build exponential backoff retry workflow

# Step 1: Define RetryBackoffState


# Step 2: Create nodes (initialize, attempt_operation, check)


# Step 3: Create router function


# Step 4: Build graph with conditional loop


# Step 5: Test with max_retries=4


# ============================================================================
# HINTS (uncomment if you need help)
# ============================================================================

# Hint 1: The delay formula
# delay = 2 ** (retry_count - 1)  # 1, 2, 4, 8, 16...

# Hint 2: Attempt node structure
# def attempt_operation(state):
#     new_count = state["retry_count"] + 1
#     
#     # Calculate and apply backoff
#     if new_count > 1:
#         delay = 2 ** (new_count - 2)
#         print(f"‚è±Ô∏è  Waiting {delay}s...")
#         time.sleep(delay)
#         new_wait = state["total_wait"] + delay
#     else:
#         new_wait = 0
#     
#     # Simulate operation (50% success)
#     success = random.random() < 0.5
#     
#     return {
#         "retry_count": new_count,
#         "success": success,
#         "total_wait": new_wait
#     }

---

### 7.3 Exercise: HA Sync Status Polling

Build a polling loop to check HA (High Availability) sync status until firewalls are synchronized.

### The Challenge

After pushing configuration changes to an HA pair, you need to:
- Poll sync status every 5 seconds
- Continue until both firewalls report "synchronized"
- Timeout after 60 seconds (12 polls) if not synced

### Requirements

**1. Define State** with fields:
```python
class HAPollingState(TypedDict):
    poll_count: int       # Current poll number
    max_polls: int        # Max polls (12 = 60s)
    ha_status: str        # Status: "out-of-sync" | "synchronizing" | "synchronized"
    synced: bool          # Are firewalls synced?
    elapsed_time: float   # Total time elapsed (seconds)
```

**2. Create Nodes**:
- `initialize_polling`: Set poll_count=0, elapsed_time=0
- `check_ha_status`:
  - Wait 5 seconds (except first poll)
  - Increment poll_count
  - Simulate checking HA status:
    - 20% chance "synchronized" 
    - 30% chance "synchronizing"
    - 50% chance "out-of-sync"
  - Update elapsed_time
- `check_result`: Router with passthrough

**3. Create Router** that returns:
- `"synced"`: If ha_status == "synchronized"
- `"timeout"`: If poll_count >= max_polls
- `"poll_again"`: Otherwise (loop back!)

**4. Build Graph** with loop back to check_ha_status

**5. Test** with:
- max_polls = 12 (60 second timeout)
- 5 second delay between polls

### Expected Output

```
Poll 1/12: Checking HA status...
   Status: out-of-sync
   
‚è±Ô∏è  Waiting 5s before next poll...

Poll 2/12: Checking HA status...
   Status: synchronizing
   
‚è±Ô∏è  Waiting 5s before next poll...

Poll 3/12: Checking HA status...
   Status: synchronized
   ‚úÖ HA pair synchronized!

Total polls: 3
Total time: 10s
```

### Why Polling Loops?

Many SCM operations are **asynchronous**:
- Configuration commits
- HA synchronization  
- Job completion
- Device status updates

You must **poll and wait** for completion.

### Real SCM Use Cases

- **Commit Jobs**: Poll job status until "completed"
- **HA Sync**: Poll HA status after config push
- **Device Connection**: Poll device until "connected"
- **License Activation**: Poll license status until "active"

### Production Pattern: SCM Commit Job Polling

Here's how you would poll an SCM commit job in production:

```python
from scm.client import ScmClient
from typing import TypedDict, Literal
import time

class JobPollingState(TypedDict):
    job_id: str           # Job ID returned from commit
    poll_count: int       # Current poll number
    max_polls: int        # Max polls before timeout
    job_status: str       # Status: "pending" | "running" | "completed" | "failed"
    completed: bool       # Job finished successfully?
    elapsed_time: float   # Total time elapsed

def poll_commit_job(state: JobPollingState) -> dict:
    """Node: Poll SCM commit job status."""
    new_count = state["poll_count"] + 1
    
    # Wait between polls (not on first poll)
    if new_count > 1:
        print("‚è±Ô∏è  Waiting 5s before next poll...")
        time.sleep(5)
        new_elapsed = state["elapsed_time"] + 5
    else:
        new_elapsed = 0
    
    print(f"\nPoll {new_count}/{state['max_polls']}: Checking job {state['job_id']}...")
    
    try:
        # Initialize client
        client = ScmClient(
            client_id=state.get("client_id"),
            client_secret=state.get("client_secret"),
            tsg_id=state.get("tsg_id")
        )
        
        # Get job status
        job_response = client.get_job_status(state["job_id"])
        status = job_response["data"][0]["status_str"]
        
        print(f"   Status: {status}")
        
        # Check completion
        if status == "FIN":
            print("   ‚úÖ Job completed successfully!")
            completed = True
        elif status in ["PEND", "ACT"]:
            print(f"   üîÑ Job still running...")
            completed = False
        else:
            print(f"   ‚ùå Job failed with status: {status}")
            completed = False
        
        return {
            "poll_count": new_count,
            "job_status": status,
            "completed": completed,
            "elapsed_time": new_elapsed
        }
    
    except Exception as e:
        print(f"   ‚ùå Error checking job status: {str(e)}")
        return {
            "poll_count": new_count,
            "job_status": "error",
            "completed": False,
            "elapsed_time": new_elapsed
        }

def should_continue_polling(state: JobPollingState) -> Literal["continue", "success", "timeout", "failed"]:
    """Router: Decide whether to continue polling."""
    # Check max polls FIRST (safety!)
    if state["poll_count"] >= state["max_polls"]:
        print(f"\nüõë Timeout: Maximum polls ({state['max_polls']}) reached!")
        return "timeout"
    
    # Check job completion
    if state["job_status"] == "FIN":
        print("\n‚úÖ Job completed successfully!")
        return "success"
    
    # Check job failure
    if state["job_status"] not in ["PEND", "ACT", "FIN", ""]:
        print(f"\n‚ùå Job failed with status: {state['job_status']}")
        return "failed"
    
    # Continue polling
    return "continue"

# Example usage in production workflow:
# 1. Submit commit
# job_id = client.commit(folders=["Texas"], description="Update security rules")
# 
# 2. Build polling graph
# polling_graph = StateGraph(JobPollingState)
# polling_graph.add_node("poll", poll_commit_job)
# polling_graph.add_node("check", lambda state: state)
# polling_graph.set_entry_point("poll")
# polling_graph.add_edge("poll", "check")
# polling_graph.add_conditional_edges(
#     source="check",
#     path=should_continue_polling,
#     path_map={
#         "continue": "poll",     # Loop back!
#         "success": END,
#         "timeout": END,
#         "failed": END
#     }
# )
# 
# 3. Run polling loop
# result = polling_graph.compile().invoke({
#     "job_id": job_id,
#     "poll_count": 0,
#     "max_polls": 60,  # 5 minutes max (60 * 5s = 300s)
#     "job_status": "",
#     "completed": False,
#     "elapsed_time": 0
# })
```

**Key differences from HA polling:**
- Real SCM API call: `client.get_job_status(job_id)`
- Multiple status values: PEND (pending), ACT (active), FIN (finished)
- Longer timeout: 60 polls = 5 minutes (commits can take time)
- Error handling: Catches API exceptions

**Reference:** See SCM API documentation for `get_job_status()` details.

### Polling Best Practices

1. **Fixed delay** between polls (don't hammer API)
2. **Maximum timeout** (don't poll forever)
3. **Exponential backoff** for long operations (5s ‚Üí 10s ‚Üí 20s)
4. **Log each poll** for debugging

Try it below!

In [None]:
# Your code here for Exercise 7.3!
# Build HA polling workflow

# Step 1: Define HAPollingState


# Step 2: Create nodes (initialize, check_ha_status, check_result)


# Step 3: Create router function


# Step 4: Build graph with conditional loop


# Step 5: Test with max_polls=12


# ============================================================================
# HINTS (uncomment if you need help)
# ============================================================================

# Hint 1: Status simulation
# rand = random.random()
# if rand < 0.2:
#     status = "synchronized"
# elif rand < 0.5:
#     status = "synchronizing"  
# else:
#     status = "out-of-sync"

# Hint 2: Check HA status node structure
# def check_ha_status(state):
#     new_count = state["poll_count"] + 1
#     
#     # Wait 5s between polls (not on first poll)
#     if new_count > 1:
#         print("‚è±Ô∏è  Waiting 5s...")
#         time.sleep(5)
#         new_elapsed = state["elapsed_time"] + 5
#     else:
#         new_elapsed = 0
#     
#     print(f"\nPoll {new_count}/{state['max_polls']}: Checking HA status...")
#     
#     # Simulate status check (20% success)
#     rand = random.random()
#     if rand < 0.2:
#         status = "synchronized"
#         synced = True
#         print("   ‚úÖ Status: synchronized")
#     elif rand < 0.5:
#         status = "synchronizing"
#         synced = False
#         print("   üîÑ Status: synchronizing")
#     else:
#         status = "out-of-sync"
#         synced = False
#         print("   ‚ùå Status: out-of-sync")
#     
#     return {
#         "poll_count": new_count,
#         "ha_status": status,
#         "synced": synced,
#         "elapsed_time": new_elapsed
#     }

# Hint 3: Router checks max_polls FIRST
# def should_poll_again(state) -> Literal["synced", "timeout", "poll_again"]:
#     if state["poll_count"] >= state["max_polls"]:
#         return "timeout"
#     if state["synced"]:
#         return "synced"
#     return "poll_again"

---

### 7.4 Exercise: Batch Address Object Creation with Error Recovery

Build a batch processing loop to create 50 address objects one at a time, with error recovery.

### The Challenge

Create many address objects sequentially:
- Process objects one at a time (avoid overwhelming API)
- Track successes and failures separately
- Continue processing even if some fail (resilient)
- Maximum 100 iterations (safety limit, should never hit)

### Requirements

**1. Define State** with fields:
```python
class BatchProcessingState(TypedDict):
    objects_to_create: list    # Remaining objects to process
    objects_created: list      # Successfully created objects
    failed_objects: list       # Objects that failed
    current_index: int         # Current position in list
    max_iterations: int        # Safety limit (100)
    all_processed: bool        # All objects processed?
```

**2. Create Nodes**:
- `initialize_batch`: Set current_index=0, create list of 50 objects
- `create_single_object`:
  - Get object at current_index
  - Simulate creation (80% success rate)
  - If success: add to objects_created
  - If failure: add to failed_objects  
  - Increment current_index
- `check_batch`: Router with passthrough

**3. Create Router** that returns:
- `"continue"`: If current_index < len(objects_to_create) AND current_index < max_iterations
- `"complete"`: If all objects processed
- `"max_iterations_reached"`: If hit safety limit

**4. Build Graph** with loop back to create_single_object

**5. Test** with:
- 50 address objects to create
- 80% success rate (expect ~40 created, ~10 failed)
- max_iterations = 100 (safety)

### Expected Output

```
Creating object 1/50: address_0
   ‚úÖ Created successfully

Creating object 2/50: address_1
   ‚ùå Creation failed

Creating object 3/50: address_2
   ‚úÖ Created successfully

...

Batch complete!
Total processed: 50
Successfully created: 42
Failed: 8
Success rate: 84%
```

### Why Batch Processing Loops?

**Instead of this (risky):**
```python
# Create all at once - if any fails, all fail!
for obj in objects:
    client.address.create(obj)  # Crash on first error
```

**Use this (resilient):**
```python
# Process one at a time with error tracking
for obj in objects:
    try:
        client.address.create(obj)
        successes.append(obj)
    except Exception as e:
        failures.append((obj, e))
        # Continue processing!
```

### Real SCM Use Cases

- **Bulk Object Creation**: Create 100+ address/service objects
- **Configuration Migration**: Move 500 rules from old to new firewall
- **Batch Updates**: Update properties on many objects
- **Cleanup Operations**: Delete old unused objects
- **Batch Validation**: Validate many objects, report all errors

### Batch Processing Best Practices

1. **Track successes AND failures** separately
2. **Continue on individual failures** (don't crash)
3. **Log each operation** for debugging
4. **Return detailed results** (counts, lists)
5. **Add delays** if API has rate limits
6. **Use safety counter** (max_iterations) to prevent infinite loops

Try it below!

In [None]:
# Your code here for Exercise 7.4!
# Build batch processing workflow

# Step 1: Define BatchProcessingState


# Step 2: Create nodes (initialize, create_single_object, check)


# Step 3: Create router function


# Step 4: Build graph with conditional loop


# Step 5: Test with 50 objects


# ============================================================================
# HINTS (uncomment if you need help)
# ============================================================================

# Hint 1: Initialize batch
# def initialize_batch(state: BatchProcessingState) -> dict:
#     # Create list of 50 objects to process
#     objects = [f"address_{i}" for i in range(50)]
#     return {
#         "objects_to_create": objects,
#         "objects_created": [],
#         "failed_objects": [],
#         "current_index": 0,
#         "all_processed": False
#     }

# Hint 2: Create single object with error handling
# def create_single_object(state: BatchProcessingState) -> dict:
#     current_idx = state["current_index"]
#     obj_name = state["objects_to_create"][current_idx]
#     
#     print(f"\nCreating object {current_idx + 1}/{len(state['objects_to_create'])}: {obj_name}")
#     
#     # Simulate creation (80% success rate)
#     success = random.random() < 0.8
#     
#     if success:
#         print("   ‚úÖ Created successfully")
#         new_created = state["objects_created"] + [obj_name]
#         new_failed = state["failed_objects"]
#     else:
#         print("   ‚ùå Creation failed")
#         new_created = state["objects_created"]
#         new_failed = state["failed_objects"] + [obj_name]
#     
#     return {
#         "current_index": current_idx + 1,
#         "objects_created": new_created,
#         "failed_objects": new_failed
#     }

# Hint 3: Router checks max_iterations FIRST
# def should_continue_batch(state) -> Literal["continue", "complete", "max_iterations_reached"]:
#     # Safety check FIRST
#     if state["current_index"] >= state["max_iterations"]:
#         return "max_iterations_reached"
#     
#     # Check if all processed
#     if state["current_index"] >= len(state["objects_to_create"]):
#         return "complete"
#     
#     return "continue"

# Hint 4: Final results
# print("\n" + "="*70)
# print("BATCH PROCESSING COMPLETE")
# print("="*70)
# print(f"Total objects: {len(result['objects_to_create'])}")
# print(f"Successfully created: {len(result['objects_created'])}")
# print(f"Failed: {len(result['failed_objects'])}")
# print(f"Success rate: {len(result['objects_created']) / len(result['objects_to_create']) * 100:.1f}%")

---

## 8. What's Next: LLM Integration Begins!

Congratulations! You've completed the **Graph Patterns Foundation** (notebooks 103-107). You now know:
- ‚úÖ Single-node graphs (103)
- ‚úÖ Complex state structures (104)
- ‚úÖ Sequential workflows (105)
- ‚úÖ Conditional routing (106)
- ‚úÖ Looping workflows (107)

### Two-Phase Workshop Structure

**Phase 1: Graph Patterns (Complete!)** ‚úÖ
- Notebooks 103-107
- No LLM integration
- Pure workflow mechanics
- No API keys required

**Phase 2: LLM Integration (Starting Next!)** üöÄ
- Notebooks 108-111
- AI-powered agents
- Tool calling and memory
- API keys required

### Coming in Notebook 108: Your First LLM Integration

Everything you've learned uses **deterministic logic** - functions that always return the same result. But the real power comes from integrating **Large Language Models (LLMs)**!

In Notebook 108, you'll learn:

**1. LangChain + LangGraph Together**
- How LangChain provides LLM tools
- How LangGraph orchestrates LLM workflows
- The `ChatOpenAI` and `HumanMessage` patterns

**2. Your First AI Bot**
```python
from langchain_openai import ChatOpenAI
from langchain_core.messages import HumanMessage

llm = ChatOpenAI(model="gpt-4o")

# Same graph structure you know!
graph.add_node("process", lambda state: llm.invoke(state["messages"]))
```

**3. The Memory Problem**
- Why simple bots can't remember conversations
- How state management becomes critical
- Preview of conversation memory (109)

### Why This Progression?

**Learning graph patterns FIRST** means:
- ‚úÖ You understand the foundation before adding complexity
- ‚úÖ You can debug LLM issues vs. graph issues separately
- ‚úÖ You appreciate how reducers simplify LLM state management
- ‚úÖ Lower cost (no API calls during graph practice)

**Now adding LLMs** builds on solid foundation:
- You know: State management, nodes, edges, routing, loops
- You add: LLM integration, conversation memory, tool calling, ReAct patterns

### The LLM Integration Path (108-111)

**108: First LLM Integration (Simple Bot)**
- Connect ChatOpenAI to graphs
- HumanMessage basics
- Discover the memory problem

**109: Conversational Memory**
- AIMessage and Union types
- Manual conversation history
- Token cost management

**110: ReAct Agents with Tools**
- `add_messages` reducer
- Tool calling with `@tool`
- Intelligent decision-making

**111: Human-in-the-Loop**
- Interactive collaboration
- Configuration drafting
- Production deployment patterns

Get ready to add AI intelligence to your SCM automation workflows!

---

## 9. Summary

### Key Concepts Mastered

**1. Looping Workflows**
- Self-referencing edges create cycles
- Enable retry, pagination, polling, batch processing
- Combine state, conditional routing, and counters

**2. The Loop Pattern**
```python
# State with counters
class LoopState(TypedDict):
    counter: int
    max_iterations: int

# Processing node (increments counter)
def process(state):
    return {"counter": state["counter"] + 1}

# Router (checks max FIRST!)
def router(state):
    if state["counter"] >= state["max_iterations"]:
        return "exit"
    return "loop"

# Conditional edge (creates loop)
graph.add_conditional_edges(
    source="check",
    path=router,
    path_map={"loop": "process", "exit": END}
)
```

**3. Safety Rules**
- Always use counter in state
- Always increment counter
- Always check maximum FIRST
- Always have multiple exit paths
- Always validate loop design

**4. Network Analogies**
- TCP retransmission = Retry loops
- BGP route chunks = Pagination
- IP TTL = Loop counter
- All prevent infinite cycles!

**5. When to Loop**
| Use Case | Pattern |
|----------|---------|
| Retry with limit | Hybrid loop (count + condition) |
| Pagination | Condition loop (until all fetched) |
| Polling | Condition loop (until ready) |
| Batch processing | Count loop (N times) |

### Production Best Practices

**For Retry Loops**:
- Max 3-5 attempts
- Exponential backoff between attempts
- Log each attempt
- Different exit paths for success/failure

**For Pagination**:
- Track total vs fetched
- Safety limit (max_pages)
- Handle empty results
- Accumulate results in state list

**For All Loops**:
- Counter initialized to 0
- Incremented in loop body
- Checked against maximum FIRST
- Multiple exit conditions
- Logged for debugging

### Your LangGraph Toolkit

You can now build:
- **Sequential**: Linear workflows (104)
- **Branching**: Conditional routing (106)
- **Looping**: Retry and iteration (107)
- **Combined**: Complex workflows with all patterns

### Next: LLM-Powered Agents

In Notebook 108, add intelligence:
- LLMs make routing decisions
- Tools for LLMs to use
- ReAct pattern (reason + act)
- Adaptive, intelligent automation

You're ready to build production SCM automation! üéâ