# üîç StateDiff: Advanced Agent State Tracking & Debugging

Welcome to the comprehensive StateDiff tutorial! This notebook teaches you how to track, analyze, and debug agent state changes with precision.

## üéØ Learning Objectives

By the end of this notebook, you'll master:
- **State snapshot management**: When and how to capture agent states
- **Diff analysis**: Understanding complex state changes and their implications
- **Debugging workflows**: Using StateDiff to solve real agent problems
- **Performance optimization**: Cost tracking and state management at scale
- **Production patterns**: Best practices for deploying StateDiff in live systems

---

## üöÄ Setup & Installation

In [None]:
# Install Argentum if needed
import sys
import subprocess

try:
    import argentum
    print("‚úÖ Argentum already installed")
except ImportError:
    print("üì¶ Installing Argentum...")
    subprocess.check_call([sys.executable, "-m", "pip", "install", "argentum-agent"])
    print("‚úÖ Installation complete!")

# Import required modules
from argentum import StateDiff
from datetime import datetime
import json
import random
import time
import copy

print(f"üîç StateDiff Tutorial - Argentum v{argentum.__version__}")
print(f"üìÖ Started: {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}")

## üìñ Chapter 1: StateDiff Fundamentals

Let's start with the basics: capturing and comparing agent states.

In [None]:
print("üìñ Chapter 1: StateDiff Fundamentals")
print("="*50)

# Create a StateDiff instance
diff = StateDiff()

# Example: Simple agent that processes a user request
print("ü§ñ Simulating agent processing user request: 'Schedule a meeting with the team'")

# Initial state
initial_state = {
    "request": "Schedule a meeting with the team",
    "status": "received",
    "extracted_entities": [],
    "intent": None,
    "confidence": 0.0,
    "next_actions": [],
    "context": {
        "user_id": "user_123",
        "session_id": "sess_456"
    }
}

diff.snapshot("request_received", initial_state)
print("üì∏ Snapshot 1: Request received")

# After NLP processing
nlp_processed_state = {
    "request": "Schedule a meeting with the team",
    "status": "processing",
    "extracted_entities": [
        {"type": "action", "value": "schedule", "confidence": 0.95},
        {"type": "object", "value": "meeting", "confidence": 0.98},
        {"type": "participants", "value": "team", "confidence": 0.87}
    ],
    "intent": "schedule_meeting",
    "confidence": 0.8,
    "next_actions": ["get_team_members", "check_availability"],
    "context": {
        "user_id": "user_123",
        "session_id": "sess_456",
        "nlp_model": "spacy_v3",
        "processing_time_ms": 150
    }
}

diff.snapshot("nlp_complete", nlp_processed_state)
print("üì∏ Snapshot 2: NLP processing complete")

# After calendar integration
calendar_state = {
    "request": "Schedule a meeting with the team",
    "status": "scheduling",
    "extracted_entities": [
        {"type": "action", "value": "schedule", "confidence": 0.95},
        {"type": "object", "value": "meeting", "confidence": 0.98},
        {"type": "participants", "value": "team", "confidence": 0.87}
    ],
    "intent": "schedule_meeting",
    "confidence": 0.95,
    "next_actions": ["send_invitations"],
    "context": {
        "user_id": "user_123",
        "session_id": "sess_456",
        "nlp_model": "spacy_v3",
        "processing_time_ms": 150,
        "calendar_integration": True,
        "available_slots": ["2024-11-09 14:00", "2024-11-09 15:30"]
    },
    "team_members": [
        {"name": "Alice", "email": "alice@company.com", "available": True},
        {"name": "Bob", "email": "bob@company.com", "available": True},
        {"name": "Charlie", "email": "charlie@company.com", "available": False}
    ]
}

diff.snapshot("calendar_checked", calendar_state)
print("üì∏ Snapshot 3: Calendar availability checked")

# Analyze the differences
print("\nüîç Analyzing State Changes:")
print("-" * 30)

# Request ‚Üí NLP
nlp_changes = diff.get_changes("request_received", "nlp_complete")
print("\n1Ô∏è‚É£ Request ‚Üí NLP Processing:")
for field, change in nlp_changes.items():
    if "added" in change:
        if isinstance(change["added"], list) and len(change["added"]) > 0:
            print(f"   ‚ûï {field}: Added {len(change['added'])} items")
            if field == "extracted_entities":
                for entity in change["added"]:
                    print(f"      ‚Ä¢ {entity['type']}: {entity['value']} ({entity['confidence']:.2f})")
        elif isinstance(change["added"], list):
            print(f"   ‚ûï {field}: {change['added']}")
        else:
            print(f"   ‚ûï {field}: {change['added']}")
    elif "from" in change and "to" in change:
        print(f"   üîÑ {field}: {change['from']} ‚Üí {change['to']}")

# NLP ‚Üí Calendar
calendar_changes = diff.get_changes("nlp_complete", "calendar_checked")
print("\n2Ô∏è‚É£ NLP ‚Üí Calendar Integration:")
for field, change in calendar_changes.items():
    if "added" in change:
        if isinstance(change["added"], list) and len(change["added"]) > 0:
            print(f"   ‚ûï {field}: Added {len(change['added'])} items")
        else:
            print(f"   ‚ûï {field}: {change['added']}")
    elif "from" in change and "to" in change:
        print(f"   üîÑ {field}: {change['from']} ‚Üí {change['to']}")

# Overall transformation
overall_changes = diff.get_changes("request_received", "calendar_checked")
print("\nüåü Overall Transformation:")
print(f"   üìä Total fields changed: {len(overall_changes)}")
print(f"   üìà Confidence: 0.0 ‚Üí 0.95 (+0.95)")
print(f"   üéØ Status: received ‚Üí scheduling")
print(f"   ‚ú® Intent discovered: schedule_meeting")

## üî¨ Chapter 2: Advanced Diff Analysis

Learn to work with complex nested states and understand subtle changes that impact agent behavior.

In [None]:
print("üî¨ Chapter 2: Advanced Diff Analysis")
print("="*40)

# Create a new StateDiff for complex state tracking
complex_diff = StateDiff()

# Simulate a complex agent with multiple subsystems
print("ü§ñ Complex Agent: Multi-modal AI Assistant")
print("üìã Scenario: Processing a request with images, text, and context")

# Initial complex state
complex_initial = {
    "session": {
        "user_id": "user_789",
        "conversation_turn": 1,
        "total_tokens_used": 0,
        "cost_so_far": 0.0
    },
    "input_processing": {
        "text": "Analyze this product image and tell me about pricing trends",
        "attachments": ["product_image.jpg"],
        "modalities": ["text", "vision"],
        "preprocessing_status": "pending"
    },
    "knowledge_base": {
        "facts_retrieved": [],
        "confidence_scores": {},
        "search_queries": [],
        "last_updated": None
    },
    "reasoning": {
        "current_hypothesis": None,
        "evidence_for": [],
        "evidence_against": [],
        "reasoning_chain": [],
        "uncertainty_level": 1.0
    },
    "output_generation": {
        "draft_response": None,
        "response_type": "pending",
        "formatting": {},
        "fact_checking_status": "not_started"
    }
}

complex_diff.snapshot("complex_start", complex_initial)
print("üì∏ Captured initial complex state")

# After vision processing
vision_processed = copy.deepcopy(complex_initial)
vision_processed["session"]["conversation_turn"] = 1
vision_processed["session"]["total_tokens_used"] = 1250  # Vision model tokens
vision_processed["session"]["cost_so_far"] = 0.025
vision_processed["input_processing"]["preprocessing_status"] = "vision_complete"
vision_processed["input_processing"]["vision_analysis"] = {
    "objects_detected": ["laptop", "coffee_cup", "desk"],
    "text_in_image": "MacBook Pro 16-inch - $2,399",
    "confidence": 0.92,
    "processing_time_ms": 850
}
vision_processed["knowledge_base"]["search_queries"] = ["MacBook Pro pricing", "laptop market trends"]
vision_processed["reasoning"]["current_hypothesis"] = "User asking about laptop pricing trends"
vision_processed["reasoning"]["uncertainty_level"] = 0.3

complex_diff.snapshot("vision_processed", vision_processed)
print("üì∏ Vision processing complete")

# After knowledge retrieval
knowledge_enriched = copy.deepcopy(vision_processed)
knowledge_enriched["session"]["total_tokens_used"] = 2100
knowledge_enriched["session"]["cost_so_far"] = 0.048
knowledge_enriched["knowledge_base"]["facts_retrieved"] = [
    "MacBook Pro 16-inch MSRP: $2,399 (as of 2024)",
    "Average laptop prices increased 15% in 2024",
    "Premium laptop market growing at 8% annually",
    "Apple typically refreshes MacBook Pro every 18 months"
]
knowledge_enriched["knowledge_base"]["confidence_scores"] = {
    "pricing_accuracy": 0.95,
    "trend_reliability": 0.82,
    "market_data_freshness": 0.78
}
knowledge_enriched["knowledge_base"]["last_updated"] = "2024-11-08T10:30:00Z"
knowledge_enriched["reasoning"]["evidence_for"] = [
    "Image shows current MSRP pricing",
    "Market data confirms upward trend",
    "User specifically asked about trends"
]
knowledge_enriched["reasoning"]["reasoning_chain"] = [
    "Detected MacBook Pro in image",
    "Extracted price information: $2,399",
    "Retrieved relevant market trend data",
    "Price aligns with current market rates"
]
knowledge_enriched["reasoning"]["uncertainty_level"] = 0.15

complex_diff.snapshot("knowledge_enriched", knowledge_enriched)
print("üì∏ Knowledge retrieval complete")

# Final response ready
response_ready = copy.deepcopy(knowledge_enriched)
response_ready["session"]["total_tokens_used"] = 2850
response_ready["session"]["cost_so_far"] = 0.067
response_ready["output_generation"]["draft_response"] = "I can see a MacBook Pro 16-inch in your image priced at $2,399, which matches the current MSRP. Laptop pricing trends show a 15% increase in 2024, with premium laptops like MacBook Pro maintaining strong market position."
response_ready["output_generation"]["response_type"] = "informational_with_analysis"
response_ready["output_generation"]["formatting"] = {
    "include_price_chart": True,
    "highlight_key_trends": True,
    "confidence_indicator": "high"
}
response_ready["output_generation"]["fact_checking_status"] = "verified"
response_ready["reasoning"]["uncertainty_level"] = 0.08

complex_diff.snapshot("response_ready", response_ready)
print("üì∏ Response generation complete")

# Advanced analysis
print("\nüî¨ Advanced Diff Analysis:")
print("-" * 35)

# Sequential analysis
sequence = complex_diff.get_sequence_changes()
print("\nüìà Sequential Processing Analysis:")
for i, step in enumerate(sequence, 1):
    print(f"\n{i}. {step['from']} ‚Üí {step['to']}")
    
    # Count different types of changes
    additions = sum(1 for change in step['changes'].values() if 'added' in change)
    modifications = sum(1 for change in step['changes'].values() if 'from' in change)
    
    print(f"   üìä {additions} additions, {modifications} modifications")
    
    # Highlight key changes
    key_changes = []
    for field, change in step['changes'].items():
        if 'uncertainty_level' in field and 'from' in change:
            uncertainty_drop = change['from'] - change['to']
            key_changes.append(f"Uncertainty ‚Üì{uncertainty_drop:.2f}")
        elif 'total_tokens_used' in field and 'from' in change:
            token_increase = change['to'] - change['from']
            key_changes.append(f"Tokens +{token_increase}")
        elif 'facts_retrieved' in field and 'added' in change:
            facts_added = len(change['added']) if isinstance(change['added'], list) else 1
            key_changes.append(f"Facts +{facts_added}")
    
    if key_changes:
        print(f"   üéØ Key: {', '.join(key_changes)}")

# Cost and performance analysis
print("\nüí∞ Cost & Performance Analysis:")
start_cost = complex_initial["session"]["cost_so_far"]
final_cost = response_ready["session"]["cost_so_far"]
start_tokens = complex_initial["session"]["total_tokens_used"]
final_tokens = response_ready["session"]["total_tokens_used"]

print(f"   üí∏ Total cost: ${start_cost:.3f} ‚Üí ${final_cost:.3f} (+${final_cost - start_cost:.3f})")
print(f"   üî§ Total tokens: {start_tokens:,} ‚Üí {final_tokens:,} (+{final_tokens - start_tokens:,})")
print(f"   üìâ Uncertainty: 1.00 ‚Üí 0.08 (-0.92)")
print(f"   üéØ Processing efficiency: {(final_tokens / final_cost):.0f} tokens/$")

# Quality indicators
print("\n‚ú® Quality Indicators:")
vision_confidence = vision_processed["input_processing"]["vision_analysis"]["confidence"]
pricing_confidence = knowledge_enriched["knowledge_base"]["confidence_scores"]["pricing_accuracy"]
final_uncertainty = response_ready["reasoning"]["uncertainty_level"]

print(f"   üëÅÔ∏è  Vision analysis: {vision_confidence:.1%} confidence")
   f"   üí∞ Pricing accuracy: {pricing_confidence:.1%} confidence")
print(f"   üß† Final uncertainty: {final_uncertainty:.1%} (excellent)")
print(f"   ‚úÖ Fact checking: {response_ready['output_generation']['fact_checking_status']}")

## üêõ Chapter 3: Debugging Real Agent Problems

Use StateDiff to identify and solve common agent issues that occur in production.

In [None]:
print("üêõ Chapter 3: Debugging Real Agent Problems")
print("="*45)

# Problem 1: Confidence Oscillation
print("\nüéØ Problem 1: Agent Confidence Oscillation")
print("Scenario: Agent confidence keeps fluctuating, causing indecisive behavior")

oscillation_diff = StateDiff()

# Simulate oscillating confidence
confidence_states = [
    ("start", 0.6, "Initial analysis", ["fact1", "fact2"]),
    ("more_data", 0.8, "Found supporting evidence", ["fact1", "fact2", "fact3_supporting"]),
    ("contradiction", 0.4, "Found contradictory evidence", ["fact1", "fact2", "fact3_supporting", "fact4_contradictory"]),
    ("resolution_attempt", 0.7, "Tried to resolve contradiction", ["fact1", "fact2", "fact3_supporting", "fact4_weight_adjusted"]),
    ("new_contradiction", 0.3, "Found another contradiction", ["fact1", "fact2", "fact3_supporting", "fact4_weight_adjusted", "fact5_contradictory"]),
    ("final_state", 0.5, "Gave up, defaulting to middle ground", ["fact1", "fact2", "fact3_supporting", "fact4_weight_adjusted", "fact5_contradictory"])
]

for label, confidence, description, evidence in confidence_states:
    state = {
        "confidence": confidence,
        "description": description,
        "evidence": evidence,
        "decision_pending": confidence < 0.8
    }
    oscillation_diff.snapshot(label, state)
    print(f"   üì∏ {label}: {confidence:.1f} confidence - {description}")

# Analyze confidence pattern
print("\nüîç Confidence Pattern Analysis:")
sequence = oscillation_diff.get_sequence_changes()
confidence_changes = []

for step in sequence:
    if 'confidence' in step['changes'] and 'from' in step['changes']['confidence']:
        from_conf = step['changes']['confidence']['from']
        to_conf = step['changes']['confidence']['to']
        change = to_conf - from_conf
        confidence_changes.append(change)
        direction = "‚ÜóÔ∏è" if change > 0 else "‚ÜòÔ∏è"
        print(f"   {direction} {step['from']} ‚Üí {step['to']}: {change:+.1f} ({from_conf:.1f} ‚Üí {to_conf:.1f})")

# Diagnostic insights
oscillations = sum(1 for i in range(1, len(confidence_changes)) if confidence_changes[i] * confidence_changes[i-1] < 0)
print(f"\nüö® Diagnostic: {oscillations} confidence oscillations detected")
print("üí° Root cause: Agent lacks evidence reconciliation strategy")
print("üîß Recommendation: Implement evidence weighting and conflict resolution")

# Problem 2: Memory Leak Detection
print("\nüß† Problem 2: Agent Memory Leak")
print("Scenario: Agent accumulates too much state, causing performance degradation")

memory_diff = StateDiff()

# Simulate growing memory usage
memory_states = []
base_state = {
    "conversation_history": [],
    "cached_computations": {},
    "temporary_data": {},
    "performance_metrics": {"response_time_ms": 100}
}

for turn in range(1, 6):
    state = copy.deepcopy(base_state)
    
    # Accumulate conversation history (grows linearly)
    state["conversation_history"] = [f"turn_{i}" for i in range(1, turn + 1)]
    
    # Accumulate cached computations (grows quadratically due to poor cleanup)
    state["cached_computations"] = {f"cache_key_{i}_{j}": f"cached_value_{i}_{j}" 
                                   for i in range(1, turn + 1) 
                                   for j in range(1, i + 1)}
    
    # Temporary data that should be cleaned but isn't
    state["temporary_data"] = {f"temp_{i}": f"temporary_data_that_should_be_cleaned_{i}" 
                              for i in range(1, turn * 3 + 1)}
    
    # Performance degrades with memory usage
    memory_usage = len(state["conversation_history"]) + len(state["cached_computations"]) + len(state["temporary_data"])
    state["performance_metrics"]["response_time_ms"] = 100 + (memory_usage * 10)
    state["performance_metrics"]["memory_usage_items"] = memory_usage
    
    memory_diff.snapshot(f"turn_{turn}", state)
    print(f"   üì∏ Turn {turn}: {memory_usage} items in memory, {state['performance_metrics']['response_time_ms']}ms response time")

# Analyze memory growth pattern
print("\nüìà Memory Growth Analysis:")
memory_sequence = memory_diff.get_sequence_changes()

for i, step in enumerate(memory_sequence):
    turn_num = i + 2  # Starting from turn 2
    memory_changes = []
    
    for field, change in step['changes'].items():
        if 'added' in change and isinstance(change['added'], (list, dict)):
            added_count = len(change['added'])
            memory_changes.append(f"{field}: +{added_count}")
        elif 'memory_usage_items' in field and 'from' in change:
            growth = change['to'] - change['from']
            memory_changes.append(f"Total: +{growth} items")
        elif 'response_time_ms' in field and 'from' in change:
            slowdown = change['to'] - change['from']
            memory_changes.append(f"Latency: +{slowdown}ms")
    
    if memory_changes:
        print(f"   üìä Turn {turn_num}: {', '.join(memory_changes)}")

# Calculate growth rates
initial_items = memory_diff._snapshots["turn_1"]["performance_metrics"]["memory_usage_items"]
final_items = memory_diff._snapshots["turn_5"]["performance_metrics"]["memory_usage_items"]
growth_rate = (final_items - initial_items) / initial_items

print(f"\nüö® Memory leak detected: {growth_rate:.1%} growth over 5 turns")
print("üí° Root cause: Temporary data and cache not being cleaned up")
print("üîß Recommendations:")
print("   ‚Ä¢ Implement automatic cache expiration")
print("   ‚Ä¢ Add temporary data cleanup after each turn")
print("   ‚Ä¢ Set maximum conversation history length")
print("   ‚Ä¢ Monitor memory usage with alerts")

# Problem 3: State Corruption Detection
print("\nüîí Problem 3: State Corruption Detection")
print("Scenario: Agent state becomes inconsistent due to race conditions or bugs")

corruption_diff = StateDiff()

# Normal state
normal_state = {
    "user_profile": {
        "name": "John Doe",
        "preferences": {"language": "en", "timezone": "UTC-5"},
        "subscription": "premium"
    },
    "session_data": {
        "authenticated": True,
        "permissions": ["read", "write", "admin"],
        "session_id": "sess_123"
    },
    "business_logic": {
        "current_operation": "data_analysis",
        "operation_state": "in_progress",
        "resources_allocated": ["cpu_core_1", "memory_block_A"]
    }
}

corruption_diff.snapshot("normal_state", normal_state)
print("   üì∏ Normal state captured")

# Corrupted state (simulating race condition)
corrupted_state = copy.deepcopy(normal_state)
# Inconsistency 1: User downgraded but still has admin permissions
corrupted_state["user_profile"]["subscription"] = "basic"
# Inconsistency 2: Operation marked complete but resources still allocated
corrupted_state["business_logic"]["operation_state"] = "completed"
# Inconsistency 3: Session expired but user data still present
corrupted_state["session_data"]["authenticated"] = False
corrupted_state["session_data"]["session_id"] = None
# Inconsistency 4: Null user name but preferences still exist
corrupted_state["user_profile"]["name"] = None

corruption_diff.snapshot("corrupted_state", corrupted_state)
print("   üì∏ Corrupted state captured")

# Detect corruption patterns
corruption_changes = corruption_diff.get_changes("normal_state", "corrupted_state")

print("\nüîç State Corruption Analysis:")
inconsistencies = []

for field, change in corruption_changes.items():
    if 'from' in change and 'to' in change:
        old_val, new_val = change['from'], change['to']
        
        # Detect specific inconsistency patterns
        if field == "user_profile.subscription" and new_val == "basic":
            # Check if admin permissions still exist
            current_permissions = corrupted_state["session_data"]["permissions"]
            if "admin" in current_permissions:
                inconsistencies.append("Basic user with admin permissions")
        
        elif field == "business_logic.operation_state" and new_val == "completed":
            # Check if resources are still allocated
            if corrupted_state["business_logic"]["resources_allocated"]:
                inconsistencies.append("Completed operation with allocated resources")
        
        elif field == "session_data.authenticated" and new_val == False:
            # Check if user profile data still exists
            if corrupted_state["user_profile"]["preferences"]:
                inconsistencies.append("Unauthenticated session with active user data")
        
        elif field == "user_profile.name" and new_val is None:
            # Check if preferences still exist
            if corrupted_state["user_profile"]["preferences"]:
                inconsistencies.append("Null user name with active preferences")
        
        print(f"   üîÑ {field}: {old_val} ‚Üí {new_val}")

print(f"\nüö® State inconsistencies detected: {len(inconsistencies)}")
for i, inconsistency in enumerate(inconsistencies, 1):
    print(f"   {i}. {inconsistency}")

print("\nüîß Corruption remediation recommendations:")
print("   ‚Ä¢ Implement state validation checks after each update")
print("   ‚Ä¢ Add invariant assertions for business logic")
print("   ‚Ä¢ Use atomic transactions for related state changes")
print("   ‚Ä¢ Add automated corruption detection in production")

## ‚ö° Chapter 4: Performance Optimization

Learn how to use StateDiff efficiently in production environments with cost tracking and optimization strategies.

In [None]:
print("‚ö° Chapter 4: Performance Optimization")
print("="*40)

# Enable cost tracking for performance analysis
perf_diff = StateDiff(track_costs=True)

print("üéØ Scenario: High-frequency trading agent with performance requirements")
print("üìä Tracking state changes with cost attribution")

# Simulate high-frequency operations
trading_operations = [
    ("market_open", {
        "portfolio": {"AAPL": 100, "GOOGL": 50, "MSFT": 75},
        "cash_balance": 50000.0,
        "market_data": {"last_update": "09:30:00", "data_points": 1000},
        "risk_metrics": {"var_95": 2500.0, "beta": 1.1},
        "active_orders": []
    }, 850, 0.017),
    
    ("price_update", {
        "portfolio": {"AAPL": 100, "GOOGL": 50, "MSFT": 75},
        "cash_balance": 50000.0,
        "market_data": {"last_update": "09:30:15", "data_points": 1200},
        "risk_metrics": {"var_95": 2650.0, "beta": 1.12},
        "active_orders": [],
        "price_changes": {"AAPL": 0.5, "GOOGL": -1.2, "MSFT": 0.8}
    }, 320, 0.006),
    
    ("signal_generated", {
        "portfolio": {"AAPL": 100, "GOOGL": 50, "MSFT": 75},
        "cash_balance": 50000.0,
        "market_data": {"last_update": "09:30:30", "data_points": 1400},
        "risk_metrics": {"var_95": 2650.0, "beta": 1.12},
        "active_orders": [],
        "price_changes": {"AAPL": 0.5, "GOOGL": -1.2, "MSFT": 0.8},
        "trading_signals": [
            {"symbol": "AAPL", "action": "BUY", "quantity": 25, "confidence": 0.78}
        ]
    }, 1100, 0.022),
    
    ("order_placed", {
        "portfolio": {"AAPL": 100, "GOOGL": 50, "MSFT": 75},
        "cash_balance": 46250.0,  # Cash reduced
        "market_data": {"last_update": "09:30:45", "data_points": 1600},
        "risk_metrics": {"var_95": 2780.0, "beta": 1.15},
        "active_orders": [
            {"order_id": "ORD_001", "symbol": "AAPL", "quantity": 25, "status": "pending"}
        ],
        "price_changes": {"AAPL": 0.5, "GOOGL": -1.2, "MSFT": 0.8},
        "trading_signals": []
    }, 450, 0.009),
    
    ("order_filled", {
        "portfolio": {"AAPL": 125, "GOOGL": 50, "MSFT": 75},  # Portfolio updated
        "cash_balance": 46250.0,
        "market_data": {"last_update": "09:31:00", "data_points": 1800},
        "risk_metrics": {"var_95": 2850.0, "beta": 1.18},
        "active_orders": [],
        "price_changes": {"AAPL": 0.5, "GOOGL": -1.2, "MSFT": 0.8},
        "trading_signals": [],
        "execution_history": [
            {"order_id": "ORD_001", "fill_price": 150.0, "timestamp": "09:30:58"}
        ]
    }, 280, 0.005)
]

# Process each operation with cost tracking
for label, state, tokens_used, cost in trading_operations:
    cost_context = {
        "operation": label,
        "tokens_used": tokens_used,
        "cost": cost,
        "agent_id": "trading_agent_prod",
        "model": "gpt-4-turbo"
    }
    
    perf_diff.snapshot(label, state, cost_context=cost_context)
    print(f"   üì∏ {label}: {tokens_used} tokens, ${cost:.3f}")

# Performance analysis
print("\nüìä Performance Analysis:")
print("-" * 25)

# Analyze state change efficiency
sequence = perf_diff.get_sequence_changes()
total_cost = 0.0
total_tokens = 0
significant_changes = 0

for step in sequence:
    if 'cost_impact' in step['changes']:
        cost_impact = step['changes']['cost_impact']
        step_cost = cost_impact.get('estimated_cost', 0)
        step_tokens = cost_impact.get('tokens_used', 0)
        
        total_cost += step_cost
        total_tokens += step_tokens
        
        # Count meaningful changes (excluding metadata updates)
        meaningful_changes = len([k for k in step['changes'].keys() 
                                if not k.startswith('market_data.last_update') 
                                and k != 'cost_impact'])
        
        if meaningful_changes > 0:
            significant_changes += 1
            efficiency = step_tokens / step_cost if step_cost > 0 else 0
            print(f"   üîÑ {step['from']} ‚Üí {step['to']}:")
            print(f"      üí∞ ${step_cost:.3f}, {step_tokens} tokens ({efficiency:.0f} tokens/$)")
            print(f"      üìù {meaningful_changes} meaningful state changes")

# Cost report
cost_report = perf_diff.get_cost_report()
if cost_report:
    print(f"\nüí∞ Cost Summary:")
    print(f"   Total cost: ${cost_report['total_cost']:.3f}")
    print(f"   Total tokens: {cost_report['total_tokens']:,}")
    print(f"   Average cost per snapshot: ${cost_report['cost_per_snapshot']:.3f}")
    print(f"   Cost efficiency: {cost_report['total_tokens'] / cost_report['total_cost']:.0f} tokens/$")

# Optimization recommendations
print("\n‚ö° Optimization Recommendations:")
avg_tokens_per_change = total_tokens / significant_changes if significant_changes > 0 else 0

if avg_tokens_per_change > 800:
    print("   üî• High token usage per state change")
    print("      ‚Ä¢ Consider reducing state granularity")
    print("      ‚Ä¢ Batch related state changes")
    print("      ‚Ä¢ Use delta compression for large states")

if total_cost / len(trading_operations) > 0.015:
    print("   üí∏ High cost per operation")
    print("      ‚Ä¢ Optimize model selection for different operations")
    print("      ‚Ä¢ Use cheaper models for routine updates")
    print("      ‚Ä¢ Implement caching for repeated computations")

print("\nüéØ Production Optimization Strategies:")
strategies = {
    "Selective Snapshots": "Only capture state at decision points, not every update",
    "Lazy Diff Computing": "Compute diffs only when needed, not proactively",
    "State Compression": "Compress large state objects before storage",
    "Batch Processing": "Process multiple state changes in batches",
    "Async Operations": "Move diff computation to background threads",
    "Memory Management": "Implement automatic cleanup of old snapshots",
    "Cost Budgeting": "Set per-operation cost limits with fallback strategies"
}

for strategy, description in strategies.items():
    print(f"   üìà {strategy}: {description}")

# Memory efficiency demonstration
print("\nüß† Memory Efficiency Demo:")
print(f"   üìä {len(perf_diff._snapshots)} snapshots stored")
print(f"   üîç {len(sequence)} transitions analyzed")
print(f"   ‚ö° Ready for {significant_changes} significant state changes")

# Cleanup demonstration
initial_snapshots = len(perf_diff._snapshots)
# In production, you might keep only the last N snapshots
# perf_diff.clear()  # Uncomment to demonstrate cleanup
print(f"   üßπ Cleanup: {initial_snapshots} snapshots ‚Üí memory freed for new operations")

## üöÄ Chapter 5: Production Deployment Patterns

Learn how to deploy StateDiff in production environments with proper monitoring, alerting, and integration patterns.

In [None]:
print("üöÄ Chapter 5: Production Deployment Patterns")
print("="*50)

# Production configuration example
print("üè≠ Production Configuration Example")
print("-" * 35)

class ProductionStateDiffManager:
    """Production-ready StateDiff manager with monitoring and optimization."""
    
    def __init__(self, agent_id: str, enable_monitoring: bool = True):
        self.agent_id = agent_id
        self.enable_monitoring = enable_monitoring
        
        # Initialize StateDiff with production settings
        self.diff = StateDiff(track_costs=True)
        
        # Production metrics
        self.metrics = {
            "snapshots_taken": 0,
            "diffs_computed": 0,
            "total_cost": 0.0,
            "avg_snapshot_size": 0,
            "performance_warnings": 0
        }
        
        # Configuration
        self.config = {
            "max_snapshots": 100,          # Keep last 100 snapshots
            "snapshot_size_limit_mb": 10,  # Max 10MB per snapshot
            "cost_alert_threshold": 1.0,   # Alert if cost > $1
            "enable_compression": True,     # Compress large states
            "batch_size": 10              # Process diffs in batches
        }
        
        print(f"‚úÖ Initialized production StateDiff for agent: {agent_id}")
    
    def capture_state(self, label: str, state: dict, operation_context: dict = None):
        """Capture state with production safeguards."""
        try:
            # Estimate state size
            state_size_mb = len(json.dumps(state).encode('utf-8')) / (1024 * 1024)
            
            # Size check
            if state_size_mb > self.config["snapshot_size_limit_mb"]:
                self.metrics["performance_warnings"] += 1
                print(f"‚ö†Ô∏è  Large state warning: {state_size_mb:.1f}MB for {label}")
                
                if self.config["enable_compression"]:
                    state = self._compress_state(state)
                    print(f"   üóúÔ∏è  State compressed for {label}")
            
            # Cost context
            cost_context = operation_context or {}
            cost_context.update({
                "agent_id": self.agent_id,
                "timestamp": datetime.now().isoformat(),
                "state_size_mb": state_size_mb
            })
            
            # Capture snapshot
            self.diff.snapshot(label, state, cost_context=cost_context)
            self.metrics["snapshots_taken"] += 1
            self.metrics["avg_snapshot_size"] = (
                (self.metrics["avg_snapshot_size"] * (self.metrics["snapshots_taken"] - 1) + state_size_mb) / 
                self.metrics["snapshots_taken"]
            )
            
            # Cleanup old snapshots
            self._cleanup_old_snapshots()
            
            if self.enable_monitoring:
                print(f"   üì∏ {label}: {state_size_mb:.2f}MB captured")
                
        except Exception as e:
            print(f"‚ùå Failed to capture state {label}: {e}")
            self.metrics["performance_warnings"] += 1
    
    def analyze_changes(self, from_label: str, to_label: str) -> dict:
        """Analyze changes with performance monitoring."""
        start_time = time.time()
        
        try:
            changes = self.diff.get_changes(from_label, to_label)
            self.metrics["diffs_computed"] += 1
            
            analysis_time = time.time() - start_time
            
            # Performance warning for slow analysis
            if analysis_time > 0.1:  # 100ms threshold
                self.metrics["performance_warnings"] += 1
                print(f"‚ö†Ô∏è  Slow diff analysis: {analysis_time:.3f}s for {from_label} ‚Üí {to_label}")
            
            # Add analysis metadata
            changes["_analysis_metadata"] = {
                "analysis_time_ms": analysis_time * 1000,
                "agent_id": self.agent_id,
                "timestamp": datetime.now().isoformat()
            }
            
            return changes
            
        except Exception as e:
            print(f"‚ùå Failed to analyze changes {from_label} ‚Üí {to_label}: {e}")
            return {"error": str(e)}
    
    def get_production_metrics(self) -> dict:
        """Get production monitoring metrics."""
        cost_report = self.diff.get_cost_report()
        
        return {
            "agent_id": self.agent_id,
            "runtime_metrics": self.metrics,
            "cost_metrics": cost_report,
            "configuration": self.config,
            "health_status": self._get_health_status()
        }
    
    def _compress_state(self, state: dict) -> dict:
        """Compress large state objects (simplified implementation)."""
        # In production, use proper compression like gzip or custom algorithms
        compressed = copy.deepcopy(state)
        
        # Example: Compress large arrays
        for key, value in compressed.items():
            if isinstance(value, list) and len(value) > 100:
                # Keep first 50 and last 50 items with summary
                compressed[key] = {
                    "_compressed": True,
                    "first_items": value[:50],
                    "last_items": value[-50:],
                    "total_count": len(value),
                    "compression_ratio": len(value) / 100
                }
        
        return compressed
    
    def _cleanup_old_snapshots(self):
        """Remove old snapshots to manage memory."""
        if len(self.diff._snapshots) > self.config["max_snapshots"]:
            # Keep only the most recent snapshots
            snapshots_to_remove = len(self.diff._snapshots) - self.config["max_snapshots"]
            
            for label in list(self.diff._snapshots.keys())[:snapshots_to_remove]:
                del self.diff._snapshots[label]
                if label in self.diff._sequence:
                    self.diff._sequence.remove(label)
            
            if self.enable_monitoring:
                print(f"   üßπ Cleaned up {snapshots_to_remove} old snapshots")
    
    def _get_health_status(self) -> str:
        """Determine system health status."""
        if self.metrics["performance_warnings"] > 10:
            return "degraded"
        elif self.metrics["avg_snapshot_size"] > 5.0:  # 5MB average
            return "warning"
        else:
            return "healthy"

# Demonstrate production usage
print("\nüîÑ Production Usage Demonstration:")
prod_manager = ProductionStateDiffManager("customer_service_agent_v2")

# Simulate production workload
production_scenario = [
    ("customer_inquiry", {
        "customer_id": "CUST_12345",
        "inquiry_type": "billing_question",
        "message": "Why was I charged twice this month?",
        "sentiment": "frustrated",
        "priority": "high",
        "session_data": {"ip": "192.168.1.100", "user_agent": "Chrome/119.0"}
    }, 450, 0.009),
    
    ("context_retrieved", {
        "customer_id": "CUST_12345",
        "inquiry_type": "billing_question",
        "message": "Why was I charged twice this month?",
        "sentiment": "frustrated",
        "priority": "high",
        "session_data": {"ip": "192.168.1.100", "user_agent": "Chrome/119.0"},
        "customer_context": {
            "subscription_tier": "premium",
            "account_status": "active",
            "recent_charges": [99.99, 99.99],  # Duplicate charge!
            "payment_method": "**** 1234",
            "last_interaction": "2024-10-15"
        }
    }, 1200, 0.024),
    
    ("issue_identified", {
        "customer_id": "CUST_12345",
        "inquiry_type": "billing_question",
        "message": "Why was I charged twice this month?",
        "sentiment": "frustrated",
        "priority": "high",
        "session_data": {"ip": "192.168.1.100", "user_agent": "Chrome/119.0"},
        "customer_context": {
            "subscription_tier": "premium",
            "account_status": "active",
            "recent_charges": [99.99, 99.99],
            "payment_method": "**** 1234",
            "last_interaction": "2024-10-15"
        },
        "issue_analysis": {
            "issue_type": "duplicate_charge",
            "confidence": 0.95,
            "resolution_strategy": "automatic_refund",
            "estimated_resolution_time": "5_minutes"
        }
    }, 800, 0.016),
    
    ("resolution_complete", {
        "customer_id": "CUST_12345",
        "inquiry_type": "billing_question",
        "message": "Why was I charged twice this month?",
        "sentiment": "satisfied",  # Improved!
        "priority": "resolved",
        "session_data": {"ip": "192.168.1.100", "user_agent": "Chrome/119.0"},
        "customer_context": {
            "subscription_tier": "premium",
            "account_status": "active",
            "recent_charges": [99.99],  # Duplicate removed
            "payment_method": "**** 1234",
            "last_interaction": "2024-11-08"
        },
        "issue_analysis": {
            "issue_type": "duplicate_charge",
            "confidence": 0.95,
            "resolution_strategy": "automatic_refund",
            "estimated_resolution_time": "5_minutes"
        },
        "resolution": {
            "action_taken": "refund_processed",
            "refund_amount": 99.99,
            "transaction_id": "REF_789012",
            "customer_notified": True,
            "resolution_time_seconds": 180
        }
    }, 600, 0.012)
]

# Process the scenario
for label, state, tokens, cost in production_scenario:
    operation_context = {
        "operation": label,
        "tokens_used": tokens,
        "cost": cost,
        "model": "gpt-4-turbo"
    }
    
    prod_manager.capture_state(label, state, operation_context)

# Analyze key transitions
print("\nüîç Production Analysis:")
key_analysis = prod_manager.analyze_changes("customer_inquiry", "resolution_complete")

# Extract key insights
sentiment_change = None
priority_change = None
charges_change = None

for field, change in key_analysis.items():
    if field == "sentiment" and "from" in change:
        sentiment_change = f"{change['from']} ‚Üí {change['to']}"
    elif field == "priority" and "from" in change:
        priority_change = f"{change['from']} ‚Üí {change['to']}"
    elif "recent_charges" in field and "removed" in change:
        charges_change = f"Removed duplicate charge: {change['removed']}"

print(f"   üéØ Customer sentiment: {sentiment_change}")
print(f"   üìã Priority status: {priority_change}")
print(f"   üí∞ Billing correction: {charges_change}")

# Get production metrics
metrics = prod_manager.get_production_metrics()
print(f"\nüìä Production Metrics:")
print(f"   üè• Health status: {metrics['health_status']}")
print(f"   üì∏ Snapshots taken: {metrics['runtime_metrics']['snapshots_taken']}")
print(f"   üîç Diffs computed: {metrics['runtime_metrics']['diffs_computed']}")
print(f"   üìè Avg snapshot size: {metrics['runtime_metrics']['avg_snapshot_size']:.2f}MB")
print(f"   ‚ö†Ô∏è  Performance warnings: {metrics['runtime_metrics']['performance_warnings']}")

if metrics['cost_metrics']:
    print(f"   üí∞ Total cost: ${metrics['cost_metrics']['total_cost']:.3f}")
    print(f"   üî§ Total tokens: {metrics['cost_metrics']['total_tokens']:,}")

print("\nüöÄ Production Deployment Checklist:")
checklist = [
    "‚úÖ State size monitoring and compression",
    "‚úÖ Automatic snapshot cleanup", 
    "‚úÖ Performance warning system",
    "‚úÖ Cost tracking and reporting",
    "‚úÖ Error handling and recovery",
    "‚úÖ Health status monitoring",
    "‚úÖ Production metrics collection"
]

for item in checklist:
    print(f"   {item}")

print("\nüéØ Ready for production deployment!")

## üéì Chapter 6: Best Practices & Common Patterns

Learn proven patterns and avoid common pitfalls when using StateDiff in production systems.

In [None]:
print("üéì Chapter 6: Best Practices & Common Patterns")
print("="*50)

print("üìö Best Practices Summary:")
print("-" * 25)

best_practices = {
    "üì∏ Snapshot Strategy": [
        "Take snapshots at decision points, not every state change",
        "Use descriptive labels that indicate the operation or milestone",
        "Include context about why the snapshot was taken",
        "Balance detail with performance - not every field needs tracking"
    ],
    
    "üîç Diff Analysis": [
        "Focus on meaningful changes that affect agent behavior",
        "Use sequence analysis to understand progression patterns", 
        "Look for unexpected changes that might indicate bugs",
        "Correlate state changes with performance metrics"
    ],
    
    "‚ö° Performance": [
        "Enable cost tracking only when needed",
        "Implement snapshot cleanup for long-running agents",
        "Use compression for large state objects",
        "Batch diff computations when possible"
    ],
    
    "üè≠ Production": [
        "Monitor snapshot sizes and computation times",
        "Set up alerts for unusual state change patterns",
        "Implement fallback strategies for debugging failures",
        "Export metrics to your monitoring infrastructure"
    ],
    
    "üêõ Debugging": [
        "Use StateDiff to isolate when problems first appear",
        "Compare successful vs failed execution paths",
        "Track confidence and uncertainty changes over time",
        "Validate state consistency with business rules"
    ]
}

for category, practices in best_practices.items():
    print(f"\n{category}:")
    for practice in practices:
        print(f"   ‚Ä¢ {practice}")

print("\n‚ùå Common Pitfalls to Avoid:")
print("-" * 30)

pitfalls = [
    "Taking too many snapshots (hurts performance)",
    "Not cleaning up old snapshots (memory leaks)",
    "Ignoring snapshot size limits (storage bloat)", 
    "Over-analyzing trivial state changes (noise)",
    "Not correlating changes with business outcomes",
    "Forgetting to handle diff computation errors",
    "Not considering the cost of state tracking itself",
    "Using StateDiff as the only debugging tool"
]

for pitfall in pitfalls:
    print(f"   ‚ö†Ô∏è  {pitfall}")

print("\nüîß Integration Patterns:")
print("-" * 22)

integration_patterns = {
    "Testing Pipeline": "Use StateDiff in unit tests to verify agent behavior",
    "CI/CD Integration": "Run state diff analysis on staging deployments",
    "A/B Testing": "Compare state evolution between different agent versions",
    "Monitoring Stack": "Export StateDiff metrics to Prometheus/Grafana",
    "Alerting System": "Trigger alerts on unexpected state change patterns",
    "Debug Dashboard": "Build real-time state visualization tools",
    "Audit Trail": "Use StateDiff for compliance and audit requirements"
}

for pattern, description in integration_patterns.items():
    print(f"   üîó {pattern}: {description}")

# Demonstrate a complete debugging workflow
print("\nüî¨ Complete Debugging Workflow Example:")
print("-" * 40)

debug_diff = StateDiff()

# Scenario: Agent giving inconsistent recommendations
print("üéØ Debugging: Agent giving inconsistent investment recommendations")

debug_states = [
    ("user_query", {
        "query": "Should I invest in tech stocks?",
        "user_profile": {"risk_tolerance": "moderate", "age": 35, "income": 75000},
        "market_data": {"tech_index": 12450, "volatility": 0.24}
    }),
    
    ("analysis_v1", {
        "query": "Should I invest in tech stocks?",
        "user_profile": {"risk_tolerance": "moderate", "age": 35, "income": 75000},
        "market_data": {"tech_index": 12450, "volatility": 0.24},
        "analysis": {
            "recommendation": "buy",
            "confidence": 0.7,
            "reasoning": "Tech stocks align with moderate risk profile"
        }
    }),
    
    ("updated_data", {
        "query": "Should I invest in tech stocks?",
        "user_profile": {"risk_tolerance": "moderate", "age": 35, "income": 75000},
        "market_data": {"tech_index": 12450, "volatility": 0.26},  # Slight increase
        "analysis": {
            "recommendation": "buy",
            "confidence": 0.7,
            "reasoning": "Tech stocks align with moderate risk profile"
        }
    }),
    
    ("analysis_v2", {
        "query": "Should I invest in tech stocks?",
        "user_profile": {"risk_tolerance": "moderate", "age": 35, "income": 75000},
        "market_data": {"tech_index": 12450, "volatility": 0.26},
        "analysis": {
            "recommendation": "hold",  # Changed!
            "confidence": 0.5,          # Dropped!
            "reasoning": "Increased volatility suggests caution"
        }
    })
]

for label, state in debug_states:
    debug_diff.snapshot(label, state)
    print(f"   üì∏ {label}")

# Debugging analysis
print("\nüîç Debugging Analysis:")

# Check what changed between consistent and inconsistent recommendations
changes = debug_diff.get_changes("analysis_v1", "analysis_v2")

print("   üîÑ Changes detected:")
critical_changes = []
for field, change in changes.items():
    if "recommendation" in field:
        critical_changes.append(f"Recommendation: {change['from']} ‚Üí {change['to']}")
    elif "confidence" in field:
        confidence_drop = change['from'] - change['to']
        critical_changes.append(f"Confidence dropped by {confidence_drop:.1f}")
    elif "volatility" in field:
        vol_increase = change['to'] - change['from']
        critical_changes.append(f"Volatility increased by {vol_increase:.2f}")

for change in critical_changes:
    print(f"      ‚Ä¢ {change}")

# Root cause analysis
print("\nüéØ Root Cause Analysis:")
volatility_change = debug_states[3][1]["market_data"]["volatility"] - debug_states[2][1]["market_data"]["volatility"]
print(f"   üìä Volatility change: +{volatility_change:.2f} (from 0.24 to 0.26)")
print(f"   ü§ñ Agent response: Recommendation changed from 'buy' to 'hold'")
print(f"   ‚ùì Question: Is a 0.02 volatility change significant enough to change recommendation?")

print("\nüí° Debugging Insights:")
print("   ‚úÖ State tracking revealed the exact trigger point")
print("   ‚úÖ Minimal data change caused major recommendation shift")
print("   ‚úÖ Confidence drop indicates agent uncertainty")
print("   üîß Recommendation: Implement volatility thresholds and confidence ranges")

print("\nüèÜ StateDiff Mastery Achieved!")
print("You now have the skills to:")
mastery_skills = [
    "Track agent state evolution with precision",
    "Debug complex agent behaviors systematically", 
    "Optimize performance in production environments",
    "Implement monitoring and alerting systems",
    "Follow best practices for state management",
    "Integrate StateDiff into your development workflow"
]

for skill in mastery_skills:
    print(f"   üéØ {skill}")

print("\nüöÄ Ready to build better, more debuggable AI agents!")

## üéØ Summary & Next Steps

Congratulations! You've completed the comprehensive StateDiff tutorial. Here's what you've learned:

### üèÜ Key Concepts Mastered
- **State Snapshots**: Capturing agent states at critical decision points
- **Diff Analysis**: Understanding complex state changes and their implications
- **Debugging Workflows**: Systematic approaches to identifying agent problems
- **Performance Optimization**: Cost tracking and efficient state management
- **Production Patterns**: Real-world deployment strategies and monitoring

### üõ†Ô∏è Practical Skills Developed
- Tracking confidence oscillations and decision inconsistencies
- Detecting memory leaks and performance degradation
- Identifying state corruption and business logic violations
- Implementing production-ready monitoring and alerting
- Following best practices for scalable state tracking

### üìö Continue Your Journey
1. **03_context_decay.ipynb** - Learn advanced memory management strategies
2. **06_cost_alerts.ipynb** - Master cost monitoring and optimization
3. **Production Integration** - Apply these concepts to your own agent systems

### ü§ù Community & Resources
- **GitHub**: [https://github.com/MarsZDF/argentum](https://github.com/MarsZDF/argentum)
- **Documentation**: [https://argentum-agent.readthedocs.io](https://argentum-agent.readthedocs.io)
- **Report Issues**: Use GitHub Issues for bugs and feature requests

---

*Now you're equipped to build agents that are not just intelligent, but also transparent, debuggable, and production-ready! üöÄ*