---
title: "Action Validation & Error Recovery"
description: "Handling invalid actions gracefully in a multiplayer world"
author: "Eric Zou"
date: "12/13/2025"
categories:
  - Agents
  - Infrastructure
  - Robustness
---


# When Actions Go Wrong

In post 017, we set up turn-based and time-based synchronization. Agents can move and speak, and everything works... until it doesn't.

What happens when:
- An AI agent outputs `[MOVE: DIAGONAL]` (not a valid direction)?
- A human types `move left` instead of `[MOVE: LEFT]`?
- An agent tries to move outside the grid boundaries?
- The LLM returns malformed JSON or empty responses?
- Network latency causes a human's action to arrive late?

In a real multiplayer game, you can't just crash. You need **graceful error handling** and **recovery strategies**.

This post explores how to make our 2D chat world robust against invalid actions.


## The Problem Space

Invalid actions can come from multiple sources:

1. **AI Agents**: LLMs are creative—sometimes too creative. They might invent new commands or format things incorrectly.
2. **Human Players**: Typos, misunderstandings, or testing edge cases.
3. **System Failures**: Network issues, API timeouts, parsing errors.

We need to handle all of these without breaking the simulation.


In [1]:
import os
import re
import random
from dataclasses import dataclass, field
from typing import List, Optional, Tuple
from enum import Enum
from dotenv import load_dotenv
from openai import OpenAI

load_dotenv("../../.env")
client = OpenAI()

GRID_SIZE = 20

@dataclass
class Agent:
    name: str
    x: int
    y: int
    color: str
    history: list = field(default_factory=list)
    
    def move(self, dx, dy):
        self.x = max(0, min(GRID_SIZE-1, self.x + dx))
        self.y = max(0, min(GRID_SIZE-1, self.y + dy))

class ActionError(Exception):
    """Base exception for action-related errors"""
    pass

class InvalidDirectionError(ActionError):
    """Raised when a move direction is invalid"""
    pass

class ParseError(ActionError):
    """Raised when action cannot be parsed"""
    pass


## Validating Actions

First, let's create a robust action parser that validates inputs before executing them.


In [2]:
VALID_DIRECTIONS = {"UP", "DOWN", "LEFT", "RIGHT", "NORTH", "SOUTH", "EAST", "WEST"}

def parse_action(content: str) -> Tuple[Optional[str], Optional[str]]:
    """
    Parse an action string, extracting move direction and message.
    Returns (direction, message) where either can be None.
    Raises ParseError if content is completely invalid.
    """
    if not content or not content.strip():
        raise ParseError("Empty action content")
    
    content = content.strip()
    direction = None
    message = None
    
    # Try to find [MOVE: DIRECTION] pattern
    move_match = re.search(r'\[MOVE:\s*(\w+)\]', content, re.IGNORECASE)
    if move_match:
        raw_direction = move_match.group(1).upper()
        # Normalize direction names
        direction_map = {
            "NORTH": "UP", "SOUTH": "DOWN", 
            "EAST": "RIGHT", "WEST": "LEFT"
        }
        direction = direction_map.get(raw_direction, raw_direction)
        
        # Validate direction
        if direction not in VALID_DIRECTIONS:
            raise InvalidDirectionError(f"Invalid direction: {raw_direction}. Valid: {VALID_DIRECTIONS}")
    
    # Extract message (remove MOVE command if present)
    message = re.sub(r'\[MOVE:\s*\w+\]', '', content, flags=re.IGNORECASE).strip()
    if not message:
        message = None
    
    # If no move and no message, that's an error
    if not direction and not message:
        raise ParseError(f"Action contains neither valid move nor message: {content}")
    
    return direction, message

# Test the parser
test_cases = [
    "[MOVE: UP] Hello world",
    "[MOVE: LEFT]",
    "Just a message",
    "[MOVE: DIAGONAL] Invalid!",
    "[MOVE: NORTH] Going north",
    "",
    "   ",
]

print("Testing action parser:")
for test in test_cases:
    try:
        direction, message = parse_action(test)
        print(f"  '{test}' -> direction={direction}, message={message}")
    except ActionError as e:
        print(f"  '{test}' -> ERROR: {type(e).__name__}: {e}")


Testing action parser:
  '[MOVE: UP] Hello world' -> direction=UP, message=Hello world
  '[MOVE: LEFT]' -> direction=LEFT, message=None
  'Just a message' -> direction=None, message=Just a message
  '[MOVE: DIAGONAL] Invalid!' -> ERROR: InvalidDirectionError: Invalid direction: DIAGONAL. Valid: {'EAST', 'NORTH', 'LEFT', 'SOUTH', 'DOWN', 'UP', 'WEST', 'RIGHT'}
  '[MOVE: NORTH] Going north' -> direction=UP, message=Going north
  '' -> ERROR: ParseError: Empty action content
  '   ' -> ERROR: ParseError: Empty action content


## Error Recovery Strategies

When an action fails, we have several recovery options:

1. **Silent Failure**: Ignore the invalid action (agent does nothing)
2. **Default Action**: Fall back to a safe default (e.g., stay in place, say nothing)
3. **Retry**: Ask the agent to try again (for AI agents)
4. **Partial Execution**: Execute the valid parts, skip the invalid parts
5. **Log & Continue**: Record the error but don't break the simulation


In [3]:
@dataclass
class ActionResult:
    """Result of attempting to execute an action"""
    success: bool
    direction: Optional[str] = None
    message: Optional[str] = None
    error: Optional[str] = None
    error_type: Optional[str] = None

def execute_action_safely(agent: Agent, content: str, shared_transcript: List[str], 
                          recovery_strategy: str = "partial") -> ActionResult:
    """
    Safely execute an action with error recovery.
    
    recovery_strategy options:
    - "silent": Ignore errors, do nothing
    - "partial": Execute valid parts, skip invalid parts
    - "default": Fall back to safe defaults
    """
    try:
        direction, message = parse_action(content)
        
        # Execute move if valid
        if direction:
            if direction == "UP" or direction == "NORTH":
                agent.move(0, 1)
            elif direction == "DOWN" or direction == "SOUTH":
                agent.move(0, -1)
            elif direction == "LEFT" or direction == "WEST":
                agent.move(-1, 0)
            elif direction == "RIGHT" or direction == "EAST":
                agent.move(1, 0)
        
        # Execute message if valid
        if message:
            shared_transcript.append(f"{agent.name}: {message}")
        
        return ActionResult(
            success=True,
            direction=direction,
            message=message
        )
        
    except InvalidDirectionError as e:
        if recovery_strategy == "silent":
            return ActionResult(success=False, error=str(e), error_type="InvalidDirection")
        elif recovery_strategy == "partial":
            # Try to extract and execute message even if move failed
            message = re.sub(r'\[MOVE:\s*\w+\]', '', content, flags=re.IGNORECASE).strip()
            if message:
                shared_transcript.append(f"{agent.name}: {message}")
            return ActionResult(success=False, direction=None, message=message, 
                              error=str(e), error_type="InvalidDirection")
        else:  # default
            return ActionResult(success=False, error=str(e), error_type="InvalidDirection")
            
    except ParseError as e:
        if recovery_strategy == "silent":
            return ActionResult(success=False, error=str(e), error_type="ParseError")
        elif recovery_strategy == "default":
            # Default: agent says they're confused
            shared_transcript.append(f"{agent.name}: [confused] I'm not sure what to do.")
            return ActionResult(success=False, error=str(e), error_type="ParseError", 
                              message="[confused] I'm not sure what to do.")
        else:
            return ActionResult(success=False, error=str(e), error_type="ParseError")
    
    except Exception as e:
        # Catch-all for unexpected errors
        return ActionResult(success=False, error=f"Unexpected error: {str(e)}", 
                          error_type="UnexpectedError")

# Test error recovery
print("Testing error recovery:")
test_agent = Agent("TestBot", 10, 10, "blue")
transcript = []

test_actions = [
    "[MOVE: UP] Hello!",
    "[MOVE: DIAGONAL] Invalid direction",
    "Just a message",
    "[MOVE: INVALID]",
    "",  # Empty
]

for action in test_actions:
    result = execute_action_safely(test_agent, action, transcript, recovery_strategy="partial")
    print(f"\nAction: '{action}'")
    print(f"  Result: {result}")
    print(f"  Agent position: ({test_agent.x}, {test_agent.y})")
    print(f"  Transcript length: {len(transcript)}")


Testing error recovery:

Action: '[MOVE: UP] Hello!'
  Result: ActionResult(success=True, direction='UP', message='Hello!', error=None, error_type=None)
  Agent position: (10, 11)
  Transcript length: 1

Action: '[MOVE: DIAGONAL] Invalid direction'
  Result: ActionResult(success=False, direction=None, message='Invalid direction', error="Invalid direction: DIAGONAL. Valid: {'EAST', 'NORTH', 'LEFT', 'SOUTH', 'DOWN', 'UP', 'WEST', 'RIGHT'}", error_type='InvalidDirection')
  Agent position: (10, 11)
  Transcript length: 2

Action: 'Just a message'
  Result: ActionResult(success=True, direction=None, message='Just a message', error=None, error_type=None)
  Agent position: (10, 11)
  Transcript length: 3

Action: '[MOVE: INVALID]'
  Result: ActionResult(success=False, direction=None, message='', error="Invalid direction: INVALID. Valid: {'EAST', 'NORTH', 'LEFT', 'SOUTH', 'DOWN', 'UP', 'WEST', 'RIGHT'}", error_type='InvalidDirection')
  Agent position: (10, 11)
  Transcript length: 3

Action:

In [9]:
def get_agent_action_safe(agent: Agent, all_agents: List[Agent], shared_transcript: List[str], 
                          max_retries: int = 1) -> ActionResult:
    """
    Get an action from an agent with retry logic for invalid responses.
    """
    others = [a for a in all_agents if a != agent]
    others_loc = "\n".join([f"- {a.name}: ({a.x}, {a.y})" for a in others])
    
    system_prompt = f"""
You have just joined an online multiplayer chatroom as an avatar in a 2D grid. Discuss any topic, including those beyond the grid.

You are {agent.name}, positioned at ({agent.x}, {agent.y}) in a 20x20 grid.

Other avatars currently visible:
{others_loc}

Recent chat messages:
{chr(10).join(shared_transcript[-3:]) if shared_transcript else "No messages yet."}

You can do BOTH:
1. Move your avatar using [MOVE: DIRECTION] (UP, DOWN, LEFT, RIGHT)
2. Chat about anything - the grid, your position, or any topic you want

You can move and speak in the same turn. Format: [MOVE: DIRECTION] followed by your message, or just speak without moving.

Keep your response short (1-2 sentences).
"""
    
    for attempt in range(max_retries + 1):
        try:
            response = client.chat.completions.create(
                model="gpt-4o",
                messages=[{"role": "system", "content": system_prompt}]
            )
            content = response.choices[0].message.content.strip()
            
            if not content:
                if attempt < max_retries:
                    continue  # Retry on empty response
                return ActionResult(success=False, error="Empty response from LLM", 
                                  error_type="EmptyResponse")
            
            # Try to execute the action
            result = execute_action_safely(agent, content, shared_transcript, 
                                         recovery_strategy="partial")
            return result
            
        except Exception as e:
            if attempt < max_retries:
                continue  # Retry on exception
            return ActionResult(success=False, error=f"LLM API error: {str(e)}", 
                              error_type="APIError")
    
    return ActionResult(success=False, error="Max retries exceeded", error_type="MaxRetries")

print("Testing safe agent action:")
test_agents = [Agent("Alice", 5, 5, "red"), Agent("Bob", 15, 15, "blue")]
test_transcript = []
# Test with a real agent
for i in range(10):
    result = get_agent_action_safe(test_agents[0], test_agents, test_transcript)
    print(f"Result: {result}")
    print(f"Alice position: ({test_agents[0].x}, {test_agents[0].y})")

    result = get_agent_action_safe(test_agents[1], test_agents, test_transcript)
    print(f"Result: {result}")
    print(f"Bob position: ({test_agents[1].x}, {test_agents[1].y})")

print(f"Transcript: {test_transcript}")


Testing safe agent action:
Result: ActionResult(success=True, direction=None, message="Hey everyone! It's Alice here at position (5, 5) on the grid. What's up, Bob?", error=None, error_type=None)
Alice position: (5, 5)
Result: ActionResult(success=True, direction='LEFT', message="Hey Alice! I'm at (15, 15) right now, taking a little stroll. How's the view from your corner of the grid?", error=None, error_type=None)
Bob position: (14, 15)
Result: ActionResult(success=True, direction='RIGHT', message="The view's pretty good here, Bob! I might head your way to check out your corner of the grid. What's the weather like over there in column 15?", error=None, error_type=None)
Alice position: (6, 5)
Result: ActionResult(success=True, direction='LEFT', message="Hey Alice, I'm now at (14, 15), and it's a lovely day here, clear skies and perfect for grid exploration. Can't wait to see you around!", error=None, error_type=None)
Bob position: (13, 15)
Result: ActionResult(success=True, direction='

## Error Logging & Monitoring

For a production system, we need to track errors to understand what's going wrong:


In [5]:
@dataclass
class ErrorLog:
    """Log entry for action errors"""
    agent_name: str
    timestamp: float
    action_content: str
    error_type: str
    error_message: str
    recovery_strategy: str
    final_result: ActionResult

class ActionLogger:
    """Tracks action errors for monitoring and debugging"""
    
    def __init__(self):
        self.errors: List[ErrorLog] = []
        self.stats = {
            "total_actions": 0,
            "successful_actions": 0,
            "failed_actions": 0,
            "error_types": {}
        }
    
    def log_action(self, agent: Agent, content: str, result: ActionResult, 
                   recovery_strategy: str, timestamp: float = None):
        """Log an action attempt"""
        import time
        if timestamp is None:
            timestamp = time.time()
        
        self.stats["total_actions"] += 1
        
        if result.success:
            self.stats["successful_actions"] += 1
        else:
            self.stats["failed_actions"] += 1
            error_type = result.error_type or "Unknown"
            self.stats["error_types"][error_type] = self.stats["error_types"].get(error_type, 0) + 1
            
            self.errors.append(ErrorLog(
                agent_name=agent.name,
                timestamp=timestamp,
                action_content=content,
                error_type=error_type,
                error_message=result.error or "",
                recovery_strategy=recovery_strategy,
                final_result=result
            ))
    
    def get_error_summary(self) -> dict:
        """Get summary statistics of errors"""
        return {
            "total_actions": self.stats["total_actions"],
            "success_rate": (self.stats["successful_actions"] / self.stats["total_actions"] 
                           if self.stats["total_actions"] > 0 else 0),
            "error_types": self.stats["error_types"],
            "recent_errors": [e.error_type for e in self.errors[-10:]]
        }
    
    def print_summary(self):
        """Print a human-readable error summary"""
        summary = self.get_error_summary()
        print("=== Action Error Summary ===")
        print(f"Total actions: {summary['total_actions']}")
        print(f"Success rate: {summary['success_rate']:.1%}")
        print(f"Error types: {summary['error_types']}")
        if self.errors:
            print(f"\nRecent errors ({len(self.errors)} total):")
            for error in self.errors[-5:]:
                print(f"  {error.agent_name}: {error.error_type} - {error.error_message[:50]}")

# Test logging
logger = ActionLogger()

test_agent = Agent("TestBot", 10, 10, "blue")
test_transcript = []

test_actions = [
    "[MOVE: UP] Hello!",
    "[MOVE: DIAGONAL] Invalid",
    "Just chatting",
    "[MOVE: INVALID]",
    "[MOVE: RIGHT] Moving right",
]

for action in test_actions:
    result = execute_action_safely(test_agent, action, test_transcript, 
                                   recovery_strategy="partial")
    logger.log_action(test_agent, action, result, "partial")

logger.print_summary()


=== Action Error Summary ===
Total actions: 5
Success rate: 60.0%
Error types: {'InvalidDirection': 2}

Recent errors (2 total):
  TestBot: InvalidDirection - Invalid direction: DIAGONAL. Valid: {'EAST', 'NORT
  TestBot: InvalidDirection - Invalid direction: INVALID. Valid: {'EAST', 'NORTH


## Handling Human Input

Humans are even more unpredictable than AI. We need flexible parsing for natural language:


In [6]:
def parse_human_input(user_input: str) -> Tuple[Optional[str], Optional[str]]:
    """
    Parse human input with flexible natural language understanding.
    Handles both structured commands and natural language.
    """
    if not user_input or not user_input.strip():
        return None, None
    
    user_input = user_input.strip()
    direction = None
    message = None
    
    # First, try structured format [MOVE: DIRECTION]
    move_match = re.search(r'\[MOVE:\s*(\w+)\]', user_input, re.IGNORECASE)
    if move_match:
        raw_direction = move_match.group(1).upper()
        direction_map = {
            "NORTH": "UP", "SOUTH": "DOWN", 
            "EAST": "RIGHT", "WEST": "LEFT"
        }
        direction = direction_map.get(raw_direction, raw_direction)
        if direction in VALID_DIRECTIONS:
            message = re.sub(r'\[MOVE:\s*\w+\]', '', user_input, flags=re.IGNORECASE).strip()
            return direction, message if message else None
    
    # Try natural language patterns
    natural_patterns = [
        (r'move\s+(up|north)', 'UP'),
        (r'move\s+(down|south)', 'DOWN'),
        (r'move\s+(left|west)', 'LEFT'),
        (r'move\s+(right|east)', 'RIGHT'),
        (r'go\s+(up|north)', 'UP'),
        (r'go\s+(down|south)', 'DOWN'),
        (r'go\s+(left|west)', 'LEFT'),
        (r'go\s+(right|east)', 'RIGHT'),
        (r'head\s+(up|north)', 'UP'),
        (r'head\s+(down|south)', 'DOWN'),
        (r'head\s+(left|west)', 'LEFT'),
        (r'head\s+(right|east)', 'RIGHT'),
    ]
    
    for pattern, dir_val in natural_patterns:
        match = re.search(pattern, user_input, re.IGNORECASE)
        if match:
            direction = dir_val
            # Remove the movement phrase from message
            message = re.sub(pattern, '', user_input, flags=re.IGNORECASE).strip()
            return direction, message if message else None
    
    # No movement found, treat entire input as message
    return None, user_input

# Test human input parsing
human_inputs = [
    "[MOVE: UP] Hello everyone!",
    "move left",
    "go right and say hi",
    "I'm just chatting here",
    "head north to explore",
    "move diagonal",  # Invalid, should be treated as message
]

print("Testing human input parsing:")
for inp in human_inputs:
    direction, message = parse_human_input(inp)
    print(f"  '{inp}'")
    print(f"    -> direction={direction}, message={message}")


Testing human input parsing:
  '[MOVE: UP] Hello everyone!'
    -> direction=UP, message=Hello everyone!
  'move left'
    -> direction=LEFT, message=None
  'go right and say hi'
    -> direction=RIGHT, message=and say hi
  'I'm just chatting here'
    -> direction=None, message=I'm just chatting here
  'head north to explore'
    -> direction=UP, message=to explore
  'move diagonal'
    -> direction=None, message=move diagonal


## Putting It All Together

Let's create a robust simulation round that handles errors gracefully:


In [None]:
def run_robust_round(agents: List[Agent], shared_transcript: List[str], 
                    logger: Optional[ActionLogger] = None, is_human: dict = None):
    """
    Run a simulation round with robust error handling.
    
    is_human: dict mapping agent names to boolean (True if human, False if AI)
    """
    if is_human is None:
        is_human = {agent.name: False for agent in agents}
    
    if logger is None:
        logger = ActionLogger()
    
    for agent in agents:
        try:
            if is_human.get(agent.name, False):
                # For humans, we'd get input from UI/network
                # Here we simulate with a test input
                user_input = "[MOVE: UP] Hello from human!"
                direction, message = parse_human_input(user_input)
                
                # Execute safely
                if direction:
                    if direction == "UP": agent.move(0, 1)
                    elif direction == "DOWN": agent.move(0, -1)
                    elif direction == "LEFT": agent.move(-1, 0)
                    elif direction == "RIGHT": agent.move(1, 0)
                
                if message:
                    shared_transcript.append(f"{agent.name}: {message}")
                
                result = ActionResult(success=True, direction=direction, message=message)
                logger.log_action(agent, user_input, result, "human_input")
            else:
                # AI agent
                result = get_agent_action_safe(agent, agents, shared_transcript, max_retries=1)
                logger.log_action(agent, "LLM response", result, "partial")
                
        except Exception as e:
            # Catch-all for unexpected errors
            result = ActionResult(success=False, error=str(e), error_type="UnexpectedError")
            logger.log_action(agent, "Unknown", result, "silent")
            print(f"Warning: Unexpected error for {agent.name}: {e}")

# Run a test simulation
print("=== Robust Simulation Test ===\n")

test_agents = [
    Agent("Alice", 5, 5, "red"),
    Agent("Bob", 15, 15, "blue"),
    Agent("Charlie", 10, 10, "green")
]

test_transcript = []
test_logger = ActionLogger()

print("Running 30 rounds with error handling...")
for round_num in range(30):
    print(f"\nRound {round_num + 1}:")
    run_robust_round(test_agents, test_transcript, test_logger)
    print(f"  Positions: ", end="")
    for a in test_agents:
        print(f"{a.name} ({a.x}, {a.y}) ", end="")
    print(f"\n  Messages this round: {len(test_transcript)}")

print("\n" + "="*50)
test_logger.print_summary()
print(f"\nFinal transcript ({len(test_transcript)} messages):")
for msg in test_transcript[-10:]:  # Show last 10
    print(f"  {msg}")


=== Robust Simulation Test ===

Running 5 rounds with error handling...

Round 1:
  Positions: Alice (6, 5) Bob (14, 15) Charlie (10, 10) 
  Messages this round: 3

Round 2:
  Positions: Alice (7, 5) Bob (13, 15) Charlie (10, 11) 
  Messages this round: 6

Round 3:
  Positions: Alice (6, 5) Bob (14, 15) Charlie (10, 12) 
  Messages this round: 9

Round 4:
  Positions: Alice (7, 5) Bob (14, 16) Charlie (10, 13) 
  Messages this round: 12

Round 5:
  Positions: Alice (6, 5) Bob (13, 16) Charlie (11, 13) 
  Messages this round: 15

=== Action Error Summary ===
Total actions: 15
Success rate: 100.0%
Error types: {}

Final transcript (15 messages):
  Charlie: I think it's interesting to consider the grid as a representation of our world - full of possibilities within boundaries. What do you guys think lies beyond our current grid world?
  Alice: I'm at (6, 5) now! The grid feels like a canvas—there's potential for new stories beyond just our moves. What if the grid could evolve with us?
  B

## Best Practices

Based on what we've built, here are key principles for robust action handling:

1. **Validate Early**: Check inputs before execution, not after
2. **Fail Gracefully**: Never crash the simulation—always have a fallback
3. **Log Everything**: Track errors to understand patterns and improve
4. **Be Flexible**: Accept multiple input formats (structured and natural language)
5. **Partial Success**: Execute valid parts of actions even if some parts fail
6. **Retry Strategically**: For AI agents, retry on transient failures but not on parse errors
7. **User Feedback**: In a real game, inform users when their actions fail

## Conclusion

Error handling might not be the most exciting feature, but it's essential for a production system. When humans and AI interact in real-time, things will go wrong. The question isn't *if*—it's *how gracefully* you handle it.

With robust validation, flexible parsing, and comprehensive logging, our 2D chat world can stay running even when agents (human or AI) do unexpected things.

I think while I probably won't directly use these tools in later experiments due to overhead, these investigations have given me a better understanding of generation failure modes we might encounter and how to handle them IRL.

Next steps: Add more sophisticated recovery strategies, implement user-facing error messages, and use error logs to improve prompt engineering.
