# Error Handling in Lionherd

Lionherd provides a comprehensive exception hierarchy with structured error context and retry semantics.

## Overview

- **Semantic exceptions**: `NotFoundError` and `ExistsError` replace generic `ValueError`
- **Structured context**: `.details` dict for debugging information  
- **Retry semantics**: `.retryable` flag for retry strategies
- **Exception chaining**: `.__cause__` preservation for root cause analysis
- **Serialization**: `.to_dict()` for logging and monitoring

In [1]:
from lionherd_core import Node, Pile, concurrency
from lionherd_core.errors import (
    ConfigurationError,
    ConnectionError,
    ExecutionError,
    ExistsError,
    LionherdError,
    NotFoundError,
    TimeoutError,
    ValidationError,
)

## Exception Hierarchy

All lionherd exceptions inherit from `LionherdError`:

```text
LionherdError (base)
├── NotFoundError (semantic)
├── ExistsError (semantic)
├── ValidationError
├── ConfigurationError
├── ExecutionError
├── ConnectionError
└── TimeoutError
```

## NotFoundError: Semantic Exception for Missing Items

`NotFoundError` replaces generic `ValueError` for missing items (v1.0.0-alpha4).

In [2]:
# Basic usage - Pile now raises NotFoundError directly (v1.0.0-alpha4)
pile = Pile[Node]()
node1 = Node(content={"value": "test"})
pile.add(node1)

# Try to access non-existent item
from uuid import uuid4

missing_id = uuid4()

try:
    item = pile[missing_id]  # Raises NotFoundError directly
except NotFoundError as e:
    print(f"Caught: {e.message}")
    print(f"Details: {e.details if e.details else 'No details'}")
    print(f"Retryable: {e.retryable}")

Caught: Item 7fd260d4-dbec-4266-9233-c3af4076c6fa not found in pile
Details: No details
Retryable: False


In [3]:
# NotFoundError is not retryable by default
error = NotFoundError("User not found")
print(f"Error message: {error.message}")
print(f"Retryable: {error.retryable}")  # False
print(f"Details: {error.details}")  # {}

Error message: User not found
Retryable: False
Details: {}


## ExistsError: Semantic Exception for Duplicates

`ExistsError` replaces generic `ValueError` for duplicate items.

In [4]:
# Duplicate prevention
pile = Pile[Node]()
node1 = Node(content={"value": "test"})
pile.add(node1)

# Try to add duplicate
try:
    if node1.id in pile:
        raise ExistsError(
            "Item already exists in pile",
            details={"item_id": str(node1.id), "content": str(node1.content)},
        )
except ExistsError as e:
    print(f"Error: {e.message}")
    print(f"Retryable: {e.retryable}")  # False
    print(f"Details: {e.details}")

Error: Item already exists in pile
Retryable: False
Details: {'item_id': '9a1df565-42b5-465d-a155-f2432f0db39c', 'content': "{'value': 'test'}"}


## Exception Metadata: .retryable, .details, .__cause__

All lionherd exceptions provide structured metadata for error handling.

In [5]:
# Create error with full metadata
error = ExecutionError(
    "Task execution failed",
    details={"task_name": "process_data", "input_size": 1000, "timestamp": "2025-11-11T10:00:00Z"},
    retryable=True,  # Override default if needed
)

print(f"Message: {error.message}")
print(f"Retryable: {error.retryable}")
print(f"Details: {error.details}")
print(f"Cause: {error.__cause__}")  # None (no chained exception)

Message: Task execution failed
Retryable: True
Details: {'task_name': 'process_data', 'input_size': 1000, 'timestamp': '2025-11-11T10:00:00Z'}
Cause: None


In [6]:
# Default retryable values by exception type
exceptions = [
    NotFoundError("Item missing"),
    ExistsError("Item exists"),
    ValidationError("Validation failed"),
    ConfigurationError("Config error"),
    ExecutionError("Execution failed"),
    ConnectionError("Connection failed"),
    TimeoutError("Timeout"),
]

for exc in exceptions:
    print(f"{exc.__class__.__name__:20s} retryable={exc.retryable}")

NotFoundError        retryable=False
ExistsError          retryable=False
ValidationError      retryable=False
ConfigurationError   retryable=False
ExecutionError       retryable=True
ConnectionError      retryable=True
TimeoutError         retryable=True


## Retry Logic Using .retryable Flag

Use the `.retryable` flag to implement retry strategies.

In [7]:
async def retry_operation(operation, max_attempts=3):
    """Retry operation if error is retryable."""
    for attempt in range(max_attempts):
        try:
            return await operation()
        except LionherdError as e:
            print(f"Attempt {attempt + 1}: {e.message}")

            # Check if error is retryable
            if not e.retryable:
                print("  → Error is not retryable, failing immediately")
                raise

            # Don't retry on last attempt
            if attempt == max_attempts - 1:
                print("  → Max attempts reached, failing")
                raise

            # Exponential backoff
            delay = 2**attempt
            print(f"  → Error is retryable, waiting {delay}s before retry")
            await concurrency.sleep(delay)


# Test with retryable error
attempt_count = 0


async def flaky_operation():
    global attempt_count
    attempt_count += 1

    if attempt_count < 3:
        # Fail first 2 attempts
        raise ExecutionError(
            "Transient failure", details={"attempt": attempt_count}, retryable=True
        )

    return "Success!"


# Run retry logic
result = await retry_operation(flaky_operation)
print(f"\nFinal result: {result}")

Attempt 1: Transient failure
  → Error is retryable, waiting 1s before retry
Attempt 2: Transient failure
  → Error is retryable, waiting 2s before retry

Final result: Success!


In [8]:
# Test with non-retryable error
async def non_retryable_operation():
    raise NotFoundError(
        "Item not found",
        details={"item_id": "123"},
        retryable=False,  # Not retryable
    )


try:
    result = await retry_operation(non_retryable_operation)
except NotFoundError as e:
    print(f"\nFailed immediately (non-retryable): {e.message}")

Attempt 1: Item not found
  → Error is not retryable, failing immediately

Failed immediately (non-retryable): Item not found


## Exception Chaining with .__cause__

Preserve exception chains for root cause analysis.

In [9]:
# Multi-layer exception chain
async def database_layer():
    """Simulate database layer that raises KeyError."""
    cache = {}
    user_id = 123

    try:
        return cache[user_id]  # KeyError
    except KeyError as e:
        raise NotFoundError(
            f"User {user_id} not in cache", details={"user_id": user_id}
        ) from e  # Preserve KeyError as __cause__


async def service_layer():
    """Simulate service layer that wraps database errors."""
    try:
        return await database_layer()
    except NotFoundError as e:
        raise ExecutionError(
            "User fetch failed", details={"service": "user_service", "reason": "not_found"}
        ) from e  # Preserve NotFoundError as __cause__


# Execute and inspect exception chain
try:
    await service_layer()
except ExecutionError as e:
    print("Exception chain:")
    current = e
    level = 0

    while current:
        indent = "  " * level
        exc_type = type(current).__name__
        message = getattr(current, "message", str(current))
        print(f"{indent}{exc_type}: {message}")

        if hasattr(current, "details"):
            print(f"{indent}  Details: {current.details}")

        current = current.__cause__
        level += 1

Exception chain:
ExecutionError: User fetch failed
  Details: {'service': 'user_service', 'reason': 'not_found'}
  NotFoundError: User 123 not in cache
    Details: {'user_id': 123}
    KeyError: 123


## ExceptionGroup Aggregation (Python 3.11+)

Collect multiple errors from batch operations.

In [10]:
def batch_add_items(pile, items):
    """Add multiple items to pile, collecting all errors."""
    errors = []

    for item in items:
        try:
            # Validate item doesn't exist
            if item.id in pile:
                raise ExistsError(
                    f"Item {item.id} already exists", details={"item_id": str(item.id)}
                )

            # Add item
            pile.add(item)

        except (ExistsError, ValidationError) as e:
            # Collect error, continue processing
            errors.append(e)

    # Raise all errors together
    if errors:
        raise ExceptionGroup("Batch operation failed", errors)


# Test batch operation
pile = Pile[Node]()
node1 = Node(content={"value": "item1"})
node2 = Node(content={"value": "item2"})
node3 = Node(content={"value": "item3"})

# Add first item successfully
pile.add(node1)

# Try to add batch with duplicates
try:
    batch_add_items(pile, [node1, node2, node1, node3])  # node1 appears twice
except ExceptionGroup as eg:
    print(f"Batch failed with {len(eg.exceptions)} errors:\n")

    for i, exc in enumerate(eg.exceptions, 1):
        print(f"{i}. {type(exc).__name__}: {exc.message}")
        if exc.details:
            print(f"   Details: {exc.details}")
        print()

Batch failed with 2 errors:

1. ExistsError: Item 60dcf464-0d3e-463b-a5b0-ddaaf4967eaa already exists
   Details: {'item_id': '60dcf464-0d3e-463b-a5b0-ddaaf4967eaa'}

2. ExistsError: Item 60dcf464-0d3e-463b-a5b0-ddaaf4967eaa already exists
   Details: {'item_id': '60dcf464-0d3e-463b-a5b0-ddaaf4967eaa'}



## Serialization with .to_dict()

Convert exceptions to dictionaries for logging and monitoring.

In [11]:
# Create error with rich metadata
error = ConnectionError(
    "API request failed",
    details={
        "url": "https://api.example.com/users",
        "method": "GET",
        "status_code": 503,
        "retry_after": 60,
    },
    retryable=True,
)

# Serialize for logging
error_dict = error.to_dict()
print("Serialized error:")
print(error_dict)

Serialized error:
{'error': 'ConnectionError', 'message': 'API request failed', 'retryable': True, 'details': {'url': 'https://api.example.com/users', 'method': 'GET', 'status_code': 503, 'retry_after': 60}}


In [12]:
# Structured logging example
import json


def log_error(error):
    """Log error in structured JSON format."""
    log_entry = {"timestamp": "2025-11-11T10:00:00Z", "level": "ERROR", **error.to_dict()}
    print(json.dumps(log_entry, indent=2))


# Test logging
error = NotFoundError(
    "User not found", details={"user_id": 123, "search_method": "by_id", "database": "users_db"}
)

log_error(error)

{
  "timestamp": "2025-11-11T10:00:00Z",
  "level": "ERROR",
  "error": "NotFoundError",
  "message": "User not found",
  "retryable": false,
  "details": {
    "user_id": 123,
    "search_method": "by_id",
    "database": "users_db"
  }
}


## Migration from ValueError

Version 1.0.0-alpha4 replaced `ValueError` with semantic exceptions.

In [13]:
# BEFORE (v0.x): Generic ValueError
def old_get_item(pile, uuid):
    if uuid not in pile:
        raise ValueError(f"Item {uuid} not found")
    return pile[uuid]


# AFTER (v1.0.0+): Semantic exception with context
def new_get_item(pile, uuid):
    try:
        return pile[uuid]
    except KeyError as e:
        raise NotFoundError(
            f"Item {uuid} not found in pile", details={"uuid": str(uuid), "pile_size": len(pile)}
        ) from e


# Test new pattern
pile = Pile[Node]()
missing_id = uuid4()

try:
    item = new_get_item(pile, missing_id)
except NotFoundError as e:
    print(f"Error: {e.message}")
    print(f"Details: {e.details}")
    print(f"Retryable: {e.retryable}")

Error: Item 77724693-74de-4214-a3e3-15fc39193652 not found in pile
Details: {}
Retryable: False


In [14]:
# BEFORE: Generic ValueError for duplicates
def old_add_unique(pile, item):
    if item.id in pile:
        raise ValueError("Item already exists")
    pile.add(item)


# AFTER: Semantic ExistsError
def new_add_unique(pile, item):
    if item.id in pile:
        raise ExistsError(
            "Item already exists in pile",
            details={"item_id": str(item.id), "content": str(item.content)},
        )
    pile.add(item)


# Test new pattern
pile = Pile[Node]()
node = Node(content={"value": "test"})
pile.add(node)

try:
    new_add_unique(pile, node)  # Duplicate
except ExistsError as e:
    print(f"Error: {e.message}")
    print(f"Details: {e.details}")
    print(f"Retryable: {e.retryable}")

Error: Item already exists in pile
Details: {'item_id': 'bc5f7eac-f320-4116-932b-7a23af65ec10', 'content': "{'value': 'test'}"}
Retryable: False


## Summary

### Key Takeaways

1. **Use semantic exceptions**: `NotFoundError`/`ExistsError` instead of `ValueError`
2. **Leverage .retryable flag**: Implement retry strategies based on error type
3. **Add structured context**: Use `.details` dict for debugging information
4. **Preserve exception chains**: Use `from e` to maintain `.__cause__`
5. **Aggregate batch errors**: Use `ExceptionGroup` for batch operations
6. **Structured logging**: Use `.to_dict()` for JSON logging

### Exception Types Summary

| Exception | Retryable | Use Case |
|-----------|-----------|----------|
| `NotFoundError` | No | Item missing from collection |
| `ExistsError` | No | Duplicate item insertion |
| `ValidationError` | No | Invalid input/schema |
| `ConfigurationError` | No | Invalid configuration |
| `ExecutionError` | Yes | Runtime execution failure |
| `ConnectionError` | Yes | Network/API failure |
| `TimeoutError` | Yes | Operation timeout |

### Best Practices

1. **Single-lookup pattern**: Prefer `try/except` over double lookup
2. **Exception transformation**: Transform low-level errors to domain exceptions
3. **Structured details**: Use JSON-serializable types in `.details`
4. **Chain preservation**: Always use `from e` for exception chaining
5. **Error aggregation**: Collect all errors in batch operations
6. **Monitoring integration**: Log `.to_dict()` for structured monitoring