# Tutorial: Flow-Based State Tracking

This tutorial explores the **Flow-based state management** pattern used by Executor.

**Core Concept**: Map EventStatus enum 1:1 to Flow progressions for O(1) status queries.

**Why This Matters**:
- Traditional approach: Linear scan through all events (O(n))
- Flow approach: Direct progression lookup (O(1))
- Enables real-time monitoring at scale (10k+ events)

**Prerequisites**: Understanding of Flow, Progression, Event basics.

## Introduction

**Problem**: Track status of 10,000 events in real-time.

**Naive Solution** (O(n)):
```python
completed = [e for e in events if e.status == EventStatus.COMPLETED]  # Scan all
```

**Flow Solution** (O(1)):
```python
completed_ids = flow.get_progression("completed").order  # Direct access
completed = [flow.items[id] for id in completed_ids]  # O(1) per item
```

**Key Insight**: Flow progressions act as **pre-computed indices** for event status.

In [None]:
# Setup
import asyncio
from typing import ClassVar

from lionherd_core.base import Event, EventStatus, Executor, Flow, Processor, Progression

## Part 1: Flow Basics

Flow combines two Pile instances: `items` and `progressions`.

In [None]:
# Create simple events
class SimpleEvent(Event):
    name: str = "event"

    async def _invoke(self):
        return "done"


# Create Flow
flow = Flow[Event, Progression](item_type=Event, name="event_flow")

print("Flow structure:")
print(f"  items: {type(flow.items)} (stores events)")
print(f"  progressions: {type(flow.progressions)} (stores ordered UUIDs)")

# Add events to items pile
events = [SimpleEvent(name=f"event_{i}") for i in range(3)]
for event in events:
    flow.add_item(event)

print("\nFlow contents:")
print(f"  Items in pile: {len(flow.items)}")
print(f"  Progressions: {len(flow.progressions)}")

## Part 2: EventStatus Enum

EventStatus defines all possible event states.

In [None]:
# Inspect EventStatus enum
print("EventStatus values:")
for status in EventStatus:
    print(f"  {status.value}")

# Event starts as PENDING
event = SimpleEvent(name="test")
print(f"\nNew event status: {event.status} ({event.status.value})")

# Status changes during lifecycle
result = await event.invoke()
print(f"After invoke: {event.status} ({event.status.value})")

## Part 3: 1:1 Mapping Architecture

Executor creates one progression per EventStatus value.

In [None]:
class SimpleProcessor(Processor):
    event_type: ClassVar[type[Event]] = SimpleEvent


class SimpleExecutor(Executor):
    processor_type: ClassVar[type[Processor]] = SimpleProcessor


# Create executor
executor = SimpleExecutor(processor_config={"queue_capacity": 10, "capacity_refresh_time": 0.5})

# Inspect Flow progressions
print("Executor Flow progressions:")
for prog in executor.states.progressions:
    print(f"  {prog.name}: {len(prog)} events")

# Verify 1:1 mapping
print(f"\nEventStatus count: {len(list(EventStatus))}")
print(f"Progression count: {len(executor.states.progressions)}")
print(f"1:1 mapping: {len(list(EventStatus)) == len(executor.states.progressions)}")

# Progression names match EventStatus values
status_values = {s.value for s in EventStatus}
prog_names = {p.name for p in executor.states.progressions}
print(f"Names match: {status_values == prog_names}")

## Part 4: Progression Updates

When event status changes, Executor updates progressions.

In [None]:
# Add events to executor
test_executor = SimpleExecutor(processor_config={"queue_capacity": 5, "capacity_refresh_time": 0.5})

event1 = SimpleEvent(name="event_1")
event2 = SimpleEvent(name="event_2")

await test_executor.append(event1)
await test_executor.append(event2)

print("After append:")
print(f"  Event status: {event1.status}")
print(f"  Pending progression: {len(test_executor.states.get_progression('pending'))} events")
print(f"  Event in pending: {event1.id in test_executor.states.get_progression('pending')}")

# Process events
await test_executor.start()
await test_executor.forward()
await asyncio.sleep(0.05)

print("\nAfter processing:")
print(f"  Event status: {event1.status}")
print(f"  Completed progression: {len(test_executor.states.get_progression('completed'))} events")
print(f"  Event in completed: {event1.id in test_executor.states.get_progression('completed')}")
print(f"  Event in pending: {event1.id in test_executor.states.get_progression('pending')}")

# Invariant: Event exists in exactly ONE progression
count = sum(1 for prog in test_executor.states.progressions if event1.id in prog)
print(f"\nEvent in {count} progression(s) (should be 1)")

## Part 5: Multi-Progression Use Cases

While Executor enforces single-progression, Flow supports M:N relationships.

In [None]:
# Create custom Flow with multiple progressions
custom_flow = Flow[Event, Progression](name="multi_prog")

# Add progressions for different views
custom_flow.add_progression(Progression(name="high_priority"))
custom_flow.add_progression(Progression(name="api_calls"))
custom_flow.add_progression(Progression(name="team_a"))

# Event can be in multiple progressions (cross-cutting concerns)
event = SimpleEvent(name="urgent_api_call")
custom_flow.add_item(event, progressions=["high_priority", "api_calls", "team_a"])

print("Multi-progression membership:")
for prog in custom_flow.progressions:
    if event.id in prog:
        print(f"  ✓ {prog.name}")

print("\nUse cases for M:N:")
print("  - Cross-cutting concerns (priority, team, type)")
print("  - Multiple views of same data")
print("  - Tag-based organization")

print("\nExecutor's single-progression invariant:")
print("  - Enforces mutually exclusive status")
print("  - Event cannot be both PENDING and COMPLETED")
print("  - Simplifies state machine logic")

## Part 6: Querying Patterns

Common patterns for querying events by status.

In [None]:
# Create executor with events
query_executor = SimpleExecutor(
    processor_config={"queue_capacity": 5, "capacity_refresh_time": 0.5}
)

# Add 10 events
for i in range(10):
    await query_executor.append(SimpleEvent(name=f"task_{i}"))

await query_executor.start()
await query_executor.forward()
await asyncio.sleep(0.05)

# Pattern 1: Get all events of a status
completed = query_executor.get_events_by_status(EventStatus.COMPLETED)
print(f"Pattern 1 - All completed: {len(completed)} events")

# Pattern 2: Count events per status
counts = query_executor.status_counts()
print(f"\nPattern 2 - Status counts: {counts}")

# Pattern 3: Direct progression access
pending_prog = query_executor.states.get_progression("pending")
pending_events = [query_executor.states.items[uid] for uid in pending_prog.order]
print(f"\nPattern 3 - Pending events: {len(pending_events)}")

# Pattern 4: Filter by predicate
completed_events = query_executor.completed_events
fast_events = [e for e in completed_events if e.execution.duration < 0.01]
print(f"\nPattern 4 - Fast completed: {len(fast_events)} events")

# Pattern 5: Check membership
event = completed_events[0] if completed_events else None
if event:
    in_completed = event.id in query_executor.states.get_progression("completed")
    print(f"\nPattern 5 - Event in completed: {in_completed}")

## Part 7: Performance Analysis

Compare O(n) scan vs O(1) progression lookup.

In [None]:
import time

# Create executor with many events
perf_executor = SimpleExecutor(
    processor_config={"queue_capacity": 100, "capacity_refresh_time": 0.5}
)

# Add 1000 events
print("Adding 1000 events...")
for i in range(1000):
    await perf_executor.append(SimpleEvent(name=f"event_{i}"))

await perf_executor.start()
await perf_executor.forward()
await asyncio.sleep(2.0)  # Let them process

print(f"Status: {perf_executor.status_counts()}")

# Benchmark 1: O(n) scan (naive approach)
all_events = list(perf_executor.states.items)

start = time.perf_counter()
for _ in range(100):  # 100 iterations
    completed_naive = [e for e in all_events if e.status == EventStatus.COMPLETED]
scan_time = (time.perf_counter() - start) * 1000  # ms

print(f"\nO(n) scan (100 iterations): {scan_time:.2f}ms")
print(f"  Per query: {scan_time / 100:.3f}ms")

# Benchmark 2: O(1) progression lookup
start = time.perf_counter()
for _ in range(100):  # 100 iterations
    completed_flow = perf_executor.get_events_by_status(EventStatus.COMPLETED)
flow_time = (time.perf_counter() - start) * 1000  # ms

print(f"\nO(1) progression (100 iterations): {flow_time:.2f}ms")
print(f"  Per query: {flow_time / 100:.3f}ms")

# Speedup
speedup = scan_time / flow_time if flow_time > 0 else float("inf")
print(f"\nSpeedup: {speedup:.1f}x faster")
print(f"Savings: {scan_time - flow_time:.2f}ms per 100 queries")

## Part 8: Serialization and Audit

Serialize Flow state for persistence and audit trails.

In [None]:
# Create executor with events
audit_executor = SimpleExecutor(
    processor_config={"queue_capacity": 5, "capacity_refresh_time": 0.5},
    name="audit_exec",
)

# Add and process events
for i in range(8):
    await audit_executor.append(SimpleEvent(name=f"audit_event_{i}"))

await audit_executor.start()
await audit_executor.forward()
await asyncio.sleep(0.05)

print(f"Executor state: {audit_executor}")

# Serialize Flow state
state_snapshot = audit_executor.states.to_dict()

print("\nSerialized snapshot:")
print(f"  Flow name: {state_snapshot['name']}")
print(f"  Items: {len(state_snapshot['items'])} events")
print(f"  Progressions: {len(state_snapshot['progressions'])}")

# Show status counts
print("\nStatus distribution:")
for status in EventStatus:
    count = len(audit_executor.get_events_by_status(status))
    if count > 0:
        print(f"    {status.value}: {count} events")

# Serialization format verification
print("\nSerialization format:")
print(f"  Snapshot keys: {list(state_snapshot.keys())}")
print("  Can be saved to JSON/disk: ✓")
print("  Can be transmitted over network: ✓")
print("  Can be used for audit trails: ✓")

# Note: Restore would use Flow.from_dict(state_snapshot)
# But requires all events to have complete execution state
print("\nFor production use:")
print("  - Save state_snapshot to database/file")
print("  - Restore with Flow.from_dict(snapshot)")
print("  - Use for crash recovery and audit")

## Part 9: Advanced Patterns

Production patterns for Flow-based state management.

In [None]:
# Pattern 1: State transitions with logging
class LoggingExecutor(SimpleExecutor):
    def __init__(self, *args, **kwargs):
        super().__init__(*args, **kwargs)
        self.transition_log = []

    async def _update_progression(self, event, force_status=None):
        """Override to log transitions."""
        old_status = event.status
        await super()._update_progression(event, force_status)
        new_status = force_status if force_status else event.execution.status

        if old_status != new_status:
            self.transition_log.append(
                {
                    "event_id": str(event.id),
                    "from": old_status.value,
                    "to": new_status.value,
                    "timestamp": time.time(),
                }
            )


logging_executor = LoggingExecutor(
    processor_config={"queue_capacity": 5, "capacity_refresh_time": 0.5}
)

for i in range(3):
    await logging_executor.append(SimpleEvent(name=f"logged_{i}"))

await logging_executor.start()
await logging_executor.forward()
await asyncio.sleep(0.05)

print("Pattern 1 - Transition log:")
for transition in logging_executor.transition_log:
    print(f"  {transition['from']} → {transition['to']}")

# Pattern 2: Conditional cleanup
print("\nPattern 2 - Conditional cleanup:")
cleanup_executor = SimpleExecutor(
    processor_config={"queue_capacity": 5, "capacity_refresh_time": 0.5}
)

for i in range(10):
    await cleanup_executor.append(SimpleEvent(name=f"cleanup_{i}"))

await cleanup_executor.start()
await cleanup_executor.forward()
await asyncio.sleep(0.05)

print(f"Before cleanup: {cleanup_executor}")

# Clean up only completed (keep failed for retry)
removed = await cleanup_executor.cleanup_events([EventStatus.COMPLETED])
print(f"Cleaned up {removed} completed events")
print(f"After cleanup: {cleanup_executor}")

# Pattern 3: Status-based routing
print("\nPattern 3 - Status-based routing:")


def route_events(executor):
    """Route events based on status."""
    failed = executor.failed_events
    completed = executor.completed_events

    # Route failed to retry queue
    if failed:
        print(f"  Routing {len(failed)} failed events to retry queue")

    # Route completed to archive
    if completed:
        print(f"  Routing {len(completed)} completed events to archive")

    return {"retry": failed, "archive": completed}


routes = route_events(cleanup_executor)
print(f"Routed to {len(routes)} destinations")

## Conclusion

**Flow-Based State Tracking**:

1. **Architecture**: 1:1 mapping between EventStatus enum and Flow progressions
2. **Performance**: O(1) status queries via direct progression lookup
3. **Invariant**: Events exist in exactly ONE progression at a time
4. **Flexibility**: Flow supports M:N for custom use cases
5. **Serialization**: Full state capture for persistence and audit
6. **Observability**: Real-time monitoring via progression counts

**Design Benefits**:
- **Scalability**: O(1) queries handle 10k+ events
- **Clarity**: Progression names match domain (pending, completed, failed)
- **Audit**: Serialize state for compliance and debugging
- **Flexibility**: Add custom progressions for cross-cutting concerns

**Key Takeaways**:
- Use Flow progressions as **pre-computed indices**
- Enforce **single-progression invariant** for state machines
- Leverage **M:N relationships** for tags and views
- Serialize Flow for **crash recovery** and **audit trails**

**Next Steps**:
- Build event processor: `notebooks/tutorials/event_processing_system.ipynb`
- API reference: `notebooks/references/processor_executor.ipynb`
- Flow deep dive: `notebooks/references/flow.ipynb`