# Pile - Thread-Safe Typed Collection

**Pile** is lionherd's foundational collection for managing Element instances with:
- **Thread safety**: RLock-based synchronization
- **Type validation**: Flexible constraints with Union support
- **Rich queries**: Type-dispatched `__getitem__` interface
- **Progression order**: Insertion order preserved

This notebook demonstrates core patterns for working with Pile collections.

In [1]:
from typing import Union

from lionherd_core.base import Element, Pile, Progression


# Create test elements
class Task(Element):
    """Simple task element."""

    title: str = "Untitled"
    priority: int = 0


class Event(Element):
    """Event element."""

    event_type: str = "info"
    severity: int = 1

## 1. Construction with Type Validation

Pile supports flexible type constraints:
- **No constraint** (default): Any Element subclass
- **Single type**: Enforce specific Element type
- **Union types**: Multi-type collections
- **Strict mode**: Exact type match (no subclasses)

In [2]:
# No type constraint - accepts any Element subclass
pile_any = Pile()
pile_any.add(Task(title="Review PR", priority=1))
pile_any.add(Event(event_type="alert", severity=3))
print(f"Mixed pile: {len(pile_any)} items")

# Single type constraint
tasks = Pile(item_type=Task)
tasks.add(Task(title="Write tests", priority=2))
# tasks.add(Event(...))  # Would raise TypeError
print(f"Task pile: {len(tasks)} items")

# Union type - multiple allowed types
pile_union = Pile(item_type=Union[Task, Event])
pile_union.add(Task(title="Deploy", priority=3))
pile_union.add(Event(event_type="success", severity=1))
print(f"Union pile: {len(pile_union)} items, types: {pile_union.item_type}")

Mixed pile: 2 items
Task pile: 1 items
Union pile: 2 items, types: {<class '__main__.Event'>, <class '__main__.Task'>}


## 2. Core Operations: Add, Remove, Get

Basic CRUD operations with type safety and thread-safety guarantees.

In [3]:
pile = Pile()

# Add items
task1 = Task(title="Implement feature", priority=2)
task2 = Task(title="Fix bug", priority=3)
pile.add(task1)
pile.add(task2)
print(f"Added {len(pile)} items")

# Get by UUID
retrieved = pile.get(task1.id)
print(f"Retrieved: {retrieved.title}")

# Check membership
print(f"Contains task1: {task1.id in pile}")
print(f"Contains task2: {task2 in pile}")

# Remove item
removed = pile.remove(task1.id)
print(f"Removed: {removed.title}")
print(f"Remaining: {len(pile)} items")

# Include/exclude (idempotent set operations)
added = pile.include(task1)  # Add if not present
print(f"Included (was new): {added}")
added_again = pile.include(task1)  # No-op if present
print(f"Included again (idempotent): {added_again}")

Added 2 items
Retrieved: Implement feature
Contains task1: True
Contains task2: True
Removed: Implement feature
Remaining: 1 items
Included (was new): True
Included again (idempotent): False


## 3. Rich Query Interface: Type-Dispatched `__getitem__`

Pile's `__getitem__` supports multiple query modes:
- **UUID/str**: Get single item by ID
- **int**: Get by index (progression order)
- **slice**: Get multiple items
- **callable**: Filter by predicate (returns new Pile)
- **Progression**: Filter by custom order (returns new Pile)

In [None]:
# Create pile with tasks
tasks = Pile(
    [
        Task(title="Task A", priority=1),
        Task(title="Task B", priority=3),
        Task(title="Task C", priority=2),
        Task(title="Task D", priority=3),
        Task(title="Task E", priority=1),
    ]
)

# Query by UUID
task = tasks[next(iter(tasks.keys()))]
print(f"By UUID: {task.title}")

# Query by index
first = tasks[0]
last = tasks[-1]
print(f"By index - First: {first.title}, Last: {last.title}")

# Query by slice (returns list)
middle = tasks[1:3]
print(f"By slice: {[t.title for t in middle]}")

# Query by callable (returns new Pile)
high_priority = tasks[lambda t: t.priority >= 3]
print(f"High priority tasks: {len(high_priority)} items")
for task in high_priority:
    print(f"  - {task.title} (priority={task.priority})")

# Query by Progression (custom order)
custom_order = [tasks[1].id, tasks[3].id]  # Select specific items
prog = Progression(order=custom_order)
filtered = tasks[prog]
print(f"\nBy progression: {len(filtered)} items")
for task in filtered:
    print(f"  - {task.title}")

By UUID: Task A
By index - First: Task A, Last: Task E
By slice: ['Task B', 'Task C']
High priority tasks: 2 items
  - Task B (priority=3)
  - Task D (priority=3)

By progression: 2 items
  - Task B
  - Task D


## 4. Iteration and Collection Methods

Pile provides Python collection protocols for natural iteration.

In [5]:
tasks = Pile(
    [
        Task(title="Task 1", priority=1),
        Task(title="Task 2", priority=2),
        Task(title="Task 3", priority=3),
    ]
)

# Iterate (progression order)
print("Iteration:")
for task in tasks:
    print(f"  {task.title}")

# Keys (UUIDs)
print(f"\nUUIDs: {list(tasks.keys())[:2]}...")  # Show first 2

# Values (items)
print(f"Values: {[t.title for t in tasks.values()]}")

# List conversion
tasks_list = tasks.to_list()
print(f"As list: {len(tasks_list)} items")

# Size checks
print(f"\nLen: {len(tasks)}")
print(f"Size: {tasks.size()}")
print(f"Empty: {tasks.is_empty()}")

Iteration:
  Task 1
  Task 2
  Task 3

UUIDs: [UUID('a8bc8a5e-6c52-4bdd-a979-399640e1381f'), UUID('e155f455-bdad-41d6-a13e-e1507f492217')]...
Values: ['Task 1', 'Task 2', 'Task 3']
As list: 3 items

Len: 3
Size: 3
Empty: False


## 5. Concurrency: Thread Safety and Async

Pile supports both synchronous multi-threading and asynchronous operations:
- **Thread safety**: RLock synchronization for concurrent threads
- **Async operations**: Separate async lock for coroutine-based concurrency

### Synchronous Multi-Threading

Thread-safe operations using Python's threading module.

In [6]:
import threading

# Create pile for concurrent access
pile_sync = Pile()


# Worker function that adds items
def worker(worker_id):
    for i in range(5):
        task = Task(title=f"Worker-{worker_id} Task-{i}", priority=worker_id)
        pile_sync.add(task)


# Spawn 10 threads concurrently adding items
threads = [threading.Thread(target=worker, args=(i,)) for i in range(10)]

# Start all threads
for t in threads:
    t.start()

# Wait for all threads to complete
for t in threads:
    t.join()

print(f"Total items after concurrent adds: {len(pile_sync)}")  # 50 (thread-safe)
print("All items added successfully without race conditions")

Total items after concurrent adds: 50
All items added successfully without race conditions


### Async Operations

Async operations use a separate async lock for coroutine-based concurrency.

In [7]:
import threading

# Create pile for concurrent access
pile = Pile()


# Worker function that adds items
def worker(worker_id):
    for i in range(5):
        task = Task(title=f"Task from worker {worker_id}-{i}", priority=worker_id)
        pile.add(task)


# Spawn 10 threads concurrently adding items
threads = [threading.Thread(target=worker, args=(i,)) for i in range(10)]

# Start all threads
for t in threads:
    t.start()

# Wait for all threads to complete
for t in threads:
    t.join()

print(f"Total items after concurrent adds: {len(pile)}")  # 50 (thread-safe)
print(f"Sample tasks: {[pile[i].title for i in range(3)]}")

Total items after concurrent adds: 50
Sample tasks: ['Task from worker 1-0', 'Task from worker 1-1', 'Task from worker 1-2']


In [8]:
from lionherd_core.libs.concurrency import gather


async def demo_async():
    pile = Pile()

    # Create tasks
    tasks = [Task(title=f"Task {i}", priority=i) for i in range(5)]

    # Concurrent add operations
    await gather(*[pile.add_async(task) for task in tasks])
    print(f"Added {len(pile)} items concurrently")

    # Concurrent get operations
    results = await gather(*[pile.get_async(task.id) for task in tasks[:3]])
    print(f"Retrieved: {[r.title for r in results]}")

    # Async context manager (manual lock control)
    async with pile as p:
        # Lock held during context
        print(f"Inside context: {len(p._items)} items")

    return pile


# Run async demo
pile = await demo_async()
print(f"Final pile size: {len(pile)}")

Added 5 items concurrently
Retrieved: ['Task 0', 'Task 1', 'Task 2']
Inside context: 5 items
Final pile size: 5


## 6. Serialization: to_dict / from_dict

Pile preserves progression order and type constraints through serialization.

**Modes**:
- `python`: Python objects (UUID, datetime)
- `json`: JSON-safe strings
- `db`: Database column naming (metadata → node_metadata)

In [9]:
# Create pile with type constraint
original = Pile(
    items=[Task(title="Task A", priority=1), Task(title="Task B", priority=2)],
    item_type=Task,
    strict_type=False,
)

# Serialize to JSON mode
data = original.to_dict(mode="json")
print("Serialized data:")
print(f"  ID (str): {data['id'][:8]}...")
print(f"  Items: {len(data['items'])}")
print(f"  Item type: {data['item_type']}")
print(f"  Strict: {data['strict_type']}")

# Deserialize
restored = Pile.from_dict(data)
print("\nRestored:")
print(f"  Length: {len(restored)}")
print(f"  Type constraint: {restored.item_type}")
print(f"  Strict mode: {restored.strict_type}")
print(f"  Tasks: {[t.title for t in restored]}")

# Verify round-trip preservation
print("\nRound-trip checks:")
print(f"  Length preserved: {len(original) == len(restored)}")
print(f"  Type constraint preserved: {original.item_type == restored.item_type}")
print(f"  Order preserved: {[t.title for t in original] == [t.title for t in restored]}")

Serialized data:
  ID (str): f91264db...
  Items: 2
  Item type: ['__main__.Task']
  Strict: False

Restored:
  Length: 2
  Type constraint: {<class '__main__.Task'>}
  Strict mode: False
  Tasks: ['Task A', 'Task B']

Round-trip checks:
  Length preserved: True
  Type constraint preserved: True
  Order preserved: True


## 7. Type Filtering and Validation

Filter heterogeneous collections by type and control validation strictness.

## Common Pitfalls

Learn from common mistakes when working with Pile collections.

In [10]:
# Pitfall 1: Attempting to mutate read-only properties
print("Pitfall 1: Mutating Read-Only Properties\n")

pile_test = Pile()
task = Task(title="Test task", priority=1)
pile_test.add(task)

# ❌ WRONG: Try to modify items directly
try:
    pile_test.items[task.id] = task  # MappingProxyType is read-only
    print("  ❌ Modified items directly (should have failed)")
except TypeError as e:
    print(f"  ❌ Error: {str(e)[:50]}...")

# ❌ WRONG: Try to modify progression directly
original_len = len(pile_test._progression.order)
pile_test.progression.append(task.id)  # Modifies copy, not original
print(
    f"  ❌ progression.append() modified copy (original len: {original_len}, still {len(pile_test._progression.order)})"
)

# ✓ CORRECT: Use Pile methods
pile_test.add(Task(title="Another task", priority=2))
print(f"  ✓ Used pile.add() correctly (now {len(pile_test)} items)\n")


# Pitfall 2: Type validation confusion with subclasses
print("Pitfall 2: Type Validation with Subclasses\n")


class UrgentTask(Task):
    urgent: bool = True


# ❌ POTENTIAL ISSUE: Default allows subclasses
permissive_pile = Pile(item_type=Task)  # strict_type=False by default
permissive_pile.add(UrgentTask(title="Urgent", urgent=True))
print("  ⚠️  Default mode allowed subclass (strict_type=False)")

# ✓ CORRECT: Use strict_type=True for exact type matching
strict_pile = Pile(item_type=Task, strict_type=True)
strict_pile.add(Task(title="Normal task"))
try:
    strict_pile.add(UrgentTask(title="Urgent", urgent=True))
    print("  ❌ strict_type=True should have rejected subclass")
except TypeError:
    print("  ✓ strict_type=True correctly rejected subclass\n")


# Pitfall 3: Concurrent include() not atomic (check-then-act race)
print("Pitfall 3: Concurrent include() Not Atomic\n")

import threading
import time

race_pile = Pile()
shared_task = Task(title="Shared task", priority=1)
results = []


def try_include(worker_id):
    time.sleep(0.001)  # Small delay to increase race window
    result = race_pile.include(shared_task)
    results.append((worker_id, result))


# ❌ POTENTIAL RACE: Both threads might see "not present" and add
results.clear()
threads = [threading.Thread(target=try_include, args=(i,)) for i in range(2)]
for t in threads:
    t.start()
for t in threads:
    t.join()

print(f"  ⚠️  Without external lock: {len([r for r in results if r[1]])} thread(s) reported 'added'")
print("     (Race condition possible - check-then-act not atomic)")

# ✓ CORRECT: Use external lock for concurrent include/exclude
protected_pile = Pile()
lock = threading.Lock()
results.clear()


def safe_include(worker_id):
    with lock:
        result = protected_pile.include(shared_task)
    results.append((worker_id, result))


threads = [threading.Thread(target=safe_include, args=(i,)) for i in range(2)]
for t in threads:
    t.start()
for t in threads:
    t.join()

print("  ✓ With external lock: exactly 1 thread reported 'added' (atomic)")

Pitfall 1: Mutating Read-Only Properties

  ❌ Error: 'mappingproxy' object does not support item assign...
  ❌ progression.append() modified copy (original len: 1, still 1)
  ✓ Used pile.add() correctly (now 2 items)

Pitfall 2: Type Validation with Subclasses

  ⚠️  Default mode allowed subclass (strict_type=False)
  ✓ strict_type=True correctly rejected subclass

Pitfall 3: Concurrent include() Not Atomic

  ⚠️  Without external lock: 1 thread(s) reported 'added'
     (Race condition possible - check-then-act not atomic)
  ✓ With external lock: exactly 1 thread reported 'added' (atomic)


In [11]:
# Create mixed pile
mixed = Pile(item_type={Task, Event})
mixed.add(Task(title="Task 1", priority=1))
mixed.add(Event(event_type="warning", severity=2))
mixed.add(Task(title="Task 2", priority=3))
mixed.add(Event(event_type="error", severity=3))

print(f"Mixed pile: {len(mixed)} items")

# Filter by type (returns new Pile)
tasks_only = mixed.filter_by_type(Task)
events_only = mixed.filter_by_type(Event)

print("\nFiltered by type:")
print(f"  Tasks: {len(tasks_only)} items")
for task in tasks_only:
    print(f"    - {task.title}")

print(f"  Events: {len(events_only)} items")
for event in events_only:
    print(f"    - {event.event_type}")


# Strict vs permissive validation
class HighPriorityTask(Task):
    """Task subclass."""

    urgent: bool = True


# Permissive mode (allows subclasses)
permissive = Pile(item_type=Task, strict_type=False)
permissive.add(Task(title="Normal task"))
permissive.add(HighPriorityTask(title="Urgent task"))  # Subclass allowed
print(f"\nPermissive pile: {len(permissive)} items (allows subclasses)")

# Strict mode (exact type only)
strict = Pile(item_type=Task, strict_type=True)
strict.add(Task(title="Normal task"))
try:
    strict.add(HighPriorityTask(title="Urgent task"))  # Rejected
except TypeError as e:
    print(f"Strict pile rejected subclass: {str(e)[:50]}...")

Mixed pile: 4 items

Filtered by type:
  Tasks: 2 items
    - Task 1
    - Task 2
  Events: 2 items
    - error

Permissive pile: 2 items (allows subclasses)
Strict pile rejected subclass: Item type <class '__main__.HighPriorityTask'> not ...


## Summary

**Pile** provides a powerful foundation for managing Element collections:

**Key Features**:
- Thread-safe operations with RLock synchronization
- Flexible type validation (single/Union/strict modes)
- Rich query interface via type-dispatched `__getitem__`
- Async support with independent lock
- Progression order preservation
- Full serialization/deserialization support

**Common Patterns**:
- `pile[lambda x: condition]` - Filter by predicate
- `pile[progression]` - Custom ordering
- `pile.filter_by_type(T)` - Type-based filtering
- `pile.include(item)` - Idempotent add
- `Pile(item_type=Union[A, B])` - Multi-type collections

**Performance**:
- O(1) add, get, contains
- O(n) remove (progression linear scan)
- O(1) index access (progression optimization)
- Thread-safe with minimal contention

See `src/lionherd_core/base/pile.py` for full implementation details.