# Sub-Workflow Basics

## Overview

This notebook demonstrates **sub-workflow composition** - showing how parent workflows orchestrate and collect results from embedded sub-workflows using `WorkflowExecutor`.

### Text Processing Pipeline:

```
Parent Workflow:
  TextProcessingOrchestrator (start)
      |
      | Dispatches TextProcessingRequest x6 (concurrent)
      v
  [ Sub-Workflow: WorkflowExecutor(TextProcessor) ]
      |
      | Each sub-workflow: counts words & characters
      v
  TextProcessingOrchestrator (collect_result)
      |
      | When all results collected
      v
  AllTasksCompleted event
```

### Key Concepts:

1. **WorkflowExecutor**: Embed complete workflows as executors
2. **Sub-Workflow Isolation**: Each sub-workflow processes independently
3. **Result Collection**: Parent aggregates outputs from multiple sub-workflows
4. **Yield Output**: Sub-workflows signal completion via `ctx.yield_output()`
5. **Concurrent Processing**: Multiple sub-workflows run in parallel
6. **Custom Events**: AllTasksCompleted signals batch completion

### What You Learn:

- Create sub-workflows with `WorkflowBuilder`
- Embed sub-workflows using `WorkflowExecutor`
- Dispatch concurrent tasks to sub-workflows
- Collect and aggregate sub-workflow results
- Use custom events for workflow coordination
- Build reusable workflow components

## Prerequisites

- Agent Framework installed: `pip install agent-framework`
- No external services required

## Setup and Imports

In [None]:
import asyncio
from dataclasses import dataclass
from typing import Any

import os
from dotenv import load_dotenv
from agent_framework import (
    Executor,
    WorkflowBuilder,
    WorkflowContext,
    WorkflowEvent,
    WorkflowExecutor,
    handler,
)
from typing_extensions import Never
# Load environment variables from .env file
load_dotenv('../../.env')


## Define Message Types

### Message Flow:

```
TextProcessingRequest
    ↓ (to sub-workflow)
Sub-workflow processes
    ↓
TextProcessingResult
    ↓ (yielded back to parent)
Parent collects results
```

In [None]:
@dataclass
class TextProcessingRequest:
    """Request to process a text string."""
    text: str
    task_id: str


@dataclass
class TextProcessingResult:
    """Result of text processing."""
    task_id: str
    text: str
    word_count: int
    char_count: int


class AllTasksCompleted(WorkflowEvent):
    """Event triggered when all processing tasks are complete."""
    def __init__(self, results: list[TextProcessingResult]):
        super().__init__(results)


print("✓ Message types defined")

## Create Sub-Workflow Executor

### TextProcessor

**Demonstrates:**
- Simple text analysis (word count, character count)
- Sub-workflow completion with `yield_output()`
- Task identification for result correlation

In [None]:
class TextProcessor(Executor):
    """Processes text strings - counts words and characters."""

    def __init__(self):
        super().__init__(id="text_processor")

    @handler
    async def process_text(
        self, 
        request: TextProcessingRequest, 
        ctx: WorkflowContext[Never, TextProcessingResult]
    ) -> None:
        """Process a text string and return statistics."""
        text_preview = f"'{request.text[:50]}{'...' if len(request.text) > 50 else ''}'"
        print(f"🔍 Sub-workflow processing text (Task {request.task_id}): {text_preview}")

        # Simple text processing
        word_count = len(request.text.split()) if request.text.strip() else 0
        char_count = len(request.text)

        print(f"📊 Task {request.task_id}: {word_count} words, {char_count} characters")

        # Create result
        result = TextProcessingResult(
            task_id=request.task_id,
            text=request.text,
            word_count=word_count,
            char_count=char_count,
        )

        print(f"✅ Sub-workflow completed task {request.task_id}")
        # Signal completion by yielding the result
        await ctx.yield_output(result)


print("✓ TextProcessor executor created")

## Create Parent Workflow Executor

### TextProcessingOrchestrator

**Demonstrates:**
- Concurrent task dispatching to sub-workflows
- Result collection and aggregation
- Custom event emission when batch completes
- Summary statistics calculation

In [None]:
class TextProcessingOrchestrator(Executor):
    """Orchestrates multiple text processing tasks using sub-workflows."""

    results: list[TextProcessingResult] = []
    expected_count: int = 0

    def __init__(self):
        super().__init__(id="text_orchestrator")

    @handler
    async def start_processing(
        self, 
        texts: list[str], 
        ctx: WorkflowContext[TextProcessingRequest]
    ) -> None:
        """Start processing multiple text strings."""
        print(f"📄 Starting processing of {len(texts)} text strings")
        print("=" * 60)

        self.expected_count = len(texts)

        # Send each text to a sub-workflow
        for i, text in enumerate(texts):
            task_id = f"task_{i + 1}"
            request = TextProcessingRequest(text=text, task_id=task_id)
            print(f"📤 Dispatching {task_id} to sub-workflow")
            await ctx.send_message(request, target_id="text_processor_workflow")

    @handler
    async def collect_result(
        self, 
        result: TextProcessingResult, 
        ctx: WorkflowContext
    ) -> None:
        """Collect results from sub-workflows."""
        print(f"📥 Collected result from {result.task_id}")
        self.results.append(result)

        # Check if all results are collected
        if len(self.results) == self.expected_count:
            print("\n🎉 All tasks completed!")
            await ctx.add_event(AllTasksCompleted(self.results))

    def get_summary(self) -> dict[str, Any]:
        """Get a summary of all processing results."""
        total_words = sum(result.word_count for result in self.results)
        total_chars = sum(result.char_count for result in self.results)
        avg_words = total_words / len(self.results) if self.results else 0
        avg_chars = total_chars / len(self.results) if self.results else 0

        return {
            "total_texts": len(self.results),
            "total_words": total_words,
            "total_characters": total_chars,
            "average_words_per_text": round(avg_words, 2),
            "average_characters_per_text": round(avg_chars, 2),
        }


print("✓ TextProcessingOrchestrator executor created")

## Build Workflows

### Sub-Workflow:

```
TextProcessor (start) → yield_output(result)
```

### Parent Workflow:

```
TextProcessingOrchestrator (start)
    ↓
WorkflowExecutor(processing_workflow)
    ↓
TextProcessingOrchestrator (collect_result)
```

In [None]:
# Step 1: Create the text processing sub-workflow
print("🚀 Setting up sub-workflow...")
text_processor = TextProcessor()

processing_workflow = (
    WorkflowBuilder()
    .set_start_executor(text_processor)
    .build()
)

print("✓ Sub-workflow created")

# Step 2: Create the parent workflow
print("🔧 Setting up parent workflow...")
orchestrator = TextProcessingOrchestrator()
workflow_executor = WorkflowExecutor(processing_workflow, id="text_processor_workflow")

main_workflow = (
    WorkflowBuilder()
    .set_start_executor(orchestrator)
    .add_edge(orchestrator, workflow_executor)
    .add_edge(workflow_executor, orchestrator)
    .build()
)

print("✓ Parent workflow created")

## Prepare Test Data

Testing with 6 different text samples:
1. Simple sentence
2. Complex sentence
3. Short text
4. Long multi-sentence text
5. Empty string (edge case)
6. Text with extra spaces (edge case)

In [None]:
test_texts = [
    "Hello world! This is a simple test.",
    "Python is a powerful programming language used for many applications.",
    "Short text.",
    "This is a longer text with multiple sentences. It contains more words and characters. We use it to test our text processing workflow.",
    "",  # Empty string
    "   Spaces   around   text   ",
]

print(f"✓ Prepared {len(test_texts)} test texts")

## Run Workflow

Watch as:
1. Orchestrator dispatches 6 tasks concurrently
2. Each sub-workflow processes independently
3. Results are collected as they complete
4. AllTasksCompleted event fires when all done

In [None]:
print(f"\n🧪 Testing with {len(test_texts)} text strings")
print("=" * 60)

await main_workflow.run(test_texts)

## Display Results

In [None]:
print("\n📊 Processing Results:")
print("=" * 60)

# Sort results by task_id for consistent display
sorted_results = sorted(orchestrator.results, key=lambda r: r.task_id)

for result in sorted_results:
    preview = result.text[:30] + "..." if len(result.text) > 30 else result.text
    preview = preview.replace("\n", " ").strip() or "(empty)"
    print(f"✅ {result.task_id}: '{preview}' -> {result.word_count} words, {result.char_count} chars")

## Display Summary Statistics

In [None]:
summary = orchestrator.get_summary()

print("\n📈 Summary:")
print("=" * 60)
print(f"📄 Total texts processed: {summary['total_texts']}")
print(f"📝 Total words: {summary['total_words']}")
print(f"🔤 Total characters: {summary['total_characters']}")
print(f"📊 Average words per text: {summary['average_words_per_text']}")
print(f"📏 Average characters per text: {summary['average_characters_per_text']}")

print("\n🏁 Processing complete!")

## Expected Output Pattern

```
🚀 Setting up sub-workflow...
✓ Sub-workflow created
🔧 Setting up parent workflow...
✓ Parent workflow created

🧪 Testing with 6 text strings
============================================================
📄 Starting processing of 6 text strings
============================================================
📤 Dispatching task_1 to sub-workflow
📤 Dispatching task_2 to sub-workflow
📤 Dispatching task_3 to sub-workflow
📤 Dispatching task_4 to sub-workflow
📤 Dispatching task_5 to sub-workflow
📤 Dispatching task_6 to sub-workflow
🔍 Sub-workflow processing text (Task task_1): 'Hello world! This is a simple test.'
📊 Task task_1: 6 words, 35 characters
✅ Sub-workflow completed task task_1
📥 Collected result from task_1
🔍 Sub-workflow processing text (Task task_2): 'Python is a powerful programming language used...'
📊 Task task_2: 10 words, 68 characters
✅ Sub-workflow completed task task_2
📥 Collected result from task_2
🔍 Sub-workflow processing text (Task task_3): 'Short text.'
📊 Task task_3: 2 words, 11 characters
✅ Sub-workflow completed task task_3
📥 Collected result from task_3
🔍 Sub-workflow processing text (Task task_4): 'This is a longer text with multiple sentences...'
📊 Task task_4: 24 words, 139 characters
✅ Sub-workflow completed task task_4
📥 Collected result from task_4
🔍 Sub-workflow processing text (Task task_5): ''
📊 Task task_5: 0 words, 0 characters
✅ Sub-workflow completed task task_5
📥 Collected result from task_5
🔍 Sub-workflow processing text (Task task_6): '   Spaces   around   text   '
📊 Task task_6: 3 words, 28 characters
✅ Sub-workflow completed task task_6
📥 Collected result from task_6

🎉 All tasks completed!

📊 Processing Results:
============================================================
✅ task_1: 'Hello world! This is a simp...' -> 6 words, 35 chars
✅ task_2: 'Python is a powerful program...' -> 10 words, 68 chars
✅ task_3: 'Short text.' -> 2 words, 11 chars
✅ task_4: 'This is a longer text with m...' -> 24 words, 139 chars
✅ task_5: '(empty)' -> 0 words, 0 chars
✅ task_6: 'Spaces   around   text' -> 3 words, 28 chars

📈 Summary:
============================================================
📄 Total texts processed: 6
📝 Total words: 45
🔤 Total characters: 281
📊 Average words per text: 7.5
📏 Average characters per text: 46.83

🏁 Processing complete!
```

## Key Takeaways

### 1. WorkflowExecutor Pattern

```python
# Create sub-workflow
sub_workflow = WorkflowBuilder().set_start_executor(processor).build()

# Embed in parent as executor
workflow_executor = WorkflowExecutor(sub_workflow, id="sub_workflow_id")

# Use in parent graph
parent = (
    WorkflowBuilder()
    .set_start_executor(orchestrator)
    .add_edge(orchestrator, workflow_executor)
    .add_edge(workflow_executor, orchestrator)  # Results back to parent
    .build()
)
```

**Benefits:**
- Workflow reusability
- Clear separation of concerns
- Independent testing
- Modular composition

### 2. Sub-Workflow Completion

```python
# In sub-workflow executor
@handler
async def process(self, request: Request, ctx: WorkflowContext[Never, Result]) -> None:
    result = do_processing(request)
    
    # Signal completion and return result to parent
    await ctx.yield_output(result)
```

**Key Points:**
- `yield_output()` sends result to parent workflow
- Sub-workflow completes after yielding
- Parent receives result via handler
- Multiple yields create multiple results

### 3. Concurrent Task Dispatching

```python
@handler
async def start_processing(self, items: list[str], ctx) -> None:
    for i, item in enumerate(items):
        request = ProcessingRequest(data=item, id=f"task_{i}")
        # All sent concurrently - not awaited
        await ctx.send_message(request, target_id="sub_workflow")
```

**Execution Model:**
- Messages sent without blocking
- Sub-workflows process in parallel
- Results arrive asynchronously
- Parent handles each result independently

### 4. Result Collection Pattern

```python
class Orchestrator(Executor):
    results: list[Result] = []
    expected_count: int = 0
    
    @handler
    async def start(self, items: list, ctx) -> None:
        self.expected_count = len(items)
        for item in items:
            await ctx.send_message(item, target_id="sub_workflow")
    
    @handler
    async def collect_result(self, result: Result, ctx) -> None:
        self.results.append(result)
        
        if len(self.results) == self.expected_count:
            # All complete!
            await ctx.add_event(AllTasksCompleted(self.results))
```

**Best Practices:**
- Track expected result count
- Use task IDs for correlation
- Emit custom events for batch completion
- Handle partial failures gracefully

### 5. Custom Events

```python
class AllTasksCompleted(WorkflowEvent):
    def __init__(self, results: list[Result]):
        super().__init__(results)

# In executor
await ctx.add_event(AllTasksCompleted(self.results))
```

**Use Cases:**
- Signaling batch completion
- Triggering downstream processing
- Workflow observability
- State machine transitions

### 6. Message Type Design

```python
@dataclass
class ProcessingRequest:
    """Input to sub-workflow."""
    text: str
    task_id: str  # For result correlation

@dataclass
class ProcessingResult:
    """Output from sub-workflow."""
    task_id: str  # Matches request
    data: Any
```

**Correlation Strategy:**
- Include task_id in both request and result
- Parent can match results to original requests
- Essential for concurrent processing
- Enables out-of-order result handling

### 7. Sub-Workflow Isolation

**Independence:**
- Sub-workflows don't know they're nested
- Can be tested standalone
- Can be reused in different contexts
- State is isolated from parent

**Communication:**
- Parent → Sub: Send message to WorkflowExecutor
- Sub → Parent: yield_output() from sub-workflow
- No direct state sharing
- All communication via messages

### 8. Production Patterns

#### Error Handling
```python
@handler
async def collect_result(self, result: Result, ctx) -> None:
    if isinstance(result, ErrorResult):
        self.errors.append(result)
    else:
        self.results.append(result)
    
    if len(self.results) + len(self.errors) == self.expected_count:
        # Report completion with error summary
        await ctx.add_event(BatchCompleted(
            results=self.results,
            errors=self.errors
        ))
```

#### Timeout Handling
```python
parent = (
    WorkflowBuilder()
    .set_start_executor(orchestrator)
    .add_edge(orchestrator, workflow_executor)
    .with_timeout(30)  # Sub-workflows must complete within 30s
    .build()
)
```

#### Resource Management
```python
# Limit concurrent sub-workflows
@handler
async def start(self, items: list, ctx) -> None:
    batch_size = 10
    for i in range(0, len(items), batch_size):
        batch = items[i:i+batch_size]
        for item in batch:
            await ctx.send_message(item, target_id="sub_workflow")
```