# Generators With Practical Implementation

---

## Table of Contents
1. [Introduction](#introduction)
2. [What are Generators?](#what-are-generators)
3. [Generator Functions](#generator-functions)
4. [Generator Expressions](#generator-expressions)
5. [Generator vs Iterator vs List](#generator-vs-iterator)
6. [Generator Methods](#generator-methods)
7. [Practical Examples](#practical-examples)
8. [Advanced Generator Patterns](#advanced-patterns)
9. [Generator Pipelines](#generator-pipelines)
10. [Best Practices](#best-practices)
11. [Summary](#summary)

---

## 1. Introduction <a id='introduction'></a>

**Generators** are a simple and powerful tool for creating iterators in Python. They allow you to iterate over data without storing the entire dataset in memory.

**Key Points:**
- Generators are functions that use `yield` instead of `return`
- They generate values lazily (on-demand)
- They are memory efficient and suitable for large datasets
- Simpler syntax compared to iterator classes
- Automatically implement the iterator protocol

**Real-Life Analogy:**
Think of a generator like a vending machine that dispenses items one at a time. Instead of carrying an entire box of snacks (storing all data in memory), you get one snack each time you press a button (call `next()`).

---

## 2. What are Generators? <a id='what-are-generators'></a>

A **generator** is a special type of iterator that generates values on-the-fly and can only be iterated once.

### Key Characteristics:

1. **Lazy Evaluation**: Values are computed only when needed
2. **Memory Efficient**: Don't store entire sequence in memory
3. **Stateful**: Remember execution state between calls
4. **One-time Use**: Once exhausted, cannot be reused
5. **Simple Syntax**: Easier than writing iterator classes

### Two Ways to Create Generators:

| Type | Syntax | Use Case |
|------|--------|----------|
| **Generator Function** | Uses `yield` keyword | Complex logic, multiple values |
| **Generator Expression** | Like list comprehension with `()` | Simple transformations |

In [None]:
# Simple generator function example

def count_up_to(n):
    """Generator that counts from 1 to n"""
    count = 1
    while count <= n:
        yield count  # Pause here and return value
        count += 1

# Using the generator
print("Using generator:")
counter = count_up_to(5)
print(f"Type: {type(counter)}")
print()

for num in counter:
    print(num, end=' ')
print()

In [None]:
# Comparing list vs generator

import sys

# List - stores all values in memory
list_numbers = [x for x in range(10000)]
print(f"List size: {sys.getsizeof(list_numbers)} bytes")

# Generator - generates values on demand
gen_numbers = (x for x in range(10000))
print(f"Generator size: {sys.getsizeof(gen_numbers)} bytes")

print(f"\nMemory saved: {sys.getsizeof(list_numbers) - sys.getsizeof(gen_numbers)} bytes")

---

## 3. Generator Functions <a id='generator-functions'></a>

A **generator function** is a function that contains at least one `yield` statement.

### How `yield` Works:

1. When `yield` is encountered, the function pauses and returns the value
2. The function's state is saved (local variables, execution position)
3. On next call, execution resumes right after the `yield`
4. Function terminates when it reaches the end or a `return` statement

### Difference: `return` vs `yield`

| `return` | `yield` |
|----------|--------|
| Terminates function | Pauses function |
| Returns single value | Returns multiple values (one at a time) |
| Function ends | Function can resume |
| State is lost | State is preserved |

In [None]:
# Example 1: Simple generator

def simple_generator():
    """Demonstrates yield behavior"""
    print("Starting generator")
    yield 1
    print("After first yield")
    yield 2
    print("After second yield")
    yield 3
    print("Generator finished")

# Using the generator
gen = simple_generator()
print("Created generator\n")

print(f"First call: {next(gen)}")
print(f"Second call: {next(gen)}")
print(f"Third call: {next(gen)}")
print("\nTrying to call again:")
try:
    next(gen)
except StopIteration:
    print("StopIteration raised - generator exhausted")

In [None]:
# Example 2: Fibonacci generator

def fibonacci(n):
    """Generate first n Fibonacci numbers"""
    a, b = 0, 1
    count = 0
    
    while count < n:
        yield a
        a, b = b, a + b
        count += 1

print("First 10 Fibonacci numbers:")
for num in fibonacci(10):
    print(num, end=' ')
print()

In [None]:
# Example 3: Range generator (like built-in range)

def my_range(start, stop, step=1):
    """Generator that mimics range() function"""
    current = start
    while current < stop:
        yield current
        current += step

print("Using custom range generator:")
for i in my_range(0, 10, 2):
    print(i, end=' ')
print()

In [None]:
# Example 4: Infinite generator

def infinite_sequence():
    """Generate infinite sequence of numbers"""
    num = 0
    while True:
        yield num
        num += 1

# Use with a break condition
print("First 10 numbers from infinite generator:")
gen = infinite_sequence()
for i in range(10):
    print(next(gen), end=' ')
print()

In [None]:
# Example 5: Generator with multiple yields

def countdown(n):
    """Countdown from n to 1"""
    print("Starting countdown!")
    while n > 0:
        yield n
        n -= 1
    print("Blast off!")

for num in countdown(5):
    print(num)

---

## 4. Generator Expressions <a id='generator-expressions'></a>

**Generator expressions** are a concise way to create generators, similar to list comprehensions but using parentheses `()` instead of square brackets `[]`.

### Syntax:

```python
# List comprehension (creates entire list in memory)
list_comp = [x**2 for x in range(10)]

# Generator expression (generates values on-demand)
gen_exp = (x**2 for x in range(10))
```

### When to Use Generator Expressions:

- Simple transformations
- One-line logic
- Memory-efficient iteration
- Pipeline operations

In [None]:
# Comparing list comprehension vs generator expression

# List comprehension - creates entire list
squares_list = [x**2 for x in range(10)]
print("List comprehension:")
print(f"Type: {type(squares_list)}")
print(f"Values: {squares_list}")
print()

# Generator expression - creates generator
squares_gen = (x**2 for x in range(10))
print("Generator expression:")
print(f"Type: {type(squares_gen)}")
print(f"Values: {list(squares_gen)}")  # Convert to list to see values
print("Note: Generator is now exhausted!")

In [None]:
# Generator expression with filtering

# Even numbers squared
even_squares = (x**2 for x in range(20) if x % 2 == 0)

print("Even numbers squared:")
for value in even_squares:
    print(value, end=' ')
print()

In [None]:
# Using generator expressions with built-in functions

numbers = range(1, 11)

# Sum of squares
sum_squares = sum(x**2 for x in numbers)
print(f"Sum of squares: {sum_squares}")

# Maximum squared value
max_square = max(x**2 for x in numbers)
print(f"Maximum square: {max_square}")

# Any even number?
has_even = any(x % 2 == 0 for x in numbers)
print(f"Has even number: {has_even}")

# All positive?
all_positive = all(x > 0 for x in numbers)
print(f"All positive: {all_positive}")

In [None]:
# Generator expression vs list comprehension performance

import time

n = 1000000

# List comprehension
start = time.time()
sum_list = sum([x**2 for x in range(n)])
list_time = time.time() - start

# Generator expression
start = time.time()
sum_gen = sum(x**2 for x in range(n))
gen_time = time.time() - start

print(f"List comprehension time: {list_time:.4f} seconds")
print(f"Generator expression time: {gen_time:.4f} seconds")
print(f"Generator is {list_time/gen_time:.2f}x faster")

---

## 5. Generator vs Iterator vs List <a id='generator-vs-iterator'></a>

Understanding the differences helps choose the right tool.

### Comparison:

| Feature | List | Iterator Class | Generator |
|---------|------|----------------|----------|
| **Memory** | Stores all items | Generates on-demand | Generates on-demand |
| **Syntax** | `[...]` | Class with `__iter__` & `__next__` | Function with `yield` |
| **Reusable** | Yes | Depends on implementation | No (exhaustible) |
| **Performance** | Fast access, high memory | Memory efficient | Memory efficient |
| **Complexity** | Simple | Complex | Simple |
| **Use Case** | Small datasets | Complex iteration logic | Large datasets, pipelines |

In [None]:
# Comparing all three approaches

# 1. List - stores everything in memory
def squares_list(n):
    result = []
    for i in range(n):
        result.append(i ** 2)
    return result

# 2. Iterator class
class SquaresIterator:
    def __init__(self, n):
        self.n = n
        self.current = 0
    
    def __iter__(self):
        return self
    
    def __next__(self):
        if self.current >= self.n:
            raise StopIteration
        result = self.current ** 2
        self.current += 1
        return result

# 3. Generator function
def squares_generator(n):
    for i in range(n):
        yield i ** 2

# Test all three
n = 5

print("List:")
print(squares_list(n))

print("\nIterator:")
for val in SquaresIterator(n):
    print(val, end=' ')
print()

print("\nGenerator:")
for val in squares_generator(n):
    print(val, end=' ')
print()

print("\nCode comparison:")
print("- List: Simple but uses more memory")
print("- Iterator: Memory efficient but complex code")
print("- Generator: Memory efficient AND simple code!")

---

## 6. Generator Methods <a id='generator-methods'></a>

Generators have special methods for advanced control:

### Generator Methods:

| Method | Description | Use Case |
|--------|-------------|----------|
| `__next__()` or `next()` | Get next value | Manual iteration |
| `send(value)` | Send value to generator | Two-way communication |
| `throw(exception)` | Raise exception in generator | Error handling |
| `close()` | Stop generator | Cleanup resources |

In [None]:
# Example 1: send() method

def echo_generator():
    """Generator that echoes values sent to it"""
    while True:
        received = yield
        print(f"Received: {received}")

gen = echo_generator()
next(gen)  # Prime the generator

print("Sending values to generator:")
gen.send("Hello")
gen.send(42)
gen.send([1, 2, 3])
gen.close()  # Close the generator

In [None]:
# Example 2: Accumulator with send()

def accumulator():
    """Generator that accumulates sent values"""
    total = 0
    while True:
        value = yield total
        if value is not None:
            total += value

acc = accumulator()
next(acc)  # Prime the generator

print("Accumulating values:")
print(f"Send 10: {acc.send(10)}")
print(f"Send 20: {acc.send(20)}")
print(f"Send 30: {acc.send(30)}")
print(f"Current total: {acc.send(None)}")

In [None]:
# Example 3: throw() method

def resilient_generator():
    """Generator that handles exceptions"""
    count = 0
    while True:
        try:
            yield count
            count += 1
        except ValueError:
            print("ValueError caught! Continuing...")
        except Exception as e:
            print(f"Other exception: {e}")
            break

gen = resilient_generator()

print(next(gen))
print(next(gen))
gen.throw(ValueError, "Test error")  # Generator catches this
print(next(gen))
gen.throw(RuntimeError, "Fatal error")  # Generator stops

In [None]:
# Example 4: close() method

def file_reader(filename):
    """Generator that reads file and closes properly"""
    print(f"Opening {filename}")
    try:
        with open(filename, 'r') as f:
            for line in f:
                yield line.strip()
    finally:
        print(f"Closing {filename}")

# Create a test file
with open('test.txt', 'w') as f:
    f.write("Line 1\nLine 2\nLine 3\n")

# Use generator
reader = file_reader('test.txt')
print(next(reader))
print(next(reader))
reader.close()  # Cleanup happens here

---

## 7. Practical Examples <a id='practical-examples'></a>

Let's explore real-world applications of generators.

In [None]:
# Example 1: Reading Large Files

def read_large_file(file_path, chunk_size=1024):
    """Read large file in chunks (memory efficient)"""
    with open(file_path, 'r') as file:
        while True:
            chunk = file.read(chunk_size)
            if not chunk:
                break
            yield chunk

# Create a test file
with open('large_file.txt', 'w') as f:
    for i in range(100):
        f.write(f"Line {i}\n")

# Process file in chunks
print("Reading file in chunks:")
chunk_count = 0
for chunk in read_large_file('large_file.txt', chunk_size=50):
    chunk_count += 1
    if chunk_count <= 3:
        print(f"Chunk {chunk_count}: {chunk[:30]}...")

print(f"\nTotal chunks processed: {chunk_count}")

In [None]:
# Example 2: Pagination Generator

def paginate(data, page_size=3):
    """Yield data in pages"""
    for i in range(0, len(data), page_size):
        yield data[i:i + page_size]

# Sample data
items = list(range(1, 21))

print("Paginated data:")
for page_num, page in enumerate(paginate(items, page_size=5), 1):
    print(f"Page {page_num}: {page}")

In [None]:
# Example 3: Data Stream Simulator

import random
import time

def data_stream(count=10):
    """Simulate streaming data (like sensor readings)"""
    for i in range(count):
        # Simulate data arriving over time
        time.sleep(0.1)
        yield {
            'id': i,
            'temperature': random.randint(15, 35),
            'humidity': random.randint(30, 70),
            'timestamp': time.time()
        }

print("Streaming sensor data:")
for data in data_stream(5):
    print(f"ID: {data['id']}, Temp: {data['temperature']}Â°C, Humidity: {data['humidity']}%")

In [None]:
# Example 4: Infinite ID Generator

def id_generator(prefix="ID"):
    """Generate unique IDs infinitely"""
    counter = 1
    while True:
        yield f"{prefix}-{counter:04d}"
        counter += 1

# Create ID generator
ids = id_generator("USER")

print("Generating unique IDs:")
for _ in range(10):
    print(next(ids), end=' ')
print()

In [None]:
# Example 5: Prime Number Generator

def prime_generator():
    """Generate infinite sequence of prime numbers"""
    def is_prime(n):
        if n < 2:
            return False
        for i in range(2, int(n ** 0.5) + 1):
            if n % i == 0:
                return False
        return True
    
    num = 2
    while True:
        if is_prime(num):
            yield num
        num += 1

# Get first 20 primes
primes = prime_generator()
print("First 20 prime numbers:")
for _ in range(20):
    print(next(primes), end=' ')
print()

In [None]:
# Example 6: Moving Average Calculator

def moving_average(data, window_size):
    """Calculate moving average"""
    window = []
    for value in data:
        window.append(value)
        if len(window) > window_size:
            window.pop(0)
        if len(window) == window_size:
            yield sum(window) / window_size

# Sample data
prices = [10, 12, 13, 15, 14, 16, 18, 20, 19, 21]

print("Stock prices:")
print(prices)
print("\n3-day moving average:")
for avg in moving_average(prices, 3):
    print(f"{avg:.2f}", end=' ')
print()

---

## 8. Advanced Generator Patterns <a id='advanced-patterns'></a>

Let's explore advanced generator patterns and techniques.

In [None]:
# Pattern 1: Generator Delegation (yield from)

def generator1():
    yield 1
    yield 2

def generator2():
    yield 3
    yield 4

def combined_generator():
    """Delegate to other generators using yield from"""
    yield from generator1()
    yield from generator2()
    yield 5

print("Combined generator output:")
for value in combined_generator():
    print(value, end=' ')
print()

In [None]:
# Pattern 2: Generator Flattening

def flatten(nested_list):
    """Flatten nested lists recursively"""
    for item in nested_list:
        if isinstance(item, list):
            yield from flatten(item)  # Recursive flattening
        else:
            yield item

# Nested list
nested = [1, [2, 3, [4, 5]], 6, [7, [8, 9]]]

print("Original nested list:")
print(nested)
print("\nFlattened:")
print(list(flatten(nested)))

In [None]:
# Pattern 3: Generator Tree Traversal

class TreeNode:
    def __init__(self, value, left=None, right=None):
        self.value = value
        self.left = left
        self.right = right

def inorder_traversal(node):
    """In-order traversal using generator"""
    if node:
        yield from inorder_traversal(node.left)
        yield node.value
        yield from inorder_traversal(node.right)

# Create a binary tree
#       4
#      / \
#     2   6
#    / \ / \
#   1  3 5  7
tree = TreeNode(4,
    TreeNode(2, TreeNode(1), TreeNode(3)),
    TreeNode(6, TreeNode(5), TreeNode(7))
)

print("In-order traversal:")
for value in inorder_traversal(tree):
    print(value, end=' ')
print()

In [None]:
# Pattern 4: Stateful Generator

def running_statistics():
    """Calculate running mean and variance"""
    count = 0
    total = 0
    sum_squares = 0
    
    while True:
        value = yield
        if value is None:
            break
        count += 1
        total += value
        sum_squares += value ** 2
        mean = total / count
        variance = (sum_squares / count) - (mean ** 2)
        print(f"Value: {value}, Mean: {mean:.2f}, Variance: {variance:.2f}")

# Use the generator
stats = running_statistics()
next(stats)  # Prime the generator

print("Calculating running statistics:")
for value in [10, 20, 30, 40, 50]:
    stats.send(value)

In [None]:
# Pattern 5: Coroutine Pattern

def coroutine(func):
    """Decorator to prime a coroutine"""
    def wrapper(*args, **kwargs):
        gen = func(*args, **kwargs)
        next(gen)  # Prime it
        return gen
    return wrapper

@coroutine
def grep(pattern):
    """Coroutine that filters lines matching pattern"""
    print(f"Looking for pattern: {pattern}")
    while True:
        line = yield
        if pattern in line:
            print(f"Found: {line}")

# Use the coroutine
python_matcher = grep("python")

lines = [
    "I love python",
    "Java is good",
    "Python is great",
    "JavaScript rocks"
]

for line in lines:
    python_matcher.send(line)

---

## 9. Generator Pipelines <a id='generator-pipelines'></a>

**Generator pipelines** chain multiple generators together for efficient data processing.

### Benefits:
- Memory efficient (no intermediate lists)
- Composable (easy to add/remove stages)
- Lazy evaluation (process only what's needed)

In [None]:
# Example 1: Simple Pipeline

def read_numbers(filename):
    """Stage 1: Read numbers from file"""
    with open(filename, 'r') as f:
        for line in f:
            yield int(line.strip())

def filter_even(numbers):
    """Stage 2: Filter even numbers"""
    for num in numbers:
        if num % 2 == 0:
            yield num

def square(numbers):
    """Stage 3: Square the numbers"""
    for num in numbers:
        yield num ** 2

# Create test file
with open('numbers.txt', 'w') as f:
    for i in range(1, 11):
        f.write(f"{i}\n")

# Build pipeline
pipeline = square(filter_even(read_numbers('numbers.txt')))

print("Pipeline output (even numbers squared):")
for value in pipeline:
    print(value, end=' ')
print()

In [None]:
# Example 2: Log File Processing Pipeline

def read_log_file(filename):
    """Read log file line by line"""
    with open(filename, 'r') as f:
        for line in f:
            yield line.strip()

def filter_errors(lines):
    """Filter lines containing ERROR"""
    for line in lines:
        if 'ERROR' in line:
            yield line

def extract_message(lines):
    """Extract message from log line"""
    for line in lines:
        parts = line.split(' - ')
        if len(parts) >= 2:
            yield parts[1]

# Create sample log file
with open('app.log', 'w') as f:
    f.write("INFO - Application started\n")
    f.write("ERROR - Connection failed\n")
    f.write("INFO - Processing data\n")
    f.write("ERROR - Invalid input\n")
    f.write("INFO - Task completed\n")

# Build pipeline
error_messages = extract_message(filter_errors(read_log_file('app.log')))

print("Error messages from log:")
for msg in error_messages:
    print(f"  - {msg}")

In [None]:
# Example 3: Data Transformation Pipeline

def data_source():
    """Generate sample data"""
    data = [
        {'name': 'Alice', 'age': 25, 'score': 85},
        {'name': 'Bob', 'age': 30, 'score': 92},
        {'name': 'Charlie', 'age': 22, 'score': 78},
        {'name': 'David', 'age': 28, 'score': 95},
    ]
    for item in data:
        yield item

def filter_by_age(data, min_age):
    """Filter by minimum age"""
    for item in data:
        if item['age'] >= min_age:
            yield item

def add_grade(data):
    """Add grade based on score"""
    for item in data:
        score = item['score']
        if score >= 90:
            item['grade'] = 'A'
        elif score >= 80:
            item['grade'] = 'B'
        else:
            item['grade'] = 'C'
        yield item

def format_output(data):
    """Format for display"""
    for item in data:
        yield f"{item['name']} (Age: {item['age']}) - Score: {item['score']}, Grade: {item['grade']}"

# Build pipeline
pipeline = format_output(
    add_grade(
        filter_by_age(data_source(), min_age=25)
    )
)

print("Filtered and processed data:")
for line in pipeline:
    print(f"  {line}")

---

## 10. Best Practices <a id='best-practices'></a>

### 1. Use Generators for Large Data
- Prefer generators over lists when dealing with large datasets
- Use generator expressions for simple transformations

### 2. Name Generators Clearly
- Use descriptive names that indicate they generate values
- Consider adding `_gen` suffix for clarity

### 3. Document Generator Behavior
- Document what values are yielded
- Specify if generator is finite or infinite
- Document any state or side effects

### 4. Handle Cleanup Properly
- Use try-finally for cleanup in generators
- Consider using context managers with generators

### 5. Don't Mix Return and Yield
- In Python 3, you can use return in generators, but it only affects StopIteration
- Keep it simple: use yield for values, return (empty) to exit

### 6. Use yield from for Delegation
- Prefer `yield from` over manual iteration when delegating to sub-generators
- More readable and efficient

### 7. Consider Memory vs Speed Trade-offs
- Generators save memory but may be slower than lists for small data
- Use lists when you need random access or multiple iterations

### 8. Prime Coroutines
- Always call next() once to prime a coroutine before sending values
- Use a decorator to automate priming

### 9. Build Pipelines for Complex Processing
- Chain generators for multi-stage data processing
- Keep each stage focused on a single task

### 10. Test Edge Cases
- Test with empty input
- Test generator exhaustion
- Test cleanup code

In [None]:
# Best Practice Example: Well-designed generator

def process_data_gen(filename, chunk_size=1000):
    """Process large data file efficiently.
    
    Yields processed data chunks from the file.
    
    Args:
        filename: Path to data file
        chunk_size: Number of lines to process at once
    
    Yields:
        dict: Processed data chunk
    
    Note:
        This is a finite generator that exhausts when file ends.
    """
    try:
        with open(filename, 'r') as file:
            chunk = []
            for line in file:
                # Process line
                data = line.strip()
                chunk.append(data)
                
                # Yield when chunk is full
                if len(chunk) >= chunk_size:
                    yield {'data': chunk, 'count': len(chunk)}
                    chunk = []
            
            # Yield remaining data
            if chunk:
                yield {'data': chunk, 'count': len(chunk)}
    
    finally:
        # Cleanup code
        print("Processing completed")

# Example usage
print("Good practices demonstrated:")
print("1. Clear, descriptive name with _gen suffix")
print("2. Comprehensive docstring")
print("3. Proper resource management (with statement)")
print("4. Cleanup in finally block")
print("5. Chunks data for memory efficiency")

---

## 11. Summary <a id='summary'></a>

### Key Takeaways:

1. **Generators** are functions that use `yield` to produce values lazily

2. **Two Types**:
   - **Generator Functions**: Use `yield` keyword
   - **Generator Expressions**: Like list comprehensions with `()`

3. **yield vs return**:
   - `yield`: Pauses function, preserves state, can yield multiple values
   - `return`: Terminates function, returns single value

4. **Benefits**:
   - Memory efficient (lazy evaluation)
   - Simple syntax (compared to iterator classes)
   - Can represent infinite sequences
   - Composable (pipelines)

5. **Generator Methods**:
   - `next()`: Get next value
   - `send()`: Send values to generator
   - `throw()`: Raise exception in generator
   - `close()`: Stop generator

6. **Advanced Features**:
   - `yield from`: Delegate to sub-generators
   - Coroutines: Two-way communication with generators
   - Pipelines: Chain generators for complex processing

### When to Use Generators:

```python
# Large datasets
def process_huge_file(filename):
    for line in open(filename):
        yield process(line)

# Infinite sequences
def fibonacci():
    a, b = 0, 1
    while True:
        yield a
        a, b = b, a + b

# Data pipelines
pipeline = transform3(transform2(transform1(data_source())))

# On-demand computation
results = (expensive_calculation(x) for x in data)
```

### Comparison Summary:

| Feature | List | Generator |
|---------|------|----------|
| **Memory** | High | Low |
| **Speed** | Fast access | Fast generation |
| **Reusable** | Yes | No |
| **Syntax** | `[x for x in data]` | `(x for x in data)` or `yield` |
| **Use Case** | Small data, multiple passes | Large data, single pass |

### Common Use Cases:

- Processing large files line by line
- Generating infinite sequences
- Creating data processing pipelines
- Streaming data from databases/APIs
- Implementing iterators simply
- Pagination and batch processing

### Next Steps:

In the next notebook, we'll explore **Closures and Decorators**, which are powerful tools for modifying and enhancing function behavior!