# Chapter 14: Generators & Iterators

Master lazy evaluation, memory-efficient iteration, and generator patterns



### What are Iterators? (Slide 20)


<p><strong>Iterator</strong> - An object that implements the iterator protocol</p>
<p><strong>Iterator Protocol:</strong></p>
<ul>
<li><code>__iter__()</code> - Returns the iterator object</li>
<li><code>__next__()</code> - Returns the next item</li>
<li>Raises <code>StopIteration</code> when done</li>
</ul>
<p><strong>Iterable vs Iterator:</strong></p>
<ul>
<li><strong>Iterable</strong> - Can be looped over (has __iter__)</li>
<li><strong>Iterator</strong> - Produces values one at a time</li>
</ul>
<p><strong>Benefits:</strong></p>
<ul>
<li>Memory efficient - generates values on demand</li>
<li>Works with infinite sequences</li>
<li>Lazy evaluation</li>
</ul>


### Built-in Iterators (Slide 21)


In [1]:
# Lists are iterable
numbers = [1, 2, 3, 4, 5]

# Get iterator
iterator = iter(numbers)
print(type(iterator))  # <class 'list_iterator'>

# Manual iteration
print(next(iterator))  # 1
print(next(iterator))  # 2
print(next(iterator))  # 3

# for loop uses iterators internally
for num in numbers:
    print(num)  # Equivalent to next(iter(numbers))

# StopIteration when exhausted
try:
    while True:
        print(next(iterator))
except StopIteration:
    print("Iterator exhausted")


<class 'list_iterator'>
1
2
3
1
2
3
4
5
4
5
Iterator exhausted


> **Note:** for loops automatically handle StopIteration


### Custom Iterator Class (Slide 22)


In [2]:
# Implement iterator protocol
class CountDown:
    def __init__(self, start):
        self.current = start

    def __iter__(self):
        return self  # Return iterator (self)

    def __next__(self):
        if self.current <= 0:
            raise StopIteration
        self.current -= 1
        return self.current + 1

# Use custom iterator
for num in CountDown(5):
    print(num)  # 5, 4, 3, 2, 1

# Manual usage
counter = CountDown(3)
print(next(counter))  # 3
print(next(counter))  # 2
print(next(counter))  # 1
# print(next(counter))  # StopIteration


5
4
3
2
1
3
2
1


> **Note:** Must implement both __iter__ and __next__


### Generator Functions - Basics (Slide 23)


In [3]:
# Generator using yield
def count_up_to(n):
    count = 1
    while count <= n:
        yield count  # Pause and return value
        count += 1

# Generator object
gen = count_up_to(5)
print(type(gen))  # <class 'generator'>

# Use like iterator
print(next(gen))  # 1
print(next(gen))  # 2

# Use in for loop
for num in count_up_to(3):
    print(num)  # 1, 2, 3

# Generator pauses at yield
def simple_gen():
    print("Start")
    yield 1
    print("Middle")
    yield 2
    print("End")

for val in simple_gen():
    print(f"Got: {val}")


<class 'generator'>
1
2
1
2
3
Start
Got: 1
Middle
Got: 2
End


> **Note:** yield pauses function, maintains state


### Generator Expressions (Slide 24)


In [4]:
# List comprehension - creates entire list
squares_list = [x**2 for x in range(10)]
print(type(squares_list))  # <class 'list'>
print(squares_list)  # [0, 1, 4, 9, ...]

# Generator expression - lazy evaluation
squares_gen = (x**2 for x in range(10))
print(type(squares_gen))  # <class 'generator'>
print(squares_gen)  # <generator object ...>

# Use generator
for square in squares_gen:
    print(square)

# Memory efficient for large data
import sys
list_obj = [x for x in range(10000)]
gen_obj = (x for x in range(10000))

print(sys.getsizeof(list_obj))  # ~85KB
print(sys.getsizeof(gen_obj))   # ~128 bytes!


<class 'list'>
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]
<class 'generator'>
<generator object <genexpr> at 0x000002449FF0B440>
0
1
4
9
16
25
36
49
64
81
85176
200


> **Note:** Generators save memory for large datasets


### Infinite Generators (Slide 25)


In [5]:
# Generators can be infinite
def infinite_counter():
    count = 0
    while True:
        yield count
        count += 1

# Use with limit
counter = infinite_counter()
for _ in range(5):
    print(next(counter))  # 0, 1, 2, 3, 4

# Fibonacci infinite generator
def fibonacci():
    a, b = 0, 1
    while True:
        yield a
        a, b = b, a + b

# Get first 10 fibonacci numbers
import itertools
fib = fibonacci()
first_10 = list(itertools.islice(fib, 10))
print(first_10)  # [0, 1, 1, 2, 3, 5, 8, 13, 21, 34]


0
1
2
3
4
[0, 1, 1, 2, 3, 5, 8, 13, 21, 34]


> **Note:** Infinite sequences only possible with generators


### Generator with send() (Slide 26)


In [6]:
# Two-way communication with generators
def echo_generator():
    while True:
        received = yield  # Receive value
        print(f"Received: {received}")

gen = echo_generator()
next(gen)  # Prime the generator

gen.send("Hello")   # Received: Hello
gen.send("World")   # Received: World

# Accumulator example
def running_average():
    total = 0
    count = 0
    average = None
    while True:
        value = yield average
        total += value
        count += 1
        average = total / count

avg = running_average()
next(avg)  # Prime
print(avg.send(10))  # 10.0
print(avg.send(20))  # 15.0
print(avg.send(30))  # 20.0


Received: Hello
Received: World
10.0
15.0
20.0


> **Note:** send() passes value back into generator


### yield from - Delegating (Slide 27)


In [7]:
# yield from delegates to sub-generator
def inner_gen():
    yield 1
    yield 2
    yield 3

def outer_gen():
    yield "start"
    yield from inner_gen()  # Delegate
    yield "end"

for value in outer_gen():
    print(value)
# Output: start, 1, 2, 3, end

# Flatten nested lists
def flatten(nested_list):
    for item in nested_list:
        if isinstance(item, list):
            yield from flatten(item)  # Recursive
        else:
            yield item

nested = [1, [2, 3, [4, 5]], 6, [7]]
flat = list(flatten(nested))
print(flat)  # [1, 2, 3, 4, 5, 6, 7]


start
1
2
3
end
[1, 2, 3, 4, 5, 6, 7]


> **Note:** yield from simplifies generator delegation


### Generator Pipeline Pattern (Slide 28)


In [8]:
# Chain generators for data processing
def read_numbers(filename):
    with open(filename) as f:
        for line in f:
            yield int(line.strip())

def filter_even(numbers):
    for num in numbers:
        if num % 2 == 0:
            yield num

def square(numbers):
    for num in numbers:
        yield num ** 2

# Pipeline (memory efficient!)
# numbers = read_numbers('data.txt')
# evens = filter_even(numbers)
# squared = square(evens)
# result = list(squared)

# Or inline
def process_file(filename):
    numbers = read_numbers(filename)
    evens = filter_even(numbers)
    return square(evens)

# Lazy - only processes when consumed
for value in process_file('data.txt'):
    print(value)


100
400
900
1600
3600
6400
10000


> **Note:** Pipelines process data without loading all into memory


### itertools Module (Slide 29)


In [9]:
import itertools

# count - infinite counter
for i in itertools.count(10, 2):  # Start 10, step 2
    if i > 20:
        break
    print(i)  # 10, 12, 14, 16, 18, 20

# cycle - repeat infinitely
counter = 0
for color in itertools.cycle(['red', 'green', 'blue']):
    print(color)
    counter += 1
    if counter >= 7:
        break
# Output: red, green, blue, red, green, blue, red

# chain - combine iterables
list1 = [1, 2, 3]
list2 = [4, 5, 6]
combined = itertools.chain(list1, list2)
print(list(combined))  # [1, 2, 3, 4, 5, 6]

# islice - slice iterator
gen = (x for x in range(100))
first_10 = itertools.islice(gen, 10)
print(list(first_10))  # [0, 1, 2, ..., 9]


10
12
14
16
18
20
red
green
blue
red
green
blue
red
[1, 2, 3, 4, 5, 6]
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]


> **Note:** itertools provides powerful iterator utilities


### itertools - Combinations (Slide 30)


In [10]:
import itertools

# product - Cartesian product
colors = ['red', 'blue']
sizes = ['S', 'M', 'L']
combos = itertools.product(colors, sizes)
print(list(combos))
# [('red', 'S'), ('red', 'M'), ('red', 'L'),
#  ('blue', 'S'), ('blue', 'M'), ('blue', 'L')]

# combinations - all combinations
items = [1, 2, 3, 4]
pairs = itertools.combinations(items, 2)
print(list(pairs))
# [(1, 2), (1, 3), (1, 4), (2, 3), (2, 4), (3, 4)]

# permutations - all orderings
chars = ['A', 'B', 'C']
perms = itertools.permutations(chars, 2)
print(list(perms))
# [('A', 'B'), ('A', 'C'), ('B', 'A'),
#  ('B', 'C'), ('C', 'A'), ('C', 'B')]


[('red', 'S'), ('red', 'M'), ('red', 'L'), ('blue', 'S'), ('blue', 'M'), ('blue', 'L')]
[(1, 2), (1, 3), (1, 4), (2, 3), (2, 4), (3, 4)]
[('A', 'B'), ('A', 'C'), ('B', 'A'), ('B', 'C'), ('C', 'A'), ('C', 'B')]


> **Note:** Great for combinatorial problems


### Real-World - File Processing (Slide 31)


In [11]:
# Memory-efficient log file processing
def read_log_file(filename):
    """Generator for reading large log files"""
    with open(filename) as f:
        for line in f:
            yield line.strip()

def filter_errors(lines):
    """Filter only ERROR lines"""
    for line in lines:
        if 'ERROR' in line:
            yield line

def extract_timestamp(lines):
    """Extract timestamp from each line"""
    for line in lines:
        # Parse timestamp from line
        timestamp = line.split()[0]
        yield timestamp

# Process huge file without loading into memory
def count_errors(log_file):
    lines = read_log_file(log_file)
    errors = filter_errors(lines)
    timestamps = extract_timestamp(errors)

    return list(timestamps)  # Only this loads to memory

# Can process GB files with constant memory


> **Note:** Essential for big data processing


### Real-World - Streaming Data (Slide 32)


In [12]:
# Process streaming data
def moving_average(values, window_size):
    """Calculate moving average"""
    window = []
    for value in values:
        window.append(value)
        if len(window) > window_size:
            window.pop(0)
        yield sum(window) / len(window)

# Simulate streaming sensor data
def sensor_readings():
    import random
    while True:
        yield random.randint(0, 100)

# Process in real-time
sensor = sensor_readings()
average = moving_average(sensor, window_size=5)

# Get first 10 averaged readings
import itertools
for avg in itertools.islice(average, 10):
    print(f"Average: {avg:.2f}")

# Never loads all data - processes stream!


Average: 49.00
Average: 74.00
Average: 73.67
Average: 58.00
Average: 61.80
Average: 63.40
Average: 55.20
Average: 43.80
Average: 52.80
Average: 39.40


> **Note:** Perfect for real-time data processing


### Generator State Management (Slide 33)


In [13]:
# Generators maintain state between yields
def stateful_generator():
    state = {'count': 0, 'total': 0}

    while True:
        value = yield state['total']
        state['count'] += 1
        state['total'] += value
        print(f"Count: {state['count']}, Total: {state['total']}")

gen = stateful_generator()
next(gen)  # Prime

gen.send(10)  # Count: 1, Total: 10
gen.send(20)  # Count: 2, Total: 30
gen.send(15)  # Count: 3, Total: 45

# Generator retains state!
# More memory efficient than class for simple state

# vs Class approach
class StatefulClass:
    def __init__(self):
        self.count = 0
        self.total = 0

    def add(self, value):
        self.count += 1
        self.total += value
        return self.total


Count: 1, Total: 10
Count: 2, Total: 30
Count: 3, Total: 45


> **Note:** Generators are lightweight stateful objects


### Generator Best Practices (Slide 34)


<p><strong>When to Use Generators:</strong></p>
<ul>
<li>Large datasets that don't fit in memory</li>
<li>Infinite sequences</li>
<li>Pipeline processing</li>
<li>Streaming data</li>
<li>On-demand computation</li>
</ul>
<p><strong>Do:</strong></p>
<ul>
<li>Use generator expressions for simple cases</li>
<li>Chain generators for data pipelines</li>
<li>Use <code>yield from</code> for delegation</li>
<li>Leverage itertools for common patterns</li>
<li>Document generator behavior</li>
</ul>
<p><strong>Don't:</strong></p>
<ul>
<li>Use generators for small, fixed datasets</li>
<li>Forget generators are one-time use</li>
<li>Try to access by index (not supported)</li>
<li>Use when random access is needed</li>
</ul>
<p><strong>Performance:</strong> Generators trade CPU for memory - slightly slower but much more memory efficient</p>
