# Chapter 4: Comprehensions and Generators

## Overview

This chapter explores Python's powerful features for working with sequences and iterators:

- **Comprehensions**: Concise syntax for creating lists, dictionaries, and sets
- **Generators**: Memory-efficient iteration using `yield`
- **Advanced patterns**: `yield from`, generator expressions, and `itertools`

---

## Item 27: Use Comprehensions Instead of map and filter

### Core Concept

List comprehensions provide a clearer, more Pythonic way to derive new lists from sequences compared to `map` and `filter` with `lambda` functions.

In [None]:
# Basic list comprehension example
a = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

# Traditional approach with loop
squares = []
for x in a:
    squares.append(x**2)
print("Loop approach:", squares)

# List comprehension - much clearer
squares = [x**2 for x in a]
print("Comprehension:", squares)

### Comprehensions vs map()

In [None]:
# Using map (requires lambda, visually noisy)
alt = map(lambda x: x ** 2, a)
print("map result:", list(alt))

# List comprehension is clearer
squares = [x**2 for x in a]
print("Comprehension result:", squares)

### Filtering with Comprehensions

In [None]:
# List comprehension with filtering
even_squares = [x**2 for x in a if x % 2 == 0]
print("Even squares:", even_squares)

# Compare to map + filter (much harder to read)
alt = map(lambda x: x**2, filter(lambda x: x % 2 == 0, a))
print("map + filter:", list(alt))

# Verify they're equal
assert even_squares == list(map(lambda x: x**2, filter(lambda x: x % 2 == 0, a)))

### Dictionary and Set Comprehensions

In [None]:
# Dictionary comprehension
even_squares_dict = {x: x**2 for x in a if x % 2 == 0}
print("Dict comprehension:", even_squares_dict)

# Set comprehension
threes_cubed_set = {x**3 for x in a if x % 3 == 0}
print("Set comprehension:", threes_cubed_set)

### Comparison: Comprehensions vs map/filter for Complex Operations

In [None]:
# Comprehensions - clean and readable
even_squares_dict = {x: x**2 for x in a if x % 2 == 0}
threes_cubed_set = {x**3 for x in a if x % 3 == 0}

# map + filter equivalent (breaks across multiple lines, hard to read)
alt_dict = dict(map(lambda x: (x, x**2),
                   filter(lambda x: x % 2 == 0, a)))
alt_set = set(map(lambda x: x**3,
                 filter(lambda x: x % 3 == 0, a)))

print("Dict - comprehension:", even_squares_dict)
print("Dict - map/filter:   ", alt_dict)
print("\nSet - comprehension:", threes_cubed_set)
print("Set - map/filter:   ", alt_set)

### Enhanced Examples

In [None]:
# Example 1: Processing strings
words = ['hello', 'world', 'python', 'comprehensions']
uppercase = [word.upper() for word in words]
print("Uppercase:", uppercase)

# Example 2: Filtering by length
long_words = [word for word in words if len(word) > 5]
print("Long words:", long_words)

# Example 3: Creating dictionary from two lists
keys = ['a', 'b', 'c']
values = [1, 2, 3]
mapping = {k: v for k, v in zip(keys, values)}
print("Mapping:", mapping)

# Example 4: Set of unique lengths
lengths = {len(word) for word in words}
print("Unique lengths:", lengths)

### Things to Remember

✦ List comprehensions are clearer than `map` and `filter` built-in functions because they don't require `lambda` expressions

✦ List comprehensions allow you to easily skip items from the input list, a behavior that `map` doesn't support without help from `filter`

✦ Dictionaries and sets may also be created using comprehensions

---

## Item 28: Avoid More Than Two Control Subexpressions in Comprehensions

### Core Concept

Comprehensions support multiple levels of looping and conditions, but beyond two control subexpressions, readability suffers dramatically.

### Multiple Loops in Comprehensions

In [None]:
# Flattening a matrix (reasonable)
matrix = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
flat = [x for row in matrix for x in row]
print("Flattened:", flat)

# Squaring each cell (still readable)
squared = [[x**2 for x in row] for row in matrix]
print("Squared:\n", squared)

### When Comprehensions Become Too Complex

In [None]:
# Three levels - too hard to read!
my_lists = [
    [[1, 2, 3], [4, 5, 6]],
    [[7, 8, 9], [10, 11, 12]]
]

# This is getting hard to follow
flat = [x for sublist1 in my_lists
          for sublist2 in sublist1
          for x in sublist2]
print("Three-level flatten:", flat)

# Better alternative: normal loops
flat_alternative = []
for sublist1 in my_lists:
    for sublist2 in sublist1:
        flat_alternative.extend(sublist2)
print("Loop alternative:", flat_alternative)

### Multiple Conditions

In [None]:
a = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

# Multiple conditions at same level (implicit AND)
b = [x for x in a if x > 4 if x % 2 == 0]
print("Multiple if:", b)

# Equivalent using 'and'
c = [x for x in a if x > 4 and x % 2 == 0]
print("Using 'and':", c)

assert b == c

### Conditions at Different Loop Levels (Avoid This!)

In [None]:
matrix = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]

# Very hard to read: conditions at multiple levels
filtered = [[x for x in row if x % 3 == 0]
            for row in matrix if sum(row) >= 10]
print("Complex filter:", filtered)

# This says: "Include rows that sum to >= 10, 
# and within those rows, include only numbers divisible by 3"

### When to Use Helper Functions

In [None]:
# Instead of complex comprehension, use helper function
def get_filtered_values(matrix):
    """Extract values divisible by 3 from rows that sum >= 10"""
    result = []
    for row in matrix:
        if sum(row) >= 10:
            filtered_row = [x for x in row if x % 3 == 0]
            if filtered_row:
                result.append(filtered_row)
    return result

matrix = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
result = get_filtered_values(matrix)
print("Using helper function:", result)

### Enhanced Examples: Good vs Bad Complexity

In [None]:
# ✓ GOOD: Two control subexpressions
matrix = [[1, 2, 3], [4, 5, 6]]
flat_evens = [x for row in matrix for x in row if x % 2 == 0]
print("Good:", flat_evens)

# ✗ BAD: Three+ control subexpressions
# Don't do this!
# nested = [[[x**2 for x in range(3)] for _ in range(2)] for _ in range(2)]

# Better alternative for complex cases
def create_nested_structure():
    result = []
    for _ in range(2):
        inner = []
        for _ in range(2):
            innermost = [x**2 for x in range(3)]
            inner.append(innermost)
        result.append(inner)
    return result

nested = create_nested_structure()
print("Better approach:", nested)

### Things to Remember

✦ Comprehensions support multiple levels of loops and multiple conditions per loop level

✦ Comprehensions with more than two control subexpressions are very difficult to read and should be avoided

✦ Use helper functions or normal loops when complexity exceeds two subexpressions

---

## Item 29: Avoid Repeated Work in Comprehensions by Using Assignment Expressions

### Core Concept

The walrus operator (`:=`) allows you to assign values within comprehensions, avoiding redundant computations and improving both readability and performance.

### The Problem: Repeated Computation

In [None]:
# Setup: order fulfillment system
stock = {
    'nails': 125,
    'screws': 35,
    'wingnuts': 8,
    'washers': 24,
}

order = ['screws', 'wingnuts', 'clips']

def get_batches(count, size):
    return count // size

# Traditional loop approach
result = {}
for name in order:
    count = stock.get(name, 0)
    batches = get_batches(count, 8)
    if batches:
        result[name] = batches

print("Loop result:", result)

### Problematic: Repeated Expression in Comprehension

In [None]:
# BAD: get_batches called twice (redundant computation)
found = {name: get_batches(stock.get(name, 0), 8)
         for name in order
         if get_batches(stock.get(name, 0), 8)}

print("Repeated computation:", found)

# This is error-prone! Easy to get out of sync:
has_bug = {name: get_batches(stock.get(name, 0), 4)  # Different size!
           for name in order
           if get_batches(stock.get(name, 0), 8)}

print("Expected:", found)
print("Bug version:", has_bug)
print("Results differ!" if found != has_bug else "Results match")

### Solution: Assignment Expressions (Walrus Operator)

In [None]:
# GOOD: Compute once, use multiple times
found = {name: batches for name in order
         if (batches := get_batches(stock.get(name, 0), 8))}

print("Assignment expression:", found)

# The := operator:
# 1. Evaluates get_batches(stock.get(name, 0), 8)
# 2. Assigns result to 'batches'
# 3. Returns the value for the if condition
# 4. 'batches' is available in the value expression

### Placement Rules for Assignment Expressions

In [None]:
# ✗ WRONG: Assignment in value expression without condition
# This causes NameError!
try:
    result = {name: (tenth := count // 10)
              for name, count in stock.items() if tenth > 0}
except NameError as e:
    print(f"Error: {e}")

# ✓ CORRECT: Assignment in condition
result = {name: tenth for name, count in stock.items()
          if (tenth := count // 10) > 0}
print("Correct placement:", result)

### Variable Leakage Behavior

In [None]:
# Assignment expressions WITHOUT condition leak the variable
half = [(last := count // 2) for count in stock.values()]
print(f'Last item of {half} is {last}')

# This is similar to regular for loops
for count in stock.values():
    pass
print(f'Last item is {count}')

# But loop variables in comprehensions DON'T leak
half = [count // 2 for count in stock.values()]
print(half)  # Works
try:
    print(count)  # This will fail
except NameError:
    print("'count' is not defined outside comprehension")

### Best Practice: Use in Condition Only

In [None]:
# Recommended: Assignment expressions in the condition part
result = {name: batches for name in order
          if (batches := get_batches(stock.get(name, 0), 8))}

print("Best practice:", result)

### Generator Expressions with Assignment Expressions

In [None]:
# Works the same way with generator expressions
found = ((name, batches) for name in order
         if (batches := get_batches(stock.get(name, 0), 8)))

print(next(found))
print(next(found))

### Enhanced Examples

In [None]:
# Example 1: Processing with validation
def validate_and_transform(value):
    """Expensive validation and transformation"""
    if value % 2 == 0:
        return value ** 2
    return None

numbers = [1, 2, 3, 4, 5, 6]
results = [transformed for x in numbers 
           if (transformed := validate_and_transform(x)) is not None]
print("Validated:", results)

# Example 2: String processing
words = ['hello', 'world', '', 'python', 'programming']
uppercase_long = [upper for word in words 
                  if (upper := word.upper()) and len(upper) > 5]
print("Long uppercase:", uppercase_long)

# Example 3: Mathematical computation
import math
values = [1, 4, 9, 16, 25]
roots_and_logs = [(val, root, math.log10(val)) 
                  for val in values 
                  if (root := math.sqrt(val)) > 2]
print("Roots and logs:", roots_and_logs)

### Things to Remember

✦ Assignment expressions make it possible for comprehensions and generator expressions to reuse the value from one condition elsewhere in the same comprehension, which can improve readability and performance

✦ Although it's possible to use an assignment expression outside of a comprehension or generator expression's condition, you should avoid doing so

✦ Use assignment expressions in the condition part to avoid variable leakage

---

## Item 30: Consider Generators Instead of Returning Lists

### Core Concept

Generators provide a memory-efficient way to produce sequences of values without storing the entire result in memory. They're especially useful for large or infinite sequences.

### Traditional Approach: Returning Lists

In [None]:
def index_words(text):
    """Find the index of each word in a string"""
    result = []
    if text:
        result.append(0)
    for index, letter in enumerate(text):
        if letter == ' ':
            result.append(index + 1)
    return result

address = 'Four score and seven years ago...'
result = index_words(address)
print("Word indices:", result[:10])

### Problems with the List Approach

**Problem 1**: Code is dense and noisy
- Method calls (`result.append`) add bulk
- Separate lines for creating and returning the list

**Problem 2**: Memory usage
- All results must be stored before returning
- Can cause crashes with huge inputs

### Solution: Generator Function

In [None]:
def index_words_iter(text):
    """Generator version - much cleaner"""
    if text:
        yield 0
    for index, letter in enumerate(text):
        if letter == ' ':
            yield index + 1

# Generator doesn't run immediately
it = index_words_iter(address)
print("Iterator object:", it)

# Advance with next()
print("First index:", next(it))
print("Second index:", next(it))

# Or convert to list
result = list(index_words_iter(address))
print("All indices:", result[:10])

### Streaming Large Inputs

In [None]:
# Create a test file first
with open('/home/claude/address.txt', 'w') as f:
    f.write(address)

def index_file(handle):
    """Stream from file - bounded memory usage"""
    offset = 0
    for line in handle:
        if line:
            yield offset
        for letter in line:
            offset += 1
            if letter == ' ':
                yield offset

import itertools

with open('/home/claude/address.txt', 'r') as f:
    it = index_file(f)
    results = itertools.islice(it, 0, 10)
    print("From file:", list(results))

### How Generators Work

In [None]:
def demo_generator():
    print("Starting")
    yield 1
    print("After first yield")
    yield 2
    print("After second yield")
    yield 3
    print("Done")

# Generator doesn't execute until iterated
gen = demo_generator()
print("Generator created\n")

print("First next():", next(gen))
print("Second next():", next(gen))
print("Third next():", next(gen))

try:
    next(gen)
except StopIteration:
    print("\nGenerator exhausted")

### Enhanced Examples

In [None]:
# Example 1: Fibonacci generator
def fibonacci(n):
    """Generate first n Fibonacci numbers"""
    a, b = 0, 1
    for _ in range(n):
        yield a
        a, b = b, a + b

print("First 10 Fibonacci:", list(fibonacci(10)))

# Example 2: Infinite sequence
def count_forever(start=0):
    """Infinite counter"""
    while True:
        yield start
        start += 1

counter = count_forever(1)
print("First 5 counts:", [next(counter) for _ in range(5)])

# Example 3: Processing pipeline
def read_numbers(filename):
    """Read numbers from file"""
    with open(filename) as f:
        for line in f:
            yield int(line.strip())

def filter_even(numbers):
    """Filter even numbers"""
    for num in numbers:
        if num % 2 == 0:
            yield num

def square(numbers):
    """Square numbers"""
    for num in numbers:
        yield num ** 2

# Create test file
with open('/home/claude/numbers.txt', 'w') as f:
    f.write('\n'.join(map(str, range(1, 11))))

# Chain generators together (memory efficient!)
pipeline = square(filter_even(read_numbers('/home/claude/numbers.txt')))
print("Pipeline result:", list(pipeline))

### Gotcha: Stateful Iterators

In [None]:
# Important: generators are stateful and can't be reused
gen = fibonacci(5)
print("First use:", list(gen))
print("Second use:", list(gen))  # Empty!

# Need to create new generator for reuse
gen = fibonacci(5)
print("New generator:", list(gen))

### Things to Remember

✦ Using generators can be clearer than the alternative of having a function return a list of accumulated results

✦ The iterator returned by a generator produces the set of values passed to `yield` expressions within the generator function's body

✦ Generators can produce a sequence of outputs for arbitrarily large inputs because their working memory doesn't include all inputs and outputs

✦ Generators are stateful and can't be reused

---

## Practice Exercises

### Exercise 1: List Comprehensions
Create a list comprehension that generates perfect squares from 1 to 100 that are divisible by 3.

In [None]:
# Your solution here
result = [x**2 for x in range(1, 11) if x**2 % 3 == 0]
print(result)

### Exercise 2: Dictionary Comprehension
Create a dictionary mapping words to their lengths, only including words longer than 4 characters.

In [None]:
# Your solution here
words = ['cat', 'elephant', 'dog', 'giraffe', 'ant']
result = {word: len(word) for word in words if len(word) > 4}
print(result)

### Exercise 3: Generator Function
Write a generator that yields prime numbers up to n.

In [None]:
# Your solution here
def primes_up_to(n):
    for num in range(2, n + 1):
        is_prime = True
        for divisor in range(2, int(num ** 0.5) + 1):
            if num % divisor == 0:
                is_prime = False
                break
        if is_prime:
            yield num

print("Primes up to 30:", list(primes_up_to(30)))

## Summary

### Key Takeaways

1. **Comprehensions are Pythonic**: Use list/dict/set comprehensions instead of `map`/`filter` with lambdas

2. **Keep comprehensions simple**: Limit to 2 control subexpressions maximum

3. **Use walrus operator wisely**: Assignment expressions (`:=`) avoid repeated computation in comprehensions

4. **Generators save memory**: Use generators for large sequences or streaming data

5. **Know the tradeoffs**: Generators are single-use; lists can be reused

### Pattern Summary

```python
# List comprehension with filter
[expression for item in iterable if condition]

# Dictionary comprehension
{key_expr: value_expr for item in iterable if condition}

# Set comprehension
{expression for item in iterable if condition}

# Generator expression
(expression for item in iterable if condition)

# Assignment expression in comprehension
[value for item in iterable if (value := transform(item))]

# Generator function
def generator():
    for item in iterable:
        yield process(item)
```

---

## Next Steps

In the next items, we'll explore:
- Defensive iteration patterns (Item 31)
- Generator expressions for large comprehensions (Item 32)
- Composing generators with `yield from` (Item 33)
- Advanced generator techniques
- The `itertools` module

Continue practicing these patterns - they're fundamental to writing Pythonic, efficient code!