# Chapter 9: Advanced Comprehensions

**Chapter 9 - Learning Python, 5th Edition**

Comprehensions are concise, readable expressions for building collections
from iterables. Python offers list, dict, set, and generator comprehensions.
Mastering them -- including nested forms, filtering, and the walrus operator --
is key to writing expressive, efficient Python.

## Key Concepts
- **List comprehensions**: `[expr for x in iterable if cond]`
- **Dict comprehensions**: `{k: v for x in iterable}`
- **Set comprehensions**: `{expr for x in iterable}`
- **Generator expressions**: `(expr for x in iterable)` -- lazy, memory-efficient
- **Walrus operator (`:=`)**: Assignment inside expressions (PEP 572)

## Section 1: List Comprehensions with Complex Conditions

List comprehensions can include multiple `if` clauses and arbitrary
expressions. They replace verbose `for`/`append` patterns with a single
readable expression.

In [None]:
# Basic filtering and transformation
numbers: list[int] = list(range(1, 21))

# Even squares
even_squares: list[int] = [n ** 2 for n in numbers if n % 2 == 0]
print(f"Even squares: {even_squares}")

# Multiple conditions: numbers divisible by 2 AND 3
div_2_and_3: list[int] = [n for n in numbers if n % 2 == 0 if n % 3 == 0]
print(f"Divisible by 2 and 3: {div_2_and_3}")

# Conditional expression (ternary) in the output
labels: list[str] = [f"{n} (even)" if n % 2 == 0 else f"{n} (odd)" for n in range(1, 8)]
print(f"Labels: {labels}")

# Complex filtering with function calls
words: list[str] = ["Python", "is", "a", "powerful", "language", "for", "data"]
long_capitalized: list[str] = [
    w.upper() for w in words if len(w) > 3 if w[0].islower()
]
print(f"Long lowercase words, uppercased: {long_capitalized}")

# Flattening with conditional: extract digits from mixed data
mixed: list[str] = ["abc", "12", "de3", "456", "fg"]
digits: list[str] = [ch for s in mixed for ch in s if ch.isdigit()]
print(f"All digits: {digits}")

## Section 2: Nested Comprehensions (Matrix Operations)

Nested `for` clauses in comprehensions read left-to-right, matching the
order of equivalent nested `for` loops. This is powerful for matrix
operations, flattening, and Cartesian products.

In [None]:
# Matrix as a list of lists
matrix: list[list[int]] = [
    [1,  2,  3,  4],
    [5,  6,  7,  8],
    [9, 10, 11, 12],
]

# Flatten the matrix
flat: list[int] = [val for row in matrix for val in row]
print(f"Flattened: {flat}")

# Transpose: swap rows and columns
transposed: list[list[int]] = [[row[i] for row in matrix] for i in range(4)]
print(f"\nOriginal:")
for row in matrix:
    print(f"  {row}")
print(f"Transposed:")
for row in transposed:
    print(f"  {row}")

# Cartesian product
suits: list[str] = ["Hearts", "Diamonds"]
ranks: list[str] = ["Ace", "King", "Queen"]
deck: list[str] = [f"{rank} of {suit}" for suit in suits for rank in ranks]
print(f"\nCartesian product (partial deck):")
for card in deck:
    print(f"  {card}")

# Nested comprehension with condition: upper triangle of a 4x4 matrix
size = 4
upper_triangle: list[list[int]] = [
    [1 if j >= i else 0 for j in range(size)]
    for i in range(size)
]
print(f"\nUpper triangle:")
for row in upper_triangle:
    print(f"  {row}")

## Section 3: Dict Comprehensions

Dict comprehensions build dictionaries from iterables. Common uses include
inverting mappings, grouping data, filtering entries, and transforming
key-value pairs.

In [None]:
# Basic dict comprehension: word -> length
words: list[str] = ["python", "is", "elegant", "and", "powerful"]
word_lengths: dict[str, int] = {w: len(w) for w in words}
print(f"Word lengths: {word_lengths}")

# Inverting a mapping (swap keys and values)
status_codes: dict[int, str] = {200: "OK", 404: "Not Found", 500: "Server Error"}
inverted: dict[str, int] = {v: k for k, v in status_codes.items()}
print(f"\nInverted: {inverted}")

# Filtering: keep only successful status codes
all_codes: dict[int, str] = {
    200: "OK", 201: "Created", 301: "Moved",
    404: "Not Found", 500: "Server Error"
}
success_only: dict[int, str] = {k: v for k, v in all_codes.items() if k < 300}
print(f"Success codes: {success_only}")

# Grouping: first letter -> list of words
vocabulary: list[str] = [
    "apple", "avocado", "banana", "blueberry", "cherry", "coconut", "apricot"
]
grouped: dict[str, list[str]] = {}
for word in vocabulary:
    grouped.setdefault(word[0], []).append(word)
print(f"\nGrouped (imperative): {grouped}")

# Same grouping with a comprehension (requires unique keys or merging)
letters = sorted(set(w[0] for w in vocabulary))
grouped_comp: dict[str, list[str]] = {
    letter: [w for w in vocabulary if w[0] == letter]
    for letter in letters
}
print(f"Grouped (comprehension): {grouped_comp}")

# Transforming values: convert config strings to appropriate types
raw_config: dict[str, str] = {
    "timeout": "30", "debug": "true", "retries": "5", "verbose": "false"
}
parsed: dict[str, int | bool] = {
    k: (v.lower() == "true" if v.lower() in ("true", "false") else int(v))
    for k, v in raw_config.items()
}
print(f"\nParsed config: {parsed}")

## Section 4: Set Comprehensions

Set comprehensions are identical to list comprehensions syntactically, but
use curly braces `{}`. They automatically deduplicate results and produce
unordered sets.

In [None]:
# Basic deduplication
text = "the quick brown fox jumps over the lazy dog"
unique_lengths: set[int] = {len(w) for w in text.split()}
print(f"Unique word lengths: {sorted(unique_lengths)}")

# Extract unique characters (excluding spaces)
unique_chars: set[str] = {ch for ch in text if ch != " "}
print(f"Unique characters: {''.join(sorted(unique_chars))}")
print(f"Count: {len(unique_chars)}")

# Set operations with comprehensions: find shared attributes
team_a_skills: list[str] = ["python", "sql", "docker", "python", "aws"]
team_b_skills: list[str] = ["java", "sql", "kubernetes", "aws", "java"]

a_unique: set[str] = {s for s in team_a_skills}
b_unique: set[str] = {s for s in team_b_skills}

print(f"\nTeam A skills: {a_unique}")
print(f"Team B skills: {b_unique}")
print(f"Shared skills: {a_unique & b_unique}")
print(f"Only in A: {a_unique - b_unique}")
print(f"All skills: {a_unique | b_unique}")

# Deduplication with transformation: normalize email domains
emails: list[str] = [
    "Alice@Gmail.COM", "bob@gmail.com", "carol@Yahoo.com",
    "dave@GMAIL.COM", "eve@yahoo.com",
]
unique_domains: set[str] = {e.split("@")[1].lower() for e in emails}
print(f"\nUnique domains: {unique_domains}")

## Section 5: Generator Expressions (Memory-Efficient Pipelines)

Generator expressions look like list comprehensions but use parentheses.
They produce values lazily, one at a time, consuming minimal memory.
This makes them ideal for processing large datasets or building pipelines.

In [None]:
import sys

# Compare memory: list vs generator for a large range
list_comp: list[int] = [x ** 2 for x in range(10_000)]
gen_expr = (x ** 2 for x in range(10_000))

print(f"List size: {sys.getsizeof(list_comp):,} bytes")
print(f"Generator size: {sys.getsizeof(gen_expr):,} bytes")
print(f"type(gen_expr): {type(gen_expr)}")

# Generator expressions can be passed directly to functions
# (parentheses serve double duty when it's the sole argument)
total = sum(x ** 2 for x in range(1, 11))
print(f"\nSum of squares 1-10: {total}")

largest = max(len(w) for w in ["apple", "banana", "cherry", "date"])
print(f"Longest word length: {largest}")

# any() and all() short-circuit with generators
numbers: list[int] = [2, 4, 6, 8, 10, 11, 14]
all_even = all(n % 2 == 0 for n in numbers)
has_odd = any(n % 2 != 0 for n in numbers)
print(f"\nAll even? {all_even}")
print(f"Has odd?  {has_odd}")

# Chaining generators into a pipeline
raw_data: list[str] = [" Alice ", "BOB", " charlie", "  ", "Dave ", ""]

cleaned = (name.strip() for name in raw_data)       # Step 1: strip whitespace
non_empty = (name for name in cleaned if name)       # Step 2: remove blanks
normalized = (name.title() for name in non_empty)    # Step 3: title case

# Nothing executes until we consume the pipeline
result: list[str] = list(normalized)
print(f"\nPipeline result: {result}")

## Section 6: Comprehensions vs `map()`/`filter()` Performance

List comprehensions and `map()`/`filter()` achieve the same results.
Comprehensions are generally preferred for readability. Performance
differences are small, but comprehensions avoid the overhead of function
calls when using inline expressions.

In [None]:
import timeit

# Equivalent operations: square even numbers in 1..1000
N = 1000

def using_comprehension() -> list[int]:
    return [x ** 2 for x in range(N) if x % 2 == 0]

def using_map_filter() -> list[int]:
    return list(map(lambda x: x ** 2, filter(lambda x: x % 2 == 0, range(N))))

def using_loop() -> list[int]:
    result: list[int] = []
    for x in range(N):
        if x % 2 == 0:
            result.append(x ** 2)
    return result

# Verify all three produce the same result
assert using_comprehension() == using_map_filter() == using_loop()

# Time each approach
runs = 5000
t_comp = timeit.timeit(using_comprehension, number=runs)
t_map = timeit.timeit(using_map_filter, number=runs)
t_loop = timeit.timeit(using_loop, number=runs)

print(f"Comprehension: {t_comp:.4f}s")
print(f"map/filter:    {t_map:.4f}s")
print(f"for loop:      {t_loop:.4f}s")

fastest = min(t_comp, t_map, t_loop)
print(f"\nRelative speeds:")
print(f"  Comprehension: {t_comp / fastest:.2f}x")
print(f"  map/filter:    {t_map / fastest:.2f}x")
print(f"  for loop:      {t_loop / fastest:.2f}x")
print(f"\nNote: comprehensions avoid lambda overhead and are most Pythonic")

## Section 7: Walrus Operator (`:=`) in Comprehensions (PEP 572)

The walrus operator `:=` (assignment expression) lets you compute a value,
assign it to a variable, and use it -- all in a single expression. In
comprehensions this avoids redundant computation when the same intermediate
result is needed in both the filter and the output.

In [None]:
import math

# Problem: we want to filter by a computed value AND include it in output.
# Without walrus, we compute twice:
numbers: list[float] = [2.5, 16.0, 7.3, 100.0, 3.1, 49.0, 0.5]

# Redundant: sqrt is called twice per element
without_walrus: list[str] = [
    f"{n} -> {math.sqrt(n):.2f}"
    for n in numbers
    if math.sqrt(n) > 3.0
]
print(f"Without walrus: {without_walrus}")

# With walrus: compute once, reuse
with_walrus: list[str] = [
    f"{n} -> {root:.2f}"
    for n in numbers
    if (root := math.sqrt(n)) > 3.0
]
print(f"With walrus:    {with_walrus}")

# Practical example: parse and filter in one pass
raw_records: list[str] = [
    "Alice:95", "Bob:42", "Carol:88", "Dave:31", "Eve:76"
]

passing_students: list[tuple[str, int]] = [
    (name, score)
    for record in raw_records
    if (parts := record.split(":"))
    if (name := parts[0])
    if (score := int(parts[1])) >= 50
]
print(f"\nPassing students: {passing_students}")

# Walrus in generator expression with any()
data: list[int] = [1, 3, 5, 8, 11, 13]
if any((even := x) % 2 == 0 for x in data):
    print(f"\nFirst even number found: {even}")
else:
    print("\nNo even numbers")

# Collecting expensive computations with walrus
values: list[int] = list(range(1, 11))
expensive_results: list[tuple[int, int]] = [
    (x, result)
    for x in values
    if (result := x ** 3 + x ** 2 + x + 1) % 7 == 0
]
print(f"Values where x^3+x^2+x+1 is divisible by 7: {expensive_results}")

## Summary

### Comprehension Syntax
```python
[expr for x in iterable if cond]         # List
{key: val for x in iterable if cond}     # Dict
{expr for x in iterable if cond}         # Set
(expr for x in iterable if cond)         # Generator
```

### Key Patterns
- **Nested loops** read left-to-right: `[... for x in A for y in B]`
- **Dict inversion**: `{v: k for k, v in d.items()}`
- **Generator expressions** are lazy and memory-efficient
- **Walrus operator** (`:=`) avoids redundant computation in filter+output
- **Comprehensions** are generally faster than `map`/`filter` with lambdas

### When to Use What
- **List comp**: When you need the full result in memory
- **Generator expr**: When passing to `sum()`, `any()`, `all()`, or chaining
- **Dict/set comp**: When building mappings or deduplicating
- **`map()`/`filter()`**: When you already have a named function to apply