# 第13章：推导式（Comprehension）

推导式是Python的一个强大特性，用一行代码就能创建列表、字典或集合。

## 列表推导式

最常用的推导式。

### 基本语法

In [None]:
# 传统方式
squares = []
for x in range(10):
    squares.append(x ** 2)

# 列表推导式（一行搞定！）
squares = [x ** 2 for x in range(10)]
print(squares)  # [0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

**格式**：`[表达式 for 变量 in 序列]`

### 带条件的推导式

In [None]:
# 只要偶数的平方
even_squares = [x ** 2 for x in range(10) if x % 2 == 0]
print(even_squares)  # [0, 4, 16, 36, 64]

# 格式：[表达式 for 变量 in 序列 if 条件]

### 多重条件

In [None]:
# 能被2和3整除的数
numbers = [x for x in range(50) if x % 2 == 0 if x % 3 == 0]
print(numbers)  # [0, 6, 12, 18, 24, 30, 36, 42, 48]

# 等价于
numbers = [x for x in range(50) if x % 2 == 0 and x % 3 == 0]

### if-else表达式

In [None]:
# 奇数变负数，偶数保持不变
numbers = [x if x % 2 == 0 else -x for x in range(10)]
print(numbers)  # [0, -1, 2, -3, 4, -5, 6, -7, 8, -9]

# 格式：[表达式1 if 条件 else 表达式2 for 变量 in 序列]

### 嵌套循环

In [None]:
# 笛卡尔积
pairs = [(x, y) for x in [1, 2, 3] for y in ['a', 'b', 'c']]
print(pairs)
# [(1, 'a'), (1, 'b'), (1, 'c'),
#  (2, 'a'), (2, 'b'), (2, 'c'),
#  (3, 'a'), (3, 'b'), (3, 'c')]

# 等价于
pairs = []
for x in [1, 2, 3]:
    for y in ['a', 'b', 'c']:
        pairs.append((x, y))

### 字符串操作

In [None]:
# 转大写
words = ["hello", "world", "python"]
upper_words = [word.upper() for word in words]
print(upper_words)  # ['HELLO', 'WORLD', 'PYTHON']

# 提取首字母
initials = [word[0] for word in words]
print(initials)  # ['h', 'w', 'p']

# 过滤长度
long_words = [word for word in words if len(word) > 5]
print(long_words)  # ['python']

### 二维列表

In [None]:
# 创建3x3矩阵
matrix = [[i * 3 + j for j in range(3)] for i in range(3)]
print(matrix)
# [[0, 1, 2],
#  [3, 4, 5],
#  [6, 7, 8]]

# 矩阵转置
transposed = [[row[i] for row in matrix] for i in range(3)]
print(transposed)
# [[0, 3, 6],
#  [1, 4, 7],
#  [2, 5, 8]]

## 字典推导式

创建字典的简洁方式。

### 基本语法

In [None]:
# 数字的平方
squares = {x: x ** 2 for x in range(6)}
print(squares)  # {0: 0, 1: 1, 2: 4, 3: 9, 4: 16, 5: 25}

# 格式：{key表达式: value表达式 for 变量 in 序列}

### 从列表创建字典

In [None]:
# 单词长度
words = ["apple", "banana", "cherry"]
word_lengths = {word: len(word) for word in words}
print(word_lengths)  # {'apple': 5, 'banana': 6, 'cherry': 6}

# 从两个列表
keys = ["a", "b", "c"]
values = [1, 2, 3]
d = {k: v for k, v in zip(keys, values)}
print(d)  # {'a': 1, 'b': 2, 'c': 3}

### 带条件

In [None]:
# 只要偶数
squares = {x: x ** 2 for x in range(10) if x % 2 == 0}
print(squares)  # {0: 0, 2: 4, 4: 16, 6: 36, 8: 64}

### 反转字典

In [None]:
original = {"a": 1, "b": 2, "c": 3}
reversed_dict = {v: k for k, v in original.items()}
print(reversed_dict)  # {1: 'a', 2: 'b', 3: 'c'}

### 过滤字典

In [None]:
scores = {"张三": 85, "李四": 92, "王五": 78, "赵六": 95}

# 只要>=90的
excellent = {name: score for name, score in scores.items() if score >= 90}
print(excellent)  # {'李四': 92, '赵六': 95}

### 转换value

In [None]:
# 摄氏度转华氏度
celsius = {"Beijing": 10, "Shanghai": 15, "Guangzhou": 20}
fahrenheit = {city: temp * 9/5 + 32 for city, temp in celsius.items()}
print(fahrenheit)
# {'Beijing': 50.0, 'Shanghai': 59.0, 'Guangzhou': 68.0}

## 集合推导式

创建集合的简洁方式。

### 基本语法

In [None]:
# 平方集合
squares = {x ** 2 for x in range(10)}
print(squares)  # {0, 1, 64, 4, 36, 9, 16, 49, 81, 25}

# 格式：{表达式 for 变量 in 序列}

### 去重

In [None]:
# 提取唯一字符
text = "hello world"
unique_chars = {char for char in text if char != ' '}
print(unique_chars)  # {'o', 'd', 'e', 'h', 'l', 'r', 'w'}

### 数学运算

In [None]:
# 两个集合的笛卡尔积的元素之和
a = {1, 2, 3}
b = {4, 5}
sums = {x + y for x in a for y in b}
print(sums)  # {5, 6, 7, 8}

## 生成器表达式

和列表推导式类似，但返回生成器（节省内存）。

### 基本语法

In [None]:
# 列表推导式（立即创建列表）
squares_list = [x ** 2 for x in range(10)]
print(type(squares_list))  # <class 'list'>

# 生成器表达式（按需生成）
squares_gen = (x ** 2 for x in range(10))
print(type(squares_gen))  # <class 'generator'>

# 遍历生成器
for num in squares_gen:
    print(num)

### 内存优势

In [None]:
# 列表推导式：占用大量内存
big_list = [x ** 2 for x in range(1000000)]

# 生成器表达式：几乎不占内存
big_gen = (x ** 2 for x in range(1000000))

# 使用
sum_of_squares = sum(x ** 2 for x in range(1000000))

### 什么时候用生成器？

In [None]:
# 只需要遍历一次：用生成器
total = sum(x ** 2 for x in range(1000000))

# 需要多次访问：用列表
squares = [x ** 2 for x in range(10)]
print(squares[5])  # 可以索引
print(squares[3])  # 可以多次访问

## 实战例子

### 例子1：数据清洗

In [None]:
# 清理数据
raw_data = ["  apple  ", "BANANA", "  Orange", "grape  "]
clean_data = [item.strip().lower() for item in raw_data]
print(clean_data)  # ['apple', 'banana', 'orange', 'grape']

### 例子2：筛选文件

In [None]:
import os

# 筛选Python文件
files = os.listdir(".")
py_files = [f for f in files if f.endswith(".py")]
print(py_files)

### 例子3：提取数字

In [None]:
text = "订单号：12345，价格：99.9元，数量：3"
numbers = [int(s) for s in text.split() if s.isdigit()]
print(numbers)  # [12345, 3]

### 例子4：矩阵操作

In [None]:
# 将矩阵所有元素乘以2
matrix = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
doubled = [[x * 2 for x in row] for row in matrix]
print(doubled)
# [[2, 4, 6], [8, 10, 12], [14, 16, 18]]

# 提取对角线
diagonal = [matrix[i][i] for i in range(len(matrix))]
print(diagonal)  # [1, 5, 9]

### 例子5：成绩统计

In [None]:
students = [
    {"name": "张三", "score": 85},
    {"name": "李四", "score": 92},
    {"name": "王五", "score": 78},
    {"name": "赵六", "score": 95}
]

# 提取所有分数
scores = [s["score"] for s in students]
print(f"平均分：{sum(scores) / len(scores):.1f}")

# 及格的学生
passed = [s["name"] for s in students if s["score"] >= 60]
print(f"及格：{passed}")

# 优秀学生（>=90）
excellent = {s["name"]: s["score"] for s in students if s["score"] >= 90}
print(f"优秀：{excellent}")

### 例子6：单词频率

In [None]:
text = "hello world hello python world"
words = text.split()

# 词频统计
word_count = {word: words.count(word) for word in set(words)}
print(word_count)
# {'hello': 2, 'world': 2, 'python': 1}

### 例子7：笛卡尔积应用

In [None]:
# 生成所有扑克牌
suits = ["♠", "♥", "♣", "♦"]
ranks = ["A", "2", "3", "4", "5", "6", "7", "8", "9", "10", "J", "Q", "K"]
deck = [f"{rank}{suit}" for suit in suits for rank in ranks]
print(f"共{len(deck)}张牌")
print(deck[:5])  # ['A♠', '2♠', '3♠', '4♠', '5♠']

### 例子8：数据分组

In [None]:
# 按奇偶分组
numbers = list(range(20))
grouped = {
    "even": [n for n in numbers if n % 2 == 0],
    "odd": [n for n in numbers if n % 2 != 0]
}
print(grouped)

### 例子9：嵌套字典

In [None]:
# 学生成绩表
students = ["张三", "李四", "王五"]
subjects = ["语文", "数学", "英语"]

# 初始化成绩字典
grades = {
    student: {subject: 0 for subject in subjects}
    for student in students
}
print(grades)
# {'张三': {'语文': 0, '数学': 0, '英语': 0},
#  '李四': {'语文': 0, '数学': 0, '英语': 0},
#  '王五': {'语文': 0, '数学': 0, '英语': 0}}

### 例子10：坐标网格

In [None]:
# 生成5x5网格的所有坐标
grid = [(x, y) for x in range(5) for y in range(5)]
print(f"共{len(grid)}个坐标")

# 只要边界坐标
boundary = [(x, y) for x in range(5) for y in range(5)
            if x == 0 or x == 4 or y == 0 or y == 4]
print(f"边界有{len(boundary)}个点")

## 推导式 vs 传统循环

### 性能对比

In [None]:
import time

# 传统方式
start = time.time()
squares = []
for x in range(1000000):
    squares.append(x ** 2)
print(f"传统方式：{time.time() - start:.3f}秒")

# 列表推导式（更快）
start = time.time()
squares = [x ** 2 for x in range(1000000)]
print(f"推导式：{time.time() - start:.3f}秒")

### 可读性

In [None]:
# 简单操作：推导式更清晰
squares = [x ** 2 for x in range(10)]

# 复杂逻辑：传统循环更清晰
# result = []
# for item in data:
#     if complex_condition(item):
#         processed = complex_processing(item)
#         if another_condition(processed):
#             result.append(processed)

## 常见陷阱

### 陷阱1：过度嵌套

In [None]:
# 不好：太复杂
# result = [[func(x) if x > 0 else other_func(x)
#            for x in row if x != 0]
#           for row in matrix if len(row) > 2]

# 好：拆分成多步
matrix = [[1, 2, 3], [4, 5], [6, 7, 8, 9]]
filtered_rows = [row for row in matrix if len(row) > 2]
result = [[x * 2 for x in row if x != 0] for row in filtered_rows]
print(result)

### 陷阱2：副作用

In [None]:
# 错误：推导式不应该有副作用
# [print(x) for x in range(10)]  # 不推荐

# 正确：用for循环
for x in range(10):
    print(x)

### 陷阱3：变量作用域

In [None]:
# Python 3中，推导式有自己的作用域
squares = [x ** 2 for x in range(5)]
# print(x)  # NameError: x不存在（Python 3）

## 练习题

### 练习1：字符串处理

从列表中提取所有大写字母。

```python
text = "Hello World 123"
# 结果：['H', 'W']
```

### 练习2：嵌套列表展平

```python
nested = [[1, 2, 3], [4, 5], [6, 7, 8, 9]]
# 结果：[1, 2, 3, 4, 5, 6, 7, 8, 9]
```

### 练习3：质数列表

生成1-100的所有质数。

### 练习4：字典合并

合并两个字典，key相同时value相加。

```python
d1 = {"a": 1, "b": 2}
d2 = {"b": 3, "c": 4}
# 结果：{"a": 1, "b": 5, "c": 4}
```

### 练习5：分组统计

按首字母分组单词。

```python
words = ["apple", "banana", "apricot", "cherry", "avocado"]
# 结果：{'a': ['apple', 'apricot', 'avocado'],
#       'b': ['banana'], 'c': ['cherry']}
```

## 本章重点

- ✅ 列表推导式：`[表达式 for 变量 in 序列]`
- ✅ 字典推导式：`{key: value for 变量 in 序列}`
- ✅ 集合推导式：`{表达式 for 变量 in 序列}`
- ✅ 生成器表达式：`(表达式 for 变量 in 序列)`
- ✅ 可以添加if条件过滤
- ✅ 可以嵌套使用

**记住**
- 推导式比传统循环更简洁、更快
- 简单操作用推导式，复杂逻辑用循环
- 不要在推导式里做有副作用的操作
- 生成器表达式节省内存
- 过度嵌套会降低可读性