# Week 8 Lecture Note: Composite Data Type(Lists and Tuples)

## Motivation of Composite Data Type | 复合数据类型的动机
- When calculating the average of a small number of values, a simple function can be used. However, when dealing with a large amount of data like the average household income in Hong Kong (with a labor size close to 4 million), a composite data type is needed to store variable-sized data sequences.  
  计算少量数据的平均值时，可以使用简单函数。然而，对于处理大量数据（如香港的平均家庭收入，劳动力规模接近 400 万），需要使用复合数据类型来存储可变大小的数据序列。

- Python has four collection data types: 
  - **List (列表)**: Ordered and mutable, allowing duplicates | 有序且可变，允许重复元素。
  - **Tuple (元组)**: Ordered and immutable, allowing duplicates | 有序且不可变，允许重复元素。
  - **Set (集合)**: Unordered and unindexed, no duplicates | 无序且无索引，不允许重复元素。
  - **Dictionary (字典)**: Unordered, mutable, and indexed, no duplicates | 无序、可变且有索引，不允许重复键。
  (This lecture focuses on Lists and Tuples. | 本次讲座重点介绍列表和元组。)

2. **Constructing Sequences**
    - **By Enumeration**:
        - Tuples are created by enclosing a comma - separated sequence in parentheses. For single - element tuples, a comma is required after the element. For example, `(0,)` is a tuple, while `(0)` is just the number 0. The unpacking operator `*` can be used to unpack iterables in tuple construction.
        - Lists are created by enclosing a comma - separated sequence in square brackets. For example, `[0]` is a list.
    - **By Comprehension**:
        - Comprehension is an efficient way to create lists and tuples based on existing iterables. It consists of an output expression, iteration, and optional conditional filtering. The syntax for list comprehension is `[output_expression for item in iterable if conditional_filtering]`.
        - For example, `[i + j for i in range(1, 10) for j in range(1, 5)]` generates a list of sums. To create a tuple using comprehension, the `tuple()` constructor is needed, like `tuple(x**2 for x in range(10))`.

3. **Selecting Items in a Sequence**
    - **Traversal**:
        - For loops can be used to iterate over tuples and lists in order. The `reversed()` function can be used to iterate in reverse. The `zip()` function allows simultaneous traversal of multiple sequences, and the length of the new iterator is determined by the shortest input sequence.
    - **Indexing**:
        - Positive indices start from 0, and negative indices represent offsets from the end of the sequence. Accessing elements beyond the valid index range raises an `IndexError`.
    - **Slicing**:
        - Slicing is used to select a range of items with the syntax `a[start:stop:step]`. The `start` defaults to 0, `stop` defaults to the length of the sequence, and `step` defaults to 1 if not specified. Negative values can be used for these parameters.
    - **Sorting**:
        - The `quicksort` algorithm can be implemented to sort a sequence. There is also a built - in `sorted()` function in Python that uses the Timsort algorithm and is more efficient.
4. **Mutation and Aliasing**
    - **Mutation**:
        - Mutable objects like lists can be changed after creation. When a variable of a mutable data type is assigned to another variable, changes to the data are reflected in both variables. For example, if `x = ['hi']` and `y = x`, then `y += ['bye']` will also change `x`.
        - Tuples, on the other hand, are immutable. For lists, subscription and slicing can be used as assignment targets to mutate the list, but there are size - matching restrictions for extended slices.
    - **Aliasing**:
        - Aliasing occurs when one variable is assigned to another variable of a mutable data type. In this case, the two variables refer to the same object. For example, if `a = [10, 20, 30, 40]` and `b = a`, then `a is b` and `a == b` are both `True`.

5. **Different Methods to Operate on a Sequence**
    - **Membership Check**:
        - The `in` and `not in` operators are used to check if an element is in a tuple or list. They call the `__contains__` method.
    - **Common Attributes**:
        - Both tuples and lists have `count` (returns the number of occurrences of a value) and `index` (returns the index of the first occurrence of a value) methods.
    - **List - Specific Attributes**:
        - Lists have methods like `append`, `clear`, `copy`, `extend`, `insert`, `pop`, `remove`, `reverse`, and `sort`. Most of these methods mutate the list, except `copy`.
    - **Tuple - Specific Attributes**:
        - Tuples have no public - specific attributes. A tuple can be copied using slicing, e.g., `b = a[::-1]` creates a reversed copy of tuple `a`.

In summary, this lecture covered the creation, access, mutation, and operation methods of tuples and lists in Python, which are important composite data types for handling data sequences. 

In [None]:
#

### 2. Constructing Sequences | 序列的构造
#### By Enumeration | 通过枚举
- Tuples are created by enclosing a comma-separated sequence in parentheses. | 元组使用圆括号 `()` 包围逗号分隔的序列。
- For single-element tuples, a comma is required after the element. | 单元素元组需要在元素后加逗号，如 `(0,)`。
- Lists are created using square brackets `[]`. | 列表使用方括号 `[]`。

#### By Comprehension | 通过推导式
- Comprehensions provide an efficient way to create sequences. | 推导式可以高效创建序列。
- **List comprehension syntax | 列表推导式语法**:
  ```python
  [output_expression for item in iterable if conditional_filtering]
  ```
- **Tuple comprehension requires the `tuple()` constructor | 元组推导式需要 `tuple()` 构造函数**:
  ```python
  tuple(x**2 for x in range(10))
  ```

### 3. Selecting Items in a Sequence | 选择序列中的元素
#### Traversal | 遍历
- `for` loops iterate over sequences. | `for` 循环用于遍历序列。
- `reversed()` iterates in reverse order. | `reversed()` 反向遍历序列。
- `zip()` combines multiple sequences, truncating to the shortest. | `zip()` 并行遍历多个序列，长度取决于最短的序列。

#### Indexing | 索引
- Positive indices start from `0`, negative indices count backward. | 正索引从 `0` 开始，负索引从末尾反向计数。
- Accessing an out-of-range index raises `IndexError`. | 访问超出范围的索引会触发 `IndexError`。

#### Slicing | 切片
- Syntax: `a[start:stop:step]`. | 语法：`a[起始:结束:步长]`。
- Defaults: `start=0`, `stop=len(a)`, `step=1`. | 默认值：`start=0`，`stop=len(a)`，`step=1`。
- Negative values allow reverse indexing. | 负值可用于反向索引。

#### Sorting | 排序
- Quicksort can be implemented for sorting. | 可以使用快速排序算法进行排序。
- Python’s built-in `sorted()` uses Timsort. | Python 内置 `sorted()` 使用 Timsort 算法。

### 4. Mutation and Aliasing | 变异与别名
#### Mutation | 变异
- Lists are mutable, modifying one reference affects all aliases. | 列表是可变的，修改一个引用会影响所有别名。
- Example | 示例:
  ```python
  x = ['hi']
  y = x
  y += ['bye']  # x 也会改变
  ```
- Tuples are immutable. | 元组是不可变的。

#### Aliasing | 别名
- Assigning one list to another creates an alias. | 将列表赋值给另一个变量会创建别名。
- Example | 示例:
  ```python
  a = [10, 20, 30, 40]
  b = a
  print(a is b)  # True
  print(a == b)  # True
  ```

### 5. Different Methods to Operate on a Sequence | 操作序列的不同方法
#### Membership Check | 成员检查
- `in` and `not in` check if an element exists. | `in` 和 `not in` 检查元素是否存在。

#### Common Attributes | 共有方法
- `count(value)`: Returns occurrences of `value`. | `count(value)`: 返回 `value` 出现的次数。
- `index(value)`: Returns first index of `value`. | `index(value)`: 返回 `value` 第一次出现的索引。

#### List-Specific Attributes | 列表特有方法
- `append()`, `clear()`, `copy()`, `extend()`, `insert()`, `pop()`, `remove()`, `reverse()`, `sort()`.
- Most methods mutate the list, except `copy()`. | 除 `copy()` 外，大多数方法都会修改列表。

#### Tuple-Specific Attributes | 元组特有方法
- Tuples have no public methods except `count` and `index`. | 除 `count` 和 `index` 外，元组没有其他公有方法。
- Tuples can be copied using slicing. | 可以使用切片复制元组:
  ```python
  b = a[::-1]  # 创建一个反向副本
  ```

### Summary | 总结
- This lecture covered the creation, access, mutation, and operations of tuples and lists in Python. | 本次讲座介绍了 Python 中元组和列表的创建、访问、变异和操作。
- Lists are mutable, while tuples are immutable. | 列表是可变的，而元组是不可变的。
- Understanding these data types is essential for handling sequences efficiently. | 理解这些数据类型对于高效处理序列数据至关重要。
