# 成為初級資料分析師 | Python 程式設計

> 資料結構

## 郭耀仁

> Data structure takes data from ingredients to collections.

## 大綱

- `list`
- `tuple`
- `dict`
- `set`
- 小結

## `list`

## `list` 是 Python 基礎的資料結構

以 `[]` 搭配 `,` 將多筆資料收納到一個 `list` 中。

```python
my_list = [val0, val1, val2, ...]
```

## 如何記錄電影的劇情類型

![Imgur](https://i.imgur.com/3sGTb1H.png)

## 不使用 `list`

In [1]:
genre_0 = "Action"
genre_1 = "Adventure"
genre_2 = "Sci-Fi"

## 使用 `list`

In [2]:
genre = ["Action", "Adventure", "Sci-Fi"]
print(genre)
print(type(genre))

['Action', 'Adventure', 'Sci-Fi']
<class 'list'>


## 常用的 `list` 操作

- `len()` 觀察長度
- `.append()` 新增資料至末端
- `.pop()` 將最末端資料拋出
- `+` 連結另一個 `list`
- 索引（Indexing）
- 切割（Slicing）

In [3]:
# len()
genre = ["Action", "Adventure", "Sci-Fi"]
print(len(genre))

3


In [4]:
# .append()
genre = ["Action", "Adventure"]
print(genre)
genre.append("Sci-Fi")
print(genre)

['Action', 'Adventure']
['Action', 'Adventure', 'Sci-Fi']


In [5]:
# .pop()
genre = ["Action", "Adventure", "Sci-Fi"]
print(genre)
third_genre = genre.pop()
print(third_genre)
print(genre)

['Action', 'Adventure', 'Sci-Fi']
Sci-Fi
['Action', 'Adventure']


In [6]:
# +
genre = ["Action", "Adventure"]
print(genre)
genre += ["Sci-Fi"]
print(genre)

['Action', 'Adventure']
['Action', 'Adventure', 'Sci-Fi']


## 索引（indexing）

|從開頭數|0|1|2|
|-|-|-|-|
|值|"Action"|"Adventure"|"Sci-Fi"|
|從末端數|-3|-2|-1|

In [7]:
# 從開頭數
genre = ["Action", "Adventure", "Sci-Fi"]
print(genre[0])
print(genre[1])
print(genre[2])

Action
Adventure
Sci-Fi


In [8]:
# 從末端數
genre = ["Action", "Adventure", "Sci-Fi"]
print(genre[-1])
print(genre[-2])
print(genre[-3])

Sci-Fi
Adventure
Action


## 切割（slicing）

- `start` 起始索引（包含）
- `stop` 結束索引（不包含）
- `step` 間距

```python
my_list[start:stop:step]
```

In [9]:
cast = ['Robert Downey Jr.', 'Chris Evans', 'Mark Ruffalo', 'Chris Hemsworth', 'Scarlett Johansson', 'Jeremy Renner', 'Don Cheadle', 'Paul Rudd', 'Benedict Cumberbatch', 'Chadwick Boseman', 'Brie Larson', 'Tom Holland', 'Karen Gillan', 'Zoe Saldana', 'Evangeline Lilly']
print(cast[0:3])
print(cast[:3])
print(cast[-3:])
print(cast[::2])
print(cast[::-1])

['Robert Downey Jr.', 'Chris Evans', 'Mark Ruffalo']
['Robert Downey Jr.', 'Chris Evans', 'Mark Ruffalo']
['Karen Gillan', 'Zoe Saldana', 'Evangeline Lilly']
['Robert Downey Jr.', 'Mark Ruffalo', 'Scarlett Johansson', 'Don Cheadle', 'Benedict Cumberbatch', 'Brie Larson', 'Karen Gillan', 'Evangeline Lilly']
['Evangeline Lilly', 'Zoe Saldana', 'Karen Gillan', 'Tom Holland', 'Brie Larson', 'Chadwick Boseman', 'Benedict Cumberbatch', 'Paul Rudd', 'Don Cheadle', 'Jeremy Renner', 'Scarlett Johansson', 'Chris Hemsworth', 'Mark Ruffalo', 'Chris Evans', 'Robert Downey Jr.']


## 隨堂練習

## 參考 `cast` 與 https://www.imdb.com/title/tt4154796 將飾演奇異博士（Doctor Strange）的演員用 indexing 方式選出來

## 參考 `cast` 與 https://www.imdb.com/title/tt4154796 將第一集復仇者聯盟的六位英雄選出來

![Imgur](https://i.imgur.com/4mKOBMQ.jpg)

## `tuple`

## `tuple` 與 `list` 在許多方面都相似

以 `()` 搭配 `,` 將多筆資料收納到一個 `tuple` 中。

```python
my_tuple = (val0, val1, val2, ...)
```

In [10]:
genre = ("Action", "Adventure", "Sci-Fi")
print(genre)
print(type(genre))

('Action', 'Adventure', 'Sci-Fi')
<class 'tuple'>


## 常用的 `tuple` 操作

- `len()` 觀察長度
- 索引（Indexing）
- 切割（Slicing）

In [11]:
genre = ("Action", "Adventure", "Sci-Fi")
# len()
print(len(genre))

3


In [12]:
# indexing
print(genre[1])

Adventure


In [13]:
# slicing
print(genre[:2])

('Action', 'Adventure')


## `tuple` 與 `list` 最大的不同在於「無法更動」這個特性

一但創建之後，內容與長度都不能改變。

In [14]:
# 更動內容
genre = ["Action", "Adventure", "sci-fi"]
genre[2] = "Sci-Fi"
print(genre)

['Action', 'Adventure', 'Sci-Fi']


In [15]:
genre = ("Action", "Adventure", "sci-fi")
genre[2] = "Sci-Fi"

TypeError: 'tuple' object does not support item assignment

In [16]:
# 更動長度
genre = ["Action", "Adventure"]
genre.append("Sci-Fi")
print(genre)

['Action', 'Adventure', 'Sci-Fi']


In [17]:
genre = ("Action", "Adventure")
genre.append("Sci-Fi")
print(genre)

AttributeError: 'tuple' object has no attribute 'append'

## （Preview）Python 函數有多個輸出會以 `tuple` 的資料結構回傳

In [19]:
x = 0.25
print(x.as_integer_ratio())
print(type(x.as_integer_ratio()))

(1, 4)
<class 'tuple'>


## `dict`

## `dict` 是 Python 將資料與標籤綁定的彈性資料結構

以 `{}` 搭配 `key:value` 將資料（value）與標籤（key）綁定起來。

```python
my_dict = {
    "key0": val0,
    "key1": val1,
    "key2": val2,
    ...
}
```

![Imgur](https://i.imgur.com/4mKOBMQ.jpg)

In [20]:
the_avengers = {
    "Iron Man": "Tony Stark",
    "Captain America": "Steve Rogers",
    "Hulk": "Bruce Banner",
    "Thor": "Thor",
    "Black Widow": "Natasha Romanoff",
    "Hawkeye": "Clint Barton"
}
print(the_avengers)
print(type(the_avengers))

{'Iron Man': 'Tony Stark', 'Captain America': 'Steve Rogers', 'Hulk': 'Bruce Banner', 'Thor': 'Thor', 'Black Widow': 'Natasha Romanoff', 'Hawkeye': 'Clint Barton'}
<class 'dict'>


## 獲取 `dict` 的標籤與資料

- `.keys()`：取得標籤
- `.values()`：取得資料
- `.items()`：同時取得標籤和資料

In [21]:
print(the_avengers.keys())
print(the_avengers.values())
print(the_avengers.items()) # 6 組 tuples

dict_keys(['Iron Man', 'Captain America', 'Hulk', 'Thor', 'Black Widow', 'Hawkeye'])
dict_values(['Tony Stark', 'Steve Rogers', 'Bruce Banner', 'Thor', 'Natasha Romanoff', 'Clint Barton'])
dict_items([('Iron Man', 'Tony Stark'), ('Captain America', 'Steve Rogers'), ('Hulk', 'Bruce Banner'), ('Thor', 'Thor'), ('Black Widow', 'Natasha Romanoff'), ('Hawkeye', 'Clint Barton')])


## 以 `my_dict["key"]` 作索引取值

In [22]:
print(the_avengers["Iron Man"])
print(the_avengers["Captain America"])
print(the_avengers["Hulk"])
print(the_avengers["Thor"])
print(the_avengers["Black Widow"])
print(the_avengers["Hawkeye"])

Tony Stark
Steve Rogers
Bruce Banner
Thor
Natasha Romanoff
Clint Barton


## `set`

## `set` 是 Python 儲存獨一值的資料結構

以 `{}` 搭配 `,` 將資料收納到 `set` 之中。

```python
my_set = {val0, val1, val2, ...}
```

In [23]:
primes = {2, 3, 5, 7, 11, 13}
print(primes)
print(type(primes))

{2, 3, 5, 7, 11, 13}
<class 'set'>


## `set` 只儲存獨一值

In [24]:
primes = {2, 3, 3, 3, 5, 5}
print(primes)

{2, 3, 5}


## `set` 能運行集合運算

- `|` 聯集
- `&` 交集
- `-` 差異
- `^` 對稱差異

In [25]:
primes = {2, 3, 5, 7, 11, 13}
odds = {1, 3, 5, 7, 9, 11, 13}

In [26]:
print(primes | odds)

{1, 2, 3, 5, 7, 9, 11, 13}


In [27]:
print(primes & odds)

{3, 5, 7, 11, 13}


In [28]:
print(primes - odds)

{2}


In [29]:
print(primes ^ odds)

{1, 2, 9}


## 小結

## 資料結構的特性 

- 可以巢狀（nested）
- 可以迭代（iteratble）
- 多數可以索引（`set` 除外）

In [30]:
# 可以巢狀
avengers_endgame = {
    "rating": 8.8,
    "genre": ["Action", "Adventure", "Sci-Fi"],
    "heroes": {
        "Iron Man": "Tony Stark",
        "Captain America": "Steve Rogers",
        "Hulk": "Bruce Banner",
        "Thor": "Thor",
        "Black Widow": "Natasha Romanoff",
        "Hawkeye": "Clint Barton"
    }
}

In [31]:
# 可以迭代
genre = ["Action", "Adventure", "Sci-Fi"]
i = 0
while i < len(genre):
    print(genre[i])
    i += 1

Action
Adventure
Sci-Fi


In [32]:
# set 無法索引
heroes = {"Iron Man", "Captain America", "Hulk", "Bruce Banner", "Thor", "Black Widow", "Hawkeye"}
print(set[0])

TypeError: 'type' object is not subscriptable

## 作業

## 印出 6 個英雄的暱稱與角色名

```python
avengers_endgame = {
    "rating": 8.8,
    "genre": ["Action", "Adventure", "Sci-Fi"],
    "heroes": {
        "Iron Man": "Tony Stark",
        "Captain America": "Steve Rogers",
        "Hulk": "Bruce Banner",
        "Thor": "Thor",
        "Black Widow": "Natasha Romanoff",
        "Hawkeye": "Clint Barton"
    }
}
```

## 執行範例

```
## Tony Stark 的暱稱 Iron Man
## Steve Rogers 的暱稱 Captain America
## Bruce Banner 的暱稱 Hulk
## Thor 的暱稱 Thor
## Natasha Romanoff 的暱稱 Black Widow
## Clint Barton 的暱稱 Hawkeye
```

In [34]:
avengers_endgame = {
    "rating": 8.8,
    "genre": ["Action", "Adventure", "Sci-Fi"],
    "heroes": {
        "Iron Man": "Tony Stark",
        "Captain America": "Steve Rogers",
        "Hulk": "Bruce Banner",
        "Thor": "Thor",
        "Black Widow": "Natasha Romanoff",
        "Hawkeye": "Clint Barton"
    }
}
heroes_dict = avengers_endgame['heroes']
heroes_dict_len = len(heroes_dict)
heroes_dict_keys = list(heroes_dict.keys())
i = 0
while i < heroes_dict_len:
    dict_key = heroes_dict_keys[i]
    print(heroes_dict[dict_key])
    i += 1

Tony Stark
Steve Rogers
Bruce Banner
Thor
Natasha Romanoff
Clint Barton
