在 Pandas 中，```concat``` 和 ```merge``` 是兩種用於合併數據的強大工具，它們各有特點和適用場景。



### concat
#### 功能
- `concat` 就是把多個 pandas 物件（像是 Series、DataFrame）疊起來，可以想像成直接把資料表上下或左右接起來。
- 可以選擇是要上下接（`axis=0`），還是左右接（`axis=1`）。

#### 什麼時候用
- 當你手上有兩組或以上結構相同（欄位一樣）的資料，想要直接拼在一起的時候。
- 合併一系列的時間序列或有順序的資料時。

#### 為什麼用
- `concat` 讓資料拼接變得簡單明瞭，不用太在意欄位或索引怎麼匹配。
- 它可以讓我們從不同的資料源快速整合資料，就算有相同的欄位名稱也能靈活處理。




In [3]:
import pandas as pd

In [107]:
df1 = pd.DataFrame({'date': ["2023-10-02", "2023-10-03", "2023-10-04", "2023-10-05"],
                    "price": [533, 529, 520, 528],
                    "stock_no": ["2330", "2330", "2330", "2330"]})

df2 = pd.DataFrame({'date': ["2023-10-05", "2023-10-06", "2023-10-11"],
                    "price": [103.5, 105.5, 106],
                    "stock_no": ["2317", "2317", "2317"]})

df3 = pd.DataFrame({'date': ["2023-10-02", "2023-10-02", "2023-10-02"],
                    "price": [533, 529, 520],
                    "stock": ["0050", "2330", "2317"]})

In [122]:
df1

Unnamed: 0,date,price,stock_no
0,2023-10-02,533,2330
1,2023-10-03,529,2330
2,2023-10-04,520,2330
3,2023-10-05,528,2330


In [123]:
df2

Unnamed: 0,date,price,stock_no
0,2023-10-05,103.5,2317
1,2023-10-06,105.5,2317
2,2023-10-11,106.0,2317


In [124]:
df3

Unnamed: 0,date,price,stock
0,2023-10-02,533,50
1,2023-10-02,529,2330
2,2023-10-02,520,2317


In [128]:
pd.concat([df1, df2, df3])

Unnamed: 0,date,price,stock_no,stock
0,2023-10-02,533.0,2330.0,
1,2023-10-03,529.0,2330.0,
2,2023-10-04,520.0,2330.0,
3,2023-10-05,528.0,2330.0,
0,2023-10-05,103.5,2317.0,
1,2023-10-06,105.5,2317.0,
2,2023-10-11,106.0,2317.0,
0,2023-10-02,533.0,,50.0
1,2023-10-02,529.0,,2330.0
2,2023-10-02,520.0,,2317.0


In [129]:
pd.concat([df1, df2, df3], ignore_index=True)

Unnamed: 0,date,price,stock_no,stock
0,2023-10-02,533.0,2330.0,
1,2023-10-03,529.0,2330.0,
2,2023-10-04,520.0,2330.0,
3,2023-10-05,528.0,2330.0,
4,2023-10-05,103.5,2317.0,
5,2023-10-06,105.5,2317.0,
6,2023-10-11,106.0,2317.0,
7,2023-10-02,533.0,,50.0
8,2023-10-02,529.0,,2330.0
9,2023-10-02,520.0,,2317.0


In [130]:
pd.concat([df1, df2], axis=1)

Unnamed: 0,date,price,stock_no,date.1,price.1,stock_no.1
0,2023-10-02,533,2330,2023-10-05,103.5,2317.0
1,2023-10-03,529,2330,2023-10-06,105.5,2317.0
2,2023-10-04,520,2330,2023-10-11,106.0,2317.0
3,2023-10-05,528,2330,,,


In [136]:
df2.index = [2, 3, 4]

In [137]:
df2

Unnamed: 0,date,price,stock_no
2,2023-10-05,103.5,2317
3,2023-10-06,105.5,2317
4,2023-10-11,106.0,2317


In [141]:
pd.concat([df1, df2], axis=1, ignore_index=True, join="outer")

Unnamed: 0,0,1,2,3,4,5
0,2023-10-02,533.0,2330.0,,,
1,2023-10-03,529.0,2330.0,,,
2,2023-10-04,520.0,2330.0,2023-10-05,103.5,2317.0
3,2023-10-05,528.0,2330.0,2023-10-06,105.5,2317.0
4,,,,2023-10-11,106.0,2317.0


In [142]:
pd.concat([df1, df2], axis=1, ignore_index=True, join="inner")

Unnamed: 0,0,1,2,3,4,5
2,2023-10-04,520,2330,2023-10-05,103.5,2317
3,2023-10-05,528,2330,2023-10-06,105.5,2317


### merge
#### 功能
- `merge` 則是根據一個或多個關鍵字把不同 DataFrame 的資料列連接起來，有點像是 SQL 裡面的 JOIN。
- 支援多種連接方式，像是內連接（`inner`）、左連接（`left`）、右連接（`right`）、全外連接（`outer`）。

#### 什麼時候用
- 當你想要根據一個或多個共同的欄位（關鍵字）來合併兩組或以上的資料。
- 做一些需要複雜關聯分析的時候，特別是當資料來自不同來源，結構不完全一樣時。

#### 為什麼用
- `merge` 提供了一種根據關聯資料模型進行合併的方法，可以讓你精準控制合併的細節。
- 它讓我們能精確地操作合併的過程，比如選擇用哪些欄位當鍵、怎麼合併，讓數據分析更加精準。


In [148]:
pd.merge(df1, df2, on="date", how="outer", suffixes=("_df1", "_df2"))

Unnamed: 0,date,price_df1,stock_no_df1,price_df2,stock_no_df2
0,2023-10-02,533.0,2330.0,,
1,2023-10-03,529.0,2330.0,,
2,2023-10-04,520.0,2330.0,,
3,2023-10-05,528.0,2330.0,103.5,2317.0
4,2023-10-06,,,105.5,2317.0
5,2023-10-11,,,106.0,2317.0


In [154]:
pd.merge(df1, df3, on="date", how="outer", suffixes=("_df1", "_df3"))

Unnamed: 0,date,price_df1,stock_no,price_df3,stock
0,2023-10-02,533,2330,533.0,50.0
1,2023-10-02,533,2330,529.0,2330.0
2,2023-10-02,533,2330,520.0,2317.0
3,2023-10-03,529,2330,,
4,2023-10-04,520,2330,,
5,2023-10-05,528,2330,,


In [None]:
pd.merge(df2, df3, left_on="stock_no", right_on="stock", suffixes=("_df1", "_df3"))

In [158]:
pd.merge(df1, df3, left_on="stock_no", right_on="stock", suffixes=("_df1", "_df3"), how="outer")

Unnamed: 0,date_df1,price_df1,stock_no,date_df3,price_df3,stock
0,2023-10-02,533.0,2330.0,2023-10-02,529,2330
1,2023-10-03,529.0,2330.0,2023-10-02,529,2330
2,2023-10-04,520.0,2330.0,2023-10-02,529,2330
3,2023-10-05,528.0,2330.0,2023-10-02,529,2330
4,,,,2023-10-02,533,50
5,,,,2023-10-02,520,2317


### 總結
- **`concat`** 適合直接把資料疊起來的情況，當資料結構相同，只是想簡單合併時，用它就對了。
- **`merge`** 則適合需要根據共同欄位做更複雜合併的情形，需要精細操作的時候挑它。
