#### Merging & Joining Data
Often, data is split across multiple table o files. Pandas lets you combine them just like SQL -- or evven flexibly!

In [2]:
# Sample DataFrames
import pandas as pd

employees = pd.DataFrame({
"EmpID": [1, 2, 3],
"Name": ["Alice", "Bob", "Charlie"],
"DeptID": [10, 20, 30]
})


departments = pd.DataFrame({
"DeptID": [10, 20, 40],
"DeptName": ["HR", "Engineering", "Marketing"]
})

---

##### Merge Like SQL: `pd.merge()`

**1. Inner Join (default)**

In [3]:
pd.merge(employees, departments, on="DeptID")

Unnamed: 0,EmpID,Name,DeptID,DeptName
0,1,Alice,10,HR
1,2,Bob,20,Engineering


Returns only matching department id's

---

**2. Left Join**

In [5]:
pd.merge(employees, departments, on="DeptID", how="left")

Unnamed: 0,EmpID,Name,DeptID,DeptName
0,1,Alice,10,HR
1,2,Bob,20,Engineering
2,3,Charlie,30,


Every value from the LHS is taken.

---

**3. Right Join**

In [6]:
pd.merge(employees, departments, on="DeptID", how="right")

Unnamed: 0,EmpID,Name,DeptID,DeptName
0,1.0,Alice,10,HR
1,2.0,Bob,20,Engineering
2,,,40,Marketing


Every value from RHS is taken.

---

**4. Outer Join**

In [7]:
pd.merge(employees, departments, on="DeptID", how="outer")

Unnamed: 0,EmpID,Name,DeptID,DeptName
0,1.0,Alice,10,HR
1,2.0,Bob,20,Engineering
2,3.0,Charlie,30,
3,,,40,Marketing


Include all data, fills missing values with NaN.

---

##### Concatenating DataFrames
Use `pd.concat()` to stack datasets either vertically or horizontally.

**Vertical (rows)**

In [12]:
df1 = pd.DataFrame({"Name":["Alice", "Bob"]})
df2 = pd.DataFrame({"Name": ["Charlie", "David"]})

pd.concat([df1, df2])

Unnamed: 0,Name
0,Alice
1,Bob
0,Charlie
1,David


**Horizontal (columns)**

In [14]:
df1 = pd.DataFrame({"ID":[1,2]})
df2 = pd.DataFrame({"Score":[90, 80]})

pd.concat([df1, df2],  axis=1)

Unnamed: 0,ID,Score
0,1,90
1,2,80


> Make sure indexes align when using `axis=1`