# 🔗 Merging & Joining Data in Pandas 📊

Often, data is split across multiple tables or files. Pandas allows you to combine them just like SQL — but with even more flexibility! 😎

## 🚀 Sample DataFrames

### 👥 Employees DataFrame
```python
employees = pd.DataFrame({
    "EmpID": [1, 2, 3],
    "Name": ["Alice", "Bob", "Charlie"],
    "DeptID": [10, 20, 30]
})
````

| **EmpID** | **Name** | **DeptID** |
| --------- | -------- | ---------- |
| 1         | Alice    | 10         |
| 2         | Bob      | 20         |
| 3         | Charlie  | 30         |

### 🏢 Departments DataFrame

```python
departments = pd.DataFrame({
    "DeptID": [10, 20, 40],
    "DeptName": ["HR", "Engineering", "Marketing"]
})
```

| **DeptID** | **DeptName** |
| ---------- | ------------ |
| 10         | HR           |
| 20         | Engineering  |
| 40         | Marketing    |

---

## 💡 Merge Like SQL with `pd.merge()`

### 🔍 Inner Join (default)

```python
pd.merge(employees, departments, on="DeptID")
```

Returns only **matching `DeptID`** values:

| **EmpID** | **Name** | **DeptID** | **DeptName** |
| --------- | -------- | ---------- | ------------ |
| 1         | Alice    | 10         | HR           |
| 2         | Bob      | 20         | Engineering  |

### ⬅️ Left Join

```python
pd.merge(employees, departments, on="DeptID", how="left")
```

Keeps **all employees** and fills `NaN` where there’s no match:

| **EmpID** | **Name** | **DeptID** | **DeptName** |
| --------- | -------- | ---------- | ------------ |
| 1         | Alice    | 10         | HR           |
| 2         | Bob      | 20         | Engineering  |
| 3         | Charlie  | 30         | NaN          |

### ➡️ Right Join

```python
pd.merge(employees, departments, on="DeptID", how="right")
```

Keeps **all departments**, even if there’s no employee:

| **EmpID** | **Name** | **DeptID** | **DeptName** |
| --------- | -------- | ---------- | ------------ |
| 1         | Alice    | 10         | HR           |
| 2         | Bob      | 20         | Engineering  |
| NaN       | NaN      | 40         | Marketing    |

### 🌍 Outer Join

```python
pd.merge(employees, departments, on="DeptID", how="outer")
```

Includes **all data**, filling missing values with `NaN`:

| **EmpID** | **Name** | **DeptID** | **DeptName** |
| --------- | -------- | ---------- | ------------ |
| 1         | Alice    | 10         | HR           |
| 2         | Bob      | 20         | Engineering  |
| 3         | Charlie  | 30         | NaN          |
| NaN       | NaN      | 40         | Marketing    |

---

## ➕ Concatenating DataFrames (`pd.concat()`)

You can use `pd.concat()` to stack datasets either **vertically** (rows) or **horizontally** (columns).

### 📏 Vertical (rows)

```python
df1 = pd.DataFrame({"Name": ["Alice", "Bob"]})
df2 = pd.DataFrame({"Name": ["Charlie", "David"]})

pd.concat([df1, df2])
```

| **Name** |
| -------- |
| Alice    |
| Bob      |
| Charlie  |
| David    |

### ➡️ Horizontal (columns)

```python
df1 = pd.DataFrame({"ID": [1, 2]})
df2 = pd.DataFrame({"Score": [90, 80]})

pd.concat([df1, df2], axis=1)
```

| **ID** | **Score** |
| ------ | --------- |
| 1      | 90        |
| 2      | 80        |

> **Tip**: When using `axis=1`, ensure indexes align. 🧩

---

## 📌 When to Use What?

| **Use Case**                            | **Method**                             |
| --------------------------------------- | -------------------------------------- |
| SQL-style joins (merge on common keys)  | `pd.merge()` or `.join()`              |
| Stack datasets vertically (rows)        | `pd.concat([df1, df2])`                |
| Combine different features side-by-side | `pd.concat([df1, df2], axis=1)`        |
| Align on index                          | `.join()` or `merge(right_index=True)` |

---

## 📝 Summary

* Use **`merge()`** like SQL joins (inner, left, right, outer) 🔄.
* Use **`concat()`** to stack DataFrames (rows or columns) 📊.
* Handle **mismatched keys and indexes** with care ⚠️.
* **Merging and joining** are essential for real-world projects 🚀.

---

🔑 **Key takeaway**: Pandas' merging and concatenation capabilities allow for powerful data manipulation. Whether combining datasets vertically or joining on keys, understanding these tools is essential for handling data effectively! 🌟

In [2]:
import pandas as pd

In [3]:
employees = pd.DataFrame({
    "EmpID": [1, 2, 3],
    "Name": ["Alice", "Bob", "Charlie"],
    "DeptID": [10, 20, 30]
})

In [4]:
employees

Unnamed: 0,EmpID,Name,DeptID
0,1,Alice,10
1,2,Bob,20
2,3,Charlie,30


In [5]:
departments = pd.DataFrame({
    "DeptID": [10, 20, 40],
    "DeptName": ["HR", "Engineering", "Marketing"]
})

In [6]:
departments

Unnamed: 0,DeptID,DeptName
0,10,HR
1,20,Engineering
2,40,Marketing


In [7]:
pd.merge(employees, departments, on="DeptID")        # Inner Join, join where they are common

Unnamed: 0,EmpID,Name,DeptID,DeptName
0,1,Alice,10,HR
1,2,Bob,20,Engineering


In [8]:
pd.merge(employees, departments, on="DeptID", how="left")     # Left Join

Unnamed: 0,EmpID,Name,DeptID,DeptName
0,1,Alice,10,HR
1,2,Bob,20,Engineering
2,3,Charlie,30,


In [9]:
pd.merge(employees, departments, on="DeptID", how="right")     # Right Join

Unnamed: 0,EmpID,Name,DeptID,DeptName
0,1.0,Alice,10,HR
1,2.0,Bob,20,Engineering
2,,,40,Marketing


In [10]:
pd.merge(employees, departments, on="DeptID", how="outer")     # Outer Join

Unnamed: 0,EmpID,Name,DeptID,DeptName
0,1.0,Alice,10,HR
1,2.0,Bob,20,Engineering
2,3.0,Charlie,30,
3,,,40,Marketing


In [11]:
df1 = pd.DataFrame({"Name": ["Alice", "Bob"], "Score": [423, 634]})
df2 = pd.DataFrame({"Name": ["Charlie", "David"], "Age": [25, 30]})

In [12]:
df1

Unnamed: 0,Name,Score
0,Alice,423
1,Bob,634


In [13]:
df2

Unnamed: 0,Name,Age
0,Charlie,25
1,David,30


In [14]:
pd.concat([df1, df2])

Unnamed: 0,Name,Score,Age
0,Alice,423.0,
1,Bob,634.0,
0,Charlie,,25.0
1,David,,30.0


In [15]:
pd.concat([df1, df2], axis=1)

Unnamed: 0,Name,Score,Name.1,Age
0,Alice,423,Charlie,25
1,Bob,634,David,30


In [16]:
pd.concat([df1, df2], axis=0)

Unnamed: 0,Name,Score,Age
0,Alice,423.0,
1,Bob,634.0,
0,Charlie,,25.0
1,David,,30.0
