### Merging, Joining, and Concatenation in Pandas

## **🔹 Merge: All Scenarios (Using Columns and Index)**

### **1️⃣ Merging on a Common Column (SQL-Style Join)**

In [None]:
import pandas as pd

df1 = pd.DataFrame({'ID': [1, 2, 3], 'Name': ['Alice', 'Bob', 'Charlie']})
df2 = pd.DataFrame({'ID': [2, 3, 4], 'Score': [85, 90, 78]})

df_merged = df1.merge(df2, on='ID', how='inner')  # Inner join
df_merged

Unnamed: 0,ID,Name,Score
0,2,Bob,85
1,3,Charlie,90


### **2️⃣ Different Join Types in `merge()`**


In [None]:
df_left = df1.merge(df2, on='ID', how='left')   # Left Join
df_right = df1.merge(df2, on='ID', how='right') # Right Join
df_outer = df1.merge(df2, on='ID', how='outer') # Outer Join
df_outer

Unnamed: 0,ID,Name,Score
0,1,Alice,
1,2,Bob,85.0
2,3,Charlie,90.0
3,4,,78.0


### **3️⃣ Merging on Index Instead of Column**


In [None]:
df1 = pd.DataFrame({'Name': ['Alice', 'Bob', 'Charlie']}, index=[1, 2, 3])
df2 = pd.DataFrame({'Score': [85, 90, 78]}, index=[2, 3, 4])
display(df1)
display(df2)
df_merged = df1.merge(df2, left_index=True, right_index=True, how='outer')
display(df_merged)

Unnamed: 0,Name,Score
1,Alice,
2,Bob,85.0
3,Charlie,90.0
4,,78.0



### **📌 Explanation**
- `left_index=True` → Uses `df1`'s **index** as the merge key.
- `right_index=True` → Uses `df2`'s **index** as the merge key.
- **`how='outer'`** → Includes **all** index values, filling missing ones with `NaN`.

✅ **This method is useful when both DataFrames already have meaningful indexes!**  


### **Final Thoughts**
- **Use `merge(on='column')` when joining on a column.**  
- **Use `merge(left_index=True, right_index=True)` when joining on an index.**  


## **🔹 Join: All Scenarios (Using Columns and Index)**

### **4️⃣ Joining on Index (Default Behavior)**

In [None]:
display(df1)
display(df2)

Unnamed: 0,A
0,1
1,2


Unnamed: 0,B
1,3
2,4


In [None]:
df_joined = df1.join(df2, how='outer')
df_joined

Unnamed: 0,A,B
0,1.0,
1,2.0,3.0
2,,4.0


### **5️⃣ Joining on a Column Instead of Index**


In [None]:
df1 = pd.DataFrame({'ID': [1, 2, 3], 'Name': ['Alice', 'Bob', 'Charlie']})
df2 = pd.DataFrame({'ID': [1, 2, 4], 'Score': [85, 90, 78]})

df_joined = df1.set_index('ID').join(df2.set_index('ID'), how='left')
print(df_joined)

       Name  Score
ID                
1     Alice   85.0
2       Bob   90.0
3   Charlie    NaN


## **🔹 Concatenation: All Scenarios (Horizontal and Vertical)**

### **6️⃣ Vertical Concatenation (`axis=0`)**

In [None]:
df1 = pd.DataFrame({'ID': [1, 2], 'Name': ['Alice', 'Bob']})
df2 = pd.DataFrame({'ID': [3, 4], 'Name': ['Charlie', 'David']})

df_vertical = pd.concat([df1, df2], axis=0, ignore_index=True)
df_vertical

Unnamed: 0,ID,Name
0,1,Alice
1,2,Bob
2,3,Charlie
3,4,David


### **7️⃣ Horizontal Concatenation (`axis=1`)**


In [None]:
df1 = pd.DataFrame({'A': [1, 2, 3]})
df2 = pd.DataFrame({'B': [4, 5, 6]})

df_horizontal = pd.concat([df1, df2], axis=1)
df_horizontal

Unnamed: 0,A,B
0,1,4
1,2,5
2,3,6


### **8️⃣ Handling Mismatched Indexes in Concatenation**


In [None]:
df1 = pd.DataFrame({'A': [1, 2]}, index=[0, 1])
df2 = pd.DataFrame({'B': [3, 4]}, index=[1, 2])

df_mismatch = pd.concat([df1, df2], axis=1)
df_mismatch

Unnamed: 0,A,B
0,1.0,
1,2.0,3.0
2,,4.0



---

## **🔹 Comparison: `merge()` vs. `join()` vs. `concat()`**

| Feature | `merge()` | `join()` | `concat()` |
|---------|----------|----------|-----------|
| SQL-style joins on column | ✅ Yes | ❌ No | ❌ No |
| Join on index | ❌ No | ✅ Yes | ❌ No |
| Stack DataFrames vertically | ❌ No | ❌ No | ✅ Yes |
| Stack DataFrames horizontally | ❌ No | ❌ No | ✅ Yes |
| Inner/Outer/Left/Right joins | ✅ Yes | ✅ Yes | ❌ No |
| Works with MultiIndex | ✅ Yes | ✅ Yes | ❌ No |
| Handles duplicate column names | ✅ Yes | ⚠️ Needs suffix | ❌ No |
| Best for structured, relational data | ✅ Yes | ❌ No | ❌ No |
| Best for quick index-based joins | ❌ No | ✅ Yes | ❌ No |
| Best for combining multiple DataFrames | ❌ No | ❌ No | ✅ Yes |

---

## **🔹 Final Recommendation**
- **Use `merge()`** when **you need to join on a specific column (SQL-style joins).**
- **Use `join()`** when **you want to join DataFrames based on the index.**
- **Use `concat()`** when **you need to stack DataFrames (vertically or horizontally).**
