# Merging & Joining Data
Often, data is split across multiple tables or files. Pandas lets you combine them just like SQL â€” or even more flexibly!

In [1]:
import pandas as pd

In [2]:
emp = pd.DataFrame({
    "EmpID": [1, 2, 3],
    "Name": ["Alice", "Bob", "Charlie"],
    "DeptID": [10, 20, 30]
})

In [3]:
emp

Unnamed: 0,EmpID,Name,DeptID
0,1,Alice,10
1,2,Bob,20
2,3,Charlie,30


In [4]:
dept = pd.DataFrame({
    "DeptID": [10, 20, 40],
    "DeptName": ["HR", "Engineering", "Marketing"]
})

In [5]:
dept

Unnamed: 0,DeptID,DeptName
0,10,HR
1,20,Engineering
2,40,Marketing


## Merge Like SQL: pd.merge()
Inner Join (default)

In [7]:
pd.merge(emp, dept, on="DeptID")

Unnamed: 0,EmpID,Name,DeptID,DeptName
0,1,Alice,10,HR
1,2,Bob,20,Engineering


## Left Join
Keeps all employees, fills NaN where no match.

In [9]:
pd.merge(emp, dept, on="DeptID", how="left")

Unnamed: 0,EmpID,Name,DeptID,DeptName
0,1,Alice,10,HR
1,2,Bob,20,Engineering
2,3,Charlie,30,


## Right Join
Keeps all departments, even if no employee.

In [10]:
pd.merge(emp, dept, on="DeptID", how="right")

Unnamed: 0,EmpID,Name,DeptID,DeptName
0,1.0,Alice,10,HR
1,2.0,Bob,20,Engineering
2,,,40,Marketing


## Outer Join
Includes all data, fills missing with NaN.

In [11]:
pd.merge(emp, dept, on="DeptID", how="outer")

Unnamed: 0,EmpID,Name,DeptID,DeptName
0,1.0,Alice,10,HR
1,2.0,Bob,20,Engineering
2,3.0,Charlie,30,
3,,,40,Marketing


# Concatenating DataFrames

Use pd.` concat()` to stack datasets either vertically or horizontally.

Vertical (rows)

In [12]:
df1 = pd.DataFrame({"Name": ["Alice", "Bob"]})
df2 = pd.DataFrame({"Name": ["Charlie", "David"]})

pd.concat([df1, df2])

Unnamed: 0,Name
0,Alice
1,Bob
0,Charlie
1,David


Horizontal (columns)<br>
Make sure indexes align when using axis=1

In [13]:
df1 = pd.DataFrame({"ID": [1, 2]})
df2 = pd.DataFrame({"Score": [90, 80]})

pd.concat([df1, df2], axis=1)

Unnamed: 0,ID,Score
0,1,90
1,2,80


# When to use what
| Use Case                            | Method                                      |
|-------------------------------------|---------------------------------------------|
| SQL-style joins (merge keys)        | `pd.merge()` or `.join()`                   |
| Stack datasets vertically           | `pd.concat([df1, df2])`                     |
| Combine different features side-by-side | `pd.concat([df1, df2], axis=1)`           |
| Align on index                      | `.join()` or `merge(..., right_index=True)` |

# Summary
1. Use merge() like SQL joins (inner, left, right, outer)
2. Use concat() to stack DataFrames (rows or columns)
3. Handle mismatched keys and indexes with care
4. Merging and joining are essential for real-world projects