<a href="https://colab.research.google.com/github/Tarun-pandit/Data_Science_practice/blob/DATA_ANALYSIS_USING_PANDAS/9_Merging_%26_Joining_Data.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

**Often, data is split across multiple tables or files. Pandas lets you combine them just like SQL — or even more flexibly!**

In [1]:
import pandas as pd


Sample DataFrames

In [2]:
employees = pd.DataFrame({
    "EmpID": [1, 2, 3],
    "Name": ["Alice", "Bob", "Charlie"],
    "DeptID": [10, 20, 30]
})

departments = pd.DataFrame({
    "DeptID": [10, 20, 40],
    "DeptName": ["HR", "Engineering", "Marketing"]
})

# Merge Like SQL: pd.merge()


**Inner Join (default)**
Returns only matching DeptIDs:

In [3]:
pd.merge(employees, departments, on="DeptID")

Unnamed: 0,EmpID,Name,DeptID,DeptName
0,1,Alice,10,HR
1,2,Bob,20,Engineering


**Left Join**

In [4]:
pd.merge(employees, departments, on="DeptID", how="left")

Unnamed: 0,EmpID,Name,DeptID,DeptName
0,1,Alice,10,HR
1,2,Bob,20,Engineering
2,3,Charlie,30,


**Right Join**

In [5]:
pd.merge(employees, departments, on="DeptID", how="right")

Unnamed: 0,EmpID,Name,DeptID,DeptName
0,1.0,Alice,10,HR
1,2.0,Bob,20,Engineering
2,,,40,Marketing


**Outer Join**

In [6]:
pd.merge(employees, departments, on="DeptID", how="outer")

Unnamed: 0,EmpID,Name,DeptID,DeptName
0,1.0,Alice,10,HR
1,2.0,Bob,20,Engineering
2,3.0,Charlie,30,
3,,,40,Marketing


## Concatenating DataFrames
Use pd.concat() to stack datasets either vertically or horizontally.



**Vertical (rows)**

In [7]:
df1 = pd.DataFrame({"Name": ["Alice", "Bob"]})
df2 = pd.DataFrame({"Name": ["Charlie", "David"]})

pd.concat([df1, df2])

Unnamed: 0,Name
0,Alice
1,Bob
0,Charlie
1,David


**Horizontal (columns)**

In [8]:
df1 = pd.DataFrame({"ID": [1, 2]})
df2 = pd.DataFrame({"Score": [90, 80]})

pd.concat([df1, df2], axis=1)

Unnamed: 0,ID,Score
0,1,90
1,2,80


Here's a quick guide on when to use each method for combining DataFrames:

| Use Case                      | Method                     |
|-------------------------------|----------------------------|
| SQL-style joins (merge keys)  | `pd.merge()` or `.join()`  |
| Stack datasets vertically     | `pd.concat([df1, df2])`    |
| Combine different features side-by-side | `pd.concat([df1, df2], axis=1)` |
| Align on index              | `.join()` or `merge` with `right_index=True` |

## Summary

* Use `merge()` for SQL-style joins (inner, left, right, outer) based on common columns (keys).
* Use `concat()` to stack DataFrames either vertically (adding rows) or horizontally (adding columns).
* Be mindful of mismatched keys and indexes when combining DataFrames, as they can lead to missing data or unexpected results.
* Merging and joining DataFrames are fundamental operations for real-world data analysis projects involving multiple data sources.