---
# Merge , Join and Concatenate 
---

---
## Understanding Merge, Join, and Concatenate in Pandas
---
In data analysis, combining datasets is a fundamental operation. Pandas provides three main methods for this: merge, join, and concatenate.

### Merge

Merge combines DataFrames based on common columns (similar to SQL joins). It aligns rows by matching values in specified columns.

Key features:
- Can combine DataFrames based on one or more keys
- Supports different join types: inner, left, right, and outer
- The "on" parameter specifies which column(s) to use for matching

In our examples:
- Inner merge (`how="inner"`) keeps only rows that match in both DataFrames
- Left merge (`how="left"`) keeps all rows from the left DataFrame and matching rows from the right
- Right merge (`how="right"`) keeps all rows from the right DataFrame and matching rows from the left
- Outer merge (`how="outer"`) keeps all rows from both DataFrames

### Join

Join combines DataFrames based on their indexes rather than columns. It's especially useful when working with indexed data.

Key differences from merge:
- Primarily uses DataFrame indexes for alignment
- More efficient when combining DataFrames with meaningful indexes

In our examples:
- Inner join keeps only rows with matching indexes
- Left join keeps all rows from the left DataFrame
- Right join keeps all rows from the right DataFrame
- Outer join keeps all rows from both DataFrames

### Concatenate

Concatenate simply stacks DataFrames either vertically (default) or horizontally. It doesn't perform any matching but combines DataFrames as-is.

Features:
- Simpler than merge or join
- Works like appending one DataFrame to another
- Preserves indexes from original DataFrames

When to use each:
- Merge: When you need to combine based on column values
- Join: When you need to combine based on indexes
- Concatenate: When you simply need to stack DataFrames without matching values

---

In [2]:
import pandas as pd   # importing pandas

In [3]:
player = ['harshit','shivam','satyam']        # creating a list
scores = [3,4,1]
title = "game1","game2","game3"

df1 = pd.DataFrame({"player" : player,"Scores":scores,"title": title}) # creating a DataFrame
df1

Unnamed: 0,player,Scores,title
0,harshit,3,game1
1,shivam,4,game2
2,satyam,1,game3


In [4]:
player = ['amit','harshit','satyam']
power = ['running','running','badminton']            # creating a list
title = ["game3","game4","game1"]


df2 = pd.DataFrame({"player": player,"power":power,"title":title})   # creating a DataFrame
df2

Unnamed: 0,player,power,title
0,amit,running,game3
1,harshit,running,game4
2,satyam,badminton,game1


---
## Merge in Pandas
---

- Inner merge :-

In [None]:
df1.merge(df2, on="title",how="inner")                  # merging the DataFrames                                  

Unnamed: 0,player_x,Scores,title,player_y,power
0,harshit,3,game1,satyam,badminton
1,satyam,1,game3,amit,running


- Left merge :-

In [None]:
df1.merge(df2, on="player" , how="left")                # merging the DataFrames by left 

Unnamed: 0,player,Scores,title_x,power,title_y
0,harshit,3,game1,running,game4
1,shivam,4,game2,,
2,satyam,1,game3,badminton,game1


- Right merge :-

In [None]:
df1.merge(df2, on="player" , how="right")               # merging the DataFrames by right

Unnamed: 0,player,Scores,title_x,power,title_y
0,amit,,,running,game3
1,harshit,3.0,game1,running,game4
2,satyam,1.0,game3,badminton,game1


- Outer merge :-

In [None]:
df1.merge(df2, on="player" , how="outer")               # merging the DataFrames by outer

Unnamed: 0,player,Scores,title_x,power,title_y
0,amit,,,running,game3
1,harshit,3.0,game1,running,game4
2,satyam,1.0,game3,badminton,game1
3,shivam,4.0,game2,,


In [9]:
player = ['amit','harshit','satyam']
power = ['running','running','badminton']            # creating a list
title = ["game3","game4","game1"]


df3 = pd.DataFrame({"player": player,"power":power,"title":title}, index=["L1","L2","L3"])      # creating a DataFrame with index
df3

Unnamed: 0,player,power,title
L1,amit,running,game3
L2,harshit,running,game4
L3,satyam,badminton,game1


In [10]:
players = ['gurkirat','shivam','satyam']
scores = [3,4,1]                                     # creating a list
title = "game6","game2","game5"
df4 = pd.DataFrame({"players" : players,"Scores":scores,"titles": title},index=["L3","L4","L5"]) # creating a DataFrame with index
df4

Unnamed: 0,players,Scores,titles
L3,gurkirat,3,game6
L4,shivam,4,game2
L5,satyam,1,game5


---
### Join in Pandas
---

- Inner Join :-

In [None]:
df3.join(df4, how="inner")                   # joining the DataFrames

Unnamed: 0,player,power,title,players,Scores,titles
L3,satyam,badminton,game1,gurkirat,3,game6


- Left join :-

In [None]:
df3.join(df4, how="left")                    # joining the DataFrames by left

Unnamed: 0,player,power,title,players,Scores,titles
L1,amit,running,game3,,,
L2,harshit,running,game4,,,
L3,satyam,badminton,game1,gurkirat,3.0,game6


- Right join :-

In [None]:
df3.join(df4, how="right")                    # joining the DataFrames by right

Unnamed: 0,player,power,title,players,Scores,titles
L3,satyam,badminton,game1,gurkirat,3,game6
L4,,,,shivam,4,game2
L5,,,,satyam,1,game5


- Outer join :-

In [None]:
df3.join(df4, how="outer")                    # joining the DataFrames by outer

Unnamed: 0,player,power,title,players,Scores,titles
L1,amit,running,game3,,,
L2,harshit,running,game4,,,
L3,satyam,badminton,game1,gurkirat,3.0,game6
L4,,,,shivam,4.0,game2
L5,,,,satyam,1.0,game5


---
### Concatenate in Pandas
---


In [None]:
pd.concat([df3,df4])                         # concatenating the DataFrames

Unnamed: 0,player,power,title,players,Scores,titles
L1,amit,running,game3,,,
L2,harshit,running,game4,,,
L3,satyam,badminton,game1,,,
L3,,,,gurkirat,3.0,game6
L4,,,,shivam,4.0,game2
L5,,,,satyam,1.0,game5


---