# Mergin DataFrames

In [30]:
import pandas as pd

#### Concatenate DataFrames
Concatenation is used to stack DataFrames either vertically (adding rows) or horizontally (adding columns).

In [33]:
# This concatenates df1 and df2 vertically (row-wise).
# Create sample DataFrames
df1 = pd.DataFrame({"A": ["A0", "A1", "A2"], "B": ["B0", "B1", "B2"]})
df2 = pd.DataFrame({"A": ["A3", "A4", "A5"], "B": ["B3", "B4", "B5"]})

# Concatenate vertically
# To reset the index after concatenating the DataFrames, you can use the reset_index method with the drop=True parameter. 
# This will create a new sequential index.
pd.concat([df1, df2], axis=0).reset_index(drop=True)

Unnamed: 0,A,B
0,A0,B0
1,A1,B1
2,A2,B2
3,A3,B3
4,A4,B4
5,A5,B5


#### Merge DataFrames

Merging is used to combine DataFrames based on a common key.

Inner :

In [35]:
# This merges df1 and df2 on the key column using an inner join, which keeps only the rows with matching keys.
# Create sample DataFrames
df1 = pd.DataFrame({"key": ["K0", "K1", "K2"], "A": ["A0", "A1", "A2"]})
df2 = pd.DataFrame({"key": ["K0", "K1", "K3"], "B": ["B0", "B1", "B3"]})

# Merge on 'key' column
pd.merge(df1, df2, on="key", how="inner")

Unnamed: 0,key,A,B
0,K0,A0,B0
1,K1,A1,B1


Outer:

In [37]:
# An outer join combines DataFrames based on a key column, including all rows from both DataFrames. 
# If a key is not present in one of the DataFrames, the resulting DataFrame will have NaN for the missing values.
# Create sample DataFrames
df1 = pd.DataFrame({"key": ["K0", "K1", "K2"], "A": ["A0", "A1", "A2"]})
df2 = pd.DataFrame({"key": ["K0", "K1", "K3"], "B": ["B0", "B1", "B3"]})

# Outer join on 'key' column
pd.merge(df1, df2, on="key", how="outer")

Unnamed: 0,key,A,B
0,K0,A0,B0
1,K1,A1,B1
2,K2,A2,
3,K3,,B3


Left:

All rows from the left DataFrame, with NaN for non-matching keys from the right DataFrame.

In [38]:
# Create sample DataFrames
df1 = pd.DataFrame({"key": ["K0", "K1", "K2"], "A": ["A0", "A1", "A2"]})
df2 = pd.DataFrame({"key": ["K0", "K1", "K3"], "B": ["B0", "B1", "B3"]})

# Left join on 'key' column
pd.merge(df1, df2, on="key", how="left")

Unnamed: 0,key,A,B
0,K0,A0,B0
1,K1,A1,B1
2,K2,A2,


Right:

All rows from the right DataFrame, with NaN for non-matching keys from the left DataFrame.

In [39]:
# Create sample DataFrames
df1 = pd.DataFrame({"key": ["K0", "K1", "K2"], "A": ["A0", "A1", "A2"]})
df2 = pd.DataFrame({"key": ["K0", "K1", "K3"], "B": ["B0", "B1", "B3"]})

# Right join on 'key' column
pd.merge(df1, df2, on="key", how="right")

Unnamed: 0,key,A,B
0,K0,A0,B0
1,K1,A1,B1
2,K3,,B3


#### Join DataFrames

Joining is used to combine DataFrames based on their index.

while both merge and join can perform inner joins, merge offers more flexibility and options, making it suitable for more complex merging operations. join is more straightforward and convenient for simpler, index-based joins.

In [36]:
# This joins df1 and df2 based on their index using an inner join, which keeps only the rows with matching indices.
# Create sample DataFrames
df1 = pd.DataFrame({"A": ["A0", "A1", "A2"]}, index=["K0", "K1", "K2"])
df2 = pd.DataFrame({"B": ["B0", "B1", "B2"]}, index=["K0", "K2", "K3"])

# Join DataFrames
df1.join(df2, how="inner")

Unnamed: 0,A,B
K0,A0,B0
K2,A2,B1
