## Pandas merge() - Merging Two DataFrame Objects
##### https://www.digitalocean.com/community/tutorials/pandas-merge-two-dataframe

### 1. Default merging - inner join

In [4]:
import pandas as pd

d1 = {'Name': ['Pankaj', 'Meghna', 'Lisa'], 'Country': ['India', 'India', 'USA'], 'Role': ['CEO', 'CTO', 'CTO']}
df1 = pd.DataFrame(d1)                                       # dataframe to be looked up
print('DataFrame 1:\n', df1, '\n Length:', len(df1))
print()

d2 = {'ID': [1, 2, 3], 'Name': ['Pankaj', 'Anupam', 'Amit']}
df2 = pd.DataFrame(d2)                                       # dataframe to be looked 
print('DataFrame 2:\n', df2, '\n Length:', len(df2))
print()

df_merged = df1.merge(df2)
print('Result:\n', df_merged, '\n Length:', len(df_merged))
print()

DataFrame 1:
      Name Country Role
0  Pankaj   India  CEO
1  Meghna   India  CTO
2    Lisa     USA  CTO 
 Length: 3

DataFrame 2:
    ID    Name
0   1  Pankaj
1   2  Anupam
2   3    Amit 
 Length: 3

Result:
      Name Country Role  ID
0  Pankaj   India  CEO   1 
 Length: 1



### 2. Merging DataFrames with Left, Right, and Outer Join

In [5]:
print('Result Left Join:\n',  df1.merge(df2, how='left'))
print()
print('Result Right Join:\n', df1.merge(df2, how='right'))
print()
print('Result Outer Join:\n', df1.merge(df2, how='outer'))
print()

Result Left Join:
      Name Country Role   ID
0  Pankaj   India  CEO  1.0
1  Meghna   India  CTO  NaN
2    Lisa     USA  CTO  NaN

Result Right Join:
      Name Country Role  ID
0  Pankaj   India  CEO   1
1  Anupam     NaN  NaN   2
2    Amit     NaN  NaN   3

Result Outer Join:
      Name Country Role   ID
0  Pankaj   India  CEO  1.0
1  Meghna   India  CTO  NaN
2    Lisa     USA  CTO  NaN
3  Anupam     NaN  NaN  2.0
4    Amit     NaN  NaN  3.0



### 3. Merging DataFrame on Specific Columns

In [3]:
d1 = {'Name': ['Pankaj', 'Meghna', 'Lisa'], 'ID': [1, 2, 3], 'Country': ['India', 'India', 'USA'],
      'Role': ['CEO', 'CTO', 'CTO']}
df1 = pd.DataFrame(d1) # the dataframe that contains the information we want to look up

d2 = {'ID': [1, 2, 3], 'Name': ['Pankaj', 'Anupam', 'Amit']}
df2 = pd.DataFrame(d2) # the dataframe that

print('Dataframe 1:\n', df1)
print()
print('Dataframe 2:\n', df2)
print()

print('Merge on "ID":\n'  , df1.merge(df2, on='ID'))
print()
print('Merge on "Name":\n', df1.merge(df2, on='Name'))
print()

Dataframe 1:
      Name  ID Country Role
0  Pankaj   1   India  CEO
1  Meghna   2   India  CTO
2    Lisa   3     USA  CTO

Dataframe 2:
    ID    Name
0   1  Pankaj
1   2  Anupam
2   3    Amit

Merge on "ID":
    Name_x  ID Country Role  Name_y
0  Pankaj   1   India  CEO  Pankaj
1  Meghna   2   India  CTO  Anupam
2    Lisa   3     USA  CTO    Amit

Merge on "Name":
      Name  ID_x Country Role  ID_y
0  Pankaj     1   India  CEO     1



### 4. Specify Left and Right Columns for Merging DataFrame Objects

In [4]:
d1  = {'Name': ['Pankaj', 'Meghna', 'Lisa'], 'ID1': [1, 2, 3], 'Country': ['India', 'India', 'USA'],
      'Role': ['CEO', 'CTO', 'CTO']}
df1 = pd.DataFrame(d1)
d2  = {'ID2': [1, 2, 3], 'Name': ['Pankaj', 'Anupam', 'Amit']}
df2 = pd.DataFrame(d2)

print('Dataframe 1:\n', df1)
print()
print('Dataframe 2:\n', df2)
print()

print('Default merge on df2 (inner join):\n', df1.merge(df2))
print()
print('Merge left on "ID1" and merge right on "ID2":\n', df1.merge(df2, left_on='ID1', right_on='ID2'))
print()

Dataframe 1:
      Name  ID1 Country Role
0  Pankaj    1   India  CEO
1  Meghna    2   India  CTO
2    Lisa    3     USA  CTO

Dataframe 2:
    ID2    Name
0    1  Pankaj
1    2  Anupam
2    3    Amit

Default merge on df2 (inner join):
      Name  ID1 Country Role  ID2
0  Pankaj    1   India  CEO    1

Merge left on "ID1" and merge right on "ID2":
    Name_x  ID1 Country Role  ID2  Name_y
0  Pankaj    1   India  CEO    1  Pankaj
1  Meghna    2   India  CTO    2  Anupam
2    Lisa    3     USA  CTO    3    Amit



### 5. Using Index as the Join Keys for Merging DataFrames

In [5]:
d1  = {'Name': ['Pankaj', 'Meghna', 'Lisa'], 'Country': ['India', 'India', 'USA'], 'Role': ['CEO', 'CTO', 'CTO']}
df1 = pd.DataFrame(d1)

d2  = {'ID': [1, 2, 3], 'Name': ['Pankaj', 'Anupam', 'Amit']}
df2 = pd.DataFrame(d2)

print('Dataframe 1:\n', df1)
print()
print('Dataframe 2:\n', df2)
print()

df_merged = df1.merge(df2)
print('Default merge (inner join) called "df_merged":\n', df_merged)
print()
df_merged = df1.merge(df2, left_index=True, right_index=True)
print('\nIndex merge:\n', df_merged)

Dataframe 1:
      Name Country Role
0  Pankaj   India  CEO
1  Meghna   India  CTO
2    Lisa     USA  CTO

Dataframe 2:
    ID    Name
0   1  Pankaj
1   2  Anupam
2   3    Amit

Default merge (inner join) called "df_merged":
      Name Country Role  ID
0  Pankaj   India  CEO   1


Index merge:
    Name_x Country Role  ID  Name_y
0  Pankaj   India  CEO   1  Pankaj
1  Meghna   India  CTO   2  Anupam
2    Lisa     USA  CTO   3    Amit
