## Join Operations in dataframes
Pandas provides various facilities for easily combining together Series or DataFrame with various kinds of set logic for the indexes and relational algebra functionality in the case of join / merge-type operations.In addition, pandas also provides utilities to compare two Series or DataFrame and summarize their differences.

https://pandas.pydata.org/pandas-docs/stable/user_guide/merging.html

## merge function
Merging two datasets is the process of bringing two datasets together into one, and aligning the rows from each based on common attributes or columns. The words merge and join are used relatively interchangeably in Pandas. We have ```merge()``` function in python for joining two dataframes. 

1. **left**: use only keys from left frame, similar to a SQL left outer join; preserve key order.
2. **right**: use only keys from right frame, similar to a SQL right outer join; preserve key order.
3. **outer:** use union of keys from both frames, similar to a SQL full outer join; sort keys lexicographically.
4. **inner**: use intersection of keys from both frames, similar to a SQL inner join; preserve the order of the left keys.

![image-2.png](attachment:image-2.png)


In [13]:
import pandas as pd
import numpy as np
df1 = pd.DataFrame({"key": ["K0", "K1", "K2", "K3"],"A": ["A0", "A1", "A2", "A3"],"B": ["B0", "B1", "B2", "B3"]})
df2 = pd.DataFrame({'key':['K0','K1','K2','K4'],'C':['C0','C1','C2','C3'],'D':['D0','D1','D2','D3']})
df3 = pd.DataFrame({'a': ['foo', 'bar'], 'b': [1, 2]})
df4 = pd.DataFrame({'a': ['foo', 'baz'], 'c': [3, 4]})

In [14]:
# inner join
pd.merge(df1,df2, on='key',how='inner')

Unnamed: 0,key,A,B,C,D
0,K0,A0,B0,C0,D0
1,K1,A1,B1,C1,D1
2,K2,A2,B2,C2,D2


In [15]:
df3.merge(df4, how='inner', on='a')

Unnamed: 0,a,b,c
0,foo,1,3


### left join

In [16]:
pd.merge(df1,df2, on='key',how='left')

Unnamed: 0,key,A,B,C,D
0,K0,A0,B0,C0,D0
1,K1,A1,B1,C1,D1
2,K2,A2,B2,C2,D2
3,K3,A3,B3,,


In [18]:
df3.merge(df4, how='right', on='a')

Unnamed: 0,a,b,c
0,foo,1.0,3
1,baz,,4


In [8]:
# creating 1st data frame
df1 = pd.DataFrame({'pincode':[249201,175131,263136,246171,171004], 
                   'city_name':['Rishikesh','Manali','Bhimtal', 'Rudraprayag','Shimla'],
                   'distance_from_delhi':[230,543,321, 370,343]})
# creating 2nd dataframe
df2 = pd.DataFrame({'pincode':[249201,175131,263001],
                   'district':['Dehradun','Kullu','Nainital'],
                   'State':['Uttarakhand','Himachal Pradesh','Uttarakhand']})

In [9]:
# left join
pd.merge(df1, df2, how= 'left', on='pincode')

Unnamed: 0,pincode,city_name,distance_from_delhi,district,State
0,249201,Rishikesh,230,Dehradun,Uttarakhand
1,175131,Manali,543,Kullu,Himachal Pradesh
2,263136,Bhimtal,321,,
3,246171,Rudraprayag,370,,
4,171004,Shimla,343,,


In [10]:
# right join
pd.merge(df1, df2, how = 'right', on = 'pincode')

Unnamed: 0,pincode,city_name,distance_from_delhi,district,State
0,249201,Rishikesh,230.0,Dehradun,Uttarakhand
1,175131,Manali,543.0,Kullu,Himachal Pradesh
2,263001,,,Nainital,Uttarakhand


In [11]:
# inner join
pd.merge(df1, df2, how = 'inner', on = 'pincode')

Unnamed: 0,pincode,city_name,distance_from_delhi,district,State
0,249201,Rishikesh,230,Dehradun,Uttarakhand
1,175131,Manali,543,Kullu,Himachal Pradesh


In [12]:
# outer join
pd.merge(df1, df2, how = 'outer', on = 'pincode')

Unnamed: 0,pincode,city_name,distance_from_delhi,district,State
0,249201,Rishikesh,230.0,Dehradun,Uttarakhand
1,175131,Manali,543.0,Kullu,Himachal Pradesh
2,263136,Bhimtal,321.0,,
3,246171,Rudraprayag,370.0,,
4,171004,Shimla,343.0,,
5,263001,,,Nainital,Uttarakhand


## Binary Operations in Data Frames
The binary operator function could perform addition, subtraction and so on to return the new value. Pandas.DataFrame has several binary operator functions defined for combining two DataFrames. The binary operator functions return a new DataFrame as a result of combining the two DataFrames.

In [4]:
# importing libraries
import pandas as pd
import numpy as np
df1 = pd.DataFrame([[2,3,5],[1,4,2],[2,7,5]])
df2 = pd.DataFrame([[3,2,5],[4,8,4],[1,6,7]])

In [5]:
# adding two dataframes
df1 + df2

Unnamed: 0,0,1,2
0,5,5,10
1,5,12,6
2,3,13,12


In [6]:
# subtracting two dataframes
df1 - df2

Unnamed: 0,0,1,2
0,-1,1,0
1,-3,-4,-2
2,1,1,-2


In [7]:
# multipliying two dataframes
df1 * df2

Unnamed: 0,0,1,2
0,6,6,25
1,4,32,8
2,2,42,35


In [8]:
# matrix multiplication
df1.dot(df2)

Unnamed: 0,0,1,2
0,23,58,57
1,21,46,35
2,39,90,73


In [9]:
# elementwise division
df1.div(df2)

Unnamed: 0,0,1,2
0,0.666667,1.5,1.0
1,0.25,0.5,0.5
2,2.0,1.166667,0.714286


In [10]:
# integer or floor division
# returns the quotient
df1.floordiv(df2)

Unnamed: 0,0,1,2
0,0,1,1
1,0,0,0
2,2,1,0


In [11]:
# modular division
# returns the remainder
df1.mod(df2)

Unnamed: 0,0,1,2
0,2,1,0
1,1,4,2
2,0,1,5


In [12]:
# pow() function to a dataframe
df1.pow(df2)

Unnamed: 0,0,1,2
0,8,9,3125
1,1,65536,16
2,2,117649,78125


In [13]:
# lt() function
df1.lt(df2)

Unnamed: 0,0,1,2
0,True,False,False
1,True,True,True
2,False,False,True


## Binary Operations on series
We can perform binary operation on series like addition, subtraction and many other operation. In order to perform binary operation on series we have to use some function like .add(),.sub() etc..

In [14]:
# importing pandas
import pandas as pd
# creating two pandas series
s1 = pd.Series([2,4,6,1,6,0])
s2 = pd.Series([5,1,7,4,2,3])

In [15]:
s1 + s2

0     7
1     5
2    13
3     5
4     8
5     3
dtype: int64

In [16]:
# adding two series using the add() function
s1.add(s2)

0     7
1     5
2    13
3     5
4     8
5     3
dtype: int64

In [17]:
s1 - s2

0   -3
1    3
2   -1
3   -3
4    4
5   -3
dtype: int64

In [18]:
# subtracting two series using the sub() function
s1.sub(s2)

0   -3
1    3
2   -1
3   -3
4    4
5   -3
dtype: int64

In [19]:
# multiplying two series using the mul() function
s1.mul(s2)

0    10
1     4
2    42
3     4
4    12
5     0
dtype: int64

In [20]:
s1 * s2

0    10
1     4
2    42
3     4
4    12
5     0
dtype: int64

## Concatenation
pandas ```concat()``` function is used to concatenate pandas objects into a dataframe output.


In [19]:
# concatenating two dataframes
df1 = pd.DataFrame([['a', 1], ['b', 2]], columns=['letter', 'number'])
df2 = pd.DataFrame([['c', 3], ['d', 4]], columns=['letter', 'number'])
pd.concat([df1, df2])

Unnamed: 0,letter,number
0,a,1
1,b,2
0,c,3
1,d,4


In [20]:
# dataframes having the same columns
df1 = pd.DataFrame({'Name': ['Arijit', 'Neeraj', 'Sakshi', 'Muskan'],
                    'Age': [34, 30, 22, 33],
                    'Gender': ['M', 'M', 'F', 'M']})

df2 = pd.DataFrame({'Name': ['Kartik', 'Veer', 'Preeti'],
                    'Age': [31, 22, 19],
                    'Gender': ['M', 'M', 'F']})

In [21]:
# concatenating two dataframes
df = pd.concat([df1, df2], axis = 0)
df= df.reset_index()
df = df.drop(['index'], axis = 1)
df

Unnamed: 0,Name,Age,Gender
0,Arijit,34,M
1,Neeraj,30,M
2,Sakshi,22,F
3,Muskan,33,M
4,Kartik,31,M
5,Veer,22,M
6,Preeti,19,F


In [25]:
# creating 1st data frame
df1 = pd.DataFrame({'pincode':[249201,175131], 
                   'city_name':['Rishikesh','Manali'],
                   'distance_from_delhi':[230,543]})

# creating 2nd data frame
df2 = pd.DataFrame({'pincode':[263136,246171,171004], 
                   'city_name':['Bhimtal', 'Rudraprayag','Shimla'],
                   'distance_from_delhi':[321, 370,343]})

In [26]:
# concatenating two dataframes having same columns
df = pd.concat([df1, df2], axis = 0)
df

Unnamed: 0,pincode,city_name,distance_from_delhi
0,249201,Rishikesh,230
1,175131,Manali,543
0,263136,Bhimtal,321
1,246171,Rudraprayag,370
2,171004,Shimla,343


In [27]:
# creating dataframe
df3 = pd.DataFrame({'district':['Dehradun','Kullu'],
                   'State':['Uttarakhand','Himachal Pradesh']})
df3

Unnamed: 0,district,State
0,Dehradun,Uttarakhand
1,Kullu,Himachal Pradesh


In [28]:
# concatenation two data frames having different columns
pd.concat([df1, df3], axis = 1)

Unnamed: 0,pincode,city_name,distance_from_delhi,district,State
0,249201,Rishikesh,230,Dehradun,Uttarakhand
1,175131,Manali,543,Kullu,Himachal Pradesh
