<img src="pandas.jpg"
     align="right"
     width="30%"
     alt="Python logo\">
     
# Python | Pandas Merging, Joining, and Concatenating

- Pandas DataFrame is two-dimensional size-mutable, potentially heterogeneous tabular data structure with labelled axes (rows and columns). 
- A Data frame is a two-dimensional data structure, i.e., data is aligned in a tabular fashion in rows and columns. 
- We can join, merge, and concat dataframe using different methods. In Dataframe df.merge(),df.join(), and df.concat() methods help in joining, merging and concating different dataframe.


### Concatenating DataFrame
In order to concat dataframe, we use `concat()` function which helps in concatenating a dataframe. We can concat a dataframe in many different ways, they are:

- Concatenating DataFrame using .concat()
- Concatenating DataFrame by setting logic on axes
- Concatenating DataFrame using .append()
- Concatenating DataFrame by ignoring indexes
- Concatenating DataFrame with group keys
- Concatenating with mixed ndims

**Concatenating DataFrame using .concat() :**
In order to concat a dataframe, we use `.concat()` function this function concat a dataframe and returns a new dataframe.

In [2]:
# importing pandas module
import pandas as pd 
 
# Define a dictionary containing employee data 
data1 = {'Name':['Jai', 'Princi', 'Gaurav', 'Anuj'], 
        'Age':[27, 24, 22, 32], 
        'Address':['Nagpur', 'Kanpur', 'Allahabad', 'Kannuaj'], 
        'Qualification':['Msc', 'MA', 'MCA', 'Phd']} 
   
# Define a dictionary containing employee data 
data2 = {'Name':['Abhi', 'Ayushi', 'Dhiraj', 'Hitesh'], 
        'Age':[17, 14, 12, 52], 
        'Address':['Nagpur', 'Kanpur', 'Allahabad', 'Kannuaj'], 
        'Qualification':['Btech', 'B.A', 'Bcom', 'B.hons']} 
 
# Convert the dictionary into DataFrame  
df = pd.DataFrame(data1,index=[0, 1, 2, 3])
 
# Convert the dictionary into DataFrame  
df1 = pd.DataFrame(data2, index=[4, 5, 6, 7])
 
print(df, "\n\n", df1) 

# using a .concat() method
frames = [df, df1]
 
res1 = pd.concat(frames)
res1

     Name  Age    Address Qualification
0     Jai   27     Nagpur           Msc
1  Princi   24     Kanpur            MA
2  Gaurav   22  Allahabad           MCA
3    Anuj   32    Kannuaj           Phd 

      Name  Age    Address Qualification
4    Abhi   17     Nagpur         Btech
5  Ayushi   14     Kanpur           B.A
6  Dhiraj   12  Allahabad          Bcom
7  Hitesh   52    Kannuaj        B.hons


Unnamed: 0,Name,Age,Address,Qualification
0,Jai,27,Nagpur,Msc
1,Princi,24,Kanpur,MA
2,Gaurav,22,Allahabad,MCA
3,Anuj,32,Kannuaj,Phd
4,Abhi,17,Nagpur,Btech
5,Ayushi,14,Kanpur,B.A
6,Dhiraj,12,Allahabad,Bcom
7,Hitesh,52,Kannuaj,B.hons


#### Concatenating DataFrame by setting logic on axes :
In order to concat dataframe, we have to set different logic on axes. We can set axes in the following three ways:

- Taking the union of them all, join='outer'. This is the default option as it results in zero information loss.
- Taking the intersection, `join='inner'`.
- Use a specific index, as passed to the join_axes argument

In [4]:
# importing pandas module
import pandas as pd 
 
# Define a dictionary containing employee data 
data1 = {'Name':['Jai', 'Princi', 'Gaurav', 'Anuj'], 
        'Age':[27, 24, 22, 32], 
        'Address':['Nagpur', 'Kanpur', 'Allahabad', 'Kannuaj'], 
        'Qualification':['Msc', 'MA', 'MCA', 'Phd'],
        'Mobile No': [97, 91, 58, 76]} 
   
# Define a dictionary containing employee data 
data2 = {'Name':['Gaurav', 'Anuj', 'Dhiraj', 'Hitesh'], 
        'Age':[22, 32, 12, 52], 
        'Address':['Allahabad', 'Kannuaj', 'Allahabad', 'Kannuaj'], 
        'Qualification':['MCA', 'Phd', 'Bcom', 'B.hons'],
        'Salary':[1000, 2000, 3000, 4000]} 
 
# Convert the dictionary into DataFrame  
df = pd.DataFrame(data1,index=[0, 1, 2, 3])
 
# Convert the dictionary into DataFrame  
df1 = pd.DataFrame(data2, index=[2, 3, 6, 7]) 
 
print(df, "\n\n", df1) 

# applying concat with axes
# join = 'inner'
res2 = pd.concat([df, df1], axis=1, join='inner')
 
res2

     Name  Age    Address Qualification  Mobile No
0     Jai   27     Nagpur           Msc         97
1  Princi   24     Kanpur            MA         91
2  Gaurav   22  Allahabad           MCA         58
3    Anuj   32    Kannuaj           Phd         76 

      Name  Age    Address Qualification  Salary
2  Gaurav   22  Allahabad           MCA    1000
3    Anuj   32    Kannuaj           Phd    2000
6  Dhiraj   12  Allahabad          Bcom    3000
7  Hitesh   52    Kannuaj        B.hons    4000


Unnamed: 0,Name,Age,Address,Qualification,Mobile No,Name.1,Age.1,Address.1,Qualification.1,Salary
2,Gaurav,22,Allahabad,MCA,58,Gaurav,22,Allahabad,MCA,1000
3,Anuj,32,Kannuaj,Phd,76,Anuj,32,Kannuaj,Phd,2000


In [5]:
#Now we set axes join = outer for union of dataframe.

# using a .concat for
# union of dataframe
res2 = pd.concat([df, df1], axis=1, sort=False)
 
res2

Unnamed: 0,Name,Age,Address,Qualification,Mobile No,Name.1,Age.1,Address.1,Qualification.1,Salary
0,Jai,27.0,Nagpur,Msc,97.0,,,,,
1,Princi,24.0,Kanpur,MA,91.0,,,,,
2,Gaurav,22.0,Allahabad,MCA,58.0,Gaurav,22.0,Allahabad,MCA,1000.0
3,Anuj,32.0,Kannuaj,Phd,76.0,Anuj,32.0,Kannuaj,Phd,2000.0
6,,,,,,Dhiraj,12.0,Allahabad,Bcom,3000.0
7,,,,,,Hitesh,52.0,Kannuaj,B.hons,4000.0


### Concatenating DataFrame using .append()
In order to concat a dataframe, we use `.append()` function this function concatenate along axis=0, namely the index. This function exist before `.concat`.

In [6]:
# importing pandas module
import pandas as pd 
 
# Define a dictionary containing employee data 
data1 = {'Name':['Jai', 'Princi', 'Gaurav', 'Anuj'], 
        'Age':[27, 24, 22, 32], 
        'Address':['Nagpur', 'Kanpur', 'Allahabad', 'Kannuaj'], 
        'Qualification':['Msc', 'MA', 'MCA', 'Phd']} 
   
# Define a dictionary containing employee data 
data2 = {'Name':['Abhi', 'Ayushi', 'Dhiraj', 'Hitesh'], 
        'Age':[17, 14, 12, 52], 
        'Address':['Nagpur', 'Kanpur', 'Allahabad', 'Kannuaj'], 
        'Qualification':['Btech', 'B.A', 'Bcom', 'B.hons']} 
 
# Convert the dictionary into DataFrame  
df = pd.DataFrame(data1,index=[0, 1, 2, 3])
 
# Convert the dictionary into DataFrame  
df1 = pd.DataFrame(data2, index=[4, 5, 6, 7])
 
print(df, "\n\n", df1) 

# using append function
 
res = df.append(df1)
res

     Name  Age    Address Qualification
0     Jai   27     Nagpur           Msc
1  Princi   24     Kanpur            MA
2  Gaurav   22  Allahabad           MCA
3    Anuj   32    Kannuaj           Phd 

      Name  Age    Address Qualification
4    Abhi   17     Nagpur         Btech
5  Ayushi   14     Kanpur           B.A
6  Dhiraj   12  Allahabad          Bcom
7  Hitesh   52    Kannuaj        B.hons


Unnamed: 0,Name,Age,Address,Qualification
0,Jai,27,Nagpur,Msc
1,Princi,24,Kanpur,MA
2,Gaurav,22,Allahabad,MCA
3,Anuj,32,Kannuaj,Phd
4,Abhi,17,Nagpur,Btech
5,Ayushi,14,Kanpur,B.A
6,Dhiraj,12,Allahabad,Bcom
7,Hitesh,52,Kannuaj,B.hons


### Concatenating DataFrame with group keys :
In order to concat dataframe with group keys, we override the column names with the use of the keys argument. Keys argument is to override the column names when creating a new DataFrame based on existing Series.

In [10]:
# importing pandas module
import pandas as pd 
 
# Define a dictionary containing employee data 
data1 = {'Name':['Jai', 'Princi', 'Gaurav', 'Anuj'], 
        'Age':[27, 24, 22, 32], 
        'Address':['Nagpur', 'Kanpur', 'Allahabad', 'Kannuaj'], 
        'Qualification':['Msc', 'MA', 'MCA', 'Phd']} 
   
# Define a dictionary containing employee data 
data2 = {'Name':['Abhi', 'Ayushi', 'Dhiraj', 'Hitesh'], 
        'Age':[17, 14, 12, 52], 
        'Address':['Nagpur', 'Kanpur', 'Allahabad', 'Kannuaj'], 
        'Qualification':['Btech', 'B.A', 'Bcom', 'B.hons']} 
 
# Convert the dictionary into DataFrame  
df = pd.DataFrame(data1,index=[0, 1, 2, 3])
 
# Convert the dictionary into DataFrame  
df1 = pd.DataFrame(data2, index=[4, 5, 6, 7])
 
print(df, "\n\n", df1)  

# using keys 
frames = [df, df1 ]

#Now we use keys as an argument.
res = pd.concat(frames, keys=['x', 'y'])
res

     Name  Age    Address Qualification
0     Jai   27     Nagpur           Msc
1  Princi   24     Kanpur            MA
2  Gaurav   22  Allahabad           MCA
3    Anuj   32    Kannuaj           Phd 

      Name  Age    Address Qualification
4    Abhi   17     Nagpur         Btech
5  Ayushi   14     Kanpur           B.A
6  Dhiraj   12  Allahabad          Bcom
7  Hitesh   52    Kannuaj        B.hons


Unnamed: 0,Unnamed: 1,Name,Age,Address,Qualification
x,0,Jai,27,Nagpur,Msc
x,1,Princi,24,Kanpur,MA
x,2,Gaurav,22,Allahabad,MCA
x,3,Anuj,32,Kannuaj,Phd
y,4,Abhi,17,Nagpur,Btech
y,5,Ayushi,14,Kanpur,B.A
y,6,Dhiraj,12,Allahabad,Bcom
y,7,Hitesh,52,Kannuaj,B.hons


### Concatenating with mixed ndims :
User can concatenate a mix of Series and DataFrame. The Series will be transformed to DataFrame with the column name as the name of the Series.

In [12]:
# importing pandas module
import pandas as pd 
 
# Define a dictionary containing employee data 
data1 = {'Name':['Jai', 'Princi', 'Gaurav', 'Anuj'], 
        'Age':[27, 24, 22, 32], 
        'Address':['Nagpur', 'Kanpur', 'Allahabad', 'Kannuaj'], 
        'Qualification':['Msc', 'MA', 'MCA', 'Phd']} 
   
# Convert the dictionary into DataFrame  
df = pd.DataFrame(data1,index=[0, 1, 2, 3])
 
# creating a series
s1 = pd.Series([10000, 20000, 30000, 40000], name='Salary')
 
print(df, "\n\n", s1) 

#Now we are going to mix Series and dataframe together

# combining series and dataframe

res = pd.concat([df, s1], axis=1)
 
res

     Name  Age    Address Qualification
0     Jai   27     Nagpur           Msc
1  Princi   24     Kanpur            MA
2  Gaurav   22  Allahabad           MCA
3    Anuj   32    Kannuaj           Phd 

 0    10000
1    20000
2    30000
3    40000
Name: Salary, dtype: int64


Unnamed: 0,Name,Age,Address,Qualification,Salary
0,Jai,27,Nagpur,Msc,10000
1,Princi,24,Kanpur,MA,20000
2,Gaurav,22,Allahabad,MCA,30000
3,Anuj,32,Kannuaj,Phd,40000
