# Data Concat:
Data concatenating involves stacking(action of placing dataset on each other) datasets along a particular axis (usually rows or columns). In the context of data preprocessing, concatenation is often used when you have data split across multiple files or sources, and you want to combine them either vertically or horizontally. For example, if you have data collected over different time periods in separate files, you might concatenate them vertically to create a single timeline.

In [1]:
import pandas as pd

### Combining columns horizontally; axis=0

In [2]:
df1 = pd.DataFrame({
    "Name":["Ali","Haider","Bilal","Fatima"],
    "Age":[23,21,29,24],
    "Gender":["M","M","M","F"]
})
df1

Unnamed: 0,Name,Age,Gender
0,Ali,23,M
1,Haider,21,M
2,Bilal,29,M
3,Fatima,24,F


In [8]:
df2 = pd.DataFrame({
    "Name":["Abdullah","Haider","Sana"],
    "Gender":["M","M","F"],
    "Age":[12,13,14]
})
df2

Unnamed: 0,Name,Gender,Age
0,Abdullah,M,12
1,Haider,M,13
2,Sana,F,14


In [15]:
df3 = pd.concat([df1,df2], axis=0)
df3

Unnamed: 0,Name,Age,Gender
0,Ali,23,M
1,Haider,21,M
2,Bilal,29,M
3,Fatima,24,F
0,Abdullah,12,M
1,Haider,13,M
2,Sana,14,F


In [16]:
df3 = pd.concat([df1,df2], axis=0, ignore_index=True)
df3

Unnamed: 0,Name,Age,Gender
0,Ali,23,M
1,Haider,21,M
2,Bilal,29,M
3,Fatima,24,F
4,Abdullah,12,M
5,Haider,13,M
6,Sana,14,F


In [18]:
df3 = pd.concat([df1,df2], axis=0, keys=["1st dataset","2nd dataset"])
df3

Unnamed: 0,Unnamed: 1,Name,Age,Gender
1st dataset,0,Ali,23,M
1st dataset,1,Haider,21,M
1st dataset,2,Bilal,29,M
1st dataset,3,Fatima,24,F
2nd dataset,0,Abdullah,12,M
2nd dataset,1,Haider,13,M
2nd dataset,2,Sana,14,F


### Combining data vertically; axis=1

In [23]:
df1 = pd.DataFrame({
    "Id":[1,2,3,4],
    "Name":["Ali","Haroon","Bilal","Sana"]
})
df1

Unnamed: 0,Id,Name
0,1,Ali
1,2,Haroon
2,3,Bilal
3,4,Sana


In [28]:
df2 = pd.DataFrame({
    "Gender":["M","M","M","F"]
})
df2

Unnamed: 0,Gender
0,M
1,M
2,M
3,F


In [29]:
df3 = pd.concat([df1,df2], axis=1)
df3

Unnamed: 0,Id,Name,Gender
0,1,Ali,M
1,2,Haroon,M
2,3,Bilal,M
3,4,Sana,F


In [30]:
df3 = pd.concat([df1,df2], axis=1, keys=["First dataset","Second dataset"])
df3

Unnamed: 0_level_0,First dataset,First dataset,Second dataset
Unnamed: 0_level_1,Id,Name,Gender
0,1,Ali,M
1,2,Haroon,M
2,3,Bilal,M
3,4,Sana,F
