Concatenate its used to join one or more DataFrames

## 1 - Concatenate

In [1]:
import pandas as pd

india_weather = pd.DataFrame({
    'city' : ['mumbai', 'delhi', 'banglore'],
    'temperature' : [32, 45, 30],
    'humidity' : [80, 60, 78]
})
india_weather

Unnamed: 0,city,temperature,humidity
0,mumbai,32,80
1,delhi,45,60
2,banglore,30,78


In [2]:
usa_weather = pd.DataFrame({
    'city' : ['new york', 'chicago', 'orlando'],
    'temperature' : [23, 14, 35],
    'humidity' : [68, 65, 75]
})
usa_weather

Unnamed: 0,city,temperature,humidity
0,new york,23,68
1,chicago,14,65
2,orlando,35,75


In [4]:
# I want a Dataframe that has the informations of usa ans india.
df = pd.concat([india_weather, usa_weather])
df

Unnamed: 0,city,temperature,humidity
0,mumbai,32,80
1,delhi,45,60
2,banglore,30,78
0,new york,23,68
1,chicago,14,65
2,orlando,35,75


In [5]:
#Ignore index, to make it continuous 
df = pd.concat([india_weather, usa_weather], ignore_index = True)
df

Unnamed: 0,city,temperature,humidity
0,mumbai,32,80
1,delhi,45,60
2,banglore,30,78
3,new york,23,68
4,chicago,14,65
5,orlando,35,75


In [6]:
#U can associate a key for each of these Dataframes, it is lika an aditional index
df = pd.concat([india_weather, usa_weather], keys = ['india', 'us'])
df

Unnamed: 0,Unnamed: 1,city,temperature,humidity
india,0,mumbai,32,80
india,1,delhi,45,60
india,2,banglore,30,78
us,0,new york,23,68
us,1,chicago,14,65
us,2,orlando,35,75


In [7]:
#With this index I can use loc (onli works in the index)
df.loc['india']

Unnamed: 0,city,temperature,humidity
0,mumbai,32,80
1,delhi,45,60
2,banglore,30,78


## 2 - Concatenate dataframe changing its axis 

In [18]:
temperature_df = pd.DataFrame({
    'city' : ['mumbai', 'delhi', 'banglore'],
    'temperature' : [32, 45, 30],
})
temperature_df 

Unnamed: 0,city,temperature
0,mumbai,32
1,delhi,45
2,banglore,30


In [10]:
windspeed_df = pd.DataFrame({
    'city' : ['mumbai', 'delhi', 'banglore'],
    'windspeed' : [7, 12, 9],
})
windspeed_df

Unnamed: 0,city,windspeed
0,mumbai,7
1,delhi,12
2,banglore,9


In [11]:
df = pd.concat([temperature_df, windspeed_df])
df

of pandas will change to not sort by default.

To accept the future behavior, pass 'sort=False'.


  """Entry point for launching an IPython kernel.


Unnamed: 0,city,temperature,windspeed
0,mumbai,32.0,
1,delhi,45.0,
2,banglore,30.0,
0,mumbai,,7.0
1,delhi,,12.0
2,banglore,,9.0


We don't want this type of DataFrame, we want just the three rows (city names) with all the informations.

In [12]:
df = pd.concat([temperature_df, windspeed_df], axis = 1)
df

Unnamed: 0,city,temperature,city.1,windspeed
0,mumbai,32,mumbai,7
1,delhi,45,delhi,12
2,banglore,30,banglore,9


axis = 0 the information go to the row, but axis = 1 it goes to the column. What will happen if the order of the cities will be different from each DataFrame?

In [13]:
windspeed_df = pd.DataFrame({
    'city' : [ 'delhi', 'mumbai'],
    'windspeed' : [7, 12],
})
windspeed_df

Unnamed: 0,city,windspeed
0,delhi,7
1,mumbai,12


In [14]:
df = pd.concat([temperature_df, windspeed_df], axis = 1)
df

Unnamed: 0,city,temperature,city.1,windspeed
0,mumbai,32,delhi,7.0
1,delhi,45,mumbai,12.0
2,banglore,30,,


Looks like the data doesn't fit well. The mumbai information of temperature will be great to fit with the windspeed of the same city. Let's change its index and make it similar for each city inside the dataframe


## 3 - Concatenate dataframe changing its index

In [15]:
temperature_df = pd.DataFrame({
    'city' : ['mumbai', 'delhi', 'banglore'],
    'temperature' : [32, 45, 30],
}, index = [0, 1, 2])
temperature_df 

Unnamed: 0,city,temperature
0,mumbai,32
1,delhi,45
2,banglore,30


In [16]:
windspeed_df = pd.DataFrame({
    'city' : [ 'delhi', 'mumbai'],
    'windspeed' : [7, 12],
}, index = [1, 0])
windspeed_df

Unnamed: 0,city,windspeed
1,delhi,7
0,mumbai,12


In [17]:
df = pd.concat([temperature_df, windspeed_df], axis = 1)
df

Unnamed: 0,city,temperature,city.1,windspeed
0,mumbai,32,mumbai,12.0
1,delhi,45,delhi,7.0
2,banglore,30,,


Now, because they had the same index, the informations fit like we wanted before

## 4 - Join our dataframe with a series

In [19]:
temperature_df

Unnamed: 0,city,temperature
0,mumbai,32
1,delhi,45
2,banglore,30


In [21]:
s = pd.Series(['Humid', 'Dry', 'Rain'], name = 'event')
s

0    Humid
1      Dry
2     Rain
Name: event, dtype: object

In [22]:
df = pd.concat([temperature_df, s], axis = 1)
df

Unnamed: 0,city,temperature,event
0,mumbai,32,Humid
1,delhi,45,Dry
2,banglore,30,Rain
