# <font color="purple"><h3 align="center">Pandas Concatenate Tutorial</h3></font>

## <font color='blue'>Basic Concatenation</font>

In [13]:
import pandas as pd

india_weather = pd.DataFrame({
    "city": ["mumbai","delhi","banglore"],
    "temperature": [32,45,30],   # avg temperature through out year
    "humidity": [80, 60, 78]   # humidity values
})
india_weather

Unnamed: 0,city,temperature,humidity
0,mumbai,32,80
1,delhi,45,60
2,banglore,30,78


In [14]:
us_weather = pd.DataFrame({
    "city": ["new york","chicago","orlando"],
    "temperature": [21,14,35],
    "humidity": [68, 65, 75]
})
us_weather

Unnamed: 0,city,temperature,humidity
0,new york,21,68
1,chicago,14,65
2,orlando,35,75


In [15]:
df = pd.concat([india_weather, us_weather])  # here by default the ignore_index param is false so the indexes are 0,1,2 and 0,1,2
df

Unnamed: 0,city,temperature,humidity
0,mumbai,32,80
1,delhi,45,60
2,banglore,30,78
0,new york,21,68
1,chicago,14,65
2,orlando,35,75


## <font color='blue'>Ignore Index</font>

In [4]:
df = pd.concat([india_weather, us_weather], ignore_index=True) # so we keep ignore index =true so that new indexes will be assigned
df

Unnamed: 0,city,temperature,humidity
0,mumbai,32,80
1,delhi,45,60
2,banglore,30,78
3,new york,21,68
4,chicago,14,65
5,orlando,35,75


## <font color='blue'>Concatenation And Keys</font>

In [5]:
df = pd.concat([india_weather, us_weather], keys=["india", "us"])  # we can pass keys so that we can retrieve only the required data later by using those keys.
df

Unnamed: 0,Unnamed: 1,city,temperature,humidity
india,0,mumbai,32,80
india,1,delhi,45,60
india,2,banglore,30,78
us,0,new york,21,68
us,1,chicago,14,65
us,2,orlando,35,75


In [6]:
df.loc["us"]  # we can retrieve only us data now when we want it later

Unnamed: 0,city,temperature,humidity
0,new york,21,68
1,chicago,14,65
2,orlando,35,75


In [7]:
df.loc["india"]

Unnamed: 0,city,temperature,humidity
0,mumbai,32,80
1,delhi,45,60
2,banglore,30,78


## <font color='blue'>Concatenation Using Index</font>

appaending data based on columns

In [17]:
temperature_df = pd.DataFrame({
    "city": ["mumbai","delhi","banglore"],
    "temperature": [32,45,30],
})
temperature_df

Unnamed: 0,city,temperature
0,mumbai,32
1,delhi,45
2,banglore,30


In [16]:
windspeed_df = pd.DataFrame({
    "city": ["delhi","mumbai"],
    "windspeed": [7,12],
})
windspeed_df

Unnamed: 0,city,windspeed
0,delhi,7
1,mumbai,12


here the column windspeed got added but the rows also got added again. so we have to change the axis paramater.

In [21]:
df= pd.concat([temperature_df,windspeed_df])
df

Unnamed: 0,city,temperature,windspeed
0,mumbai,32.0,
1,delhi,45.0,
2,banglore,30.0,
0,delhi,,7.0
1,mumbai,,12.0


In [18]:
df = pd.concat([temperature_df,windspeed_df],axis=1)   #by default axis=0 it means it appends the second data frame as rows ,if we want to append as columns then we have to give axis=1
df

Unnamed: 0,city,temperature,city.1,windspeed
0,mumbai,32,delhi,7.0
1,delhi,45,mumbai,12.0
2,banglore,30,,


but here the cities are not same so while creating the data frames we can change the indexing.

In [23]:
temperature_df = pd.DataFrame({
    "city": ["mumbai","delhi","banglore"],
    "temperature": [32,45,30],
},index=[0,1,2])
temperature_df

Unnamed: 0,city,temperature
0,mumbai,32
1,delhi,45
2,banglore,30


In [24]:
windspeed_df = pd.DataFrame({
    "city": ["delhi","mumbai"],
    "windspeed": [7,12],
},index=[1,0])

# changing the indexes of second data frame by observing first data frame
windspeed_df


Unnamed: 0,city,windspeed
1,delhi,7
0,mumbai,12


In [25]:
df = pd.concat([temperature_df,windspeed_df],axis=1)   #by default axis=0 it means it appends the second data frame as rows ,if we want to append as columns then we have to give axis=1
df

Unnamed: 0,city,temperature,city.1,windspeed
0,mumbai,32,mumbai,12.0
1,delhi,45,delhi,7.0
2,banglore,30,,


## <font color='blue'>Concatenate dataframe with series</font>

**pandas series?**

A pandas Series is a one-dimensional array. It holds any data type supported in Python and uses labels to locate each data value for retrieval.

In [11]:
s = pd.Series(["Humid","Dry","Rain"], name="event")  # here we are creating a series to store events
s

0    Humid
1      Dry
2     Rain
Name: event, dtype: object

In [12]:
df = pd.concat([temperature_df,s],axis=1)  # concatenating the data frame with series
df

Unnamed: 0,city,temperature,event
0,mumbai,32,Humid
1,delhi,45,Dry
2,banglore,30,Rain


we have more better ways to concatenate the data using merge function , there is no need to indicate the indexes also explicitly.