# Understand measures such as mean, median, mode, variance, and standard deviation.

#### Dataset: India_Inflation_Rate_Historical_Data.csv

### Dataset preparation

In [1]:
#Library Import
import pandas as pd
import numpy as np

In [2]:
#Dataset import
data = pd.read_csv("pra_datasets/India_Inflation_Rate_Historical_Data.csv")
data.head()

Unnamed: 0.1,Unnamed: 0,year,Inflation_Rate,Annual_percent_geowth
0,0,2022,6.70%,1.57%
1,1,2021,5.13%,-1.49%
2,2,2020,6.62%,2.89%
3,3,2019,3.73%,-0.21%
4,4,2018,3.94%,0.61%


### Data preprocessing

In [3]:
# Renaming columns and preparing data for analysis:
data.rename(columns={'Unnamed: 0': 'Sr. No.','Inflation_Rate': 'Inflation_rate(%)','Annual_percent_geowth':'Annual_growth(%)'},inplace = True)
data['Inflation_rate(%)'] = data['Inflation_rate(%)'].str.replace('%','').astype(float)
data['Annual_growth(%)'] = data['Annual_growth(%)'].str.replace('%','').astype(float)
data.head()

Unnamed: 0,Sr. No.,year,Inflation_rate(%),Annual_growth(%)
0,0,2022,6.7,1.57
1,1,2021,5.13,-1.49
2,2,2020,6.62,2.89
3,3,2019,3.73,-0.21
4,4,2018,3.94,0.61


### Mean

In [4]:
# Calculating mean
mean_inflation_rate = round(data['Inflation_rate(%)'].mean(),3)
mean_annual_growth = round(data['Annual_growth(%)'].mean(),3)
print("Mean Inflation Rate:", mean_inflation_rate)
print("Mean Annual Percent Growth:", mean_annual_growth)

Mean Inflation Rate: 7.366
Mean Annual Percent Growth: 0.078


- Mean Inflation Rate 
    - A higher mean inflation rate indicates that, on average, prices are increasing at a relatively high rate.
- Mean Annual Percent Growth
    - It refer to the average annual growth rate of a particular economic indicator or metric.
    - A positive mean indicates overall growth, while a negative mean would indicate overall decline.

### Median

In [5]:
# Calculating median
median_inflation_rate = round(data['Inflation_rate(%)'].median(),3)
median_annual_growth = round(data['Annual_growth(%)'].median(),3)
print("\nMedian Inflation Rate:", median_inflation_rate)
print("Median Annual Percent Growth:", median_annual_growth)


Median Inflation Rate: 6.67
Median Annual Percent Growth: 0.07


- Median Inflation Rate
    - It gives a measure of central tendency that is less affected by outliers compared to the mean.
    - In this case, the median inflation rate is slightly lower than the mean, indicating that there might be some outliers pulling the mean upwards.
- Median Annual Percent Growth
    - Similar to the median inflation rate, this represents the middle value of the annual percent growth data.

### Mode

In [6]:
# Calculating mode
mode_inflation_rate = round(data['Inflation_rate(%)'].mode().iloc[0],3)
mode_annual_growth = round(data['Annual_growth(%)'].mode().iloc[0],3)
print("\nMode Inflation Rate:", mode_inflation_rate)
print("Mode Annual Percent Growth:", mode_annual_growth)


Mode Inflation Rate: -7.63
Mode Annual Percent Growth: -0.08


- Mode Inflation Rate
    - It's worth noting that a negative mode could indicate periods of deflation or significant decreases in prices.
- Mode Annual Percent Growth
    - A negative mode could indicate periods of economic contraction or decline.

### Variance

In [7]:
# Calculating variance
variance_inflation_rate = round(data['Inflation_rate(%)'].var(),3)
variance_annual_growth = round(data['Annual_growth(%)'].var(),3)
print("\nVariance Inflation Rate:", variance_inflation_rate)
print("Variance Annual Percent Growth:", variance_annual_growth)


Variance Inflation Rate: 23.708
Variance Annual Percent Growth: 30.608


- Variance Inflation Rate
    - Variance measures the spread or dispersion of data points around the mean.
    - A higher variance indicates greater variability in inflation rates across the dataset.
    - In this case, a variance of 23.708 suggests moderate variability.
- Variance Annual Percent Growth
    - A higher variance indicates greater variability in annual growth rates.

### Standard Deviation

In [8]:
# Calculating standard deviation
std_dev_inflation_rate = round(data['Inflation_rate(%)'].std(),3)
std_dev_annual_growth = round(data['Annual_growth(%)'].std(),3)
print("\nStandard Deviation Inflation Rate:", std_dev_inflation_rate)
print("Standard Deviation Annual Percent Growth:", std_dev_annual_growth)


Standard Deviation Inflation Rate: 4.869
Standard Deviation Annual Percent Growth: 5.532


- SD Inflation Rate
    - SD is another measure of the spread of data points around the mean.
    - It's the square root of the variance and is expressed in the same units as the original data.
    - A higher SD indicates greater variability in inflation rates.
- SD Annual Percent Growth
    - A higher SD indicates greater variability in annual growth rates.

# Correlation

In [9]:
# Calculating correlation
correlation = round(data['Inflation_rate(%)'].corr(data['Annual_growth(%)']),3)
print("\nCorrelation between Inflation Rate and Annual Percent Growth:", correlation)


Correlation between Inflation Rate and Annual Percent Growth: 0.559


- Correlation
    - Correlation measures the strength and direction of the linear relationship between two variables.
    - A correlation of 0.559 indicates a moderately positive linear relationship between inflation rate and annual percent growth.
    - This means that as one variable (inflation rate) tends to increase, the other variable (annual percent growth) also tends to increase, and vice versa.

#### Below build in function to get the descriptive statistics results 

In [10]:
summary = round(data[['Inflation_rate(%)','Annual_growth(%)']].describe(),3)
print(summary)

       Inflation_rate(%)  Annual_growth(%)
count             63.000            63.000
mean               7.366             0.078
std                4.869             5.532
min               -7.630           -22.850
25%                4.130            -1.910
50%                6.670             0.070
75%                9.750             2.120
max               28.600            15.940


- data.describe()
  - It  generates descriptive statistics summarizing the central tendency, dispersion, and shape of the dataset's distribution.
  - Here's what each statistic means:
    - count: Number of non-null observations in the column.
    - mean: Average value of the column.
    - std: Standard deviation, which measures the dispersion of values around the mean.
    - min: Minimum value in the column.
    - 25% (1st quartile): Value below which 25% of the data falls.
    - 50% (2nd quartile or median): Value below which 50% of the data falls.
    - 75% (3rd quartile): Value below which 75% of the data falls.
    - max: Maximum value in the column.

<div class="alert alert-block alert-success">
<b>END</b>
</div>