<h1> <u> STATISTICS - CHAPTER-1 - DEMO-1 </u> </h1>
<h2> <u> Measure of Central Tendancy </u> </h2>

This notebook will demonstrate various commands used in the measure of central tendancy. This notebook will concentrate on the following:

1. Mean
2. Median
3. Mode

<h6> Importing Libraries </h6>

In [1]:
import pandas as pd
import statistics as st

<h6> Reading the Dataframe (CSV file) </h6>

In [2]:
df = pd.read_csv("cars.csv")
print(df.dtypes)

model     object
mpg      float64
cyl        int64
disp     float64
hp         int64
drat     float64
wt       float64
qsec     float64
vs         int64
am         int64
gear       int64
carb       int64
dtype: object


<h3> 1. Mean </h3>
$$\bar{x} = \frac{1}{n}\sum_{i=1}^{n}x_i$$

$$\bar{\mu} = \frac{1}{N}\sum_{i=1}^{N}x_i$$


where:

$\bar{\mu} = \mbox{Population Mean}$

$\bar{x} = \mbox{Sample Mean}$

$N = \mbox{Population Size}$

$n = \mbox{Sample Size}$


<br>

Mean can be calculated in python as follows
1. Using the pandas library
    - Input: Multidimensional Data
    - Output: Pandas Series / Single Float Value
2. Using the statistics library
    - Input: Cannot handle Multidimensional Data
    - Output: Single Float Value

In [3]:
df_mean = df.mean() # calculating mean using pandas library
s_mean = st.mean(df["mpg"]) # calculating mean using statistics library

# printing values
print(df_mean)
print("\n")
print(s_mean)

# printing datatypes
print("\n")
print("\n")
print("Pandas Library returns an object of type:", type(df_mean))
print("\n")
print("Statistics Library returns an object of type:", type(s_mean))

mpg      20.090625
cyl       6.187500
disp    230.721875
hp      146.687500
drat      3.596563
wt        3.217250
qsec     17.848750
vs        0.437500
am        0.406250
gear      3.687500
carb      2.812500
dtype: float64


20.090625




Pandas Library returns an object of type: <class 'pandas.core.series.Series'>


Statistics Library returns an object of type: <class 'float'>


<h3> 2. Median </h3>
$$\widetilde{x} = \left\{
				   		\begin{array}{ll}
				   			x_{(n+1)/2}; \mbox{ if n is odd}\\
				   			\frac{1}{2}\left(x_{(n)/2} + x_{(n+1)/2}\right); \mbox{ if n is even}
				   		\end{array}	
                    \right.
$$

Medain can be calculated in python as follows
1. Using the pandas library
    - Input: Multi-dimensional Data
    - Output: Pandas Series / Single Float Value
2. Using the statistics library
    - Input: Cannot handle Multi-dimensional Data
    - Output: Single Float Value

In [4]:
df_median = df.median() # Calculation of median using pandas library
s_median = st.median(df["mpg"]) # calculating median using statistics library

# printing values
print(df_median)
print("\n")
print(s_median)

# printing data types
print("\n")
print("\n")
print("Pandas Library returns an object of type:",type(df_median))
print("\n")
print("Statistics Library returns an object of type:", type(s_median))

mpg      19.200
cyl       6.000
disp    196.300
hp      123.000
drat      3.695
wt        3.325
qsec     17.710
vs        0.000
am        0.000
gear      4.000
carb      2.000
dtype: float64


19.2




Pandas Library returns an object of type: <class 'pandas.core.series.Series'>


Statistics Library returns an object of type: <class 'float'>


<h3> 3. Mode </h3>

Mode can be calculated in python as follows
1. Using the pandas library
    - Input: Multidimensional Data
    - Output: Pandas Series / Single Float Value
2. Using the statistics library
    - Input: Cannot handle Multidimensional Data
    - Output: Single Float Value

Mode functionality of pandas library cannot be applied to the entire dataframe because Dataframe has a column named _model_ which contains _strings_ , whereas mode can only be applied to columns with numbers

In [5]:
df_mode = df.median()
s_mode = st.mode(df["cyl"])

# Printing values
print(df_mode)
print("\n")
print(s_mode)

# Printing Data Types
print("\n")
print("\n")
print("Pandas Library returns an object of type:",type(df_mode))
print("\n")
print("Statistics Library returns an object of type:", type(s_mode))

mpg      19.200
cyl       6.000
disp    196.300
hp      123.000
drat      3.695
wt        3.325
qsec     17.710
vs        0.000
am        0.000
gear      4.000
carb      2.000
dtype: float64


8




Pandas Library returns an object of type: <class 'pandas.core.series.Series'>


Statistics Library returns an object of type: <class 'int'>


<div class="alert alert-box alert-info">
    <b>note</b> : <b> <i> pandas </i> </b> library can take multidimensional data for calculation of mean, median and mode, but <b> <i> statistics </i> </b> library cannot take multidimensioanl data as illustrated above
</div>