# Measures of Central Tendancy

An essential statistical concept is the “measure of central tendency“. This measure is an important way to summarize the dataset with one representative value. This measure provides a rough picture of where data points are centered. The commonly used measures of central tendency are:


#### Mode: the most frequent value.

#### Median: the middle number in an ordered dataset.

#### Mean: the sum of all values divided by the total number of values.

<div class="alert alert-block alert-info" style="margin-top: 40px">
<h1> Mode </h1><br>
<b>The mode is the most frequently occurring value in the dataset.<br> 
It’s possible to have no mode, one mode, or more than one mode.</b>
</div>

In [11]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

1] Mode for Ungrouped data

Suppose we have the following Dataset as Age of People

In [12]:
df = pd.DataFrame([5,12,13,5,18,20,19,18,20,55,60,18,12,13,19])

In [13]:
df = df[0].value_counts().sort_index().reset_index()
df.columns = ['Age','frequency']
df

Unnamed: 0,Age,frequency
0,5,2
1,12,2
2,13,2
3,18,3
4,19,2
5,20,2
6,55,1
7,60,1


To Find Mode

In [14]:
mode = df.frequency.value_counts().to_dict()

In [15]:
max_freq = max(mode.values())
max_freq_key = [k for k, v in mode.items() if v == max_freq][0]


In [16]:
print(f'{max_freq_key} : {max_freq}')

2 : 5


<div class="alert alert-block alert-info" style="margin-top: 40px">
<h1> Median </h1><br>
<b>The Median is the middle value of a set of data.<br> To determine the median value in a sequence of numbers, the numbers must first be arranged in ascending order.</b>
</div>

In [18]:
df

Unnamed: 0,Age,frequency
0,5,2
1,12,2
2,13,2
3,18,3
4,19,2
5,20,2
6,55,1
7,60,1


In [30]:
cumulative_frequency = []
sum = 0

for i in df.frequency:
    sum += i
    cumulative_frequency.append(sum)

In [31]:
cumulative_frequency

[2, 4, 6, 9, 11, 13, 14, 15]

In [32]:
df['cumulative_frequency'] = cumulative_frequency

In [33]:
df

Unnamed: 0,Age,frequency,cumulative_frequency
0,5,2,2
1,12,2,4
2,13,2,6
3,18,3,9
4,19,2,11
5,20,2,13
6,55,1,14
7,60,1,15


In [58]:
total_count = df.frequency.sum()

In [59]:
middle_index = total_count/2


In [68]:

if total_count % 2 == 0:
    
    left_index = df[df['cumulative_frequency'] < middle_index].index[-1]
    right_index = df[df['cumulative_frequency'] >= middle_index].index[0]
    median = (df.loc[left_index, 'Age'] + df.loc[right_index, 'Age']) / 2

else:

    index = df[df['cumulative_frequency'] >= middle_index].index[0]
    median = df.loc[index, 'Age']



In [69]:
print("Median:", median)

Median: 18
