# Measures of Center

## Introduction

In [9]:
# Imports
import numpy as np
from scipy import stats

There are typically three measures of central tendency which are commonly used as summary statistics:
- Mean
- Median
- Mode

## Mean

The mean is the sum of all values in a population or sample, divided by the total number of values within that dataset.

Mathematically expressed this would be:

$$
\bar{x} = \frac{1}{n} \sum\limits_{i=1}^nx_i​
$$

```{note}
$\bf{\bar{x}}$ is the notation for the mean of $x$ and is pronounced as "$x$ bar". This is the notation used for the mean of a **sample** (e.g a subset of a population). The mean of the whole **population** is expressed with $\bf{\mu}$.
```

In Python the mean of a random list of numbers can be calculated as follows:

In [10]:
# For a random list of numbers, calculate the Mean
numbers = [21, 5, 32, 1, 4, 2, 8, 5, 5, 1, 4, 2, 643, 5]
np.mean(numbers)

52.714285714285715

## Median

The median is the center of a numeric dataset, which can be found be ordering all values in the dateset from lowest to highest value and by than taking:
- the number which is at the center of the set (if there is an uneven number of values in the set)
- the mean of the two numbers at th center of the set (if there is an even number of values in the set)

```{note}
The median is less affected by outliers than the mean.
```

Mathematically expressed the median would be noted as:

$$
median = \frac{a_\left \lfloor{\frac{l+1}{2}}\right \rfloor + a_\left \lceil{\frac{l+1}{2}}\right \rceil}{2}
$$

Where $a$ is an ordered list of $l$ numbers and where $\left \lfloor{.}\right \rfloor$ and $\left \lceil{.}\right \rceil$ denote the floor and ceiling functions, respectively.

In Python the median is calcuted as follows:

In [11]:
# For a random list of numbers, calculate the Median
numbers = [21, 5, 32, 1, 4, 2, 8, 5, 5, 1, 4, 2, 643, 5]
np.median(numbers)

5.0

## Mode

The mode is the most common value in the dataset, e.g the value which occurs most frequently.
- If no values occur more than once, there is no mode
- If multiple values occur most frequently, we have multiple modes

In Python the mode would be calculated as follows *(using scipy.stats)*:

In [14]:
# For a random list of numbers, calculate the Mode
numbers = [21, 5, 32, 1, 4, 2, 8, 5, 5, 1, 4, 2, 643, 5]
stats.mode(numbers)

ModeResult(mode=array([5]), count=array([4]))