We talked about ways to represent the central tendency, or the _average of a dataset_. Here we are going to expand on that.

Lets say we have the set S:
```
S = -10, 0, 10, 20, 30
```

and the set _T_:
```
T = 8, 9, 10, 11, 12
```

Here we assume that T and U make up the _entire population_
Calculating the _population_ mean of both sets:
For _S_:
```
 -10 + 0 + 10 + 20 + 30
-------------------------
           5

= 10
```

And for _T_:
```
  8 + 9 + 10 + 11 + 12
-------------------------
           5

= 10
```

Here we can see _T_ and _S_ have the same exact _population mean_.
But when you look at these two datasets, There is a distinct difference between the two sets. Set _S_ has values with a much larger range than _T_. All of the values from set _S_ have a large difference from `10`, whereas in _T_ they are all tightly clumped around 10. In other words, _S_ is more _disperse_ than _T_.

So lets think about different ways we can measure _dispersion_, or _how far are we from the center, on average?_

### Range: max - min
Although simple and not used often, its a way of understanding the spread of data.
If we apply these to our datasets:
```
range(_S_) = 40
range(_T_) = 4
```

As we can see here, in this case range is a pretty good measure of dispersion. But sometimes range isn't good enough.. Sometimes you will have datasets that have the same range, but because of how data can be bunched up it's possible for these sets to have different distributions. 

### Enter Variance  ( σ² ) 
Description: For each datapoint _x_ in the set _U_, Find the difference between mean(U) and x, then square the result and add them together. divide by the length(_U_)

Here's an example of calculating σ² for set _S_:
```
(-10 - 10)² + (0 - 10)² + (10 - 10)² + 20 - 10)² + (30 - 10)²
-------------------------------------------------------------
                             5
```

or in python:

In [10]:
S = [-10, 0, 10, 20, 30]
T = [8, 9, 10, 11, 12]

mean = lambda *l: sum(l) / float(len(l))
sigma_squared = lambda *s: sum([(x - mean(*s)) ** 2 for x in s]) / len(s) # Just a beefed up mean function
sigma_squared(*S), sigma_squared(*T)

(200.0, 2.0)

Here we can see that the variance of _S_ was 200, where the variance of _T_ was only 2! This is a much better representation of the two sets.

## Standard Deviation (σ):
(Just the square root of the σ²)

So the Standard Deviation of _T_ and _S_ would be:

In [11]:
import math

sigma = lambda *l: math.sqrt(sigma_squared(*l))

sigma(*S), sigma(*T)

(14.142135623730951, 1.4142135623730951)

Intuitively, these numbers make sense. On average _S_ has _ten times_ the standard deviation than _T_. (-10, 0, 20 are 10 away from 10)

## Variance of a population:

Lets say i'm trying to judge how many years of experience an entire population of people in an organisation have.

```
Years of experiance
--------------------
        1
        3
        5
        7
        14
```

What would be the population mean (mean years of exp) for my population

Well, we could just take the mean:
```
                                                1 + 3 + 5 + 7 + 14
                                     mean =     ------------------
                                                         5
```
Or in fancy speak:

$$m = \frac{1}{5}\sum_{i=1}^{5} x_i$$


Which would ultimately, equal 6.
Lets calculate the variance:
$$ \sigma^2 = \frac{1}{n}\sum_{i=1}^{n} (x_i - \bar{u})^2 = 20 $$
or
```
                                  (1 - 6)² + (3-6)² + (5-6)² + (7-6)² + (14 - 6)²
                       σ2 =       -----------------------------------------------      = 20
                                                        5
```