# Standard Deviation
An important characteristic of datasets is how much variability exists among individual samples. Without a measure of variability, you can’t effectively compare two datasets. 

For example, if one dataset consists of the values <mark>[99, 100, 101]</mark>, and another dataset consists of the values <mark>[0, 100, 200]</mark>, they both have the same mean and median values of 100, yet they have very different amounts of variability. There is a large amount of variability in the second dataset compared to the first. 

The variability is often a defining characteristic of a dataset. The **`standard deviation`** is perhaps the most informative and certainly the most widely used measure of a dataset's variability.  Let's look at how it's calculated.


The formula for the **`standard deviation`** of a dataset is:  


$$\Large\sqrt[]{\frac {\Sigma (x_{i}-\bar{x})^{2}}{n-1} }$$

_Let's walk through the above formula and summarize what we've discussed: The numerator is simply the sum of the squared distance that each value is from the mean. We then divide that sum by n-1 (we'll assume that the data is a sample from larger a population) to get the average squared distance or deviation from the mean. Finally, because we squared the deviations from the mean, we'll now take the square root of the calculated quantity to undo the squaring and put everything back into the original terms._

In [1]:
import pandas as pd

In [2]:
data = [3, 5, 8, 2, 5, 9, 6, 3, 5, 7]
df = pd.DataFrame(data, columns=['nums'])

In [3]:
# Determining standard deviation
df['nums'].std()

np.float64(2.2632326929023945)

<div class="alert alert-block alert-success">
    <b>Success:</b> The <b>mean</b> is interpreted as the expected value, and the <b>standard deviation</b> can be interpreted as the expected range of values around the mean.
</div>