# Standard Deviation

In statistics, the standard deviation is a measure of the amount of variation or dispersion of a set of values.

---
A low standard deviation indicates that the values tend to be close to the **mean** (also called the expected value) of the set, while a high standard deviation indicates that the values are spread out over a wider range

Standard Deviation is abbreviated as SD, and is most commonly represented by sigma σ, for the population standard deviation, or the Latin letter *s*, for the sample standard deviation

The standard deviation of a random variable, statistical population, data set, or probability distribution is the square root of variance.

Standard deviation is measure using the same units as the data

To recap, basically standard deviation is the measure of dispersion, or how spread out values are in a dataset.

![image.png](attachment:d9efc1fc-bf57-4db4-b83f-90d64a91b5c9.png)

This is a plot of normal distribution where each band has a width of 1 standard deviation

### Mathematical formula for standard deviation

![image.png](attachment:8ba5ad57-1bef-4c1a-ad57-6a865eae28de.png)

σ = population standard deviation

N = the size of the  population

![image.png](attachment:ac94b976-7eea-42ee-b8ca-066cb8ec47f4.png)= each value from the population

![image.png](attachment:7fb1d3e7-f860-4d12-9abf-cce05462a162.png) = the population mean

---
---

# Variance

In probability and statistics, variance is the expectation of the squared deviation of a random variable from its mean

Informally, it measures how far a set of numbers is spread out from their average value.

It is represented by $σ^2$ or Var(X), where X is a random variable

### Formula for variance

![image.png](attachment:d2a160e8-fe1d-48f3-baee-3fc8325e7081.png)

where $σ^2$ is the variance, N is the number of observations, X is the individual set of observations and μ is the mean

![image](https://upload.wikimedia.org/wikipedia/commons/thumb/f/f9/Comparison_standard_deviations.svg/600px-Comparison_standard_deviations.svg.png)

Example of samples from two populations with the same mean but different variances. The red population has mean 100 and variance 100 (SD=10) while the blue population has mean 100 and variance 2500 (SD=50).

The variance can also be thought of as the covariance of a random variable with itself:

![image.png](attachment:d03e04e1-6967-43ee-a319-d30700e224bb.png)

---

### Standard deviation in Python

In [2]:
import numpy as np

age = [86, 87, 88, 86, 87, 85, 86]
mean = np.mean(age)

x = np.std(age)
print(f"Mean is {mean}:")
print(f"Standard Deviation is {x}: ")

Mean is 86.42857142857143:
Standard Deviation is 0.9035079029052513: 


In the above example, the value of standard deviation is low. So, it indicates that the data is distributed closely to the mean

In [3]:
age = [90,98,100,89,8695,97]
mean = np.mean(age)
x = numpy.std(age)

print(f"Mean is {mean}:")
print(f"Standard Deviation is {x}: ")

Mean is 1528.1666666666667:
Standard Deviation is 3205.1078721662743: 


In the above example, one outlier is introduced which results in higher value of standard deviation. This indicates that the data is spread out over a wider range (not closely distributed to mean)

---

### Variance in Python

Note: Variance is just a sqaure of standard dec

In [5]:
age = [86, 87, 88, 86, 87, 85, 86]
mean = np.mean(age)

x = np.var(age)
print(f"Mean is {mean}:")
print(f"Variance is {x}: ")

Mean is 86.42857142857143:
Variance is 0.8163265306122449: 


In [6]:
age = [90,98,100,89,8695,97]
mean = np.mean(age)
x = numpy.var(age)

print(f"Mean is {mean}:")
print(f"Variance is {x}: ")

Mean is 1528.1666666666667:
Variance is 10272716.472222222: 


### Other references:
https://www.youtube.com/watch?v=1E7NU-uWalY - Krish Naik

https://www.youtube.com/watch?v=SzZ6GpcfoQY - Josh Starmer