# Numpy: Standard Deviation And Variance

## Standard Deviation

In statistics, the standard deviation is a measure of the amount of variation or dispersion of a set of values. 

A low standard deviation indicates that the values tend to be close to the mean of the set, while a high standard deviation indicates that the values are spread out over a wider range.

Reference: [Стандартно отклонение](https://bg.wikipedia.org/wiki/%D0%A1%D1%82%D0%B0%D0%BD%D0%B4%D0%B0%D1%80%D1%82%D0%BD%D0%BE_%D0%BE%D1%82%D0%BA%D0%BB%D0%BE%D0%BD%D0%B5%D0%BD%D0%B8%D0%B5)

Standard deviation tells you how spread out the data is. It is a measure of how far each observed value is from the mean. In any distribution, about 95% of values will be within 2 standard deviations of the mean.


![Standard-deviation.jpg](../images/Standard-deviation.jpg)

### Normal Distributions

Standard deviation is a useful measure of spread for normal distributions.

Normal distribution, also known as the Gaussian distribution, is a probability distribution that is symmetric about the mean, showing that data near the mean are more frequent in occurrence than data far from the mean.

In graphical form, the normal distribution appears as a "bell curve".

The mean of the bell curve is the middle of the curve (or the peak of the bell) with equal amount of data on both sides, while the standard deviation quantifies the variability of the curve (in other words, how wide or narrow the curve is).

#### Examples

A good example of a normally distributed variable is the height of men. The average height of an adult male in the UK is about 1.77 meters. There are a range of heights but most men are within a certain proximity to this average. There are some very short people and some very tall people but both of these are in the minority at the edges of the range of values. If we plot a histogram of the data, we would get a ‘bell shaped’ curve, with most heights clustered around the average and fewer and fewer cases occurring as you move away either side of the average value. This is the normal distribution and next figure shows us this curve for our height example.


![normal_curve.jpg](../images/normal_curve.jpg)


### Standard Deviation Formula


The formula for standard deviation (SD) depends on whether you are analyzing population data, in which case it is called lower case Greek letter σ (sigma), or estimating the population standard deviation from sample data, which is called s.

**In statistics, a population is a set of similar items or events which is of interest for some question or experiment. A population is the entire group that you want to draw conclusions about.**

**A sample is a subset of the population which is chosen to represent the population in a statistical analysis.**


Standard deviation is rarely calculated by hand. It can, however, be done using the formula below, where:

	x: represents a value in a data set, 
	μ: represents the mean of the data set
	N: represents the number of values in the data set.

<div class="perseus-block-math-inner" style="overflow: auto hidden; padding-top: 10px; padding-bottom: 10px; margin-top: -10px; margin-bottom: -10px;"><span><span></span><span aria-hidden="true" style="white-space: nowrap;" aria-describedby="katex-uid--10-described-by-id"><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mstyle mathsize="1.44em"><mtext>SD</mtext><mo>=</mo><msqrt><mstyle displaystyle="true" scriptlevel="0"><mfrac><mrow><msubsup><mo>∑</mo><mrow></mrow><mrow></mrow></msubsup><mrow><mo stretchy="false">∣</mo><mi>x</mi><mo>−</mo><mi>μ</mi><msup><mo stretchy="false">∣</mo><mn>2</mn></msup></mrow></mrow><mi>N</mi></mfrac></mstyle></msqrt></mstyle></mrow><annotation encoding="application/x-tex">\Large\text{SD} = \sqrt{\dfrac{\sum\limits_{}^{}{{\lvert x-\mu\rvert^2}}}{N}}</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.9839952em;vertical-align:0em;"></span><span class="mord text sizing reset-size6 size8"><span class="mord">SD</span></span><span class="mspace" style="margin-right:0.2777777777777778em;"></span><span class="mrel sizing reset-size6 size8">=</span><span class="mspace" style="margin-right:0.2777777777777778em;"></span></span><span class="base"><span class="strut" style="height:4.6166544em;vertical-align:-0.98784em;"></span><span class="mord sqrt sizing reset-size6 size8"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:2.52001em;"><span class="svg-align" style="top:-5.16601em;"><span class="pstrut" style="height:5.16601em;"></span><span class="mord" style="padding-left:1.056em;"><span class="mord"><span class="mopen nulldelimiter sizing reset-size8 size6"></span><span class="mfrac"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:2.39001em;"><span style="top:-2.754em;"><span class="pstrut" style="height:3.44em;"></span><span class="mord"><span class="mord mathdefault" style="margin-right:0.10903em;">N</span></span></span><span style="top:-3.67em;"><span class="pstrut" style="height:3.44em;"></span><span class="frac-line" style="border-bottom-width:0.04em;"></span></span><span style="top:-4.780005em;"><span class="pstrut" style="height:3.44em;"></span><span class="mord"><span class="mop op-limits"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:1.050005em;"><span style="top:-2.589995em;margin-left:0em;"><span class="pstrut" style="height:3.44em;"></span><span class="sizing reset-size8 size6 mtight"><span class="mord mtight"></span></span></span><span style="top:-3.4400049999999998em;"><span class="pstrut" style="height:3.44em;"></span><span><span class="mop op-symbol small-op">∑</span></span></span><span style="top:-4.390005em;margin-left:0em;"><span class="pstrut" style="height:3.44em;"></span><span class="sizing reset-size8 size6 mtight"><span class="mord mtight"></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.950005em;"><span></span></span></span></span></span><span class="mspace" style="margin-right:0.16666666666666666em;"></span><span class="mord"><span class="mord"><span class="mopen">∣</span><span class="mord mathdefault">x</span><span class="mspace" style="margin-right:0.2222222222222222em;"></span><span class="mbin">−</span><span class="mspace" style="margin-right:0.2222222222222222em;"></span><span class="mord mathdefault">μ</span><span class="mclose"><span class="mclose">∣</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.8105277777777777em;"><span style="top:-3.363em;margin-right:0.034722222222222224em;"><span class="pstrut" style="height:3em;"></span><span class="sizing reset-size8 size6 mtight"><span class="mord mtight">2</span></span></span></span></span></span></span></span></span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.686em;"><span></span></span></span></span></span><span class="mclose nulldelimiter sizing reset-size8 size6"></span></span></span></span><span style="top:-4.48001em;"><span class="pstrut" style="height:5.16601em;"></span><span class="hide-tail" style="min-width:0.742em;height:3.24601em;"><svg width="400em" height="3.24601em" viewBox="0 0 400000 3246" preserveAspectRatio="xMinYMin slice"><path d="M702 80H40000040
H742v3112l-4 4-4 4c-.667.7 -2 1.5-4 2.5s-4.167 1.833-6.5 2.5-5.5 1-9.5 1
h-12l-28-84c-16.667-52-96.667 -294.333-240-727l-212 -643 -85 170
c-4-3.333-8.333-7.667-13 -13l-13-13l77-155 77-156c66 199.333 139 419.667
219 661 l218 661zM702 80H400000v40H742z"></path></svg></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.686em;"><span></span></span></span></span></span></span></span></span></span><span style="border: 0px; clip: rect(0px, 0px, 0px, 0px); height: 1px; margin: -1px; overflow: hidden; padding: 0px; position: absolute; width: 1px;" id="katex-uid--10-described-by-id">start text, S, D, end text, equals, square root of, start fraction, sum, start subscript, end subscript, start superscript, end superscript, open vertical bar, x, minus, mu, close vertical bar, squared, divided by, N, end fraction, end square root</span></span></div>

The steps in calculating the standard deviation are as follows:

1. For each value, find its distance to the mean
1. For each value, find the square of this distance
1. Find the sum of these squared values
1. Divide the sum by the number of values in the data set
1. Find the square root of this


[Basic_examples](https://en.wikipedia.org/wiki/Standard_deviation#Basic_examples)

## Variance

### What Is Variance?

The term variance refers to a statistical measurement of the spread between numbers in a data set. More specifically, variance measures how far each number in the set is from the mean (average), and thus from every other number in the set.

High variance signifies that dataset values are far from their mean.On the other side, low variance signifies that the values are very close to their meanOn the other side, low variance signifies that the values are very close to their mean

The variance is mean squared difference between each data point and the centre of the distribution measured by the mean.


## Variance vs Standard Deviation

The basic difference between both is that standard deviation is represented in the same units as the mean of data, while the variance is represented in squared units.

![variance_vs_std_deviation_formulae.png](../images/variance_vs_std_deviation_formulae.png)

## Example: Calculating Variance and Standard Deviation

Let's have next data sample:   2, 7, 3, 12, 9.

1. Calculate Variance:
   1. calculate the mean:  
   
      `33 / 5 = 6.6`
   2. take each value in data set, subtract the mean and square the difference
   
      `(2 - 6.6)**2  = 21.16`

      `(7 - 6.6)**2  = 0.16`

      `(3 - 6.6)**2  = 12.96`

      `(12 - 6.6)**2  = 29.16`
      
      `(9 - 6.6)**2  = 5.76`

   3. calculate the sum, and then divide it by the number of data points
   
      `21.16 + 0.16 + 12.96 + 29.16 + 5.76 = 69.20`
      
      `69.20 ÷5 = 13.84`

   The variance is 13.84

2. Calculate the standard deviation:

   Calculate the square root of the variance:
   sqrt(13.84)= 3.72

   The standard deviation is 3.72

## Standard Deviation and Variance in Numpy

**numpy.var()**

Compute the variance along the specified axis

Reference: [numpy.var](https://numpy.org/doc/stable/reference/generated/numpy.var.html)


**numpy.std()**

Compute the standard deviation along the specified axis.	

Reference: [numpy.std](https://numpy.org/doc/stable/reference/generated/numpy.std.html)