# Variance

## Theory of Variance

Variance is a fundamental measure of dispersion in statistics. It quantifies how much the values in a dataset differ from the mean, providing insight into the spread or variability of the data.

### How to Calculate Variance
- Find the mean of the dataset.
- Subtract the mean from each value to get the deviation for each value.
- Square each deviation.
- Find the average of the squared deviations.

### Formula
For a dataset $x_1, x_2, \ldots, x_n$ with mean $\bar{x}$:

$$
\text{Variance} = \frac{1}{n} \sum_{i=1}^{n} (x_i - \bar{x})^2
$$

For a sample (not the entire population), divide by $n-1$ instead of $n$:

$$
\text{Sample Variance} = \frac{1}{n-1} \sum_{i=1}^{n} (x_i - \bar{x})^2
$$

### Properties
- Variance is always non-negative.
- A higher variance indicates greater spread in the data.
- Variance is sensitive to outliers, as large deviations are squared.
- The square root of variance is the standard deviation.

### Example
For the dataset [4, 7, 9], the mean is $6.67$.
- Deviations: $4-6.67 = -2.67$, $7-6.67 = 0.33$, $9-6.67 = 2.33$
- Squared deviations: $7.11$, $0.11$, $5.43$
- Variance: $(7.11 + 0.11 + 5.43)/3 = 4.22$

Variance helps describe how data points are distributed around the mean and is widely used in statistical analysis.

# Variance using numpy

In [1]:
import numpy as np
Data=[4, 7, 9]
sample_var=np.var(Data,ddof=1)
population_var=np.var(Data,ddof=0)
print(f"Sample Variance:{sample_var}")
print(f"Population Variance:{population_var}")

Sample Variance:6.333333333333333
Population Variance:4.222222222222222


# Variance using pandas

In [6]:
import pandas as pd

Data = [4, 7, 9]
Series = pd.Series(Data)

sample_var = Series.var()     
population_var = Series.var(ddof=0)

print(f"Sample Variance:{sample_var}")
print(f"Population Variance:{population_var}")


Sample Variance:6.333333333333333
Population Variance:4.222222222222222


# Variance using scipy

In [7]:
from scipy import stats as st

Data = [4, 7, 9]

sample_var = st.tvar(Data,)     
population_var = st.tvar(Data,ddof=0)

print(f"Sample Variance:{sample_var}")
print(f"Population Variance:{population_var}")


Sample Variance:6.333333333333334
Population Variance:4.222222222222222
