# Weighted Mean and Grouped Data
---

## Import Python Libraries

In [1]:
# import Python libraries
import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np
import pandas as pd
from scipy import stats

## Left Align Cell Contents

In [2]:
%%html
<style>
table {float:left}
</style>

---

## Weighted Mean

The standard way of calculating the mean assumes each data point should be given equal weight in how it contributes to the mean.  
However, sometimes, given the data available, it is not correct to assume an equal weight for every data point.  
In those cases the mean is calcuated by assigning the appropriate weight to each set of points within the data.

The formula for the **weighted population mean** \($ \mu_w $\) is given by:

$ \mu_w = \frac{\sum_{i=1}^{N} w_i x_i}{\sum_{i=1}^{N} w_i} $

where:
- \($ x_i $\) is the value of the \($ i $\)-th observation,
- \($ w_i $\) is the weight of the \($ i $\)-th observation,
- \($ n $\) is the total number of observations.

The formula for the **weighted sample mean** \($ \bar{x}_w $\) is given by:

$ \bar{x}_w = \frac{\sum_{i=1}^{n} w_i x_i}{\sum_{i=1}^{n} w_i} $

where:
- \($ x_i $\) is the value of the \($ i $\)-th observation,
- \($ w_i $\) is the weight of the \($ i $\)-th observation,
- \($ n $\) is the total number of observations.


---

## Grouped Data

When data is grouped, such as in a frequency table by buckets, the calculation of the mean, variance and standard deviation will need to be done in a manner to the weighted mean.  
In this case the midpoint of each bucket (or class), \($ M_i $\) is used as the value for all of the items in the bucket and the frequency of each bucket, \($ f_i $\), is used as the weight.

The formula for the **grouped population mean** \($ \mu $\) is given by:

$ \mu = \frac{\sum_{i=1}^{N} f_i M_i}{N} $

The formula for the **grouped sample mean** \($ \bar{x} $\) is given by:

$ \bar{x} = \frac{\sum_{i=1}^{n} f_i M_i}{n} $

The formula for the **grouped population variance** \($ \sigma^2 $\) of is given by:

$ \sigma^2 = \frac{\sum_{i=1}^{N} f_i (M_i - \mu)^2}{N} $

The formula for the **grouped sample variance** \($ s^2 $\) of is given by:

$ s^2 = \frac{\sum_{i=1}^{n} f_i (M_i - \bar{x})^2}{n -1} $

---