# Covariance
---

## Import Python Libraries

In [1]:
# import Python libraries
import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np
import pandas as pd
from scipy import stats

## Left Align Cell Contents

In [2]:
%%html
<style>
table {float:left}
</style>

---

## Covariance

Covariance is a measurement of how 2 variables vary together.

Like many statistical measures there is a formula for the population covariance and another for the sample covariance.

The formula for **population covariance** \($ \sigma_{XY} $\) between two variables \($ X $\) and \($ Y $\) is:

$ \sigma_{XY} = \frac{\sum_{i=1}^{N} (X_i - \mu_X)(Y_i - \mu_Y)}{N} $

where:
- \($ N $\) is the number of data points
- \($ X_i $\) and \($ Y_i $\) are the individual data points of variables \($ X $\) and \($ Y $\)
- \($ \mu_X $\) and \($ \mu_Y $\) are the means of variables \($ X $\) and \($ Y $\) respectively

The formula for **sample covariance** \($ s_{XY} $\) between two variables \($ X $\) and \($ Y $\) is:

$ s_{XY} = \frac{\sum_{i=1}^{n} (X_i - \bar{X})(Y_i - \bar{Y})}{n - 1} $

where:
- \($ n $\) is the number of data points in the sample
- \($ X_i $\) and \($ Y_i $\) are the individual data points of variables \($ X $\) and \($ Y $\)
- \($ \bar{X} $\) and \($ \bar{Y} $\) are the sample means of variables \($ X $\) and \($ Y $\) respectively



Covariance can be interpreted as follows:

- A value greater than 0 means that as one variable increases or decreases the other variable also increases or decreases.
- A value less than 0 means that as one variable increases the other variable decreases and vice versa.
- A value of 0 (or close to it) means that the variables have no discernible relationship.
- Covariance shows the direction of the relationship not its magnitude.

Covariance does not adjust well to changes in units of measurement.  

Therefore, in general, it is better to use the correlation coefficient to determine the strength of the relationship between 2 variables.

---