<a href="https://colab.research.google.com/github/drdww/OPIM5641/blob/main/Module6/M6_3/2_Covariance_and_Correlation.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Covariance vs. Correlation
**OPIM 5641: Business Decision Modeling - University of Connecticut**

-----------------------------------------------

Let's explore the difference between covariance and correlation. They are two different animals!

* Original content from: https://www.investopedia.com/articles/financial-theory/11/calculating-covariance.asp
* An example with some more math: https://mathcs.clarku.edu/~djoyce/ma217/covar.pdf

In [None]:
import numpy as np

Imagine your dataset looked like this... 5 days of stock trading data.

Day|Stock1 (x) |Stock2 (y)|
---|---|---|
1| 1.1%|3.0% |
2| 1.7%|4.2% |
3| 2.1%| 4.9% |
4| 1.4%| 4.1%|
5| 0.2%| 2.5%|

In [None]:
x = [1.1, 1.7, 2.1, 1.4, 0.2]
# y = [1.1, 1.7, 2.1, 1.4, 0.2]
y = [3.0, 4.2, 4.9, 4.1, 2.5]
print('Stock1 (x):', x)
print('Stock2 (y):', y) 

Stock1 (x): [1.1, 1.7, 2.1, 1.4, 0.2]
Stock2 (y): [3.0, 4.2, 4.9, 4.1, 2.5]


In [None]:
# compute the averages
print('mean(Stock1):', np.mean(x))
print('mean(Stock2):',np.mean(y))

mean(Stock1): 1.3000000000000003
mean(Stock2): 3.7400000000000007


# Covariance

## Covariance in Portfolio Management
Covariance applied to a portfolio can help determine what assets to include in the portfolio. It measures whether stocks move in the same direction (a positive covariance) or in opposite directions (a negative covariance). When constructing a portfolio, a portfolio manager will select stocks that work well together, which usually means these stocks' returns would not move in the same direction.

## Formula

$\text{cov}_{x,y}=\frac{\sum_{i=1}^{N}(x_{i}-\bar{x})(y_{i}-\bar{y})}{N-1}$

Then, we take the difference between x's return and y's average return and multiply it by the difference between x's return and y's average return.
Finally, we divide the result by the sample size and subtract one (because it's a sample). 

In [None]:
# if you run like this code, you get the covariance matrix
np.cov(x, y)

array([[0.515, 0.665],
       [0.665, 0.943]])

What are the elements of the covariance matrix? 



```
cov(x,x)  cov(x,y)

cov(y,x)  cov(y,y)
```



**On your own:** Don't believe me? Try reworking the data and running `cov(x,x)` and see if you get a matrix with four identical entries! 

If you want the covariance of (x,y), then try this syntax. It will pull the second element from the top row.

In [None]:
np.cov(x,y)[0][1] # tada!

0.6650000000000001

In [None]:
# just to be clear, cov(x,y) is the same as cov(y,x)
np.cov(y,x)[0][1] 

0.6650000000000001

**On your own:** Instead of the formula `np.cov()` formula, you can work out the entire formula!!! 

## Interpreting Covariance
The covariance between the two stock returns is 0.665. Because this number is positive, the stocks move in the same direction. In other words, when Stock 1 (x) had a high return, Stock 2 (y) also had a high return.

## Properties of Covariance

In the example, there is a positive covariance, so the two stocks tend to move together. When one stock has a positive return, the other tends to have a positive return as well. If the result were negative, then the two stocks would tend to have opposite returns—when one had a positive return, the other would have a negative return.

**On your own:** Delete the data for Stock2 and  make it $-1*\text{Stock1}$. Calculate the covariance. If they perfectly move in opposite different directions, then covariance should be negative!

In [None]:
x = [1.1, 1.7, 2.1, 1.4, 0.2]
y = [-1.1, -1.7, -2.1, -1.4, -0.2]
print('Stock1 (x):', x)
print('Stock2 (y):', y) 

Stock1 (x): [1.1, 1.7, 2.1, 1.4, 0.2]
Stock2 (y): [-1.1, -1.7, -2.1, -1.4, -0.2]


In [None]:
np.cov(x,y)[0][1] # tada! this should be negative if they move perfectly OPPOSITE

-0.515

When x and y are independent, then covariance is equal to 0. 

# Correlation
... is just covariance(x,y) divided by the standard deviation of x and y!

## Formula
$\text{cor}_{a,b} = \frac{\text{cov}_{a,b}}{\sigma_x \sigma_y}$
​


The correlation between two variables is the covariance between both variables divided by the product of the standard deviation of the variables. While both measures reveal whether two variables are positively or inversely related, the correlation provides additional information by determining the degree to which both variables move together. The correlation will always have a measurement value between -1 and 1, and it adds a strength value on how the stocks move together.

In [None]:
np.corrcoef(x,y) # again, this is the correlation MATRIX

array([[1.        , 0.95425003],
       [0.95425003, 1.        ]])

In [None]:
# you are interested in the correlation by x and y, not x and x OR y and y
np.corrcoef(x,y)[0][1]

0.9542500347004005

# Summary
If the correlation is 1, they move perfectly together, and if the correlation is -1, the stocks move perfectly in opposite directions. If the correlation is 0, then the two stocks move in random directions from each other. In short, covariance tells you that two variables change the same way while correlation reveals how a change in one variable affects a change in the other. 

* https://www.investopedia.com/ask/answers/040315/how-does-covariance-impact-portfolio-risk-and-return.asp
* https://www.investopedia.com/terms/e/efficientfrontier.asp

Read these for more on the efficient frontier!