## Calculating Covariance and Correlation

Consider a portfolio composed of *Walmart* and *Facebook*. Do you expect the returns of these companies to show high or low covariance? Or, could you guess what the correlation would be? Will it be closer to 0 or closer to 1? 

Begin by extracting data for Walmart and Facebook from the 1st of January 2014 until today.

In [1]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import os

In [2]:
data_dir = '../dataset'
adj_closes = pd.DataFrame()
tickers = ['FB', 'WMT']
for t in tickers:
    adj_closes[t] = pd.read_csv(os.path.join(data_dir, t+'.csv'), index_col=0)['Adj Close']
adj_closes

Unnamed: 0_level_0,FB,WMT
Date,Unnamed: 1_level_1,Unnamed: 2_level_1
2014-01-02,54.709999,78.910004
2014-01-03,54.560001,78.650002
2014-01-06,57.200001,78.209999
2014-01-07,57.919998,78.449997
2014-01-08,58.230000,77.830002
2014-01-09,57.220001,78.089996
2014-01-10,57.939999,78.040001
2014-01-13,55.910000,77.489998
2014-01-14,57.740002,77.959999
2014-01-15,57.599998,77.660004


In [3]:
log_returns = np.log(adj_closes / adj_closes.shift(1))
print log_returns

                  FB       WMT
Date                          
2014-01-02       NaN       NaN
2014-01-03 -0.002745 -0.003300
2014-01-06  0.047253 -0.005610
2014-01-07  0.012509  0.003064
2014-01-08  0.005338 -0.007934
2014-01-09 -0.017497  0.003335
2014-01-10  0.012504 -0.000640
2014-01-13 -0.035665 -0.007073
2014-01-14  0.032207  0.006047
2014-01-15 -0.002428 -0.003855
2014-01-16 -0.007143 -0.011657
2014-01-17 -0.015685 -0.007453
2014-01-21  0.038503 -0.004604
2014-01-22 -0.017239 -0.006482
2014-01-23 -0.015420 -0.005189
2014-01-24 -0.039256 -0.007230
2014-01-27 -0.016667 -0.003635
2014-01-28  0.029260  0.006988
2014-01-29 -0.029633 -0.007663
2014-01-30  0.131942  0.008734
2014-01-31  0.024101 -0.000937
2014-02-03 -0.017574 -0.027421
2014-02-04  0.020447  0.000963
2014-02-05 -0.008964  0.001923
2014-02-06 -0.000482 -0.000686
2014-02-07  0.034159  0.012690
2014-02-10 -0.012044  0.000136
2014-02-11  0.020250  0.014001
2014-02-12 -0.006187  0.002137
2014-02-13  0.043716  0.005322
...     

Repeat the process we went through in the lecture for these two stocks. How would you explain the difference between their means and their standard deviations?

In [4]:
log_returns.mean() * 250

FB     0.287455
WMT   -0.024093
dtype: float64

In [5]:
log_returns.std() * 250 ** 0.5

FB     0.287901
WMT    0.177903
dtype: float64

***

## Covariance and Correlation


\begin{eqnarray*}
Covariance Matrix: \  \   
\Sigma = \begin{bmatrix}
        \sigma_{1}^2 \ \sigma_{12} \ \dots \ \sigma_{1I} \\
        \sigma_{21} \ \sigma_{2}^2 \ \dots \ \sigma_{2I} \\
        \vdots \ \vdots \ \ddots \ \vdots \\
        \sigma_{I1} \ \sigma_{I2} \ \dots \ \sigma_{I}^2
    \end{bmatrix}
\end{eqnarray*}

Covariance matrix:

In [7]:
log_returns.cov()*250

Unnamed: 0,FB,WMT
FB,0.082887,0.008832
WMT,0.008832,0.031649


Correlation matrix:

In [8]:
log_returns.corr()

Unnamed: 0,FB,WMT
FB,1.0,0.172437
WMT,0.172437,1.0


Would you consider investing in such a portfolio?