## Hey, Python techies!!!

The first step in the identification of a Markowitz portfolio is the definition of the investment opportunity set. The investment opportunity set consists of the expected returns of all assets and the covariance matrix of these assets. We will now see how to calculate these.

## Let's directly jump into the code

As always, we first import standard packages.

In [1]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

Next, we read in some price data and calculate monthly log returns from the provided monthly price observations. In code line 5, we drop the row with the NaN return entries that emerged by shifted the prices in the return calculation.

In [2]:
prices = pd.read_csv("prices.csv", parse_dates = ['Date'])
prices.loc[:, 'SP500_return'] = np.log(prices.loc[:, 'SP500']) - np.log(prices.loc[:, 'SP500'].shift())
prices.loc[:, 'VBTIX_return'] = np.log(prices.loc[:, 'VBTIX']) - np.log(prices.loc[:, 'VBTIX'].shift())
prices = prices.dropna()

An unbiased estimator for the expected return is just the mean of observed returns. Hence, we calculate this mean of our return time series right away.

In [9]:
mu = prices[['SP500_return', 'VBTIX_return']].mean()

mu

SP500_return    0.010670
VBTIX_return    0.000075
dtype: float64

The estimates correspond to expected returns over one month. Usually, we want to work with a yearly time horizon. Annualizing the 'mu' estimates by multiplying them with 12 (there are 12 months in a year) gives our expected return component of the investment opportunity set.

In [10]:

mu = 12 * mu
mu

SP500_return    0.128036
VBTIX_return    0.000905
dtype: float64

For the covariance matrix, we make use of numpy's 'cov' function. That function calculated the covariance matrix for all rows of a given matrix or DataFrame. Since our data is stored in columns, we need to transpose the DataFrame by calling the 'T' attributed before passing it to the 'cov' function.

In [5]:
covariance = np.cov(prices[['SP500_return', 'VBTIX_return']].T)
covariance

array([[ 1.01135863e-03, -1.99263382e-05],
       [-1.99263382e-05,  6.44715109e-05]])

The covariance, too, needs to be annualized now. Again, we do so by multiplying it with 12. After that, we also have the covariance component of the investment opportunity set and we're ready to proceed with the identification Markowitz portfolios.

In [6]:
covariance = covariance * 12
covariance

array([[ 0.0121363 , -0.00023912],
       [-0.00023912,  0.00077366]])

MU: to do:
1. are monthly log returns Gaussian?
2. are monthly simple returns more or less Gaussian than monthly log returns?
3. Is the covariance matrix for log returns different than for simple return 
4. plot Bond and SP500 mthly return time series and spot (by eye bolling)periods of extreme movements (jumps or vol clustering)
5. state annualized Sharpe ratio for 100% stock investment, 100% bond investment and an equal weight portfolio
6. is the log return of an equal weight portfolio Gaussian?