# Lecture 11 5/10 Markowitz Portfolio Covariance Estimation

# Markowitz Portfolio Optimization

We continue to consider the portfolio optimization problem. In the previous notebook, we found ways to compute the exptected return and volatility by computing the sample mean and covariance matrix of log-returns.

## Backtesting

Testing a portfolio selection strategy can be tested with what is called a backtesting strategy. At any given point in time $t$, one assumes the knowledge of *historical* data as given: $t-1, t-2, \dots$. Any investment strategy is developed with historical data, and is tested with current and future data at times $t, t+1, t+2, \dots$.

Suppose we are currently at time $t$. We have recoreded returns $r_t^{(i)}$ for times $t-1, t-2, \dots$ and stock $i=1,2,\dots, s$. What we would like to compute are portfolio allocations: $w_{t}^{(i)}$ for all $i$.

Then, we simulate earning the returns on future returns data (while holding our portfolio constant for a time period of $T$). Suppose we have computed $w_{t}=(w_{t}^{(1)}, w_{t}^{(2)}, \dots, w_{t}^{(s)})^\intercal$ based on historical data. The earnings we make on investing with this strategy is
$$ r_{p,k} = r_k^\intercal w_t,  $$
where $k=t, t+1,\dots t+T$ and $r_k = (r_t^{(i)}, r_t^{(i)}, \dots, r_t^{(i)})^\intercal $.

Note that the earnings is proportional to the net worth (dollar amount of the portfolio). So, let $W_{t}$ be the net worth of the portfolio at time $t$. Then, given $W_t$, and the portfolio at $w_t$, portfolio at next time period is worth $W_{t+1}$. Suppose for some time $k\in [t, t+T]$,
$$ W_{k+1} = W_k \cdot (1 + r_k^\intercal w_t) $$ 


## Dealing with timeseries data and non-stationarity

In our previous notebook, we implicitly assumed that the market is stationary. Also, we used *all* data when computing $\mu$ and $\Sigma$. However, at any given point in time $t$, we only have the *past* data at our disposal, and we don't want to use data from too far back due to non-stationarity: i.e., the market probably has changed since then.

Note that previously we used *all* available data to compute $\mu$ and $\Sigma$ estimates. The problem setup assumes that the returns are coming from a single Gaussian distribution. (Recall that we need to supply estimates of these parameters as input to optimization problem.)

There are two opposing forces here:
* Non-stationarity of the market imply that most recent data is most relevant
* Statistical estimation is more stable when we use more historical data

Dealing with non-stationarity when using a method that assumes stationarity is challenging. However, in a sufficiently short period of time, we hope that stationarity assumption is approximately true. So, we can decide on a set of choices for fixed value of $N$: the number of historical datapoints to use for estimation of $\mu_t$ and $\Sigma_t$.



In [None]:
%matplotlib inline
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
import pickle
import cvxpy as cvx

In [None]:
data = pickle.load(open("dowjones_data.pkl", "rb")).set_index('date')
data = data['2000-01-03':'2017-08-31']
data = data.drop(['DWDP'])
datawide = data.reset_index().pivot(index='date',columns='ticker',values='adj_close')
datawide.head()

In [None]:
logret = np.log(datawide).diff()
logret.head()

In [None]:
mu = logret[1:].mean()
mu

In [None]:
sigma = logret.cov()
sigma

### Functions for portfolio allocation

In [None]:
def compute_mvp(mu, sigma):
    
    import cvxpy as cvx
    
    s, _ = sigma.shape

    w = cvx.Variable(s)
    risk = cvx.quad_form(w, sigma.as_matrix())
    prob = cvx.Problem(cvx.Minimize(risk), 
                   [cvx.sum_entries(w) == 1])
    prob.solve()
    
    return(w.value)

def compute_pf(mu, sigma, mu_star=-1):

    import cvxpy as cvx

    s, _ = sigma.shape

    w = cvx.Variable(s)
    risk = cvx.quad_form(w, sigma.as_matrix())
    prob = cvx.Problem(cvx.Minimize(risk), 
                   [
                       cvx.sum_entries(w) == 1,
                       mu.as_matrix()*w >= mu_star
                   ])
    prob.solve()
    
    return(w.value)

compute_mvp(mu, sigma).squeeze()- compute_pf(mu, sigma).squeeze()

### Rolling apply for periodic application of functions

To illustrate the usage of rolling-apply, let's create a data frame:

In [None]:
x = pd.DataFrame(np.reshape(np.arange(0, 100), (20,5)))
x.head()

`rolling()` is a series or dataframe method and apply takes the function and any additional arguments.

In [None]:
def myf(ind, **kwargs):
    ind = ind.astype('int')
    df = kwargs['df']
    print('ind:', ind)
    print('')
    print('df:', df.iloc[ind,:].as_matrix())
    print('')
    #print('here', df.loc[ind])
    #return(kwargs['df'][ind].shape[0])
    return(0)

asdf = x.index.to_series().rolling(3, center=True).apply(myf, kwargs={'df': x})

### Using rolling apply to compute $\mu_t$ and $\Sigma_t$

Now we can apply it to our real dataset.

In [None]:
def compute_returns(dw):
    """Compute log returns
    """
    from numpy import log
    
    return(log(dw).diff())

def myf2(ind, **kwargs):
    
    ind = ind.astype('int')
    df = kwargs['df']
    
    
    logret = compute_returns(df.iloc[ind])
   
    #####
    # use datetime to re-estimate every 20 days or so using the index 
    mu = logret.mean()
    sigma = logret.cov()
    
    print('ind:', df.index.to_series()[ind[0]])
    print('AAPL mu:', mu['AAPL'])
    print('AAPL sigma^2:', sigma.loc['AAPL', 'AAPL'], '\n')
    
    # weights = compute_mvp(mu, sigma)
    #####
    
    return(0)

dw = datawide[1:100]

## additional parameters
kw = {
    'df': dw,
    'date': dw.index.to_series(),
     }

n, p = dw.shape

output = pd.Series(np.arange(0, n)).rolling(30, center=True).apply(myf2, kwargs=kw)