#### CS164 Pre-Class Work for Session 10.1

### Quadratic Programming
_Yoav Rabinovich, Mar 2020_

----------

_In this exercise, we will look at the problem of portfolio optimization, which we briefly investigated in lesson 1.1. We will now use a more realistic model, where we consider risk levels and constraints on minimum rates of return. Suppose that we have a vector $\mathbf{x} \in \mathbb{R}^N$, where the $i$th component $x_i$ represents the fraction of our capital that we have invested in
asset $i$. We treat the rate of return of each asset as a random variable, where the mean rate of return is represented by a vector $\boldsymbol{\mu} ∈ \mathbb{R}^N$, and the covariance matrix for the rates of return over all assets is denoted $C \in \mathbb{R}^{N\times N}$. One way to allocate assets is to minimize
risk, subject to our portfolio making some minimum rate or return._

**(1)** _Explain why the quadratic form $\frac{1}{2}\mathbf{x}^T C\mathbf{x}$ provides a measure of overall portfolio risk. This is what we want to minimize._

Reducing covariance in our portfolio will reduce the correlation between our assets. That is, if one asset drops in value, we minimize the amount by which other assets are likely to drop in value as well. Diversification of assets only reduces risk if the assets are not correlated: Investing in beef and milk is not a risky form of "diversification", for instance.



**(2)** _Let $r$ denote the minimum rate of return for the portfolio. Explain why this translates into a constraint $\boldsymbol{\mu}^T \mathbf{x} \geq r$_.

Since the rate of return for each asset is embodied in the vector $\boldsymbol{\mu}$, its inner product with the portfolio $\mathbf{x}$ represents the rate of return of a specific portfolio. As $r$ sets a lower bound for the return, we get the constraint. 

**(3)** _Explain why we additionally need the (element-wise) constraint $x \geq 0$ and the equality constraint $\sum_i x_i = 1$._

The normalization constraint reflects our fractional representation of the portfolio: rather than assigning dollar values to investments in the portfolio, we generalize it and represent the relative portion of the protfolio we recommend to invest in any stock regardless of the total amount. These fractions then add up to one whole sum. In that sense, we also can't invest negative fracitons of our investment money into stocks, otherwise we could split the portfolio into arbitrary positive and negative amounts that add up to one, on top of how we can't invest negative dollars in stocks anyway, hence the positivity constraints.


**(4)** _A dataset of $225$ different assets can be found [here](http://people.brunel.ac.uk/~mastjjb/jeb/orlib/files/port5.txt). The first line of the file tells us the number of assets ($225$). The next $225$ lines list the mean rate of return and standard deviation for each of the $225$ assets. The final $113 \times 225$ lines tell us the correlation between the rates of return of the different assets: the first and second column are two assets $i$ and $j$, and the third column is the correlation between asset $i$ and asset $j$. Note that only the upper triangle of this matrix is specified, since correlations must be symmetric._

(a) _Load the data into Python (pre-processing if necessary) and create a vector
$\mu$ for the mean rate of return, a vector σ for the standard deviations, and a
matrix $K$ for the correlations._

(b) _Compute the covariance matrix $C$ by using the identity $C_{ij} = K_{ij}\sigma_i\sigma_j$._

(c) _Using the cost and constraints described above, create and solve a quadratic program using CVXPY to find the optimal asset allocation, assuming a minimum return rate of $0.2\%$. Are there some assets in which we practically would not invest in this case?_


In [0]:
# Imports
import numpy as np
import cvxpy as cp
from tabulate import tabulate

In [0]:
# Data
#http://people.brunel.ac.uk/~mastjjb/jeb/orlib/files/port5.txt)
n = int(np.loadtxt("port5.txt",max_rows=1))
data = np.loadtxt("port5.txt",skiprows=1,max_rows=n)
data_cor = np.loadtxt("port5.txt",skiprows=n+1)

In [0]:
# Preprocessing
mu = data[:,0]
sigma = data[:,1]
K = np.zeros(shape=(n,n))
C = np.zeros(shape=(n,n))
for row in data_cor:
    K[int(row[0])-1,int(row[1])-1] = row[2]
    K[int(row[1])-1,int(row[0])-1] = row[2]
    C[int(row[0])-1,int(row[1])-1] = row[2]*sigma[int(row[0])-1]*sigma[int(row[1])-1]
    C[int(row[1])-1,int(row[0])-1] = row[2]*sigma[int(row[0])-1]*sigma[int(row[1])-1]

r = 0.002

In [0]:
# Optimization
x = cp.Variable(n)
obj = cp.Minimize((1/2)*cp.quad_form(x, C))
const = [mu.T @ x >= r,
               cp.sum(x) == 1,
               x >= 0]
prob = cp.Problem(obj,const)
prob.solve()

In [145]:
# Result
m = 10
print("Minimum risk:", prob.value)
print("Top",m,"investments:")
indexed = np.concatenate((np.arange(1,n+1).reshape((1,n)),
                        x.value.reshape((1,n))),axis=0)
top = indexed[:,np.argsort(indexed[1])[::-1]]
table = tabulate(top[:,:m], tablefmt="fancy_grid", floatfmt = ".3f")
print(table)

Minimum risk: 0.00019491212566550894
Top 10 investments:
╒════════╤════════╤═════════╤════════╤════════╤═══════╤═════════╤═════════╤════════╤═════════╕
│ 62.000 │ 60.000 │ 196.000 │ 40.000 │ 43.000 │ 9.000 │ 129.000 │ 215.000 │ 97.000 │ 171.000 │
├────────┼────────┼─────────┼────────┼────────┼───────┼─────────┼─────────┼────────┼─────────┤
│  0.257 │  0.120 │   0.098 │  0.087 │  0.081 │ 0.080 │   0.074 │   0.069 │  0.059 │   0.057 │
╘════════╧════════╧═════════╧════════╧════════╧═══════╧═════════╧═════════╧════════╧═════════╛
