# Advanced Portfolio Optimization using cvxpy

## Install cvxpy and other libraries

In [2]:
import sys
!{sys.executable} -m pip install -r requirements.txt



## Imports

In [3]:
import cvxpy as cvx
import numpy as np
import quiz_tests_advanced

## What's our objective?
http://www.cvxpy.org/

Let's see how we can use optimization to meet a more advanced objective.  We want to both minimize the portfolio variance and also want to closely track a market cap weighted index.  In other words, we're trying to minimize the distance between the weights of our portfolio and the weights of the index.

$Minimize \left [ \sigma^2_p + \lambda \sqrt{\sum_{1}^{m}(weight_i - indexWeight_i)^2} \right  ]$ where $m$ is the number of stocks in the portfolio, and $\lambda$ is a scaling factor that you can choose.


## Hints

### x vector
To create a vector of M variables $\mathbf{x} = \begin{bmatrix}
x_1 &...& x_M
\end{bmatrix}
$
we can use `cvx.Variable(m)`

### covariance matrix
If we have $m$ stock series, the covariance matrix is an $m \times m$ matrix containing the covariance between each pair of stocks.  We can use [numpy.cov](https://docs.scipy.org/doc/numpy/reference/generated/numpy.cov.html) to get the covariance.  We give it a 2D array in which each row is a stock series, and each column is an observation at the same period of time.

The covariance matrix $\mathbf{P} = 
\begin{bmatrix}
\sigma^2_{1,1} & ... & \sigma^2_{1,m} \\ 
... & ... & ...\\
\sigma_{m,1} & ... & \sigma^2_{m,m}  \\
\end{bmatrix}$

### portfolio variance
We can write the portfolio variance $\sigma^2_p = \mathbf{x^T} \mathbf{P} \mathbf{x}$

Recall that the $\mathbf{x^T} \mathbf{P} \mathbf{x}$ is called the quadratic form.
We can use the cvxpy function `quad_form(x,P)` to get the quadratic form.

### Distance from index weights
We want portfolio weights that track the index closely.  So we want to minimize the distance between them.
Recall from the Pythagorean theorem that you can get the distance between two points in an x,y plane by adding the square of the x and y distances and taking the square root.  Extending this to any number of dimensions is called the L2 norm.  So: $\sqrt{\sum_{1}^{n}(weight_i - indexWeight_i)^2}$  Can also be written as $\left \| \mathbf{x} - \mathbf{index} \right \|_2$.  There's a cvxpy function called [norm()](https://www.cvxpy.org/api_reference/cvxpy.atoms.other_atoms.html#norm)
`norm(x, p=2, axis=None)`.  The default is already set to find an L2 norm, so you would pass in one argument, which is the difference between your portfolio weights and the index weights.

### objective function
We want to minimize both the portfolio variance and the distance of the portfolio weights from the index weights.
We also want to choose a `scale` constant, which is $\lambda$ in the expression. This lets us choose how much priority we give to minimizing the difference from the index, relative to minimizing the variance of the portfolio.  If you choose a higher value for `scale` ($\lambda$), do you think this gives more priority to minimizing the difference, or minimizing the variance?

We can find the objective function using cvxpy `objective = cvx.Minimize()`.  Can you guess what to pass into this function?

### constraints
We can also define our constraints in a list.  For example, you'd want the weights to sum to one. So $\sum_{1}^{n}x = 1$.  You may also need to go long only, which means no shorting, so no negative weights.  So $x_i >0 $ for all $i$. you could save a variable as `[x >= 0, sum(x) == 1]`, where x was created using `cvx.Variable()`.

### optimization
So now that we have our objective function and constraints, we can solve for the values of $\mathbf{x}$.
cvxpy has the constructor `Problem(objective, constraints)`, which returns a `Problem` object.

The `Problem` object has a function solve(), which returns the minimum of the solution.  In this case, this is the minimum variance of the portfolio.

It also updates the vector $\mathbf{x}$.

We can check out the values of $x_A$ and $x_B$ that gave the minimum portfolio variance by using `x.value`

## Quiz

In [5]:
import cvxpy as cvx
import numpy as np

def optimize_portfolio(returns, index_weights, scale=.00001):
    """
    Create a function that takes the return series of a set of stocks, the index weights,
    and scaling factor. The function will minimize a combination of the portfolio variance
    and the distance of its weights from the index weights.  
    The optimization will be constrained to be long only, and the weights should sum to one.
    
    Parameters
    ----------
    returns : numpy.ndarray
        2D array containing stock return series in each row.
        
    index_weights : numpy.ndarray
        1D numpy array containing weights of the index.
        
    scale : float
        The scaling factor applied to the distance between portfolio and index weights
        
    Returns
    -------
    x : np.ndarray
        A numpy ndarray containing the weights of the stocks in the optimized portfolio
    """
    # TODO: Use cvxpy to determine the weights on the assets
    # that minimizes the combination of portfolio variance and distance from index weights
    
    # number of stocks m is number of rows of returns, and also number of index weights
    #m = 
    print(returns)
    #covariance matrix of returns
    #cov = 
    
    # x variables (to be found with optimization)
    #x = 
    
    #portfolio variance, in quadratic form
    #portfolio_variance = 
    
    # euclidean distance (L2 norm) between portfolio and index weights
    #distance_to_index = 
    
    #objective function
    #objective = 
    
    #constraints
    #constraints = 

    #use cvxpy to solve the objective
    
    #retrieve the weights of the optimized portfolio
    #x_values
    
    return x_values

quiz_tests_advanced.test_optimize_portfolio(optimize_portfolio)

[[ 0.39399943  0.03818684 -0.03522677  0.63540972  0.18440401  0.18162973
  -0.10864379  0.062098    0.24992858  0.0723064   0.19941505  0.13914636
   0.30945414 -0.38957     0.4203888   0.07334708 -0.29578217  0.17014395
   0.40183396 -0.24486384  0.16281584 -0.50302167 -0.15317135  0.23991218
   0.21502243 -0.17643613 -0.06171463 -0.14852506 -0.23726996 -0.62360981
   0.02204716  0.19548639  0.36829158 -0.14208576  0.08124685  0.01136494
   0.63075644 -0.07080872 -0.01388208  0.19647915  0.06364942 -0.32565066
  -0.05310956 -0.39879858  0.27884801 -0.03515648 -0.46830435 -0.17596231
   0.01264183  0.65847691 -0.37079778  0.2645648  -0.09926207  0.08041151
   0.13376598 -0.36288259 -0.05317145 -0.23159076  0.23031622 -0.30968674
  -0.42324922  0.610723   -0.43634154 -0.26428928  0.29604021  0.30505181
  -0.37174438 -0.01329347  0.3552185   0.03252964 -0.25657853 -0.10659511
   0.29291785 -0.05538485  0.22972691  0.07615587  0.26693662 -0.13079902
  -0.07132163 -0.4046252  -0.03982935 

NameError: name 'x_values' is not defined

In [6]:
"""Test with a 3 simulated stock return series"""
days_per_year = 252
years = 3
total_days = days_per_year * years

return_market = np.random.normal(loc=0.05, scale=0.3, size=days_per_year)
return_1 = np.random.uniform(low=-0.000001, high=.000001, size=days_per_year) + return_market
return_2 = np.random.uniform(low=-0.000001, high=.000001, size=days_per_year) + return_market
return_3 = np.random.uniform(low=-0.000001, high=.000001, size=days_per_year) + return_market
returns = np.array([return_1, return_2, return_3])

"""simulate index weights"""
index_weights = np.array([0.9,0.15,0.05])

"""try out your optimization function"""
x_values = optimize_portfolio(returns, index_weights, scale=.00001)

print(f"The optimized weights are {x_values}, which sum to {sum(x_values):.2f}")

[[ 6.26662643e-01 -9.74874249e-02  3.05384048e-01 -3.56568788e-02
   2.10052074e-01  3.36345641e-01 -1.16326947e-01  3.48614843e-01
   1.85314920e-01 -5.27334231e-01  4.44089182e-02 -3.58100683e-01
   1.10750222e-01  5.52308351e-01  1.47794788e-01 -1.90345929e-01
   6.68006889e-02  1.30641422e-01 -6.19413832e-01  9.49890714e-02
   5.13202099e-01 -2.01263806e-01  6.62088453e-01  6.29128604e-01
   6.21923827e-01 -2.25558785e-01 -5.21246632e-01 -2.51558267e-01
   1.26471953e-02 -9.46909391e-02 -1.17622376e-01 -2.67549937e-01
  -5.56881364e-04  3.42007421e-01  2.47192604e-01 -3.17606184e-01
   3.38717187e-01  4.27769838e-01  2.38337725e-01  6.18732395e-01
   2.50113900e-02  3.80128814e-01  8.32237378e-03 -7.17275835e-02
   4.68830578e-01  1.54968359e-01 -1.14147259e-01  1.84038109e-01
   3.87438083e-03 -1.31438180e-01 -5.18499894e-01 -1.23515551e-01
  -5.25149516e-02 -1.97932968e-01  4.91694941e-01  3.74521474e-01
   1.30222042e-01 -2.55447400e-01  1.47916607e-01 -7.90624301e-01
  -4.57154

NameError: name 'x_values' is not defined

If you're feeling stuck, you can check out the solution [here](m3l4_cvxpy_advanced_solution.ipynb)