***
# Portfolio Optimization techniques


This notebook will be following the Markowitz technique for portfolio optimization. Follow this [site](https://cvxr.rbind.io/cvxr_examples/cvxr_portfolio-optimization/).

First we will take random data and plot them to see their returns. In this notebook, risk (aka volatility) will be quantified as the standard deviation of returns.

Once we generate the returns, the next step will be to generate weights for each of the assets. The goal here is to maximise returns while minimizing risks. And we will assume that our entire capital will be invested in all the companies, so the weights add up to 1.

To evaluate how well our portfolios would perform, we need to calculate two things:
1. The mean returns of the portfolio.
2. The volatility (standard deviation).

The expected return, $R$, is calculated as follows with R being the return, p being the vector (whose length is equal to the number of stocks) of means of the entire porfolio and W being the random weights:

### <center> $R = p^TW$

The standard deviation, $\sigma$, is calculated using the covariance, $C$ of the returns along with the weights:

### <center> $\sigma = \sqrt(W^TCW)$

The diagonal elements of the covariance matrix indicate the amount of variance in each individual stock, while the off diagonal elements indicate the amount of variance between stocks. If we only use the regular standard deviation formula, we only get the diagonal elements.

## Markowitz optimization and the Efficient Frontier


What we have in the figure above is the plot of returns vs. volatility -- characteristically assuming the form of a parabola called the Markowitz bullet. The borders of this paraboa, called the **efficient frontier**, have the lowest variance for a given expected return.

In an optimization framework, this is done by maximizing returns adjusted with risk:

### <center> maximize $(R - \sigma^2)$

subject to:

### <center> $\sum_{i} W = 1$

### <center> $w_{i}>=0$

We will be using the convex optimization library, CVXPY, to solve the optimization problem: it is a Python-embedded modeling language for convex optimization problems. It allows you to express your problem in a natural way that follows the mathematical model, rather than in the restrictive standard form required by solvers.



In [1]:
## Import libraries.
import sys
## Certain imports are failing.
sys.path.append('/home/sharatpc/.local/lib/python3.8/site-packages')

import cvxpy as cp
import matplotlib.pyplot as plt         # Graphs
import numpy as np                      # Arrays
import os                               # Operating system
import pandas as pd
import plotly.express as px
import plotly.graph_objects as go

from matplotlib import cm               # Colours
from mpl_toolkits.mplot3d import Axes3D # 3D graphs

In [None]:
## Trying with random data.
num_observations = 40
num_assets = 5
asset_obs_matrix = np.random.randn(num_observations,num_assets)

In [None]:
fig = plt.figure(figsize=(10,10))
plt.plot(asset_obs_matrix)
plt.xlabel("Time")
plt.ylabel("Returns")
plt.legend(['First Asset', 'Second Asset','Third Asset','Fourth Asset'])
plt.title("Returns over Time")

In [None]:
means = np.asarray(np.mean(asset_obs_matrix,axis=0))
covar = np.asarray(np.cov(asset_obs_matrix,rowvar=False))
weights = cp.Variable(num_assets)
r = means.T@weights
risk = cp.quad_form(weights,covar)
# Construct the problem.
objective = cp.Maximize(r-risk)
constraints = [sum(weights)==1, weights >= 0]
prob = cp.Problem(objective, constraints)

In [None]:
try:
    prob.solve()
#     print ("Optimal portfolio")
#     print ("----------------------")
#     for s in range(len(symbols)):
#        print (" Investment in {} : {}% of the portfolio".format(symbols[s],round(100*x.value[s],2)))
#     print ("----------------------")
#     print ("Exp ret = {}%".format(round(100*ret.value,2)))
#     print ("Expected risk    = {}%".format(round(100*risk.value**0.5,2)))
except:
    print ("Error")

In [None]:
prob.status

In [None]:
weights.value

## With Historical Stock Data
Now I will compute the same by taking the Nifty50 data dating from 2020-05-04 until 2021-04-30. In the interest of time, I have only considered 6 stocks from the Nifty50, but the idea can be extended to all stocks.

In [8]:
nifty_data = pd.read_csv(os.getcwd()+"/nifty_50_close.csv")
num_observations = nifty_data.shape[0]
num_assets = nifty_data.shape[1]
nifty_data = nifty_data.to_numpy()
dict_of_companies = dict()

In [9]:
means = np.asarray(np.mean(nifty_data,axis=0))
covar = np.asarray(np.cov(nifty_data,rowvar=False))
weights = cp.Variable(num_assets)
r = means.T@weights
risk = cp.quad_form(weights,covar)
# Construct the problem.
objective = cp.Maximize(r-risk)
constraints = [sum(weights)==1, weights >= 0]
prob = cp.Problem(objective, constraints)

In [10]:
try:
    prob.solve()
#     print ("Optimal portfolio")
#     print ("----------------------")
#     for s in range(len(symbols)):
#        print (" Investment in {} : {}% of the portfolio".format(symbols[s],round(100*x.value[s],2)))
#     print ("----------------------")
#     print ("Exp ret = {}%".format(round(100*ret.value,2)))
#     print ("Expected risk    = {}%".format(round(100*risk.value**0.5,2)))
except:
    print ("Error")

In [11]:
prob.status

'optimal'

In [12]:
weights.value

array([ 3.03058341e-23,  2.42487625e-02,  3.01312606e-02, -7.94336664e-24,
       -3.45749848e-24, -2.77679836e-24,  9.45619977e-01])