# Module 5 - Modern Portfolio Theory

In this module, We’ll be looking at investment portfolio optimization with python, the
fundamental concept of diversification and the creation of an efficient frontier that can be used by investors to choose
specific mixes of assets based on investment goals; that is, the trade off between their desired level of portfolio
return vs their desired level of portfolio risk.

[Modern Portfolio Theory](https://www.investopedia.com/terms/m/modernportfoliotheory.asp) suggests that it is possible to 
construct an "efficient frontier" of optimal portfolios,
offering the maximum possible expected return for a given level of risk. It suggests that it is not enough to look at
the expected risk and return of one particular stock. By investing in more than one stock, an investor can reap the
benefits of diversification, particularly a reduction in the riskiness of the portfolio. MPT quantifies the benefits of
diversification, also known as not putting all of your eggs in one basket.

In [1]:
import pandas as pd
import numpy as np
from pandas import Series, DataFrame
import seaborn as sns
import matplotlib.pyplot as plt
import os
import re
import glob
import random
import skopt
# from skopt import gp_minimize



## Problem Statements

## 5.1 - Annualized Volatility and Returns
For your chosen stock, calculate the mean daily return and daily standard deviation of returns, and then just annualise them to get mean expected annual return and volatility of that single stock. ( annual mean = daily mean * 252 , annual stdev = daily stdev * sqrt(252) )

In [2]:
def read_csv( filename ):
    if isinstance(filename, pd.DataFrame): return filename  # OPTIMIZATION: allow passthrough of existing dataframe
    
    dataframe = pd.read_csv( filename, parse_dates=['Date'] )
    dataframe.set_index( dataframe.Date, inplace=True )
    return dataframe

def meanDailyReturn( filename ):
    return read_csv( filename ).Close_Price.pct_change().dropna().mean()

def meanDailySTD( filename ):
    return read_csv( filename ).Close_Price.pct_change().dropna().std()

def meanAnnualReturn( filename ):
    return meanDailyReturn(filename) * 252

def meanAnnualSTD( filename ):
    return meanDailySTD(filename) * np.sqrt(252)

def getName( filename ):
    return re.sub(r'^.+/|\.[^.]+$',     '',    filename)

def getCap( filename ):
    return re.sub(r'^.*/(\w+_Cap)/.*$', '\\1', filename)

def calcReturnVolatility( filename ):
    input  = read_csv( filename )
    output = DataFrame([{
        "Name":             getName( filename ),
        "Cap":              getCap(  filename ),
        "meanDailyReturn":  meanDailyReturn( input ),
        "meanDailySTD":     meanDailySTD( input ),
        "meanAnnualReturn": meanAnnualReturn( input ),
        "meanAnnualSTD":    meanAnnualSTD( input )
    }])
    output.set_index( output.Name, inplace=True, drop=False )
    return output

stock = '../../data_output/module_1/python3/stocks/Large_Cap/ADANIPORTS.csv'
calcReturnVolatility(stock)

Unnamed: 0_level_0,Cap,Name,meanAnnualReturn,meanAnnualSTD,meanDailyReturn,meanDailySTD
Name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
ADANIPORTS,Large_Cap,ADANIPORTS,0.060499,0.309751,0.00024,0.019512


## 5.2 - Covariance Matrix
Now, we need to diversify our portfolio. Build your own portfolio by choosing any 5 stocks, preferably of different sectors and different caps. Assume that all 5 have the same weightage, i.e. 20% . Now calculate the annual returns and volatility of the entire portfolio ( Hint : Don't forget to use the covariance )

In [3]:
filenames_all = glob.glob('../../data_output/module_1/python3/stocks/**/*.csv')
filenames     = random.sample( filenames_all, 5 )

summary_all   = pd.concat([ calcReturnVolatility(stock) for stock in filenames_all ])
summary       = pd.concat([ calcReturnVolatility(stock) for stock in filenames     ])
summary

Unnamed: 0_level_0,Cap,Name,meanAnnualReturn,meanAnnualSTD,meanDailyReturn,meanDailySTD
Name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
VOLTAS,Mid_Cap,VOLTAS,0.19103,0.308346,0.000758,0.019424
IOC,Large_Cap,IOC,-0.413701,0.500046,-0.001642,0.0315
LEMONTREE,Small_Cap,LEMONTREE,0.056795,0.392313,0.000225,0.024713
APOLLOTYRE,Mid_Cap,APOLLOTYRE,-0.064303,0.310445,-0.000255,0.019556
VENKEYS,Small_Cap,VENKEYS,0.381711,0.627383,0.001515,0.039521


In [4]:
portfolio = DataFrame()
for filename in filenames:
    portfolio[ getName(filename) ] = read_csv(filename).Close_Price
portfolio.head()

Unnamed: 0_level_0,VOLTAS,IOC,LEMONTREE,APOLLOTYRE,VENKEYS
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
2017-05-15,431.85,442.1,,231.9,1169.7
2017-05-16,432.45,446.6,,234.4,1177.0
2017-05-17,430.2,444.25,,237.35,1188.0
2017-05-18,414.1,439.9,,232.65,1156.9
2017-05-19,415.75,435.4,,234.65,1161.65


In [5]:
equal_weights = np.full( portfolio.shape[1], 1/portfolio.shape[1] )
equal_weights

array([0.2, 0.2, 0.2, 0.2, 0.2])

In [6]:
# Portfolio Mean Average Return can be calculated either from: the summary data or the portfolio table
def portfolio_annual_returns(portfolio, weights):
    return np.sum( portfolio.pct_change().mean() * weights ) * 252  

round( portfolio_annual_returns(portfolio, equal_weights), 2 )

0.03

In [7]:
# Portfolio Covarence matrix
portfolio_covarence = portfolio.pct_change().cov()
portfolio_covarence

Unnamed: 0,VOLTAS,IOC,LEMONTREE,APOLLOTYRE,VENKEYS
VOLTAS,0.000377,8.7e-05,7.1e-05,0.000127,0.000179
IOC,8.7e-05,0.000992,8.3e-05,9.3e-05,0.000154
LEMONTREE,7.1e-05,8.3e-05,0.000611,3.6e-05,0.000259
APOLLOTYRE,0.000127,9.3e-05,3.6e-05,0.000382,0.000159
VENKEYS,0.000179,0.000154,0.000259,0.000159,0.001562


In [8]:
def portfolio_annual_volatility( portfolio, weights ): 
    return np.sqrt(
        np.dot( 
            weights.T, 
            np.dot( portfolio.pct_change().cov(), weights )
        ) 
        * np.sqrt(252)
    )

round( portfolio_annual_volatility(portfolio, equal_weights), 2 )

0.06

In [9]:
def portfolio_sharpe( portfolio, weights ): 
    return portfolio_annual_returns( portfolio, weights ) / portfolio_annual_volatility( portfolio, weights )

round( portfolio_sharpe(portfolio, equal_weights), 2 )

0.47

In [10]:
print("Portfolio Annualized Mean Return: ", round( portfolio_annual_returns(    portfolio, equal_weights ), 2) )
print("Portfolio Annualized Volatility:  ", round( portfolio_annual_volatility( portfolio, equal_weights ), 2) )

Portfolio Annualized Mean Return:  0.03
Portfolio Annualized Volatility:   0.06


## 5.3 + 5.4 - Monty-Carlo Simulation

Prepare a scatter plot for differing weights of the individual stocks in the portfolio , the axes being the returns and volatility. Colour the data points based on the Sharpe Ratio ( Returns/Volatility) of that particular portfolio.

Mark the 2 portfolios where:
- Portfolio 1 - The Sharpe ratio is the highest 
- Portfolio 2 - The volatility is the lowest.

In [11]:
def normalize_weights( weights ):
    for i in range(0,3):
        weights  = np.round( weights, 3 ) 
        weights /= weights.sum()
    return np.asarray(weights)
    
def random_weights():
    weights  = np.random.rand(portfolio.shape[1])
    return normalize_weights( weights )

random_weights()

array([0.13913914, 0.36736737, 0.14014014, 0.29229229, 0.06106106])

In [None]:
scatter_data = DataFrame()
for i in range(0, 2500):
    weights    = random_weights()
    returns    = portfolio_annual_returns(    portfolio, weights )
    volatility = portfolio_annual_volatility( portfolio, weights )
    sharpe     = returns / volatility
    scatter_data = scatter_data.append([{
        "weights":    weights,
        "returns":    returns,
        "volatility": volatility,
        "sharpe":     sharpe
    }])

scatter_data.reset_index(inplace=True, drop=True)
scatter_data.head()

Mark the 2 portfolios where:
- Portfolio 1 - The Sharpe ratio is the highest 
- Portfolio 2 - The volatility is the lowest.

In [None]:
point_max_sharpe     = scatter_data.loc[ scatter_data['sharpe'].idxmax()     ]
point_max_sharpe

In [None]:
point_min_volatility = scatter_data.loc[ scatter_data['volatility'].idxmin() ]
point_min_volatility

In [None]:
fig, ax = plt.subplots(figsize=(20, 10), nrows=1, ncols=1)
plt.scatter( 
    scatter_data.volatility,     
    scatter_data.returns, 
    c = scatter_data.sharpe
)
plt.title('Portfolo Weightings - Monty-Carlo Simulation')
plt.ylabel('Annualized Return')
plt.xlabel('Annualized Volatility')
plt.colorbar()

# Mark the 2 portfolios where
plt.scatter( point_max_sharpe.volatility,     point_max_sharpe.returns,     marker=(5,1,0), c='b', s=200 )
plt.scatter( point_min_volatility.volatility, point_min_volatility.returns, marker=(5,1,0), c='r', s=200 )

---
# Scikit-Optimize

Mark the 2 portfolios where:
- Portfolio 1 - The Sharpe ratio is the highest
- Portfolio 2 - The volatility is the lowest.

In [None]:
# Portfolio 1 - The Sharpe ratio is the highest 
def max_sharpe( weights ):
    weights = normalize_weights(weights)
    sharpe  = portfolio_annual_returns( portfolio, weights ) / portfolio_annual_volatility( portfolio, weights )
    return  -sharpe  # convert maximization for minimization

# DOCS: https://scikit-optimize.github.io/#skopt.gp_minimize
skopt_max_sharpe = skopt.gp_minimize(
    max_sharpe, 
    [(0., 1.),(0., 1.),(0., 1.),(0., 1.),(0., 1.)],  # must be floating point array
    verbose=False,
    n_calls=30,         # the number of evaluations of f 
    n_random_starts=5,  # the number of random initialization points
    random_state=123    # the random seed    
)
print( "Max Sharpe Ratio Value   - skopt:       ", -skopt_max_sharpe.fun )
print( "Max Sharpe Ratio Value   - Monty Carlo: ",  point_max_sharpe.sharpe  )
print( "Max Sharpe Ratio Improvement:           ",  round( abs(-skopt_max_sharpe.fun / point_max_sharpe.sharpe), 2), 'x'  )
print( "Max Sharpe Ratio Weights - skopt:       ",  skopt_max_sharpe.x )
print( "Max Sharpe Ratio Weights - Monty Carlo: ",  point_max_sharpe.weights  )

In [None]:
# Portfolio 2 - The volatility is the lowest.
def min_volatility( weights ):
    weights    = normalize_weights(weights)
    volatility = portfolio_annual_volatility( portfolio, weights )
    return volatility
    
# DOCS: https://scikit-optimize.github.io/#skopt.gp_minimize
skopt_min_volatility = skopt.gp_minimize(
    min_volatility, 
    [(0., 1.),(0., 1.),(0., 1.),(0., 1.),(0., 1.)],  # must be floating point array
    verbose=False,
    n_calls=30,         # the number of evaluations of f 
    n_random_starts=5,  # the number of random initialization points
    random_state=123    # the random seed        
)
print( "Min Volatility Value   - skopt:       ",  skopt_min_volatility.fun )
print( "Min Volatility Value   - Monty Carlo: ",  point_min_volatility.volatility  )
print( "Min Volatility Weights - skopt:       ",  skopt_min_volatility.x )
print( "Min Volatility Weights - Monty Carlo: ",  point_min_volatility.weights  )

In [None]:
fig, ax = plt.subplots(figsize=(20, 10), nrows=1, ncols=1)
plt.scatter( 
    scatter_data.volatility,     
    scatter_data.returns, 
    c = scatter_data.sharpe
)
plt.title('Portfolo Weightings - Monty-Carlo Simulation')
plt.ylabel('Annualized Return')
plt.xlabel('Annualized Volatility')
plt.colorbar()

# Mark the 2 portfolios where
plt.scatter( point_max_sharpe.volatility,     point_max_sharpe.returns,     marker=(5,1,0), c='b', s=200 )
plt.scatter( point_min_volatility.volatility, point_min_volatility.returns, marker=(5,1,0), c='r', s=200 )
plt.scatter( 
    portfolio_annual_volatility( portfolio, normalize_weights(skopt_max_sharpe.x) ),     
    portfolio_annual_returns(    portfolio, normalize_weights(skopt_max_sharpe.x) ), 
    marker=(4,2,0), c='b', s=200 
)
plt.scatter( 
    portfolio_annual_volatility( portfolio, normalize_weights(skopt_min_volatility.x) ),     
    portfolio_annual_returns(    portfolio, normalize_weights(skopt_min_volatility.x) ), 
    marker=(4,2,0), c='r', s=200 
)

The 4 point +'s mark the skopt optimization points, whereas the 5 point stars represent the mix/max points of the Monty Carlo simulation# 

---
# Analysis of All Stocks

Prepare a scatter plot for differing weights of the individual stocks in the portfolio , the axes being the returns and volatility. Colour the data points based on the Sharpe Ratio ( Returns/Volatility) of that particular portfolio.

In [None]:
summary_all['Sharpe'] = summary_all.meanAnnualReturn / summary_all.meanAnnualSTD

fig, axes = plt.subplots(figsize=(20, 30), nrows=2, ncols=1)
for n in range(0,2):
    axes[n].scatter( 
        summary_all.meanAnnualReturn, 
        summary_all.meanAnnualSTD, 
        c     = summary_all['Sharpe'],
        label = summary_all['Sharpe']
    )
    axes[n].set_title('Returns vs Volatility')
    axes[n].set_xlabel('Annualized Return')
    axes[n].set_ylabel('Annualized Volatility')

for i in range(0, summary_all.shape[0]):
    axes[n].annotate(summary_all.Name[i], (summary_all.meanAnnualReturn[i]+0.02, summary_all.meanAnnualSTD[i] - 0.005))

Mark the 2 portfolios where:
- Portfolio 1 - The Sharpe ratio is the highest 
- Portfolio 2 - The volatility is the lowest.

In [None]:
portfolio_1 = summary_all.sort_values('Sharpe',ascending=False).head(5)
portfolio_1

In [None]:
portfolio_2 = summary_all.sort_values('meanAnnualSTD',ascending=True).head(5)
portfolio_2