# Using Monte Carlo Simulation to Determine the Optimal Portfolio of Stocks

If you have tried investing in the stock market, then you are most likely faced with multiple investment decisions such as "which stock to choose", "which industry to focus on" and "how much should you allocate to each stock".
Fortunately, Harry Markowitz provided an answer to the last question which is also considered as one of the most difficult problems in investing: portfolio security selection. 

His Moden Portfolio Theory (MPT) won him a Nobel Prize and introduced the ideas of portfolio investing and how securities' risks and correlations impact the portfolio as a whole.
So you might think that there are many ways to answer the question "How much should I invest in a stock", but optimization theories and Markowitz's work would tell you that there is only one correct answer.
There are two ways to solve this problem:

1.  **Analytically** -  which uses linear algebra and matrix operations to arrive at the optimal portfolio.

2.  **Computationally** -  with the use of computers to crunch possible permutations of the portfolio. The optimal portfolio would be the one with the highest return per risk portfolio.

Note that in portfolio optimization, what we optimize is that of the weights or the allocation, given a list of possible investments. 

To get our stock data, we will employ the investpy package.

## PRELIMINARIES

In [13]:
import numpy as np
import pandas as pd
import hvplot.pandas  # noqa

#For Monte Carlo
import random

#Visualization
import holoviews as hv

from tqdm import tqdm

#Historical Data
import investpy

## STEP 1: GET HISTORICAL DATA FOR THE STOCKS OF YOUR CHOICE

We get the stock data, to track the volatility of the market. Ideally, we want a time period that covers the full business cycle, from trough, recession, expansion, and peak.

To begin, specify the stock and past trading dates you are looking at. The past trading dates will provide us the riskiness of the stocks through the standard deviation of their returns.

In [2]:
stocks = ['MEG','CEB', 'BDO','ALI', 'MER', 'AC', 'JGS', 'URC', 
          'JFC', 'SEVN', 'BMM', 'CAT', 'FGEN', 'MAXS', 'PGOLD']

## STEP 2: CALCULATE THE RETURNS FOR THESE STOCKS

To calculate the return, specify the time period in question. This is critical as the choice of the time period may coincide with periods of high growth or periods of depressed growth for the equity market.

For now, let us assume that the correct time period that will represent the expected return for the coming year is given by the following:

In [4]:
begin_date = "01/01/2014"
end_date = "29/12/2019"

In [6]:
def generate_stock_returns(stocks_list, begin_date, end_date):
    
    prices = pd.DataFrame()
    
    for stock in stocks_list:
        df_ = investpy.get_stock_historical_data(stock=stock,
                                                country='Philippines',
                                                from_date=begin_date,
                                                to_date=end_date).Close
        df_.rename(stock, inplace=True)                                             
        df_.columns = [stock]
        prices = pd.concat([prices, df_],axis=1)
        prices.index.name = "Date"
    return prices

In [7]:
prices = generate_stock_returns(stocks, begin_date, end_date)

In [8]:
prices

Unnamed: 0_level_0,MEG,CEB,BDO,ALI,MER,AC,JGS,URC,JFC,SEVN,BMM,CAT,FGEN,MAXS,PGOLD
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1
2014-01-02,3.31,48.712,71.00,25.50,256.00,525.5,39.000,115.88,175.9,,,,14.02,7.55,38.899
2014-01-03,3.31,49.109,70.45,25.25,254.20,520.5,38.850,116.86,172.0,98.0,,,14.14,7.65,39.098
2014-01-06,3.41,49.209,71.80,25.50,260.00,525.5,39.500,117.26,175.0,97.0,,,14.28,7.60,38.899
2014-01-07,3.40,48.960,71.85,25.60,257.00,524.0,38.200,113.43,172.0,95.0,,,14.36,7.55,38.999
2014-01-08,3.38,50.650,72.55,26.10,263.00,530.5,39.200,116.57,169.5,95.0,,,14.86,7.32,39.098
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
2019-12-19,4.10,90.700,154.80,45.40,304.63,755.0,73.333,141.04,216.0,148.0,,17.54,23.50,12.10,40.200
2019-12-20,4.00,90.400,155.40,46.00,279.97,779.5,76.286,142.03,213.0,,,17.52,22.90,12.10,39.950
2019-12-23,4.03,88.900,157.00,47.25,293.28,779.0,77.810,149.82,220.0,145.5,,17.52,23.75,12.12,39.950
2019-12-26,4.09,89.850,157.40,46.40,298.36,800.0,75.238,145.98,216.2,136.2,88.1,17.48,24.30,11.98,40.400


To get the return, we simply use the following code:

### Returns Calculation

In [10]:
returns = prices.pct_change()
returns.head()

Unnamed: 0_level_0,MEG,CEB,BDO,ALI,MER,AC,JGS,URC,JFC,SEVN,BMM,CAT,FGEN,MAXS,PGOLD
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1
2014-01-02 00:00:00,,,,,,,,,,,,,,,
2014-01-03 00:00:00,0.0,0.00815,-0.007746,-0.009804,-0.007031,-0.009515,-0.003846,0.008457,-0.022172,,,,0.008559,0.013245,0.005116
2014-01-06 00:00:00,0.030211,0.002036,0.019163,0.009901,0.022817,0.009606,0.016731,0.003423,0.017442,-0.010204,,,0.009901,-0.006536,-0.00509
2014-01-07 00:00:00,-0.002933,-0.00506,0.000696,0.003922,-0.011538,-0.002854,-0.032911,-0.032662,-0.017143,-0.020619,,,0.005602,-0.006579,0.002571
2014-01-08 00:00:00,-0.005882,0.034518,0.009743,0.019531,0.023346,0.012405,0.026178,0.027682,-0.014535,0.0,,,0.034819,-0.030464,0.002539


### Covariance Matrix

In [11]:
cov = returns.cov()
cov.head()

Unnamed: 0,MEG,CEB,BDO,ALI,MER,AC,JGS,URC,JFC,SEVN,BMM,CAT,FGEN,MAXS,PGOLD
MEG,0.000435,8.6e-05,0.000106,0.000141,5.8e-05,0.000119,0.000138,9.1e-05,8.1e-05,-1.1e-05,9e-06,3.1e-05,7.7e-05,8.6e-05,9e-05
CEB,8.6e-05,0.000555,6.2e-05,5.9e-05,2.2e-05,4.8e-05,7e-05,5.5e-05,3.2e-05,2e-06,1.8e-05,2.3e-05,6.3e-05,7.8e-05,4.4e-05
BDO,0.000106,6.2e-05,0.000232,9.6e-05,5e-05,7.8e-05,0.000107,9.3e-05,6.3e-05,5e-06,1.5e-05,5e-06,4.6e-05,4.2e-05,5.5e-05
ALI,0.000141,5.9e-05,9.6e-05,0.000303,5.2e-05,0.000124,0.000129,0.000109,8.7e-05,1.6e-05,2.7e-05,5e-06,7e-05,7.7e-05,6.1e-05
MER,5.8e-05,2.2e-05,5e-05,5.2e-05,0.000193,4.4e-05,7.1e-05,5.5e-05,4.3e-05,2e-06,3e-05,2.3e-05,3.1e-05,1.7e-05,2.7e-05


## STEP 3: INITIALIZE THE WEIGHTS AND FUNCTION FOR THE CALCULATION OF METRICS

As we are trying to optimize the portfolio by altering the weight allocation, let's initialize our weights and calculate the initial metrics using our customized function below.

It is important that the weights chosen come from a uniform distribution from 0 to 1. Luckily, random.random of numpy draws from a continuous, uniform distribution.

In [33]:
np.random.seed(10) #for replicability
weights = np.random.random(len(stocks))
weights /= np.sum(weights)
weights

array([0.10892225, 0.00293049, 0.08948081, 0.10574254, 0.0703968 ,
       0.03174472, 0.0279695 , 0.10739855, 0.02388103, 0.01247493,
       0.09678327, 0.1346337 , 0.00055756, 0.07232937, 0.11475449])

Note that we normalize our weights to equal to 1 or 100% allocation.

### PORTFOLIO RETURNS

Because what we have are daily returns of stocks that we have calculated, we then proceed to annualize these by multiplying the average daily returns with the number of trading days in a year which is 252.

In [22]:
rp = (returns.mean()*252)@weights 
rp

0.2115677224809396

### PORTFOLIO VARIANCE

In [None]:
Since we have the daily volatility as well, we proceed to annualize these by multiplying them with 252. The formula for the Covariance is given as:

In [23]:
#Portfolio Variance
port_var = weights@(cov*252)@weights 
port_var

0.04448360793364385

### SHARPE RATIO

In [24]:
#Sharpe Ratio
rf = 0.02 #risk-free rate
sharpe = (rp-rf)/np.sqrt(port_var)
sharpe

0.9082854011482231

In [25]:
len(weights)

15

## FUNCTION FOR USE

In [30]:
def portfolio_metrics(weights, index='Trial'):
    
    '''
    This function generates the relative performance metrics that will be reported and will be used
    to find the optimal weights.
    
    Parameters:
    weights: initialized weights or optimal weights for performance reporting
    
    '''   
    
    rp = (returns.mean()*252)@weights 
    port_var = weights@(cov*252)@weights
    sharpe = (rp-rf)/np.sqrt(port_var)
    df = pd.DataFrame({"Expected Return": rp,
                       "Portfolio Variance":port_var,
                       'Portfolio Std': np.sqrt(port_var),
                       'Sharpe Ratio': sharpe}, index=[index])
    return df

portfolio_metrics(weights)

Unnamed: 0,Expected Return,Portfolio Variance,Portfolio Std,Sharpe Ratio
Trial,0.211568,0.044484,0.210911,0.908285


## STEP 4:  MONTE CARLO SIMULATION

In [None]:
np.random.seed(42)
portfolios = pd.DataFrame(columns=[*stocks, "Expected Return","Portfolio Variance", "Portfolio Std", "Sharpe Ratio"])

for i in range(10000):
    weights = np.random.random(len(stocks))
    weights /= np.sum(weights)
    portfolios.loc[i, stocks] = weights
    metrics = portfolio_metrics(weights,i)
    portfolios.loc[i, ["Expected Return","Portfolio Variance", "Portfolio Std", "Sharpe Ratio"]] = \
    metrics.loc[i,["Expected Return","Portfolio Variance", "Portfolio Std", "Sharpe Ratio"]]
    
portfolios