# Portfolio Optimization with Monte Carlo Simulation and Modern Portfolio Theory

## Abstract
The project involves utilizing Monte Carlo simulation and Modern Portfolio Theory (MPT) to determine the optimal weights of stocks in a portfolio. The goal is to construct an efficient frontier using historical stock data and MPT, allowing for maximized returns while minimizing risks. A Monte Carlo simulation will be performed to test various stock weights in the portfolio of specific stocks in order to find the optimal allocation.

## Introduction

Modern Portfolio Theory (MPT) is a theory of investment that aims to maximize expected return while minimizing risk by carefully choosing the proportion of various assets in a portfolio. At its core, MPT provides a quantitative approach to the concept of diversification that aims to help investors achieve their financial goals by constructing portfolios that balance risk and reward.

**Advantages of Modern Portfolio Theory:**

By diversifying investments across multiple asset classes, MPT aims to optimize the risk-return tradeoff of a portfolio, potentially leading to better risk-adjusted returns.
MPT encourages investors to assess their risk tolerance, goals, and investment horizon, which can lead to a more structured investment plan.
The theory provides a framework for understanding portfolio construction and risk management.

**Disadvantages of Modern Portfolio Theory:**

MPT relies on statistical data, which can be unreliable and calculated with assumptions that do not match reality.
The theory assumes that the returns of assets are normally distributed, which can lead to errors when applied to non-normal asset classes.
MPT places a greater emphasis on maximizing returns rather than minimizing losses or considering downside risk, which may not be suitable for all investors.


The Efficient Frontier is a key concept in MPT, which defines the set of optimal portfolios that provide the highest expected return for a given level of risk, or the lowest level of risk for a given expected return. This efficient frontier is determined by plotting the expected return of various portfolios against their risk, and then identifying the set of portfolios that have the highest expected return for a given level of risk or the lowest level of risk for a given expected return.

Monte Carlo simulation is a mathematical technique used to predict the probability of a range of outcomes when dealing with potential random variables. It involves using computer programs to run random experiments and analyze the results to gain insights into the likelihood of certain outcomes.By simulating a large number of potential market scenarios and running them through the portfolio optimization process such as MPT, the two methods can be combined to identify the most robust and efficient portfolio weights that yield the maximum returns at the lowest risk level.

## Implementation

### Extract Data
Extract data from Financial Modelling Prep API

In [None]:
%pip install python-dotenv
%pip install scipy

In [None]:
import os
from dotenv import load_dotenv
import ssl
from urllib.request import urlopen
import numpy as np
import datetime as dt
import pandas as pd
import matplotlib.pyplot as plt
import datetime as dt
import json

load_dotenv()
API_KEY = os.getenv('API_KEY')
base_url = "https://financialmodelingprep.com/api/v3"

In [None]:
def get_jsonparsed_data(url):
    """
    Receive the content of ``url``, parse it as JSON and return the object.

    Parameters
    ----------
    url : str

    Returns
    -------
    dict
    """
    context = ssl.create_default_context()
    response = urlopen(url, context=context)
    data = response.read().decode("utf-8")
    return json.loads(data)

def get_historical_price_full(stickers, file_path):
    """
    Extract historical 1yr daily price for stock stickers, and save to json file with file_path

    Parameters:
      stickers (list): list of stock stickers
      file_path (str): json data file

    Returns:
      Json object of historical stock prices of all stocks in the list
    """
    stickers_str = ','.join(stickers)

    url = (f"{base_url}/historical-price-full/{stickers_str}?apikey={API_KEY}")

    data = get_jsonparsed_data(url)

    with open(file_path, "w") as f:
        if len(stickers) == 1:
          json.dump(data, f)   
        else:
          json.dump(data["historicalStockList"], f)

    return data

def get_quote(stickers, file_path):
    """
    Extract current price for stock stickers, and save to json file with file_path

    Parameters:
      stickers (list): list of stock stickers
      file_path (str): json data file

    Returns:
      Json object of historical stock prices of all stocks in the list
    """
    stickers_str = ','.join(stickers)

    url = (f"{base_url}/quote/{stickers_str}?apikey={API_KEY}")

    data = get_jsonparsed_data(url)

    with open(file_path, "w") as f:
        json.dump(data, f)   
    return data
# data = get_quote(stickers, "quote.json")

### Process Data

In [41]:
def get_json_data(file_path):
  # open the JSON file
  with open(file_path, 'r') as f:
      # load the JSON object into a Python object
      json_obj = json.load(f)

  columns = list(json_obj[0]['historical'][0].keys())
  df = pd.DataFrame(columns=['symbol'] + columns)
  for stock in json_obj:
    symbol = stock['symbol']
    historical = stock['historical']
    data = pd.DataFrame(historical, columns=columns)
    data.insert(0, 'symbol', symbol)
    df = pd.concat([df, data])
  return df

def get_data(file_path):
  """
  Return a cleaned pandas dataframe of historical full stock price information from json file
  """
  df = pd.read_json(file_path)
  # use explode to split a list to multiple rows
  explode_df = df.explode('historical')
  # use apply and pd.Series to split dictionary column into multiple columns
  normalize_df = explode_df['historical'].apply(pd.Series)
  # concatenate 'symbol' column to the normalized_df by column (axis=1)
  df_final = pd.concat([explode_df['symbol'], normalize_df], axis=1)
  return df_final


### Calculate Portfolio Return with Modern Portfolio Theory

In [42]:
def cal_return(df):
  """
  Return a pandas dataframe of adjusted close price for each stock sticker
  """
  pivot_df = df.pivot(index = 'date', columns='symbol', values = 'adjClose')
  returns = pivot_df.pct_change()
  mean_returns = returns.mean()
  cov_matrix = returns.cov()
  return pivot_df, mean_returns, cov_matrix
  
def cal_portfolio_performance(weights, mean_returns, cov_matrix):
  """
  Given porfolio weight, calculate portfolio return and standard deviatzion based on modern portfolio theory
  
  Parameters: 
    mean_returns
    cov_matrix
    weights (numpy array): array of weights for each stock sticker

  Returns:
    portfolio_return (float):  Sum(mean_returns * weights) * trading_days
    porfolio_std (float):  weights_transposed * cov_matrix * weights
  """
  trading_days = 252
  portfolio_returns = round(np.sum(mean_returns * weights) * trading_days, 4)
  portfolio_std = round(np.sqrt( np.dot(weights.T, np.dot(cov_matrix, weights)) ), 4)
  return portfolio_returns, portfolio_std

### Optimize the portfolio

#### Option 1: Maximize Sharpe Ratio

The Sharpe ratio is a financial metric that measures the risk-adjusted return of an investment or portfolio . It takes into account both the investment's returns and the risk involved in achieving those returns, and compares them to a risk-free investment, such as a Treasury bill. The higher the Sharpe ratio, the better the investment has performed in terms of returns per unit of risk. It is commonly used in finance to evaluate the performance of investments and to compare different investment opportunities.

$$S(R_p, R_f, \sigma_p) = \frac{R_p - R_f}{\sigma_p}$$

Where:
* S is the Sharpe ratio
* $R_p$ is the expected portfolio return
* $R_f$ is the risk-free rate
* $\sigma_p$ is the portfolio's standard deviation (i.e., a measure of its risk).
* The higher the value of S, the better the portfolio's risk-adjusted performance.

In [64]:
from scipy.optimize import minimize


def negative_sharpe_ratio(weights, mean_returns, cov_matrix, risk_free_rate = 0):
    portfolio_returns, portfolio_std = cal_portfolio_performance(weights, mean_returns, cov_matrix)
    sharpe_ratio = (portfolio_returns - risk_free_rate)/portfolio_std
    return -sharpe_ratio

def maximize_sharpe_ratio(mean_returns, cov_matrix, risk_free_rate = 0, constrain_set = (0,1)):
    """
    Minimize the negative sharpe ratio by altering the weights of the portfolio
    """
    num_assets = len(mean_returns)
    args = ( mean_returns, cov_matrix, risk_free_rate)
    constraints = ({'type': 'eq', 'fun': lambda x: np.sum(x) - 1})
    bound = constrain_set
    bounds = tuple(bound for asset in range(num_assets) )
    max_sr_result = minimize(negative_sharpe_ratio, num_assets*[1.0/num_assets], args=args,
                        method = 'SLSQP', bounds=bounds, constraints=constraints)
    
    max_sr_weights = np.around(max_sr_result['x']*100, decimals=3)
    max_sr_returns, max_sr_std = cal_portfolio_performance(max_sr_weights, mean_returns, cov_matrix)
    max_sr_allocation = pd.DataFrame(max_sr_weights, index=mean_returns.index, columns=['allocation'])
    return max_sr_result, max_sr_returns, max_sr_std, max_sr_allocation


#### Option2: Minimize variance

What is the minimize volatity of the portfolio

In [44]:
def cal_portfolio_variance(weights, mean_returns, cov_matrix):
    """ Returns only the standard deviation of the portfolio """
    return cal_portfolio_performance(weights, mean_returns, cov_matrix)[1]

def minimize_variance(mean_returns, cov_matrix, constraint_set=(0,1)):
    """Minimize the portfolio variance by altering the 
     weights/allocation of assets in the portfolio"""
    num_assets = len(mean_returns)
    args = (mean_returns, cov_matrix)
    constraints = ({'type': 'eq', 'fun': lambda x: np.sum(x) - 1})
    bound = constraint_set
    bounds = tuple(bound for asset in range(num_assets))
    min_var_result = minimize(cal_portfolio_variance, num_assets*[1./num_assets], args=args,
                        method='SLSQP', bounds=bounds, constraints=constraints)
    
    min_var_weights = np.around(min_var_result['x']*100, decimals=3)
    min_var_returns, min_var_std = cal_portfolio_performance(min_var_weights, mean_returns, cov_matrix)
    min_var_allocation = pd.DataFrame(min_var_weights, index=mean_returns.index, columns=['allocation'])
    return min_var_result, min_var_returns, min_var_std, min_var_allocation


### Efficient Frontier

In [45]:
def cal_portfolio_return(weights, mean_returns, cov_matrix):
        return cal_portfolio_performance(weights, mean_returns, cov_matrix)[0]

def cal_efficient_opt(mean_returns, cov_matrix, return_target, constraint_set=(0,1)):
    """For each returnTarget, we want to optimise the portfolio for min variance"""
    num_assets = len(mean_returns)
    args = (mean_returns, cov_matrix)
    constraints = ({'type':'eq', 'fun': lambda x: cal_portfolio_return(x, mean_returns, cov_matrix) - return_target},
                    {'type': 'eq', 'fun': lambda x: np.sum(x) - 1})
    bound = constraint_set
    bounds = tuple(bound for asset in range(num_assets))
    eff_opt = minimize(cal_portfolio_variance, num_assets*[1./num_assets], args=args, method = 'SLSQP', bounds=bounds, constraints=constraints)
    return eff_opt

## Verify

In [46]:
file_name = "historical.json"
file_path = f"{os.getcwd()}/data/{file_name}"

In [47]:
# stickers = ['AAPL', 'MSFT', 'TSLA', 'LCID', 'PFE', 'ABBV', 'RIVN', 'NVDA', 'AMD']
# historical = get_historical_price_full(stickers, file_path)

In [55]:
df = get_data(file_path)
stocks = df['symbol'].unique()

In [49]:
pivot_df, mean_returns, cov_matrix = cal_return(df)

In [80]:
def cal_portfolio_metrics(weights, mean_returns, cov_matrix, risk_free_rate = 0.0, index=0):
    
    '''
    This function generates the relative performance metrics that will be reported and will be used
    to find the optimal weights.
    
    Parameters
    ---
    weights (numpy array): initialized weights or optimal weights for performance reporting
    cov_matrix (pd dataframe): covariance matrix of stock ,
    risk_free_rate (float): risk free rate such as t-bill, default is 0.0

    Returns
    ---
    pandas dataframe of a portfolio performance
    '''   
    portfolio_returns, portfolio_std = cal_portfolio_performance(weights, mean_returns, cov_matrix)
    sharpe = (portfolio_returns - risk_free_rate)/portfolio_std
    df = pd.DataFrame({"Expected Return": portfolio_returns,
                       "Portfolio Variance":portfolio_std**2,
                       'Portfolio Std': portfolio_std,
                       'Sharpe Ratio': sharpe}, index=[index])
    return df

In [79]:
def simulate_portfolios( mean_returns, cov_matrix, risk_free_rate=0.0, n=100):
    """
    Given the historical mean_returns and cov_matrix of the portfolio, as well as the risk_free_rate
    Simulate n portfolios with different weights and performance

    Parameters
    ---
    mean_returns (float): historical mean_returns
    cov_matrix (pd dataframe): covariance matrix of stock ,
    risk_free_rate (float): risk free rate such as t-bill, default is 0.0
    n: number of simulations, default is 100

    Returns
    ---
    portfolios (pandas dataframe): pandas dataframe of n portfolios and their expected performances
    """
    np.random.seed(42)
    #Empty Container
    portfolios = pd.DataFrame(columns=[*stocks, "Expected Return","Portfolio Variance", "Portfolio Std", "Sharpe Ratio"])
    #Loop
    for i in range(n):
        weights = np.random.random(len(stocks))
        weights /= np.sum(weights)
        portfolios.loc[i, stocks] = weights
        metrics = cal_portfolio_metrics(weights, mean_returns, cov_matrix, risk_free_rate = risk_free_rate, index=i)
        # print(metrics)
        portfolios.loc[i, ["Expected Return","Portfolio Variance", "Portfolio Std", "Sharpe Ratio"]] = metrics.loc[i,["Expected Return","Portfolio Variance", "Portfolio Std", "Sharpe Ratio"]]
        
    return portfolios


In [81]:
portfolios = simulate_portfolios(mean_returns, cov_matrix, risk_free_rate=0.0, n=10000)
portfolios[portfolios["Sharpe Ratio"]==portfolios["Sharpe Ratio"].max()]

Unnamed: 0,AAPL,MSFT,TSLA,LCID,PFE,Expected Return,Portfolio Variance,Portfolio Std,Sharpe Ratio
4222,0.552807,0.009441,0.389236,0.044609,0.003906,-0.0105,0.000412,0.0203,-0.517241
