In [28]:
import requests 
from dotenv import load_dotenv
import json      
import pandas as pd  
import os
load_dotenv()

True

# *Quick Introduction *

This notebook is a supplementary resource for the workshop “Stock Portfolio Optimization with Python” using the Sectors API. It covers these topics: stock investment background and understanding the Sectors API, stock selection overview and two stock portfolio optimization models, followed by further recommendations and insights.

Using the Sectors API, investors gain access to valuable data from the “**Companies by Index**” and “**Daily Transaction Data**” endpoints, including stock indexes, dates, closing prices, volumes, and market capitalizations. Next, the “**Company Report**” API provides additional insights such as EPS, dividends, growth metrics, and other relevant financial information.

In **Section 3**, we will explore how to effectively query this data from the API to build a portfolio optimization model. However, before diving into optimization, it’s crucial to **understand the available stock indexes** and their **respective investment purposes**.

# Section 1 - Sectors API & Stock Investment Overview

When considering stock investments, individual investors typically focus on key questions such as: Which stocks to buy? How many shares? What’s the price, growth potential, and risk?

In [29]:
import pandas as pd

index_data = {

              #list of stock indexes from Sectors API

    'Index': ['FTSE', 'IDX30', 'IDXBUMN20', 'IDXESGL', 'IDXG30', 'IDXHIDIV20',
              'IDXQ30', 'IDXV30', 'JII70', 'KOMPAS100', 'LQ45', 'SMInfra18',
              'SRIKEHA18', 'SRIKEHATI'],

    'Focus': ['Globally recognized index of large-cap companies',
              'Top 30 stocks by market cap and liquidity',
              'Top 20 government-owned enterprises (BUMN)',
              'Stocks meeting ESG (environmental, social, governance) standards',
              'Large-cap, high liquidity growth stocks',
              '20 stocks with high dividend yields',
              'Focus on quality stocks based on financial metrics',
              'Focus on value stocks trading below intrinsic value',
              '70 stocks complying with Shariah (Islamic law)',
              '100 most liquid, actively traded stocks',
              'Top 45 most liquid stocks with large market caps',
              '18 infrastructure-related stocks',
              'Tracks sustainability and social responsibility',
              'Sustainability and ethical investing'],

    'Investment Use': ['For investors seeking to track global or broad market performance',
                       'Suitable for blue-chip stock investors looking for stable and liquid companies',
                       'For exposure to state-owned enterprises (SOEs) benefiting from government policies',
                       'Ideal for socially responsible investors focused on sustainability and ethical investing',
                       'For growth-oriented investors looking for long-term capital appreciation',
                       'Attractive to income-seeking investors focused on dividend income',
                       'Suitable for long-term investors seeking companies with strong fundamentals',
                       'Ideal for value investors looking for undervalued stocks',
                       'Suitable for Shariah-compliant investors following Islamic investment principles',
                       'For investors seeking diversified exposure to Indonesia’s liquid stocks',
                       'Blue-chip focused, for investors looking for stability and long-term growth potential',
                       'For investors bullish on infrastructure growth and development projects in Indonesia',
                       'Ideal for ESG investors prioritizing sustainable business practices',
                       'Same as SRIKEHA18, for socially responsible investors']
}

# Load into a pandas DataFrame
df_index_info = pd.DataFrame(index_data)

# Display the DataFrame
df_index_info

Unnamed: 0,Index,Focus,Investment Use
0,FTSE,Globally recognized index of large-cap companies,For investors seeking to track global or broad...
1,IDX30,Top 30 stocks by market cap and liquidity,Suitable for blue-chip stock investors looking...
2,IDXBUMN20,Top 20 government-owned enterprises (BUMN),For exposure to state-owned enterprises (SOEs)...
3,IDXESGL,"Stocks meeting ESG (environmental, social, gov...",Ideal for socially responsible investors focus...
4,IDXG30,"Large-cap, high liquidity growth stocks",For growth-oriented investors looking for long...
5,IDXHIDIV20,20 stocks with high dividend yields,Attractive to income-seeking investors focused...
6,IDXQ30,Focus on quality stocks based on financial met...,Suitable for long-term investors seeking compa...
7,IDXV30,Focus on value stocks trading below intrinsic ...,Ideal for value investors looking for underval...
8,JII70,70 stocks complying with Shariah (Islamic law),Suitable for Shariah-compliant investors follo...
9,KOMPAS100,"100 most liquid, actively traded stocks",For investors seeking diversified exposure to ...


Which index caught your attention? What do you want to choose?

# Section 2 - Where to start?

Recommendation: **Match your investment goals with the property of stock index**

In [30]:
import pandas as pd

# Dictionary representing the stock index information with persona examples
index_criteria = {
    'Criteria': [
        'Liquidity & Stability',
        'Government-Owned Enterprises',
        'Dividend Focus',
        'Growth-Oriented',
        'Value Stocks',
        'High-Quality Financials',
        'Shariah-Compliant Investments',
        'Socially Responsible Investments',
        'Infrastructure Focus',
        'Broad Market Exposure'
    ],
    'Description': [
        'Investors looking for highly liquid and stable stocks that are less volatile.',
        'For investors interested in companies benefiting from government backing and policies.',
        'Ideal for those seeking regular income from dividends.',
        'Suitable for long-term investors focusing on capital appreciation.',
        'Investors seeking undervalued stocks trading below intrinsic value.',
        'Focus on stocks with strong fundamentals and good financial health.',
        'For investors following Islamic principles.',
        'Investors interested in ESG (environmental, social, governance) and ethical business practices.',
        'For investors bullish on Indonesia\'s infrastructure growth.',
        'For investors wanting diversified exposure across large segments of the market.'
    ],
    'Persona': [
        'Maria, a risk-averse retiree seeking low volatility and stable returns.',
        'Adi, a public sector enthusiast who trusts government-driven initiatives.',
        'Siti, a conservative investor who prefers stable income from dividends.',
        'Kevin, a young professional aiming for long-term wealth through capital growth.',
        'Tom, a value investor who looks for bargain stocks below intrinsic value.',
        'Dewi, a financial analyst who invests in companies with strong fundamentals.',
        'Ahmad, a devout Muslim who prioritizes Shariah-compliant investments.',
        'Sarah, an environmentally conscious investor focusing on ethical companies.',
        'Indra, an infrastructure expert optimistic about Indonesia\'s construction growth.',
        'Emily, a diversified investor looking for broad market exposure and lower risk.'
    ],
    'Recommended Index': [
        'IDX30, LQ45, KOMPAS100',
        'IDXBUMN20',
        'IDXHIDIV20',
        'IDXG30',
        'IDXV30',
        'IDXQ30',
        'JII70',
        'IDXESGL, SRIKEHA18, SRIKEHATI',
        'SMINFA18',
        'FTSE, KOMPAS100'
    ]
}

df_index_criteria = pd.DataFrame(index_criteria)

df_index_criteria

Unnamed: 0,Criteria,Description,Persona,Recommended Index
0,Liquidity & Stability,Investors looking for highly liquid and stable...,"Maria, a risk-averse retiree seeking low volat...","IDX30, LQ45, KOMPAS100"
1,Government-Owned Enterprises,For investors interested in companies benefiti...,"Adi, a public sector enthusiast who trusts gov...",IDXBUMN20
2,Dividend Focus,Ideal for those seeking regular income from di...,"Siti, a conservative investor who prefers stab...",IDXHIDIV20
3,Growth-Oriented,Suitable for long-term investors focusing on c...,"Kevin, a young professional aiming for long-te...",IDXG30
4,Value Stocks,Investors seeking undervalued stocks trading b...,"Tom, a value investor who looks for bargain st...",IDXV30
5,High-Quality Financials,Focus on stocks with strong fundamentals and g...,"Dewi, a financial analyst who invests in compa...",IDXQ30
6,Shariah-Compliant Investments,For investors following Islamic principles.,"Ahmad, a devout Muslim who prioritizes Shariah...",JII70
7,Socially Responsible Investments,"Investors interested in ESG (environmental, so...","Sarah, an environmentally conscious investor f...","IDXESGL, SRIKEHA18, SRIKEHATI"
8,Infrastructure Focus,For investors bullish on Indonesia's infrastru...,"Indra, an infrastructure expert optimistic abo...",SMINFA18
9,Broad Market Exposure,For investors wanting diversified exposure acr...,"Emily, a diversified investor looking for broa...","FTSE, KOMPAS100"


While it’s crucial to evaluate a stock’s internal characteristics—such as its nature, properties, and the type of investor it appeals to—external macroeconomic factors play an equally important role in guiding stock index selection. **These broader economic conditions, including inflation rates, government policies, and global market trends, are key determinants in the performance of various sectors**. Therefore, understanding both your investment persona and the prevailing macroeconomic landscape is essential for making informed decisions.

Now let’s assume my client is Indra, an infrastructure expert optimistic about Indonesia’s construction growth. To confirm focusing solely on SMINFA18, we need to observe the following macroeconomic factors:

- Government infrastructure spending: Announcements of new airports and highways as public projects.
- Interest Rate policies: Indonesia’s central bank lowers interest rates.
- Commodity prices: Steel and cement prices drop due to global supply chain restructuring.
- Foreign Direct investment: China increases investment in Indonesia’s high-speed rail projects.
We will build the following content for our client Mr. X.

How to choose specific stocks from an index will be discussed in section 4.

# Section 3 - Data Collection from Sectors API

## Section 3.1 - Stock Price Information

In [31]:
# Retrive Stock index from "Companies by Index" API

import time
import requests
from google.colab import userdata

# Retrieve the API key securely
api_key = userdata.get('SECTORS_API_KEY')

# Define the API URL
url = "https://api.sectors.app/v1/index/sminfra18/"

# Pass the API key in the header
headers = {"Authorization": api_key}

# Make the API request
response_company_index = requests.get(url, headers=headers)

print(response_company_index.text)

ModuleNotFoundError: No module named 'google'

In [None]:
# Retrieve date and price from "Daily Transaction Data" API

from datetime import datetime, timedelta

# Function to calculate the date 90 days ago from today
def calculate_start_date(days_ago=90):
    return (datetime.now() - timedelta(days=days_ago)).strftime('%Y-%m-%d')

# Calculate the start date 90 days ago
start_date = calculate_start_date()

# Looping API info
history_sminfra18 = []

for i, x in enumerate(response_company_index.json()):

  # Define the URL for the API endpoint
  url = "https://api.sectors.app/v1/daily/" + response_company_index.json()[i]['symbol'] + "/"

  # Define the query string with the calculated start date
  querystring = {"start": start_date}

  headers = {"Authorization": userdata.get('SECTORS_API_KEY')}

  response_daily_transaction_data = requests.request("GET", url, headers=headers, params=querystring)

  # Append the result into target list
  history_sminfra18.append(response_daily_transaction_data.json())

  time.sleep(1)

In [None]:
history_sminfra18

## Section 3.2 - Company Report Information

In [None]:
company_report_sminfra18 = []

for i, x in enumerate(response_company_index.json()):

  #Define the URL for the API endpoint
  url = "https://api.sectors.app/v1/company/report/" +  response_company_index.json()[i]['symbol'] + "/"

  headers = {"Authorization": userdata.get('SECTORS_API_KEY')}

  #Make the API request
  response_company_report = requests.request("GET", url, headers=headers)

  # Append the result into target list
  company_report_sminfra18.append(response_company_report.json())

  time.sleep(1)

In [None]:
# Create a subset to retrieve relative information for analysis
prepared_data = []

# Looping through companies in the index
for company in company_report_sminfra18:
  current_company = {}

# Accessing relative API information
  current_company['symbol'] = company['symbol']
  current_company['company_name'] = company['company_name']
  current_company['industry'] = company['overview']['industry']
  current_company['sub_industry'] = company['overview']['sub_industry']
  current_company['sector'] = company['overview']['sector']

# For the forecast variables ("company_growth_forecasts" and "company_value_forecasts"), we can see that they have both year 2024 and 2025 forecast

# Create a function to only capture year 2025 value
  if company['future']['company_value_forecasts'] != None:
    for i, forecast in enumerate(company['future']['company_value_forecasts']):
      if forecast['estimate_year'] == 2025:
        position = i

# This code below, we can see that estimate_year has both 2024 and 2025, we only want forecast variables based on year 2025
# We also observe that the year is the same for 'company_value_forecast' and 'company_growth_forecasts'
# Meaning we just create 1 function to use one variable is enough to capture all the year to 2025 for forecasting variables
    #print(company['future']['company_value_forecasts'][0]['estimate_year'])
    #print(company['future']['company_growth_forecasts'][0]['estimate_year'])
    #print('next')


# Exception handling in case the value is empty to double secure the result

  try:
    current_company['eps_estimate'] = company['future']['company_value_forecasts'][position]['eps_estimate']
  except:
    current_company['eps_estimate'] = 0

  try:
    current_company['revenue_estimate'] = company['future']['company_value_forecasts'][position]['revenue_estimate']
  except:
    current_company['revenue_estimate'] = 0

  try:
    current_company['eps_growth'] = company['future']['company_growth_forecasts'][position]['eps_growth']
  except:
    current_company['eps_growth'] = 0

  try:
    current_company['revenue_growth'] = company['future']['company_growth_forecasts'][position]['revenue_growth']
  except:
    current_company['revenue_growth'] = 0

  try:
    current_company['total_dividends'] = company['dividend']['annual_yield'][0]['total']
  except:
    current_company['total_dividends'] = 0

  try:
    current_company['avg_yield_dividends'] = company['dividend']['dividend_yield_avg']['avg_yield']
  except:
    current_company['avg_yield_dividends'] = 0

  prepared_data.append(current_company)

prepared_data

# Section 4 - Which Stock to choose?

## Section 4.1 - Data Preprocessing

**How to select stocks?**

For our client Mr. X, who’s focused on stocks from the SMInfra18 index and doesn't want to invest in the ente it’s time to carefully select the right stocks.

Several key economic factors should guide stock selection from an index:

- Diversification by Industry and Sector: Ensuring exposure across different sectors to reduce risk.
- Diversification by Correlation and Returns: Selecting stocks that offer varied returns and are not too closely correlated.
- Company Growth and Dividend Information: Choosing companies with strong growth potential and reliable dividend payouts.

Other factors, like recent projects, mergers, acquisitions, and broader macroeconomic or geopolitical policies, are also worth considering, though we won’t delve into them here.

In this section, we will employ a range of data analysis techniques, along with our expertise in stock selection, to identify two suitable stocks from a pool of 18 for our client, Indra, to invest in.

Step 1: Create a Dataframe for historical stock performance

In [None]:
# Import all useful libraries
import numpy as np
import pandas as pd

In [None]:
# Flatten the list of lists into a single list of dictionaries
flattened_data = [item for sublist in history_sminfra18 for item in sublist]

# Convert to a pandas DataFrame
df_history_sminfra18 = pd.DataFrame(flattened_data)

# Ensure 'date' is in datetime format
df_history_sminfra18['date'] = pd.to_datetime(df_history_sminfra18['date'])

# Enforce in case any column needs conversion (e.g., 'close', 'volume', 'market_cap')
df_history_sminfra18['close'] = pd.to_numeric(df_history_sminfra18['close'], errors='coerce')
df_history_sminfra18['volume'] = pd.to_numeric(df_history_sminfra18['volume'], errors='coerce')
df_history_sminfra18['market_cap'] = pd.to_numeric(df_history_sminfra18['market_cap'], errors='coerce')

# Check for the first few rows to ensure everything is correct
print(df_history_sminfra18)

In [None]:
# We aim to have all the stock price evolution in the past 90 days in one dataframe

#Pivot the DataFrame so that the 'symbol' becomes the column names and 'date' becomes the index
df_pivot_history_sminfra18 = df_history_sminfra18.pivot_table(index='date', columns='symbol', values='close')
df_pivot_history_sminfra18.head()

In [None]:
#Save the DataFrame to a CSV file -- we need to use this for MVO analysis
#df_pivot_history_sminfra18.to_csv('stock_prices_sminfra18.csv', index=False)

Step 2: Create a Dataframe for company report

In [None]:
df_sminfra18_company_report = pd.DataFrame(prepared_data)
df_sminfra18_company_report

## Section 4.2 - Exploratory Data Analysis (EDA)

Our goal is to select two or three stocks from a list of 18 for Mr. X to invest in.

At the beginning of this section, we outlined three strategies:

Diversification by Industry and Sector: Conduct an overview analysis to understand the information better.
Diversification by Correlation: Examine the correlation matrix based on stock price evolution.
Company Growth and Dividend Information: Extract all relevant information beyond sector and industry, normalize the data, and assign a score to each stock.

Overview of Sectors and Industries

In [None]:
# Categorize stocks by industries
df_sminfra18_company_report.industry.value_counts()

In [None]:
# Categorize stocks by sectors
df_sminfra18_company_report.sector.value_counts()

Correlation Matrix on stock price evolution

In [None]:
# Calculate the correlation of the 18 stock prices
correlation_matrix = df_pivot_history_sminfra18.corr()
correlation_matrix

## Section 4.3 - Normalization & Scoring

In [None]:
# Import library for normalization
from sklearn.preprocessing import MinMaxScaler

# Extract useful information for normalization analysis
columns_of_interest = ['symbol', 'eps_growth', 'avg_yield_dividends', 'revenue_growth', 'revenue_estimate']
df_sminfra18_company_report_norm = df_sminfra18_company_report[columns_of_interest]
df_sminfra18_company_report_norm

In [None]:
# Check missing data
df_sminfra18_company_report_norm.isnull().sum()

In [None]:
# Impute missing data with mean - we do a simple imputation here, can also check the reason and correlation with other variables behind missing data
df_sminfra18_company_report_norm.fillna(df_sminfra18_company_report_norm.eps_growth.mean(), inplace=True)
df_sminfra18_company_report_norm.isnull().sum()

In [None]:
# Normalize the data for the metrics using MinMaxScaler
scaler = MinMaxScaler()
df_sminfra18_company_report_norm[['eps_growth', 'avg_yield_dividends', 'revenue_growth', 'revenue_estimate']] = scaler.fit_transform(df_sminfra18_company_report_norm[['eps_growth', 'avg_yield_dividends', 'revenue_growth', 'revenue_estimate']])

In [None]:
# Add a ranking system by summing the normalized scores for all metrics
df_sminfra18_company_report_norm['score'] = df_sminfra18_company_report_norm.iloc[:, 1:5].sum(axis=1)
df_sminfra18_company_report_norm

In [None]:
# Sort the companies by total score
df_sminfra18_company_report_norm = df_sminfra18_company_report_norm.sort_values(by='score', ascending=False).reset_index(drop=True)
df_sminfra18_company_report_norm

In [None]:
# Pick the top 5 companies
top_5_companies = df_sminfra18_company_report_norm.head(5)
top_5_companies

## Section 4.4 - Stock selection choice

Based on the results from Section 4.3, we have identified the top five stock tickers that we should consider for selection, as they have received the highest scores.

Next, we will implement our diversification strategy, focusing on industry and sector, as well as correlation analysis, to select two stocks from this group of five high-scoring candidates.

In [None]:
# Store the top 5 company symbol into a list
top_5_symbols = top_5_companies['symbol'].tolist()
top_5_symbols

In [None]:
# Store the top 5 company symbol into a list
top_5_symbols = top_5_companies['symbol'].tolist()
top_5_symbols

We have three stocks are all in banking industry / financial sector. For those three tickers (BBNI.JK, BBRI.JK, BMRI.JK), we choose the ticker with the highest score - BBRI.JK.

Now we have our choice: BBRI.JK, TLKM.JK, UNTR.JK.

We need to confirm that their price movement doesn't have high positive correlation.

In [None]:
# Query the price history info for selected 3 tickers and check correlation
selected_3_symbols =  ['BBRI.JK', 'TLKM.JK', 'UNTR.JK']
selected_3_prices = df_pivot_history_sminfra18[selected_3_symbols]
correlation_matrix_selected = selected_3_prices.corr()
correlation_matrix_selected

TLKM.JK and BBRI.JK, their price has strong positive correlation! We drop one and only select BBRI.JK, since it has the highest score regarding growth potential analysis.

To sum up, our final choices are two stocks: BBRI.JK & UNTR.JK.

In [None]:
Indra_stocks = ['BBRI.JK', 'UNTR.JK']

In [None]:
# Query the price info for Indra_stocks
price_Indra = df_pivot_history_sminfra18[Indra_stocks]
price_Indra.head()

In [None]:
# Vizualize the movement of Indra's two stocks for the past 90 days
import matplotlib.pyplot as plt

# Plotting the prices for both BBRI.JK and UNTR.JK
plt.figure(figsize=(10, 6))

# Plot BBRI.JK stock prices
plt.plot(price_Indra.index, price_Indra['BBRI.JK'], label='BBRI.JK', marker='o')

# Plot UNTR.JK stock prices
plt.plot(price_Indra.index, price_Indra['UNTR.JK'], label='UNTR.JK', marker='o')

# Adding title and labels
plt.title('Price Performance of BBRI.JK and UNTR.JK', fontsize=16)
plt.xlabel('Date', fontsize=12)
plt.ylabel('Stock Price (IDR)', fontsize=12)

# Displaying a legend to distinguish between the two stocks
plt.legend()

# Adding grid for better readability
plt.grid(True, linestyle=':', linewidth=1)

# Display the plot
plt.xticks(rotation=45)  # Rotate the x-axis labels for better readability
plt.tight_layout()  # Ensure everything fits well in the plot
plt.show()

# Section 5 - MVO with Monte Carlos Simulation

## Section 5.1 - Theoretical Context

After understanding which stock to invest in, here are some key questions:

- How much of each stock should we hold?
- How to balance between risk and reward?

This is where Mean-Variance Optimization (MVO) comes in. It’s a method to help us find the optimal way to allocate our capital among various stocks - find the optimal weight.

What are the information we need?

1. Expected Returns: This is the average return you expect from each stock based on historical performance.

2. Risk (Volatility): Risk is measured as the variance (or more commonly, standard deviation) of the returns. It shows how much the stock’s returns fluctuate.

3. Covariance Between Stocks: Covariance tells us how the returns of two stocks move together. Some stocks may go up and down at the same time (positive covariance), while others may move in opposite directions (negative covariance). Covariance helps us understand how diversification can reduce risk.

Step 2: Build the Model

Once we have this data, the next step is to use it to construct a portfolio of stocks. The goal of the model is twofold:

- Maximize Return: We want to achieve the highest possible return for a given amount of risk.
- Minimize Risk: Alternatively, we might want to minimize risk while achieving a certain return level.

Consult the mathematical model here: [link text](https://en.wikipedia.org/wiki/Modern_portfolio_theory)

Step 3: Optimize the Portfolio through Sharp Ratio

Now that the model is built, we can use it to optimize the portfolio. This means:

- Maximizing Return for a Given Level of Risk: This is suitable for investors who are willing to take on some risk but want to get the highest return for that risk level.
- Minimizing Risk for a Given Return: This is helpful for more conservative investors who want to reduce risk as much as possible while still achieving a certain return.

This optimization results will generate an Efficient Frontier:

A curve that shows the best possible portfolios in terms of risk and return. Every point on the frontier represents a portfolio that either minimizes risk for a given return or maximizes return for a given risk.

The highest sharp ratio point, which lies on the efficient frontier, gives the optimal portfolio weight.

Bibliography:

Useful math for Portfolio Optimization. [link](chrome-extension://efaidnbmnnnibpcajpcglclefindmkaj/https://www.fields.utoronto.ca/programs/scientific/09-10/finance/courses/pliska3.pdf)

Understand basic concepts of mean (return), variance (risk) and how they work in investing. [link](https://smartasset.com/financial-advisor/mean-variance-optimization)

Understand Mean Variance Optimization Model. [link](https://analystprep.com/study-notes/cfa-level-iii/mean-variance-optimization-an-overview/#:~:text=Mean%2Dvariance%20optimization%20(%E2%80%9CMVO,risk%2Dto%2Dreturn%20profile.))

## Section 5.2 - Mean & Variance

For this section, for a better illustration of the model, we will randomly generate 4 stocks from the sminfra18 to compose a portfolio to optimize.

In [None]:
# Import all useful libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

In [None]:
# Randomly select 4 stocks from sminfra18
sminfra18_tickers = df_pivot_history_sminfra18.columns.tolist()

selected_stocks = np.random.choice(sminfra18_tickers, size=4, replace=False)
selected_stocks

In [None]:
# Query the price info for the selected stocks
prices = df_pivot_history_sminfra18[selected_stocks]
prices.head()

In [None]:
# Calculate Returns
returns = pd.DataFrame()

for stock in prices:
  returns[stock + ' Returns'] = prices[stock].pct_change().dropna()

returns.head()

In [None]:
# Calculate expected returns (mean) and covariance matrix
# Assuming 252 is the trading days in a year = 365 - weekends 104 - holidays 9
# We need to annualize the return and risk since MVO models are long-term orientated and long-term in economics perspectives refers to 12 months +

expected_returns = (1 + returns.mean()) ** 252 - 1 # Annualize by multiplying by trading days
covariance_matrix = (1 + returns.cov()) ** 252 - 1 # Annualize covariance

# Print the expected returns and covariance matrix
print("Expected Returns:")
print(expected_returns)
print("\nCovariance Matrix:")
print(covariance_matrix)

In [None]:
# Here we can see the distribution of each stock returns
returns.hist(figsize=(8,6), bins = 50)
plt.show()

## Section 5.3 Monte Carlo Simulation

Monte Carlo simulation is a powerful technique used in portfolio optimization to assess the potential outcomes of different investment strategies or different allocations under varying conditions. It involves generating multiple scenarios based on statistical models and random sampling.

Implementing Monte Carlo simulation in Python involves combining statistical analysis, simulation, and optimization techniques to gain insights into portfolio performance under different allocations. For our analysis, we will run a simulation on different allocations of the same stocks to find the optimum allocation. A single run of the simulation is shown in the code below.

In [None]:
# Define Risk Free rate
# Assume RF as 6.64% as the yield of 10 year Indonesian Government bond
# Source link https://tradingeconomics.com/indonesia/government-bond-yield
rf = 0.0664

# Monte Carlo simulation
#Single Run
np.random.seed(101)
print(selected_stocks)

# Generates an array of random numbers representing initial weights for each asset in the portfolio
weights = np.array(np.random.random(4))
print('\nRandom Weights')
print(weights)

# Normalizing the randomly generated weights to ensure they sum up to 1, representing a fully invested portfolio.
print('\nRebalanced Weights')
weights = weights / np.sum(weights)
print(weights)

# Calculating the portfolio’s expected return using the weighted average of individual asset returns.
# It multiplies the annual mean returns of each asset by its respective weight and aggregates them.
print('\nPortfolio Return')
portfolio_return = np.sum(returns.mean()*252*weights)
print(portfolio_return)

# Expected Volatility (Standard Deviation): Using the formula for portfolio volatility, it calculates the square root of the dot product of weights, the covariance matrix of asset returns (multiplied by 252 for annualization), and weights transpose.
# The square root of covariance matrix is taken because square root of variance is standard deviation or volatility.
print('\nPortfolio Volatility')
portfolio_volatility = np.sqrt(np.dot(weights.T, np.dot(returns.cov()*252, weights)))
print(portfolio_volatility)

# Sharpe Ratio measures the risk-adjusted return by subtracting the risk-free rate from the portfolio return and dividing by its volatility.
print('\nPortfolio Sharpe Ratio')
sharpe_ratio = (portfolio_return - rf) / portfolio_volatility
print(sharpe_ratio)

Having seen a single run of the simulation above, let us now perform 5000 simulations of the random allocations generated to find the optimum allocation for the four stocks chosen. The portfolio performance will analyzed based on the Sharpe ratio. The Sharpe ratio gives the return delivered per unit of risk taken. The code for the simulation is shown below.

In [None]:
# Specify the number of simulated portfolios to generate (in our case, 5000)
num_ports = 5000

# A 2D array to store the randomly generated weights for each asset in each portfolio.
all_weights = np.zeros((num_ports, len(selected_stocks)))

# Create arrays to store portfolio returns, volatilities, and Sharpe ratios for each simulated portfolio.
ret_arr = np.zeros(num_ports)
vol_arr = np.zeros(num_ports)
sharp_arr = np.zeros(num_ports)

# Looping over each portfolio
for i in range(num_ports):

  # Generate random weights randomly assigning weights to each asset in the portfolio and rebalancing them to sum up to 1
  weights = np.array(np.random.random(4))
  weights = weights / np.sum(weights)
  all_weights[i, :] = weights

  #Expected return
  ret_arr[i] = np.sum((returns.mean() * 252 * weights))

  #Expected volatility
  vol_arr[i] = np.sqrt(np.dot(weights.T, np.dot(returns.cov() * 252, weights)))

  #Sharpe ratio
  sharp_arr[i] = (ret_arr[i] - rf) / vol_arr[i]

## Section 5.4 - Sharpe Ratio

Sharpe ratio measures the risk-adjusted return by subtracting the risk-free rate. We want the portfolio weight randomly generated by Monte Carlo Simulation with the highest sharp ratio.

More info [link text](https://www.investopedia.com/terms/s/sharperatio.asp)

In [None]:
# Retrieve the maximum value from the sharp_arr array
max_sr = sharp_arr.max()

# Find the index of the portfolio with the maximum Sharpe ratio in the sharp_arr array.
max_sr_index = sharp_arr.argmax()

# Retrieve the weights of the assets in the portfolio corresponding to the index 4091 in the all_weights array.
opt_weights = all_weights[max_sr_index,:]

# Retrive the optimal expected returns and volaticity
optimal_return = ret_arr[max_sr_index]
optimal_volatility = vol_arr[max_sr_index]

# Print Result
print('Max Sharpe Ratio: ', max_sr)
print('Optimal Return: ', optimal_return)
print('Optimal Volatility: ', optimal_volatility)
print('Optimal Weights: ', opt_weights)
print('Stock List: ', selected_stocks)

## Section 5.5 - Efficient Frontier

The efficient frontier rates portfolios on a coordinate plane. Plotted on the x-axis is the risk, while return is plotted on the y-axis—annualized standard deviation is typically used to measure risk, while compound annual growth rate (CAGR) is used for return.

More info [link](https://www.investopedia.com/terms/e/efficientfrontier.asp)

The volatility, return and sharpe ratio values for the simulation are plotted below.

In [None]:
plt.figure(figsize = (10,5))
plt.scatter(vol_arr, ret_arr, c = sharp_arr, cmap = 'plasma')
plt.colorbar(label = 'Sharpe Ratio')
plt.xlabel('Volatility', fontweight = 'bold')
plt.ylabel('Return', fontweight = 'bold')

plt.scatter(optimal_volatility, optimal_return, c='red', s=200, edgecolors='black', marker='*')
plt.grid(True, ls=':', lw=1)

The visualization can be seen above for volatility, return and sharpe ratio. The x-axis shows the volatility, y axis shows the return and the colorbar on right shows the sharpe ratio graded colorwise with dark color showing the lowest and light values showing the highest sharpe ratio. The red star shows the highest Sharpe ratio point on the plot.

# Section 6 - MVO with Scipy Minimized Function

## Section 6.1 - Overview

In our next analysis method, we will optimize the same portfolio allocation mathematically using the minimize function in Scipy (a library in Python) and Sharpe ratio.

Portfolio optimization using Scipy’s minimize function and the Sharpe ratio involves using mathematical optimization to find the optimal asset allocation that maximizes the Sharpe ratio—a measure of risk-adjusted returns.

The basic principle is to find the Sharpe ratio for a random allocation and then multiply it by -1 to make it negative and then minimize it to obtain the allocation weights that gives the highest Sharpe ratio.

In [None]:
def ret_vol_sr(weights):
  weights = np.array(weights)

  # Calculate Annualized Expected Returns
  ret = np.sum(returns.mean() * weights * 252)

  # Calculate Portfolio Volaticity
  vol = np.sqrt(np.dot(weights.T, np.dot(returns.cov() * 252, weights)))

  # Calculate Sharp Ratio, Rist Free rate 6.4%
  sr = (ret - rf) / vol

  return np.array([ret, vol, sr])

## Section 6.2 Scipy Minimize Function

In [None]:
from scipy.optimize import minimize

# This function below is designed to negate the Sharpe ratio because in optimization, the objective function is minimized by default.
# We want to maximize the Sharpe ratio, index 2, so multiplying by -1 helps flip the optimization to achieve that.
def sharp_neg(weights):
  return ret_vol_sr(weights)[2] * -1

# constraint function ensures the weight sum is 1
# In optimization, constraints are typically written such that the result equals 0
# If the sum of weights is exactly 1, then np.sum(weights) - 1 will equal 0. This satisfies the constraint that the sum of the weights must be 1.
def sum_check(weights):
  return np.sum(weights) - 1

# We want equity constraint, the function must return a value equals to 0
# Specify the equality constraint that the sum of weights must equal 1.
cons = ({'type': 'eq', 'fun': sum_check})

# Ensure that each weight lies between 0 and 1, meaning no short-selling or leverage.
bounds = ((0, 1), (0, 1), (0, 1), (0, 1))

# Initial guess for the optimization (equal weighting of 25% for each stock initially.
init_guess = [0.25, 0.25, 0.25, 0.25]

# SLSQP stands for Sequential Least Squares Quadratic Programming, which is an algorithm that can handle both equality and inequality constraints, making it well-suited for portfolio optimization problems.
# We are minimize the negative sharp ratio - maximize sharp ratio

opt_results = minimize(sharp_neg, init_guess, method = 'SLSQP', bounds = bounds, constraints = cons)

opt_results

# fun --> Returns the result for the negative sharpe ratio
# x --> Represents the optimizal weights in accordance with the stock list order
# nit --> This value indicates the number of iterations that the optimization algorithm took to converge to the optimal solution. A smaller number usually means a faster convergence
# jac --> Jacobian (or gradient) of the objective function at the optimal weights


NameError: name 'np' is not defined

In [None]:
# opt_results.x gives the optimized portfolio weights.
opt_results.x

In [None]:
#return the expected return, volatility, and Sharpe ratio for the optimized portfolio.
ret_vol_sr(opt_results.x)

In [None]:
# Print Result
print('Return: ', ret_vol_sr(opt_results.x)[0])
print('Volatility: ', ret_vol_sr(opt_results.x)[1])
print('Sharpe Ratio: ', ret_vol_sr(opt_results.x)[2])
print (selected_stocks)
print ('Optimal Weights: ', opt_results.x)

## Section 6.3 - Result Comparison

In [None]:
# Print Result from Monte Carlos Simulation
print ('\nResult from Monte Carlos Simulation')
print('Stock List: ', selected_stocks)
print('Optimal Weights: ', opt_weights)
print('Optimal Return: ', optimal_return)
print('Optimal Volatility: ', optimal_volatility)
print('Max Sharpe Ratio: ', max_sr)

# Print Result from Scipy Minimized Function
print ('\nResult from Scipy Minimized Function')
print ('Stock List: ', selected_stocks)
print ('Optimal Weights: ', opt_results.x)
print('Return: ', ret_vol_sr(opt_results.x)[0])
print('Volatility: ', ret_vol_sr(opt_results.x)[1])
print('Sharpe Ratio: ', ret_vol_sr(opt_results.x)[2])


In [None]:
# Draw Efficient Frontier

plt.figure(figsize = (10,5))
plt.scatter(vol_arr, ret_arr, c = sharp_arr, cmap = 'plasma')
plt.colorbar(label = 'Sharpe Ratio')
plt.xlabel('Volatility', fontweight = 'bold')
plt.ylabel('Return', fontweight = 'bold')

# Optimial point for Monte Carlos Simulation is labelled as red star
plt.scatter(optimal_volatility, optimal_return, c='red', s=200, edgecolors='black', marker='*')
plt.grid(True, ls=':', lw=1)

# Optimial point for Scipy Optimization is labelled as green star
plt.scatter(ret_vol_sr(opt_results.x)[1], ret_vol_sr(opt_results.x)[0], c='green', s=200, edgecolors='black', marker='*')

Summary:

The Monte Carlo Simulation provides a diversified portfolio with a balanced approach to risk and return, while the Scipy Optimization approach is more aggressive, concentrating investments in fewer stocks with higher expected returns but also slightly increased volatility.

Overall, the Scipy optimization yields better risk-adjusted returns (higher Sharpe ratio) and expected returns, making it a more favorable strategy in this context. However, the Monte Carlo simulation’s results reflect a more cautious approach to diversification that may suit risk-averse investors.