<font color='#E27271'>

# *Unveiling Complex Interconnections Among Companies through Learned Embeddings*</font>

-----------------------
<font color='#E27271'>

Ethan Moody, Eugene Oon, and Sam Shinde</font>

<font color='#E27271'>

August 2023</font>

-----------------------
<font color='#00AED3'>

# **Portfolio Construction** </font>
-----------------------


## [1] Installs, Imports and Setup Steps

### [1.1] Complete Initial Installs

In [None]:
!pip install pybind11
!pip install cvxpy
!pip install riskfolio-lib
!pip install yfinance
!pip install mosek

### [1.2] Import Packages

In [None]:
import numpy as np
import pandas as pd
import yfinance as yf
import warnings
import json
import riskfolio as rp

warnings.filterwarnings("ignore")
pd.options.display.float_format = '{:.4%}'.format
pd.set_option('display.max_columns', None)

### [1.3] Mount Drive

In [None]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


## [2] Modeling Data Preparation

### [2.1] Load and Preprocess data for KMeans Clustering - Approach 1

In [None]:
# Path for "communities_dict" JSON file
kmeans1_dict_file_path = '/content/drive/My Drive/Colab Notebooks/project/communities_dict.json'

# Load the JSON file
with open(kmeans1_dict_file_path, 'r') as file:
  kmeans1_dict = json.load(file)

In [None]:
# Adjust ticker to meet Yahoo Finance requirement
for key, values in kmeans1_dict.items():
  # Loop through each string value in the list
  for i, value in enumerate(values):
    # Check if "." is present in the string value
    if "." in value:
      # Replace "." with "-"
      values[i] = value.replace(".", "-")

### [2.2] Load and Preprocess data for KMeans Clustering - Approach 2

In [None]:
# Path for "kmean2_clusters_norm" CSV file
kmean2_dict_file_path = '/content/drive/My Drive/Colab Notebooks/project/kmean2_clusters_norm.csv'

# Load the CSV file
df_kmeans2 = pd.read_csv(kmean2_dict_file_path, index_col=0)

In [None]:
# Create an empty dictionary to store the results
kmeans2_ticker_dict = {}

# Iterate through the DataFrame to build the dictionary
for cluster in df_kmeans2.columns:
    # Get the tickers with non-zero values in the cluster column
    tickers_with_non_zero = df_kmeans2.index[df_kmeans2[cluster] != 0].tolist()

    # Add the tickers to the dictionary with the cluster as the key
    kmeans2_ticker_dict[cluster] = tickers_with_non_zero

# Adjust ticker to meet Yahoo Finance requirement
for key, values in kmeans2_ticker_dict.items():
  # Loop through each string value in the list
  for i, value in enumerate(values):
    # Check if "." is present in the string value
    if "." in value:
      # Replace "." with "-"
      values[i] = value.replace(".", "-")

In [None]:
# Create an empty dictionary to store the results
kmeans2_weight_dict = {}

# Iterate through the DataFrame to build the dictionary
for cluster in df_kmeans2.columns:
    # Get the tickers with non-zero values in the cluster column
    weights_with_non_zero = df_kmeans2[cluster][df_kmeans2[cluster] != 0].tolist()

    # Add the tickers to the dictionary with the cluster as the key
    kmeans2_weight_dict[cluster] = weights_with_non_zero

### [2.3] Load and Preprocess data for GICS Classification

In [None]:
# Path for CSV file
gics_file_path = '/content/drive/My Drive/Colab Notebooks/project/sp500_2022bus_final.csv'

# Load the CSV file
df_raw_gics = pd.read_csv(gics_file_path)

# Filter for "name", "ticker", "sector" and "label" columns
df_gics = df_raw_gics[["ticker", "sector"]]

In [None]:
# Create an empty dictionary to store the results
gics_ticker_dict = {}

# Loop through the DataFrame to group tickers by sector
for index, row in df_gics.iterrows():
  sector = row['sector']
  ticker = row['ticker']

  # Check if the sector is already in the dictionary
  if sector in gics_ticker_dict:
    # Append the ticker to the existing list of tickers for this sector
    gics_ticker_dict[sector].append(ticker)
  else:
    # If sector is not in the dictionary, create a new entry with the ticker as a list
    gics_ticker_dict[sector] = [ticker]

# Adjust ticker to meet Yahoo Finance requirement
for key, values in gics_ticker_dict.items():
  # Loop through each string value in the list
  for i, value in enumerate(values):
    # Check if "." is present in the string value
    if "." in value:
      # Replace "." with "-"
      values[i] = value.replace(".", "-")

### [2.4] Load and Preprocess data for Multi-GICS Prediction Approach

In [None]:
# Path for CSV file
gicspred_file_path = '/content/drive/My Drive/Colab Notebooks/project/data/stocks/multi_gics_alternate_classification.csv'

# Load the CSV file
df_gicspred = pd.read_csv(gicspred_file_path, index_col=0)

In [None]:
# Create an empty dictionary to store the results
gicspred_ticker_dict = {}

# Iterate through the DataFrame to build the dictionary
for cluster in df_gicspred.columns:
    # Get the tickers with non-zero values in the cluster column
    tickers_with_non_zero = df_gicspred.index[df_gicspred[cluster] != 0].tolist()

    # Add the tickers to the dictionary with the cluster as the key
    gicspred_ticker_dict[cluster] = tickers_with_non_zero

In [None]:
# Create an empty dictionary to store the results
gicspred_weight_dict = {}

# Iterate through the DataFrame to build the dictionary
for cluster in df_gicspred.columns:
    # Get the tickers with non-zero values in the cluster column
    weights_with_non_zero = df_gicspred[cluster][df_gicspred[cluster] != 0].tolist()

    # Add the tickers to the dictionary with the cluster as the key
    gicspred_weight_dict[cluster] = weights_with_non_zero

## [3] Portfolio Construction and BackTesting

### Functions

In [None]:
def get_port_weights(start, end, assets):
  """
  Calculate the risk parity portfolio weights using variance optimization.
  Each cluster is treated as a risk parity portfolio utilizing the optimization functions from Riskfolio.
  This function is suitable only for stocks with single cluster label.

  Parameters:
  start (str): Start date for downloading financial data (format: 'YYYY-MM-DD')
  end (str): End date for downloading financial data (format: 'YYYY-MM-DD')
  assets (list): A list of stock tickers representing the assets to be included in the portfolio

  Returns:
  dict: Dictionary containing the risk parity portfolio weights for each asset in the portfolio
  """

  # Downloading data
  data = yf.download(assets, start = start, end = end)
  data = data.loc[:,('Adj Close', slice(None))]
  data.columns = assets
  num_trading_days = len(data)

  # Calculating daily returns and covariance matrix
  daily_ret = data[assets].pct_change().dropna(how='all')
  daily_ret.fillna(0, inplace=True)

  daily_covmat = daily_ret.cov()

  # Building the portfolio object
  port = rp.Portfolio(returns=daily_ret)

  method_mu='hist' # Method to estimate expected returns based on historical data
  method_cov='hist' # Method to estimate covariance matrix based on historical data
  port.assets_stats(method_mu=method_mu, method_cov=method_cov, d=0.94)

  model='Classic' # Could be Classic (historical), BL (Black Litterman) or FM (Factor Model)
  rm = 'MV' # Risk measure used, this time will be variance
  rf = 0 # Risk free rate
  b = None
  hist = True # Use historical scenarios for risk measures that depend on scenarios

  # Calculate risk parity portfolio weights using variance
  w_rp = port.rp_optimization(model=model, rm=rm, rf=rf, b=b, hist=hist)

  return w_rp.weights

In [None]:
def get_multilabel_port_weights(start, end, assets, weights):
  """
  Calculate the risk parity portfolio weights using variance optimization.
  Each cluster is treated as a risk parity portfolio utilizing the optimization functions from Riskfolio.
  This function is suitable only for stocks with multi-cluster label.

  Parameters:
  start (str): Start date for downloading financial data (format: 'YYYY-MM-DD')
  end (str): End date for downloading financial data (format: 'YYYY-MM-DD')
  assets (list): A list of stock tickers representing the assets to be included in the portfolio

  Returns:
  dict: Dictionary containing the risk parity portfolio weights for each asset in the portfolio
  """

  # Downloading data
  data = yf.download(assets, start = start, end = end)
  data = data.loc[:,('Adj Close', slice(None))]
  data.columns = assets
  num_trading_days = len(data)

  # Calculating daily returns and covariance matrix
  daily_ret = data[assets].pct_change().dropna(how='all')
  daily_ret.fillna(0, inplace=True)

  daily_covmat = daily_ret.cov()

  # Building the portfolio object
  port = rp.Portfolio(returns=daily_ret)

  method_mu='hist' # Method to estimate expected returns based on historical data.
  method_cov='hist' # Method to estimate covariance matrix based on historical data.
  port.assets_stats(method_mu=method_mu, method_cov=method_cov, d=0.94)

  model='Classic' # Could be Classic (historical), BL (Black Litterman) or FM (Factor Model)
  rm = 'MV' # Risk measure used, this time will be variance
  rf = 0 # Risk free rate
  b = None
  hist = True # Use historical scenarios for risk measures that depend on scenarios

  # Calculate risk parity portfolio weights using variance
  w_rp_pre = port.rp_optimization(model=model, rm=rm, rf=rf, b=b, hist=hist)
  wgt = pd.DataFrame(weights)
  w_rp_post = np.multiply(w_rp_pre, wgt)
  normalized_sum = w_rp_post['weights'].sum()
  w_rp_post['weights_normalized'] = w_rp_post['weights'] / normalized_sum

  return w_rp_post.weights_normalized

In [None]:
def get_combined_port_return_vol(df_cluster_ret_daily):
  """
  Calculate investment return and volatility for the risk parity total portfolio
  of clusters or communities utilizing the optimization functions from Riskfolio.

  Parameters:
  df_cluster_ret_daily (DataFrame): DataFrame of daily returns for each cluster or community

  Returns:
  port_ret_ann: Portfolio annualized return
  port_vol_ann: Portfolio annualized volatility
  port_ret_daily: Array of daily portfolio returns
  """

  # Calculate the number of trading days
  num_trading_days = df_cluster_ret_daily.shape[0]

  # Building the portfolio object
  port = rp.Portfolio(returns=df_cluster_ret_daily)

  method_mu='hist' # Method to estimate expected returns based on historical data.
  method_cov='hist' # Method to estimate covariance matrix based on historical data.
  port.assets_stats(method_mu=method_mu, method_cov=method_cov, d=0.94)

  model='Classic' # Could be Classic (historical), BL (Black Litterman) or FM (Factor Model)
  rm = 'MV' # Risk measure used, this time will be variance
  rf = 0 # Risk free rate
  b = None
  hist = True # Use historical scenarios for risk measures that depend on scenarios

  # Calculate risk parity portfolio weights using variance
  w_rp = port.rp_optimization(model=model, rm=rm, rf=rf, b=b, hist=hist)

  # Calculate portfolio daily and annualized returns
  port_ret_daily = df_cluster_ret_daily@w_rp.weights
  port_ret_ann = ((1 + port_ret_daily).prod())**(252/num_trading_days) - 1

  # Calculate portfolio variance and volatility (annualized)
  ann_covmat = df_cluster_ret_daily.cov()*252
  port_var_ann = np.transpose(w_rp.weights)@ann_covmat@w_rp.weights
  port_vol_ann = np.sqrt(port_var_ann)

  return port_ret_ann, port_vol_ann, port_ret_daily

### Constants

In [None]:
# Measurement periods

start_2022 = "2022-01-01"
end_2022 = "2022-12-31"

start_2023 = "2023-01-01"
end_2023 = "2023-07-29"

### [3.1] KMeans Approach 1

In [None]:
# Create function to calculate investment performance

def calc_inv_perf_kmeans1(start_wgt, end_wgt, start_perf, end_perf, communities_dict):

  # Empty dictionary to store results for each cluster or community
  results_dict = {}

  # Calculate portfolio returns and volatility for each cluster or community
  for community_id in communities_dict.keys():
    assets = communities_dict[community_id]
    assets.sort()
    port_wgts = get_port_weights(start=start_wgt, end=end_wgt, assets=assets)

    data = yf.download(assets, start = start_perf, end = end_perf)
    data = data.loc[:,('Adj Close', slice(None))]
    data.columns = assets

    daily_ret = data[assets].pct_change().dropna(how='all')
    daily_ret.fillna(0, inplace=True)
    port_ret_daily = daily_ret@port_wgts
    results_dict[community_id] = port_ret_daily

  cluster_ret_daily = {key: data for key, data in results_dict.items()}
  df_cluster_ret_daily = pd.DataFrame(cluster_ret_daily)

  totport_ret_ann, totport_vol_ann, totport_ret_daily = get_combined_port_return_vol(df_cluster_ret_daily)

  # Print results
  print(f"RP Portfolio Annualized Return: {totport_ret_ann:.4%}")
  print(f"RP Portfolio Annualized Volatility: {totport_vol_ann:.4%}")
  print(f"Sharpe Ratio: {totport_ret_ann/totport_vol_ann:.2f}")

#### Full Year 2022

In [None]:
calc_inv_perf_kmeans1(start_2022, end_2022, start_2022, end_2022, kmeans1_dict)

[*********************100%***********************]  10 of 10 completed
[*********************100%***********************]  10 of 10 completed
[*********************100%***********************]  33 of 33 completed
[*********************100%***********************]  33 of 33 completed
[*********************100%***********************]  24 of 24 completed
[*********************100%***********************]  24 of 24 completed
[*********************100%***********************]  16 of 16 completed
[*********************100%***********************]  16 of 16 completed
[*********************100%***********************]  31 of 31 completed
[*********************100%***********************]  31 of 31 completed
[*********************100%***********************]  90 of 90 completed
[*********************100%***********************]  90 of 90 completed
[*********************100%***********************]  29 of 29 completed
[*********************100%***********************]  29 of 29 completed
[*****

#### Year To Date 2023

In [None]:
calc_inv_perf_kmeans1(start_2022, end_2022, start_2023, end_2023, kmeans1_dict)

[*********************100%***********************]  10 of 10 completed
[*********************100%***********************]  10 of 10 completed
[*********************100%***********************]  33 of 33 completed
[*********************100%***********************]  33 of 33 completed
[*********************100%***********************]  24 of 24 completed
[*********************100%***********************]  24 of 24 completed
[*********************100%***********************]  16 of 16 completed
[*********************100%***********************]  16 of 16 completed
[*********************100%***********************]  31 of 31 completed
[*********************100%***********************]  31 of 31 completed
[*********************100%***********************]  90 of 90 completed
[*********************100%***********************]  90 of 90 completed
[*********************100%***********************]  29 of 29 completed
[*********************100%***********************]  29 of 29 completed
[*****

#### Start of 2022 to Year To Date 2023

In [None]:
calc_inv_perf_kmeans1(start_2022, end_2022, start_2022, end_2023, kmeans1_dict)

[*********************100%***********************]  10 of 10 completed
[*********************100%***********************]  10 of 10 completed
[*********************100%***********************]  33 of 33 completed
[*********************100%***********************]  33 of 33 completed
[*********************100%***********************]  24 of 24 completed
[*********************100%***********************]  24 of 24 completed
[*********************100%***********************]  16 of 16 completed
[*********************100%***********************]  16 of 16 completed
[*********************100%***********************]  31 of 31 completed
[*********************100%***********************]  31 of 31 completed
[*********************100%***********************]  90 of 90 completed
[*********************100%***********************]  90 of 90 completed
[*********************100%***********************]  29 of 29 completed
[*********************100%***********************]  29 of 29 completed
[*****

### [3.2] KMeans Approach 2

In [None]:
# Create function to calculate investment performance

def calc_inv_perf_kmeans2(start_wgt, end_wgt, start_perf, end_perf, kmeans2_ticker_dict, kmeans2_weight_dict):

  # Empty dictionary to store results for each cluster or community
  results_kmeans2_dict = {}

  # Calculate portfolio returns and volatility for each cluster or community
  for cluster_id in kmeans2_ticker_dict.keys():
    assets = kmeans2_ticker_dict[cluster_id]
    weights = kmeans2_weight_dict[cluster_id]
    port_wgts = get_multilabel_port_weights(start=start_wgt, end=end_wgt, assets=assets, weights=weights)

    data = yf.download(assets, start = start_perf, end = end_perf)
    data = data.loc[:,('Adj Close', slice(None))]
    data.columns = assets

    daily_ret = data[assets].pct_change().dropna(how='all')
    daily_ret.fillna(0, inplace=True)
    port_ret_daily = daily_ret@port_wgts
    results_kmeans2_dict[cluster_id] = port_ret_daily

  kmeans2_ret_daily = {key: data for key, data in results_kmeans2_dict.items()}
  df_kmeans2_ret_daily = pd.DataFrame(kmeans2_ret_daily)

  totport_ret_ann, totport_vol_ann, totport_ret_daily = get_combined_port_return_vol(df_kmeans2_ret_daily)

  # Print results
  print(f"RP Portfolio Annualized Return: {totport_ret_ann:.4%}")
  print(f"RP Portfolio Annualized Volatility: {totport_vol_ann:.4%}")
  print(f"Sharpe Ratio: {totport_ret_ann/totport_vol_ann:.2f}")

#### Full Year 2022

In [None]:
calc_inv_perf_kmeans2(start_2022, end_2022, start_2022, end_2022, kmeans2_ticker_dict, kmeans2_weight_dict)

[*********************100%***********************]  19 of 19 completed
[*********************100%***********************]  19 of 19 completed
[*********************100%***********************]  140 of 140 completed
[*********************100%***********************]  140 of 140 completed
[*********************100%***********************]  50 of 50 completed
[*********************100%***********************]  50 of 50 completed
[*********************100%***********************]  35 of 35 completed
[*********************100%***********************]  35 of 35 completed
[*********************100%***********************]  131 of 131 completed
[*********************100%***********************]  131 of 131 completed
[*********************100%***********************]  367 of 367 completed
You must convert self.cov to a positive definite matrix
[*********************100%***********************]  367 of 367 completed
[*********************100%***********************]  60 of 60 completed
[********

#### Year To Date 2023

In [None]:
calc_inv_perf_kmeans2(start_2022, end_2022, start_2023, end_2023, kmeans2_ticker_dict, kmeans2_weight_dict)

[*********************100%***********************]  19 of 19 completed
[*********************100%***********************]  19 of 19 completed
[*********************100%***********************]  140 of 140 completed
[*********************100%***********************]  140 of 140 completed
[*********************100%***********************]  50 of 50 completed
[*********************100%***********************]  50 of 50 completed
[*********************100%***********************]  35 of 35 completed
[*********************100%***********************]  35 of 35 completed
[*********************100%***********************]  131 of 131 completed
[*********************100%***********************]  131 of 131 completed
[*********************100%***********************]  367 of 367 completed
You must convert self.cov to a positive definite matrix
[*********************100%***********************]  367 of 367 completed

ERROR:yfinance:
1 Failed download:
ERROR:yfinance:['DRI']: Exception('%ticker%: No price data found, symbol may be delisted (1d 2023-01-01 -> 2023-07-29)')



[*********************100%***********************]  60 of 60 completed
[*********************100%***********************]  60 of 60 completed
[*********************100%***********************]  141 of 141 completed
[*********************100%***********************]  141 of 141 completed
[*********************100%***********************]  135 of 135 completed
[*********************100%***********************]  135 of 135 completed
[*********************100%***********************]  201 of 201 completed
[*********************100%***********************]  201 of 201 completed
[*********************100%***********************]  48 of 48 completed
[*********************100%***********************]  48 of 48 completed
[*********************100%***********************]  83 of 83 completed
[*********************100%***********************]  83 of 83 completed
[*********************100%***********************]  124 of 124 completed
[*********************100%***********************]  124 of 124

#### Start of 2022 to Year To Date 2023

In [None]:
calc_inv_perf_kmeans2(start_2022, end_2022, start_2022, end_2023, kmeans2_ticker_dict, kmeans2_weight_dict)

[*********************100%***********************]  19 of 19 completed
[*********************100%***********************]  19 of 19 completed
[*********************100%***********************]  140 of 140 completed
[*********************100%***********************]  140 of 140 completed
[*********************100%***********************]  50 of 50 completed
[*********************100%***********************]  50 of 50 completed
[*********************100%***********************]  35 of 35 completed
[*********************100%***********************]  35 of 35 completed
[*********************100%***********************]  131 of 131 completed
[*********************100%***********************]  131 of 131 completed
[*********************100%***********************]  367 of 367 completed
You must convert self.cov to a positive definite matrix
[*********************100%***********************]  367 of 367 completed
[*********************100%***********************]  60 of 60 completed
[********

### [3.3] GICS Classification

In [None]:
# Create function to calculate investment performance

def calc_inv_perf_gics(start_wgt, end_wgt, start_perf, end_perf, gics_ticker_dict):

  # Empty dictionary to store results for each cluster or community
  results_gics_dict = {}

  # Calculate portfolio returns and volatility for each cluster or community
  for sector_id in gics_ticker_dict.keys():
    assets = gics_ticker_dict[sector_id]
    port_wgts = get_port_weights(start=start_wgt, end=end_wgt, assets=assets)

    data = yf.download(assets, start = start_perf, end = end_perf)
    data = data.loc[:,('Adj Close', slice(None))]
    data.columns = assets

    daily_ret = data[assets].pct_change().dropna(how='all')
    daily_ret.fillna(0, inplace=True)
    port_ret_daily = daily_ret@port_wgts
    results_gics_dict[sector_id] = port_ret_daily

  gics_ret_daily = {key: data for key, data in results_gics_dict.items()}
  df_gics_ret_daily = pd.DataFrame(gics_ret_daily)

  totport_ret_ann, totport_vol_ann, totport_ret_daily = get_combined_port_return_vol(df_gics_ret_daily)

  # Print results
  print(f"RP Portfolio Annualized Return: {totport_ret_ann:.4%}")
  print(f"RP Portfolio Annualized Volatility: {totport_vol_ann:.4%}")
  print(f"Sharpe Ratio: {totport_ret_ann/totport_vol_ann:.2f}")

#### Full Year 2022

In [None]:
calc_inv_perf_gics(start_2022, end_2022, start_2022, end_2022, gics_ticker_dict)

[*********************100%***********************]  53 of 53 completed
[*********************100%***********************]  53 of 53 completed
[*********************100%***********************]  72 of 72 completed
[*********************100%***********************]  72 of 72 completed
[*********************100%***********************]  75 of 75 completed
[*********************100%***********************]  75 of 75 completed
[*********************100%***********************]  37 of 37 completed
[*********************100%***********************]  37 of 37 completed
[*********************100%***********************]  23 of 23 completed
[*********************100%***********************]  23 of 23 completed
[*********************100%***********************]  31 of 31 completed
[*********************100%***********************]  31 of 31 completed
[*********************100%***********************]  65 of 65 completed
[*********************100%***********************]  65 of 65 completed
[*****

#### Year To Date 2023

In [None]:
calc_inv_perf_gics(start_2022, end_2022, start_2023, end_2023, gics_ticker_dict)

[*********************100%***********************]  53 of 53 completed
[*********************100%***********************]  53 of 53 completed
[*********************100%***********************]  72 of 72 completed
[*********************100%***********************]  72 of 72 completed
[*********************100%***********************]  75 of 75 completed
[*********************100%***********************]  75 of 75 completed
[*********************100%***********************]  37 of 37 completed
[*********************100%***********************]  37 of 37 completed
[*********************100%***********************]  23 of 23 completed
[*********************100%***********************]  23 of 23 completed
[*********************100%***********************]  31 of 31 completed
[*********************100%***********************]  31 of 31 completed
[*********************100%***********************]  65 of 65 completed
[*********************100%***********************]  65 of 65 completed
[*****

#### Start of Year 2022 to Year To Date 2023

In [None]:
calc_inv_perf_gics(start_2022, end_2022, start_2022, end_2023, gics_ticker_dict)

[*********************100%***********************]  53 of 53 completed
[*********************100%***********************]  53 of 53 completed
[*********************100%***********************]  72 of 72 completed
[*********************100%***********************]  72 of 72 completed
[*********************100%***********************]  75 of 75 completed
[*********************100%***********************]  75 of 75 completed
[*********************100%***********************]  37 of 37 completed
[*********************100%***********************]  37 of 37 completed
[*********************100%***********************]  23 of 23 completed
[*********************100%***********************]  23 of 23 completed
[*********************100%***********************]  31 of 31 completed
[*********************100%***********************]  31 of 31 completed
[*********************100%***********************]  65 of 65 completed
[*********************100%***********************]  65 of 65 completed
[*****

### [3.4] Multi-GICS Prediction Approach

In [None]:
# Create function to calculate investment performance

def calc_inv_perf_gicspred(start_wgt, end_wgt, start_perf, end_perf, gicspred_ticker_dict, gicspred_weight_dict):

  # Empty dictionary to store results for each cluster or community
  results_gicspred_dict = {}

  # Calculate portfolio returns and volatility for each cluster or community
  for cluster_id in gicspred_ticker_dict.keys():
    assets = gicspred_ticker_dict[cluster_id]
    weights = gicspred_weight_dict[cluster_id]
    port_wgts = get_multilabel_port_weights(start=start_wgt, end=end_wgt, assets=assets, weights=weights)

    data = yf.download(assets, start = start_perf, end = end_perf)
    data = data.loc[:,('Adj Close', slice(None))]
    data.columns = assets

    daily_ret = data[assets].pct_change().dropna(how='all')
    daily_ret.fillna(0, inplace=True)
    port_ret_daily = daily_ret@port_wgts
    results_gicspred_dict[cluster_id] = port_ret_daily

  gicspred_ret_daily = {key: data for key, data in results_gicspred_dict.items()}
  df_gicspred_ret_daily = pd.DataFrame(gicspred_ret_daily)

  totport_ret_ann, totport_vol_ann, totport_ret_daily = get_combined_port_return_vol(df_gicspred_ret_daily)

  # Print results
  print(f"RP Portfolio Annualized Return: {totport_ret_ann:.4%}")
  print(f"RP Portfolio Annualized Volatility: {totport_vol_ann:.4%}")
  print(f"Sharpe Ratio: {totport_ret_ann/totport_vol_ann:.2f}")

#### Full Year 2022

In [None]:
calc_inv_perf_gicspred(start_2022, end_2022, start_2022, end_2022, gicspred_ticker_dict, gicspred_weight_dict)

[*********************100%***********************]  26 of 26 completed
[*********************100%***********************]  26 of 26 completed
[*********************100%***********************]  109 of 109 completed
[*********************100%***********************]  109 of 109 completed
[*********************100%***********************]  24 of 24 completed
[*********************100%***********************]  24 of 24 completed
[*********************100%***********************]  77 of 77 completed
[*********************100%***********************]  77 of 77 completed
[*********************100%***********************]  67 of 67 completed
[*********************100%***********************]  67 of 67 completed
[*********************100%***********************]  98 of 98 completed
[*********************100%***********************]  98 of 98 completed
[*********************100%***********************]  86 of 86 completed
[*********************100%***********************]  86 of 86 completed
[*

#### Year To Date 2023

In [None]:
calc_inv_perf_gicspred(start_2022, end_2022, start_2023, end_2023, gicspred_ticker_dict, gicspred_weight_dict)

[*********************100%***********************]  26 of 26 completed
[*********************100%***********************]  26 of 26 completed
[*********************100%***********************]  109 of 109 completed
[*********************100%***********************]  109 of 109 completed
[*********************100%***********************]  24 of 24 completed
[*********************100%***********************]  24 of 24 completed
[*********************100%***********************]  77 of 77 completed
[*********************100%***********************]  77 of 77 completed
[*********************100%***********************]  67 of 67 completed
[*********************100%***********************]  67 of 67 completed
[*********************100%***********************]  98 of 98 completed
[*********************100%***********************]  98 of 98 completed
[*********************100%***********************]  86 of 86 completed
[*********************100%***********************]  86 of 86 completed
[*

#### Start of 2022 to Year To Date 2023

In [None]:
calc_inv_perf_gicspred(start_2022, end_2022, start_2022, end_2023, gicspred_ticker_dict, gicspred_weight_dict)

[*********************100%***********************]  26 of 26 completed
[*********************100%***********************]  26 of 26 completed
[*********************100%***********************]  109 of 109 completed
[*********************100%***********************]  109 of 109 completed
[*********************100%***********************]  24 of 24 completed
[*********************100%***********************]  24 of 24 completed
[*********************100%***********************]  77 of 77 completed
[*********************100%***********************]  77 of 77 completed
[*********************100%***********************]  67 of 67 completed
[*********************100%***********************]  67 of 67 completed
[*********************100%***********************]  98 of 98 completed
[*********************100%***********************]  98 of 98 completed
[*********************100%***********************]  86 of 86 completed
[*********************100%***********************]  86 of 86 completed
[*

### [3.5] SPDR S&P 500 ETF Trust (SPY)

In [None]:
# Create function to calculate investment performance

def get_benchmark(start, end, assets):

  data = yf.download(assets, start=start, end=end)
  data = data['Adj Close'].pct_change().dropna()
  data.fillna(0, inplace=True)

  num_trading_days = len(data)
  port_ret_ann = ((1 + data).prod()) ** (252 / num_trading_days) - 1
  port_vol_ann = data.std() * (252 ** 0.5)

  print(f"RP Portfolio Annualized Return: {port_ret_ann:.4%}")
  print(f"RP Portfolio Annualized Volatility: {port_vol_ann:.4%}")
  print(f"Sharpe Ratio: {port_ret_ann/port_vol_ann:.2f}")

#### Full Year 2022

In [None]:
get_benchmark(start=start_2022, end=end_2022, assets=["SPY"])

[*********************100%***********************]  1 of 1 completed
RP Portfolio Annualized Return: -20.0875%
RP Portfolio Annualized Volatility: 24.3084%
Sharpe Ratio: -0.83


#### Year to Date 2023

In [None]:
get_benchmark(start=start_2023, end=end_2023, assets=["SPY"])

[*********************100%***********************]  1 of 1 completed
RP Portfolio Annualized Return: 38.1691%
RP Portfolio Annualized Volatility: 13.9291%
Sharpe Ratio: 2.74


#### Start of 2022 to Year To Date 2023

In [None]:
get_benchmark(start=start_2022, end=end_2023, assets=["SPY"])

[*********************100%***********************]  1 of 1 completed
RP Portfolio Annualized Return: -2.8128%
RP Portfolio Annualized Volatility: 21.1612%
Sharpe Ratio: -0.13
