<div style="display: flex; align-items: center; width: 100%;">  
  <div style="display: flex; flex-direction: column; align-items: center; justify-content: center; width: 100px; margin-right: 0px;">    
    <a href="https://risklab.ai" style="border: 0; line-height: 0.5;">
      <img src="../../utils/risklab_ai.gif" width="60px" style="border: 0; margin-bottom:-10px; vertical-align: middle;"/>
    </a>
  </div>  
  <div style="flex-grow: 1;">
    <h1 style="margin: 0; margin-left:0; font-weight: bold; text-align: left; font-size: 38px;">
      Hierarchical Risk Parity 
    </h1>
  </div>  
</div>

This notebook demonstrates the Hierarchical Risk Parity (HRP) algorithm using the `RiskLabAI` library. 

HRP is a modern portfolio allocation method from Marcos López de Prado that addresses the main issues of traditional Mean-Variance Optimization (MVO), namely instability and concentration. 

In this tutorial, we will:
1.  Load a universe of diverse assets (stocks, bonds, commodities, and FX) from the FRED database.
2.  Calculate the covariance and correlation matrices for these assets.
3.  Visualize the asset hierarchy using a dendrogram, which is the first step of HRP.
4.  Compute the final HRP portfolio weights using `RiskLabAI.optimization.hrp.hrp_alloc`.
5.  Run a Monte Carlo simulation to compare the stability of HRP weights against traditional MVO weights.

## 0. Setup and Imports

In [5]:
# Standard Imports
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import scipy.cluster.hierarchy as sch
import warnings
import os

# Third-party for data
from fredapi import Fred
from dotenv import load_dotenv

# RiskLabAI Imports
from RiskLabAI.optimization.hrp import hrp_alloc
from RiskLabAI.optimization.nco import get_optimal_portfolio_weights
from RiskLabAI.data.synthetic_data.simulation import form_true_matrix, simulates_cov_mu
from RiskLabAI.cluster.clustering import covariance_to_correlation
import RiskLabAI.utils.publication_plots as pub_plots

# Setup configuration
warnings.filterwarnings('ignore')

ImportError: cannot import name 'hrp_alloc' from 'RiskLabAI.optimization.hrp' (c:\Users\Hamid\.conda\envs\risklab313\Lib\site-packages\RiskLabAI\optimization\hrp.py)

In [None]:
# Global Plotting Settings
SAVE_PLOTS = False
PLOT_THEME = 'dark' # Options: 'light', 'medium', 'dark', 'light-transparent'
PLOT_QUALITY = 300
SAVE_DIR = 'figs'

pub_plots.setup_publication_style(
    theme=PLOT_THEME,
    quality=PLOT_QUALITY,
    save_plots=SAVE_PLOTS,
    save_dir=SAVE_DIR
)

## 1. Load Data from FRED

We will download daily price data for a set of diverse global assets from the Federal Reserve Economic Data (FRED) database. To do this, you'll need a free API key from the [FRED website](https://fred.stlouisfed.org/docs/api/api_key.html).

Once you have your key, create a file named `.env` in the same directory as this notebook and add the line: `FRED_API_KEY='YOUR_KEY_HERE'`.

In [None]:
# This finds the .env file and loads FRED_API_KEY into os.environ
load_dotenv()

FRED_API_KEY = os.environ.get('FRED_API_KEY')

if not FRED_API_KEY:
    print("Error: FRED_API_KEY not found.")
    print("Please create a .env file with your key or set the environment variable.")
    # FRED_API_KEY = 'YOUR_KEY_HERE' # Uncomment and paste your key here
else:
    print("FRED API Key loaded successfully.")

fred = Fred(api_key=FRED_API_KEY)

# We select a diverse set of assets: Stocks, Bonds, Gold, and Currencies
ticker_dict = {
    'S&P_500': 'SP500',
    'NASDAQ': 'NASDAQCOM',
    'US_CORP_BONDS': 'BAMLCC0A0CMTRIV', # US Corporate Bond Total Return Index
    'GOLD': 'GOLDPMGBD228NLBM',       # Gold Price
    'EUR_USD': 'DEXUSEU',               # FX: EUR/USD
    'JPY_USD': 'DEXJPUS',               # FX: JPY/USD
    'CNY_USD': 'DEXCHUS'                # FX: CNY/USD
}

start_date = '2014-01-01'
end_date = '2024-01-01'

data_list = []
for name, ticker in ticker_dict.items():
    series = fred.get_series(ticker, start_date, end_date)
    series.name = name
    data_list.append(series)

# Combine all series into a single DataFrame
data = pd.concat(data_list, axis=1)

# Forward-fill missing values (e.g., holidays) and drop any initial NaNs
data = data.ffill().dropna()

# Calculate log returns
returns = np.log(data).diff().dropna()

print(f"Loaded returns data from {returns.index.min().date()} to {returns.index.max().date()}")
returns.head()

## 2. Calculate Inputs

HRP requires two inputs:
1.  **Covariance Matrix:** Used for the final weight allocation.
2.  **Correlation Matrix:** Used for the tree clustering step.

In [None]:
cov_matrix = returns.cov()
corr_matrix = returns.corr()

print("Correlation Matrix:")
display(corr_matrix.style.background_gradient(cmap='coolwarm', vmin=-1, vmax=1))

## 3. Visualize the Cluster Hierarchy

Before we run the allocation, let's visualize the first step. We can use `scipy.cluster.hierarchy` to compute and plot the dendrogram. This shows us how HRP 'sees' the asset universe.

In [None]:
# 1. Calculate the distance matrix (based on correlation)
dist_matrix = np.sqrt((1 - corr_matrix.values) / 2.)

# 2. Calculate the linkage matrix using 'single' linkage
link = sch.linkage(dist_matrix, 'single')

# 3. Plot the dendrogram
fig, ax = plt.subplots(figsize=(10, 6))

dendrogram = sch.dendrogram(
    link,
    labels=corr_matrix.columns,
    ax=ax,
    leaf_rotation=90
)

pub_plots.apply_plot_style(
    ax,
    title='Hierarchical Clustering of Assets (Dendrogram)',
    xlabel='Assets',
    ylabel='Distance'
)
ax.grid(axis='x')

pub_plots.finalize_plot(fig, 'hrp_dendrogram.png')

## 4. Run HRP Allocation

Now we pass our covariance and correlation matrices to the `hrp_alloc` function to get the final portfolio weights.

In [None]:
hrp_weights = hrp_alloc(cov_matrix, corr_matrix)

fig, ax = plt.subplots(figsize=(10, 6))
hrp_weights.sort_values(ascending=False).plot(kind='bar', ax=ax)

pub_plots.apply_plot_style(
    ax,
    title='HRP Portfolio Allocations',
    xlabel='Asset',
    ylabel='Weight'
)
ax.grid(axis='x')
pub_plots.finalize_plot(fig, 'hrp_weights_real_data.png')

**Observation:** The HRP algorithm successfully clustered the assets and produced a diversified set of portfolio weights, allocating capital across the different asset classes (stocks, bonds, gold, and currencies).

## 5. HRP vs. MVO Stability (Monte Carlo)

A key advantage of HRP is its **stability**. Traditional MVO (e.g., Markowitz) is notoriously unstable—small changes in the input covariance matrix (due to estimation errors) can lead to wildly different portfolio weights.

To test this, we will:
1.  Create a synthetic "true" covariance matrix with a known block structure.
2.  Run 100 simulations. In each simulation, we will draw a small, noisy sample (`T=100`) from this "true" world to create an empirical covariance matrix.
3.  Calculate both the MVO and HRP weights for each noisy matrix.
4.  Plot the distribution of weights for all 100 simulations.

In [None]:
def run_allocation_simulation(mu0, cov0, n_obs, n_sims, n_clusters):
    """
    Runs a Monte Carlo simulation comparing MVO and HRP weight stability.
    """
    mvo_weights = []
    hrp_weights = []
    
    for _ in range(n_sims):
        # 1. Simulate a noisy, empirical covariance matrix
        # We set shrink=False to get the raw, unstable matrix
        _, cov1 = simulates_cov_mu(mu0, cov0, n_obs, shrink=False)
        corr1 = covariance_to_correlation(cov1)
        
        # 2. Calculate MVO (Markowitz) weights (Global Minimum Variance)
        w_mvo = get_optimal_portfolio_weights(cov1, mu=None)
        mvo_weights.append(w_mvo.flatten())
        
        # 3. Calculate HRP weights
        w_hrp = hrp_alloc(cov1, corr1)
        hrp_weights.append(w_hrp)
    
    # Convert lists of weights to DataFrames
    asset_names = cov0.columns
    mvo_weights_df = pd.DataFrame(mvo_weights, columns=asset_names)
    hrp_weights_df = pd.DataFrame(hrp_weights, columns=asset_names)
    
    return mvo_weights_df, hrp_weights_df

def plot_weight_stability(mvo_df, hrp_df):
    """
    Plots the simulation results using boxplots to show weight distribution.
    """
    fig, (ax1, ax2) = plt.subplots(2, 1, figsize=(12, 10), sharex=True)
    
    # 1. Plot MVO weight distribution
    sns.boxplot(data=mvo_df, ax=ax1, palette='coolwarm')
    pub_plots.apply_plot_style(
        ax1,
        title='Markowitz (MVO) Weight Stability (100 Simulations)',
        xlabel='',
        ylabel='Portfolio Weight'
    )
    ax1.grid(axis='x')
    
    # 2. Plot HRP weight distribution
    sns.boxplot(data=hrp_df, ax=ax2, palette='coolwarm')
    pub_plots.apply_plot_style(
        ax2,
        title='HRP Weight Stability (100 Simulations)',
        xlabel='Asset Index',
        ylabel='Portfolio Weight'
    )
    ax2.grid(axis='x')
    
    plt.tight_layout()
    pub_plots.finalize_plot(fig, 'hrp_mvo_stability_comparison.png')


In [None]:
# 1. Setup the synthetic "true" world
n_blocks = 5       # 5 clusters
b_size = 10        # 10 assets per cluster
b_corr = 0.5       # 50% correlation within a cluster
n_assets = n_blocks * b_size

mu0, cov0 = form_true_matrix(n_blocks, b_size, b_corr)
cov0 = pd.DataFrame(cov0, 
                    index=range(n_assets), 
                    columns=range(n_assets))

# 2. Simulation parameters
n_obs = 100        # Simulate only 100 observations (T < N)
n_sims = 100       # Run 100 trials

# 3. Run the simulation
print(f"Running {n_sims} simulations...")
mvo_weights, hrp_weights = run_allocation_simulation(
    mu0=mu0,
    cov0=cov0,
    n_obs=n_obs,
    n_sims=n_sims,
    n_clusters=n_blocks
)
print("Simulation complete. Plotting results...")

# 4. Plot the results
plot_weight_stability(mvo_weights, hrp_weights)