# ⭐ Tutorial: Nested Clustered Optimisation (NCO)

This notebook demonstrates the Nested Clustered Optimisation (NCO) algorithm, another advanced allocation method from Marcos López de Prado (Chapter 16, 'Advances in Financial Machine Learning').

NCO is a hybrid approach that combines the strengths of clustering with traditional MVO. It's designed to make portfolio allocation more robust.

The algorithm works in steps:
1.  **Denoise & Detone:** Clean the covariance matrix using Random Matrix Theory (RMT) to get a stable, signal-only matrix.
2.  **Cluster Assets:** Group the assets into a predefined number of clusters based on their correlation.
3.  **Intra-Cluster Allocation:** Run MVO (e.g., `optimal_portfolio`) *inside* each cluster to find the optimal sub-portfolios.
4.  **Inter-Cluster Allocation:** Combine the cluster-level allocations into a final portfolio.

This notebook is a **capstone tutorial** that combines multiple `RiskLabAI` modules:
* `RiskLabAI.data.synthetic_data` (to create a known data structure)
* `RiskLabAI.data.denoise` (to clean the matrix)
* `RiskLabAI.optimization` (to perform clustering and optimization)

## 0. Setup and Imports

In [None]:
# Standard Imports
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import warnings

# RiskLabAI Imports
from RiskLabAI.optimization.nco import cluster_kmeans_base, nco_alloc, optimal_portfolio
from RiskLabAI.data.denoise.denoising import denoise_cov
from RiskLabAI.data.synthetic_data.simulation import form_true_matrix, simulates_cov_mu
import RiskLabAI.utils.publication_plots as pub_plots

# Setup plotting and configuration
pub_plots.setup_publication_style()
warnings.filterwarnings('ignore')

## 1. Create Synthetic, Structured Data

NCO works best when there is a clear cluster structure. We'll use our `form_true_matrix` function to create a 'true' covariance matrix (`cov0`) with 5 distinct blocks (clusters).

In [None]:
n_blocks = 5
b_size = 10
b_corr = 0.5

# 1. Create the 'True' (ground truth) covariance matrix
mu0, cov0 = form_true_matrix(n_blocks, b_size, b_corr)

print("Generated a 'true' covariance matrix of shape:", cov0.shape)

# Plot the 'True' Correlation Matrix
corr0 = pd.DataFrame(dn.cov_to_corr(cov0)) # Use helper from denoising

fig, ax = plt.subplots(figsize=(8, 6))
sns.heatmap(corr0, ax=ax, cmap='coolwarm', cbar=False)
pub_plots.apply_plot_style(
    ax,
    title='True Correlation Matrix (Shuffled)',
    xlabel='Assets',
    ylabel='Assets'
)
plt.show()

## 2. Simulate Noisy, Empirical Data

Now we simulate a small number of observations (`T=100`) drawn from our 'true' world. This creates a noisy, unstable empirical covariance matrix (`cov1`), which is what we would observe in the real world.

In [None]:
n_obs = 100 # Low number of observations to create noise

# 1. Simulate to get the 'empirical' (noisy) matrix
mu1, cov1 = simulates_cov_mu(mu0, cov0, n_obs, shrink=False)

# 2. Plot the 'Empirical' Correlation Matrix
corr1 = pd.DataFrame(dn.cov_to_corr(cov1))

fig, ax = plt.subplots(figsize=(8, 6))
sns.heatmap(corr1, ax=ax, cmap='coolwarm', cbar=False)
pub_plots.apply_plot_style(
    ax,
    title='Empirical (Noisy) Correlation Matrix',
    xlabel='Assets',
    ylabel='Assets'
)
plt.show()

**Observation:** The noisy matrix barely resembles the true structure. Running MVO on this matrix would produce unstable, suboptimal weights.

## 3. Denoise the Covariance Matrix

We use the `denoise_cov` function we developed in the previous module to clean the empirical matrix.

In [None]:
q = n_obs / float(cov1.shape[1])

# 1. Denoise the covariance matrix
cov1_d = denoise_cov(cov1, q, bandwidth=.01, denoise_method='const_resid')

# 2. Plot the 'Denoised' Correlation Matrix
corr1_d = pd.DataFrame(dn.cov_to_corr(cov1_d))

fig, ax = plt.subplots(figsize=(8, 6))
sns.heatmap(corr1_d, ax=ax, cmap='coolwarm', cbar=False)
pub_plots.apply_plot_style(
    ax,
    title='Denoised Correlation Matrix',
    xlabel='Assets',
    ylabel='Assets'
)
plt.show()

**Observation:** The denoised matrix is a *much* cleaner representation of the original block structure. This is the stable matrix we will use for our NCO allocation.

## 4. Run NCO Allocation

Now we run the NCO algorithm. We pass it the denoised covariance matrix and tell it to find the 5 clusters we know are there.

In [None]:
# 1. Run the main NCO allocation function
# We pass mu=None to compute the Minimum-Variance portfolio
nco_weights = nco_alloc(
    cov1_d,
    mu=None, 
    n_clusters=n_blocks
)

# 2. For comparison, run standard MVO (Markowitz)
markowitz_weights = optimal_portfolio(cov1_d, mu=None)
markowitz_weights = pd.Series(markowitz_weights, index=cov1_d.index)

# 3. Combine results for plotting
weights_df = pd.DataFrame({
    'NCO': nco_weights,
    'Markowitz (MVO)': markowitz_weights
}).sort_index()

print("NCO vs. Markowitz (MVO) Weights")
display(weights_df.head())

In [None]:
# Plot the two sets of weights
fig, ax = plt.subplots(figsize=(12, 7))

weights_df.plot(kind='bar', ax=ax, width=0.8)

pub_plots.apply_plot_style(
    ax,
    title='NCO vs. Markowitz (MVO) Allocations',
    xlabel='Asset Index',
    ylabel='Portfolio Weight'
)
ax.grid(axis='x')
plt.tight_layout()
plt.show()

**Conclusion:** The standard Markowitz (MVO) allocation is highly concentrated, putting all of its weight into just a few assets. 

The **NCO allocation**, by contrast, is far more diversified. It correctly identified the underlying clusters and distributed risk *within* each cluster, resulting in a much more robust and stable portfolio.