# Robust Spectral Portfolio Diversification
### Francisco A. Ibanez

1. RPCA on the sample
2. Singular Value Hard Thresholding (SVHT)
3. Truncated SVD
4. Maximize portfolio effective bets - regualization, s.t.: 
    - Positivity constraint
    - Leverage 1x

The combination of (1), (2), and (3) should limit the possible permutations of the J vector when doing the spectral risk parity.

## Methodology
The goal of the overall methodology is to arrive to a portfolio weights vector which provides a well-balanced portfolio exposure to each one of the spectral risk factors present in an given investable universe.

We start with the data set $X_{T \times N}$ which containst the historical excess returns for each one of the assets that span the investable universe of the portfolio. Before performing the eigendecomposition on $X$, we need to clean the set from noisy trading observations and outliers. We apply Robust Principal Components (RPCA) on $X$ to achieve this, which seeks to decompose $X$ into a structured low-rank matrix $R$ and a sparse matrix $C$ containing outliers and corrupt data:

\begin{aligned}
X=R_0+C_0
\end{aligned}

The principal components of $R$ are robust to outliers and corrupt data in $C$. Mathematically, the goal is to find $R$ and $C$ that satisfy the following:

\begin{aligned}
\min_{R,C} ||R||_{*} + \lambda ||C||_{1} \\
\text{subject to} \\ R + C = X
\end{aligned} 

In [2]:
import pandas as pd
import numpy as np
from rpca import RobustPCA
import matplotlib.pyplot as plt
from scipy.linalg import svd
from optht import optht

raw = pd.read_pickle('etf_er.pkl').dropna() # Working with even panel for now
sample = raw.copy()

# Outlier detection & cleaning
X = (sample - sample.mean()).div(sample.std()).values
t, n = X.shape
lmb = 4 / np.sqrt(max(t, n))
rob = RobustPCA(lmb=lmb, max_iter=int(1E6))
R, C = rob.fit(X)

# Low-rank representation through hard thresholding Truncated-SVD
U, S, Vh = svd(R, full_matrices=False, compute_uv=True, lapack_driver='gesdd')
S = np.diag(S)
k = optht(X, sv=np.diag(S), sigma=None)

V = Vh.T
Vt = V.copy()
Vt[:, k:] = 0


cum_energy = np.cumsum(np.diag(S)) / np.sum(np.diag(S))
print(f'SVHT: {k}, {round(cum_energy[k] * 100, 2)}% of energy explained')

SVHT: 8, 58.43% of energy explained


\begin{aligned}
X &= R + C \\
R &= USV^{T}
\end{aligned}

using the Singular Value Hard Thresholding (SVHT) obtained above we can approximate $R$:
\begin{aligned}
R &\approx \tilde{U}\tilde{S}\tilde{V}^{T}
\end{aligned}

Check the algebra so everything add up and the first matrix $X$ can be recovered from this point.

\begin{aligned}
\Sigma &= \frac{1}{(n - 1)}DX^{T}XD \\
\Sigma &= \frac{1}{(n - 1)}D(R + C)^{T}(R + C))D
\end{aligned}

then, portfolio risk will be given by:
\begin{aligned}
w^{T}\Sigma w &= \frac{1}{(n - 1)}w^{T}D(R + C)^{T}(R + C))D w \\
w^{T}\Sigma w &= \frac{1}{(n - 1)}w^{T}D(R^{T}R + R^{T}C + C^{T}R + C^{T}C ) D w \\
\end{aligned}



\begin{aligned}
w^{T}\Sigma w &= \frac{1}{(n - 1)} \lbrack w^{T}D(R^{T}R)Dw + w^{T} D(R^{T}C + C^{T}R + C^{T}C ) D w \rbrack
\end{aligned}


Taking the Singular Value Decomposition of R

\begin{aligned}
R &= USV^{T} \\
\end{aligned}

we can express R in terms of its singular values and eigenvectors:

\begin{aligned}
w^{T}\Sigma w &= (n - 1)^{-1} \lbrack w^{T}D(VSU^{T}USV^{T})Dw + w^{T} D(R^{T}C + C^{T}R + C^{T}C) D w \rbrack \\
w^{T}\Sigma w &= (n - 1)^{-1} \lbrack w^{T}D(V S^{2} V^{T})Dw + w^{T} D(R^{T}C + C^{T}R + C^{T}C) D w \rbrack
\end{aligned}

where $S^{2}$ contains the eigenvalues of $R$ in its diagonal entries

\begin{aligned}
w^{T}\Sigma w &= (n - 1)^{-1} \lbrack \underbrace{w^{T}DV S^{2} V^{T}Dw}_\text{Robust Component} 
+ \underbrace{w^{T} D(R^{T}C + C^{T}R + C^{T}C) D w}_\text{Noisy Component} \rbrack
\end{aligned}

The portfolio risk contribution is then given by 

\begin{aligned}
diag(w)\Sigma w &= (n - 1)^{-1} \lbrack \underbrace{\theta}_\text{Robust Component} 
+ \underbrace{\gamma}_\text{Noisy Component} \rbrack \\

\theta &= diag(V^{T}Dw)S^{2} V^{T}Dw

\end{aligned}

\begin{align}
\eta (w) & \equiv \exp \left( -\sum^{N}_{n=1} p_{n} \ln{(p_{n})} \right)
\end{align}

Now we look for:
\begin{align}
\arg \max_{w} \eta(w)
\end{align}


In [27]:
D = np.diag(sample.std().values)
t, n = X.shape
w = np.array([1 / n] * n).reshape(-1, 1)
eigen_wts = V.T @ D @ w
p = np.divide(np.diag(eigen_wts.flatten()) @ S.T @ S @ eigen_wts, w.T @ D @ R.T @ R @ D @ w)

p



array([[8.98462382e-01],
       [9.20993398e-02],
       [5.07435042e-03],
       [2.00141701e-03],
       [7.94223671e-05],
       [1.14994212e-03],
       [1.96561287e-04],
       [2.79608670e-04],
       [4.84549760e-05],
       [1.27048706e-05],
       [9.82416733e-05],
       [3.82216180e-05],
       [7.63099946e-05],
       [5.44295378e-06],
       [1.31889757e-05],
       [7.99430721e-05],
       [5.27402390e-05],
       [5.54536562e-08],
       [2.63625473e-05],
       [7.26012029e-07],
       [7.91936636e-06],
       [8.60752289e-06],
       [1.19415094e-05],
       [7.27852381e-06],
       [2.37711182e-05],
       [5.95107462e-06],
       [1.37477562e-04],
       [5.76309964e-07],
       [6.27800265e-07],
       [1.53383883e-07],
       [2.79838789e-07]])

In [5]:
D = np.diag(sample.std().values)
n = sample.shape[0]
Sigma = 1 / (n - 1) * D @ X.T @ X @ D
Sigma_b = 1 / (n - 1) * D @ (R + C).T @ (R + C) @ D

pd.DataFrame(R.T @ C)

Unnamed: 0,0,1,2,3,4,5,6,7,8,9,...,21,22,23,24,25,26,27,28,29,30
0,17.307323,26.319082,5.538994,100.470453,34.416213,39.850091,12.207528,4.832033,102.368867,9.855725,...,-27.505575,5.903934,-2.43298,-5.170104,14.875815,218.71377,135.966332,236.313966,134.028907,61.170399
1,5.045302,47.833714,5.03009,101.439677,25.089104,28.767951,16.179416,7.703886,81.371217,17.448708,...,-30.665084,3.389943,-12.724522,-4.620212,2.80231,287.405596,158.856692,235.178051,218.275188,103.044511
2,5.98322,30.926072,13.313046,85.232875,41.154272,48.168092,15.425056,8.29491,104.241961,17.424995,...,-16.955087,3.581394,-1.823237,-3.358751,16.529373,259.477703,127.773586,220.947121,204.256029,77.275388
3,12.10554,19.543928,6.363632,141.388209,24.648328,28.797726,12.511061,4.49837,64.142184,9.89312,...,-29.316162,0.09923,2.012347,-1.670605,33.106213,238.63072,114.06467,206.17471,153.058576,78.334467
4,6.128181,24.860002,5.757952,64.88267,82.594553,96.66904,12.741549,6.610807,149.959284,12.491189,...,-12.166952,6.209005,-3.940672,-5.499219,36.811463,228.421453,148.177575,189.924179,156.855853,71.64107
5,5.280841,21.194044,4.37446,65.300771,81.465504,99.829203,11.926269,6.238532,134.882063,12.236385,...,-12.122919,5.618669,-5.220215,-5.713942,36.268842,208.862126,148.365281,180.665437,146.592308,79.11112
6,5.561278,27.809052,3.096072,96.067085,23.989319,30.355991,33.478224,7.330237,66.412159,17.447132,...,-23.585362,-0.904829,-26.841566,-10.753681,5.153636,248.648824,150.682388,209.957956,205.822081,76.360234
7,9.392554,29.539081,5.992909,80.66645,44.680459,50.697829,14.789691,11.834817,112.674063,15.50122,...,-19.533242,0.839496,-9.766067,-6.111622,10.046013,255.575518,123.777635,208.223557,176.972451,68.69819
8,6.689694,29.690277,7.171997,44.707403,66.106918,74.27605,12.161525,4.84448,179.451394,12.011046,...,-7.947173,5.574434,-0.264237,-2.283612,25.552207,219.431826,98.832239,140.913696,147.69811,20.71539
9,6.326585,30.642136,7.496673,68.755077,33.26246,39.777368,14.748677,6.532153,100.603711,21.116483,...,-21.710288,2.893031,-18.54349,-8.646817,-9.055484,207.397453,125.791623,240.669166,164.667797,13.149811


In [7]:
pd.DataFrame(R.T @ C) + pd.DataFrame(C.T @ R)

Unnamed: 0,0,1,2,3,4,5,6,7,8,9,...,21,22,23,24,25,26,27,28,29,30
0,34.614646,31.364384,11.522214,112.575993,40.544394,45.130933,17.768806,14.224586,109.058561,16.182311,...,-31.083967,0.796075,-7.909535,-11.180898,16.131465,221.111767,145.192596,241.519959,131.926893,66.826539
1,31.364384,95.667428,35.956161,120.983606,49.949107,49.961994,43.988468,37.242967,111.061494,48.090844,...,-41.982598,-4.704379,-19.010391,-13.759098,-10.12756,290.025809,175.605255,241.144507,217.805275,122.238565
2,11.522214,35.956161,26.626093,91.596507,46.912223,52.542552,18.521128,14.287819,111.413958,24.921668,...,-9.689681,10.161858,6.680477,4.753439,24.152961,258.630848,127.721515,224.599844,206.087944,80.238194
3,112.575993,120.983606,91.596507,282.776418,89.530997,94.098496,108.578147,85.16482,108.849588,78.648198,...,-52.174675,-3.545748,-3.660093,-11.440042,71.237598,286.263998,218.447175,273.69141,177.258069,163.105514
4,40.544394,49.949107,46.912223,89.530997,165.189105,178.134545,36.730868,51.291266,216.066202,45.75365,...,-18.071166,-1.289984,-11.483284,-16.39169,34.637641,225.555995,177.544692,197.457922,152.664871,88.958362
5,45.130933,49.961994,52.542552,94.098496,178.134545,199.658407,42.28226,56.936362,209.158113,52.013754,...,-24.982411,-7.138041,-19.812264,-23.317474,32.940736,208.801402,186.146195,190.17168,140.217287,100.043437
6,17.768806,43.988468,18.521128,108.578147,36.730868,42.28226,66.956447,22.119928,78.573684,32.195809,...,-30.389332,-4.265703,-29.109403,-13.555827,6.502376,254.149905,168.15526,215.721057,199.061022,83.827268
7,14.224586,37.242967,14.287819,85.16482,51.291266,56.936362,22.119928,23.669635,117.518543,22.033373,...,-20.724595,-1.706072,-13.356857,-7.798075,8.988447,259.117707,130.539518,209.549832,178.960228,70.880417
8,109.058561,111.061494,111.413958,108.849588,216.066202,209.158113,78.573684,117.518543,358.902788,112.614758,...,-56.576465,-40.565774,-44.228467,-45.168275,-27.865153,203.007302,146.119619,132.670717,123.846027,23.587874
9,16.182311,48.090844,24.921668,78.648198,45.75365,52.013754,32.195809,22.033373,112.614758,42.232966,...,-26.159831,-2.218892,-25.310331,-15.359742,-5.554391,215.352656,138.53114,248.781721,167.813359,24.634331


In [8]:
pd.DataFrame(R.T @ R) + pd.DataFrame(C.T @ C)

Unnamed: 0,0,1,2,3,4,5,6,7,8,9,...,21,22,23,24,25,26,27,28,29,30
0,3550.385763,2676.465371,2772.555085,2785.366783,2680.327474,2579.776289,2451.882825,3140.337406,2340.912675,2669.540826,...,-1307.457046,-1385.253074,-1420.711614,-1446.766569,-445.418426,53.400236,2079.648076,240.706236,-407.206778,1080.377856
1,2676.465371,3489.335393,3009.07146,2240.50182,2649.123643,2498.495744,2813.427954,3012.814917,2420.754485,2746.162624,...,-1184.63851,-1224.108329,-1234.151449,-1266.202299,-502.181927,302.451228,2005.622936,357.067846,-292.867585,1214.462635
2,2772.555085,3009.07146,3558.374247,2285.129353,2904.100859,2790.589747,2838.773779,3213.770499,2580.162818,3057.579645,...,-1352.020204,-1393.473153,-1371.94032,-1384.285517,-609.825166,183.472362,2048.96394,259.838837,-447.478629,1154.736832
3,2785.366783,2240.50182,2285.129353,3302.228077,2239.155523,2182.563188,2075.431664,2608.244214,1858.780474,2146.144428,...,-1182.036871,-1251.602579,-1306.970965,-1332.488406,-323.421887,80.584664,1873.565721,221.008427,-343.6672,1010.035485
4,2680.327474,2649.123643,2904.100859,2239.155523,3419.814504,3355.713892,2458.975391,3034.231388,2847.9803,2557.600815,...,-1419.706915,-1445.514679,-1432.545868,-1432.832812,-685.80393,58.83442,1961.495859,156.937799,-566.914037,1003.301537
5,2579.776289,2498.495744,2790.589747,2182.563188,3355.713892,3385.344935,2343.953973,2934.526942,2605.742651,2481.34418,...,-1489.816501,-1523.210994,-1503.969963,-1508.281299,-759.89876,-8.603679,1891.873345,69.716417,-639.900499,942.366273
6,2451.882825,2813.427954,2838.773779,2075.431664,2458.975391,2343.953973,3518.045273,2809.175479,2140.488427,2694.296844,...,-1142.499408,-1204.386829,-1221.933275,-1261.664867,-523.316883,186.35692,1866.195543,287.399192,-367.621537,1063.005616
7,3140.337406,3012.814917,3213.770499,2608.244214,3034.231388,2934.526942,2809.175479,3561.330674,2625.356387,2999.41928,...,-1466.158269,-1541.057815,-1548.34004,-1558.385609,-642.268385,102.532934,2179.517834,234.874443,-512.177671,1168.552207
8,2340.912675,2420.754485,2580.162818,1858.780474,2847.9803,2605.742651,2140.488427,2625.356387,3226.103803,2227.892503,...,-862.435677,-850.02023,-861.123914,-845.321764,-323.987912,259.82341,1760.918196,404.585317,-157.843924,965.509231
9,2669.540826,2746.162624,3057.579645,2146.144428,2557.600815,2481.34418,2694.296844,2999.41928,2227.892503,3542.767808,...,-1275.283506,-1301.943057,-1275.399287,-1288.80636,-581.069374,139.343342,1953.906534,280.946091,-395.344779,1085.385188


In [64]:
sample.cov() * 1E6

Unnamed: 0,IYM,IYK,IYC,IYE,IYF,IYG,IYH,IYJ,IYR,IYW,...,IEI,IEF,TLH,TLT,TIP,MUB,HYG,LQD,MBB,EMB
IYM,306.111146,141.456444,173.098069,282.882042,249.722631,272.022402,137.251397,219.812636,234.057239,193.304374,...,-15.626681,-28.515276,-43.560561,-67.878333,-8.357242,4.749165,80.261585,13.275026,-2.930126,38.721795
IYK,141.456444,114.578274,115.82802,141.030268,151.556966,161.577038,97.155199,130.026869,147.980592,123.043207,...,-8.761066,-15.484429,-23.37733,-36.458388,-6.101814,6.271045,48.141708,10.075166,-0.488832,27.603235
IYC,173.098069,115.82802,162.300976,168.93348,197.215883,214.540227,115.626443,163.785685,187.235776,161.549152,...,-11.575528,-20.746223,-30.311871,-46.767279,-8.30216,5.569301,57.177566,9.71056,-1.870922,30.352421
IYE,282.882042,141.030268,168.93348,400.065502,244.335533,269.72114,138.759059,214.55651,214.897024,183.060944,...,-16.472163,-29.554204,-45.685998,-71.530529,-5.612568,7.255513,86.277819,15.568653,-2.024978,45.267889
IYF,249.722631,151.556966,197.215883,244.335533,353.670699,393.639181,149.085181,231.101163,314.641507,201.406552,...,-18.042089,-32.030808,-47.327357,-72.524521,-13.625962,5.288452,82.944544,10.486494,-4.739502,39.627812
IYG,272.022402,161.577038,214.540227,269.72114,393.639181,450.900675,160.951614,252.986857,326.380954,221.298134,...,-21.463024,-38.255105,-56.38974,-86.543624,-17.176137,4.203522,90.982804,8.682911,-6.455148,42.702495
IYH,137.251397,97.155199,115.626443,138.759059,149.085181,160.951614,129.676585,128.407361,137.981789,127.724996,...,-8.912167,-16.202833,-24.827889,-38.642559,-6.548497,4.960199,47.76677,9.014621,-1.167791,25.194488
IYJ,219.812636,130.026869,163.785685,214.55651,231.101163,252.986857,128.407361,203.857365,213.840856,177.467919,...,-14.165615,-25.931111,-38.859452,-59.505286,-10.060838,5.10586,68.007299,9.983996,-2.894429,34.139813
IYR,234.057239,147.980592,187.235776,214.897024,314.641507,326.380954,137.981789,213.840856,383.196938,188.47838,...,-12.004041,-20.523197,-30.886236,-46.386245,-7.663843,8.958842,76.973211,16.54784,-0.404905,37.352998
IYW,193.304374,123.043207,161.549152,183.060944,201.406552,221.298134,127.724996,177.467919,188.47838,217.500456,...,-12.807101,-22.642277,-33.430828,-51.181454,-9.626424,5.172526,63.628548,12.292191,-2.041477,31.581739
