-
Notifications
You must be signed in to change notification settings - Fork 929
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add new shrinkage estimators #20
Comments
That would be amazing! Actually I always wanted to write custom implementations for PyPortfolioOpt rather than just wrapping sklearn (because it's an unnecessary dependency IMO), but had a lot of trouble implementing either of the estimators you mention. What I envisioned was a To help you along, here was my attempt – I think it semi-worked but in the end I couldn't get acceptable covariance matrices and had other things to do, so I just wrapped sklearn instead. Keep me updated! class LedoitWolfShrinkage:
"""
Originally described in Ledoit and Wolf (2001).
The core idea behind Ledoit Wolf Shrinkage is to find a compromise between the sample covariance
matrix S (although it is the MLE, it has high estimation error) and a shrinkage target F.
These are then combined with :math:`\delta F + (1-\delta)S`, where :math:`\delta` is
the shrinkage constant which has to be estimated. The optimal shrinkage constant depends on
F and S, which is why this module has been designed such that different shrinkage targets are
sublcasses of LedoitWolfShrinkage.
https://epublications.bond.edu.au/cgi/viewcontent.cgi?article=1099&context=ejsie
The default is constant correlation shrinkage, as per Ledoit and Wolf 2004.
"""
# TODO implement a default identity shrinkage
def __init__(self, prices):
self.X = prices.pct_change().dropna(how="all")
self.S = self.X.cov() # sample cov is the MLE
self.F = None # default shrinkage target
self.delta = None # shrinkage constant
def _shrinkage_target(self):
"""
Calculates the constant correlation shrinkage target, as per Ledoit and Wolf 2004.
All pairwise correlations are assumed to be equal to the mean of all correlations,
and the diagonal is just the std of each asset's returns.
Note that these functions particularly depend on X being an pandas dataframe, because
it provides functions that are very robust to missing data.
:return: constant correlation shrinkage target
:rtype: np.array
"""
T, N = self.X.shape
correlation_matrix = self.X.corr()
# r_bar is the average pairwise correlation (excluding diagonal)
r_bar = (np.sum(correlation_matrix.values) - N) / (N * (N - 1))
# r_bar = (np.mean(correlation_matrix.values) average correlation
# Use numpy's broadcasting to calculate s_ii * s_jj
# Then f_ij = r_bar * sqrt(s_ii * s_jj)
F = r_bar * np.sqrt(self.X.std().values *
self.X.std().values.reshape((-1, 1)))
# f_ii = s_ii
np.fill_diagonal(F, self.X.std().values)
self.F = F
return self.F, r_bar
def _pi_matrix(self):
"""
Estimates the asymptotic variances of the scaled entries of the sample covariance matrix.
This is used to calculate pi and rho.
:return: scaled asymptotic variances, in an NxN array where N is the number of assets.
:rtype: np.array
"""
T, N = self.X.shape
Xc = np.array(self.X - self.X.mean()) # centre X
Xc = np.nan_to_num(Xc)
S = np.array(self.S)
pi_matrix = np.zeros((N, N))
for i in range(N):
for j in range(i, N):
pi_matrix[i, j] = pi_matrix[j, i] = np.sum(
(Xc[:, i] * Xc[:, j] - S[i, j]) ** 2
)
return pi_matrix / T
def _pi(self):
"""
Estimates the sum of asymptotic variances of the scaled entries of the
sample covariance matrix. One of the three components of the optimal shrinkage
constant
:return: pi(hat)
:rtype: float
"""
pi_matrix = self._pi_matrix()
return np.sum(pi_matrix)
# TODO implement
def _rho(self):
Xc = self.X - self.X.mean()
T, N = self.X.shape
pi_diag = np.trace(self._pi_matrix())
_, r_bar = self._shrinkage_target()
asycov = 0
for i in range(N):
for j in range(N):
if j == i:
continue
theta_ii = np.sum(
(Xc.iloc[:, i] ** 2 - self.S.iloc[i, i])
* (Xc.iloc[:, i] * Xc.iloc[:, j] - self.S.iloc[i, j])
)
theta_jj = np.sum(
(Xc.iloc[:, j] ** 2 - self.S.iloc[j, j])
* (Xc.iloc[:, i] * Xc.iloc[:, j] - self.S.iloc[i, j])
)
first_term = np.sqrt(
self.S.iloc[j, j] / self.S.iloc[i, i]) * theta_ii
second_term = np.sqrt(
self.S.iloc[i, i] / self.S.iloc[j, j]) * theta_jj
asycov += first_term + second_term
return pi_diag + r_bar / 2 * asycov / T
def _gamma(self):
"""
Gamma estimates deviance of the shrinkage target (i.e sum of element-wise squared distances
between shrinkage target matrix and sample covariance matrix).
:return: gamma
:rtype: float
"""
if self.F is None:
self._shrinkage_target()
S = np.array(self.S)
return np.sum((self.F - S) ** 2)
def _optimal_shrinkage_constant(self):
"""
Calculates the optimal value of the shrinkage constant, as per Ledoit and Wolf. This
is truncated to [0, 1], in the rare case that k/T is outside of this range.
:return: optimal shrinkage constant
:rtype: float between 0 and 1.
"""
T = self.X.shape[0]
pi = self._pi()
rho = self._rho()
gamma = self._gamma()
kappa = (pi - rho) / gamma
self.delta = max(0, min(kappa / T, 1))
return self.delta
def shrink(self):
if self.delta is None:
self._optimal_shrinkage_constant()
# Return shrunk matrix
S_hat = self.delta * self.F + (1 - self.delta) * self.S
return S_hat |
IMO sklearn's shrinkage(s) is very well-optimized then we dont need to re-implement them. I will look at your attempt and go ahead. |
The problem is that they only implement identity shrinkage, whereas ideally I want to offer constant correlation and single index |
Just keep them in the wrapper, it's okay. I found that this guy (https://github.com/jasonstrimpel/covshrink/blob/master/core/portfolio.py#L417) converts authors' code from Matlab to python, I will verify and follow your codebase to add new shrinkage models. |
@88d52bdba0366127fffca9dfa93895 I've just merged the PR – code looks excellent! Also, could you share with me links to all the sources you used? This will be very important for me when I get round to writing documentation. For instance, where is this matlab code? |
The authors' code is here: https://www.econ.uzh.ch/en/people/faculty/wolf/publications.html For Constant Correlation model: https://www.econ.uzh.ch/dam/jcr:ffffffff-935a-b0d6-ffff-ffffde5e2d4e/covCor.m.zip |
Brilliant. I'll revisit this in summer and add docs and examples. Thanks for all your contributions! |
Hi @robertmartin8,
I'm trying to implement 2 estimators of Ledoit-Wolf in python: Sharpe’s single factor (or single-index) model and The constant-correlation model. And I think it's suitable to contribute it into your repo, is it okay or do you have any suggestion for me?
The text was updated successfully, but these errors were encountered: