Add new shrinkage estimators #20

88d52bdba0366127fffca9dfa93895 · 2019-03-04T15:21:16Z

I'm trying to implement 2 estimators of Ledoit-Wolf in python: Sharpe’s single factor (or single-index) model and The constant-correlation model. And I think it's suitable to contribute it into your repo, is it okay or do you have any suggestion for me?

robertmartin8 · 2019-03-04T17:32:13Z

That would be amazing! Actually I always wanted to write custom implementations for PyPortfolioOpt rather than just wrapping sklearn (because it's an unnecessary dependency IMO), but had a lot of trouble implementing either of the estimators you mention.

What I envisioned was a LedoitWolfShrinkage parent class which can be subclassed into SingleFactorShrinkage or ConstantCorrelation shrinkage. Please do have a go at it, and I would be happily accept a PR and mark you as a contributor on the documentation site as well.

To help you along, here was my attempt – I think it semi-worked but in the end I couldn't get acceptable covariance matrices and had other things to do, so I just wrapped sklearn instead. Keep me updated!

class LedoitWolfShrinkage:
    """
    Originally described in Ledoit and Wolf (2001).
    The core idea behind Ledoit Wolf Shrinkage is to find a compromise between the sample covariance
    matrix S (although it is the MLE, it has high estimation error) and a shrinkage target F.
    These are then combined with :math:`\delta F + (1-\delta)S`, where :math:`\delta` is
    the shrinkage constant which has to be estimated. The optimal shrinkage constant depends on
    F and S, which is why this module has been designed such that different shrinkage targets are
    sublcasses of LedoitWolfShrinkage.

    https://epublications.bond.edu.au/cgi/viewcontent.cgi?article=1099&context=ejsie

    The default is constant correlation shrinkage, as per Ledoit and Wolf 2004.
    """

    # TODO implement a default identity shrinkage

    def __init__(self, prices):
        self.X = prices.pct_change().dropna(how="all")
        self.S = self.X.cov()  # sample cov is the MLE
        self.F = None  # default shrinkage target
        self.delta = None  # shrinkage constant

    def _shrinkage_target(self):
        """
        Calculates the constant correlation shrinkage target, as per Ledoit and Wolf 2004.
        All pairwise correlations are assumed to be equal to the mean of all correlations,
        and the diagonal is just the std of each asset's returns.
        Note that these functions particularly depend on X being an pandas dataframe, because
        it provides functions that are very robust to missing data.
        :return: constant correlation shrinkage target
        :rtype: np.array
        """
        T, N = self.X.shape
        correlation_matrix = self.X.corr()
        # r_bar is the average pairwise correlation (excluding diagonal)
        r_bar = (np.sum(correlation_matrix.values) - N) / (N * (N - 1))
        # r_bar = (np.mean(correlation_matrix.values)  average correlation

        # Use numpy's broadcasting to calculate s_ii * s_jj
        # Then f_ij = r_bar * sqrt(s_ii * s_jj)
        F = r_bar * np.sqrt(self.X.std().values *
                            self.X.std().values.reshape((-1, 1)))
        # f_ii = s_ii
        np.fill_diagonal(F, self.X.std().values)
        self.F = F
        return self.F, r_bar

    def _pi_matrix(self):
        """
        Estimates the asymptotic variances of the scaled entries of the sample covariance matrix.
        This is used to calculate pi and rho.
        :return: scaled asymptotic variances, in an NxN array where N is the number of assets.
        :rtype: np.array
        """

        T, N = self.X.shape
        Xc = np.array(self.X - self.X.mean())  # centre X
        Xc = np.nan_to_num(Xc)
        S = np.array(self.S)
        pi_matrix = np.zeros((N, N))
        for i in range(N):
            for j in range(i, N):
                pi_matrix[i, j] = pi_matrix[j, i] = np.sum(
                    (Xc[:, i] * Xc[:, j] - S[i, j]) ** 2
                )
        return pi_matrix / T

    def _pi(self):
        """
        Estimates the sum of asymptotic variances of the scaled entries of the
        sample covariance matrix. One of the three components of the optimal shrinkage
        constant
        :return: pi(hat)
        :rtype: float
        """
        pi_matrix = self._pi_matrix()
        return np.sum(pi_matrix)

    # TODO implement
    def _rho(self):
        Xc = self.X - self.X.mean()
        T, N = self.X.shape

        pi_diag = np.trace(self._pi_matrix())
        _, r_bar = self._shrinkage_target()

        asycov = 0

        for i in range(N):
            for j in range(N):
                if j == i:
                    continue

                theta_ii = np.sum(
                    (Xc.iloc[:, i] ** 2 - self.S.iloc[i, i])
                    * (Xc.iloc[:, i] * Xc.iloc[:, j] - self.S.iloc[i, j])
                )
                theta_jj = np.sum(
                    (Xc.iloc[:, j] ** 2 - self.S.iloc[j, j])
                    * (Xc.iloc[:, i] * Xc.iloc[:, j] - self.S.iloc[i, j])
                )

                first_term = np.sqrt(
                    self.S.iloc[j, j] / self.S.iloc[i, i]) * theta_ii
                second_term = np.sqrt(
                    self.S.iloc[i, i] / self.S.iloc[j, j]) * theta_jj
                asycov += first_term + second_term

        return pi_diag + r_bar / 2 * asycov / T

    def _gamma(self):
        """
        Gamma estimates deviance of the shrinkage target (i.e sum of element-wise squared distances
        between shrinkage target matrix and sample covariance matrix).
        :return: gamma
        :rtype: float
        """
        if self.F is None:
            self._shrinkage_target()
        S = np.array(self.S)
        return np.sum((self.F - S) ** 2)

    def _optimal_shrinkage_constant(self):
        """
        Calculates the optimal value of the shrinkage constant, as per Ledoit and Wolf. This
        is truncated to [0, 1], in the rare case that k/T is outside of this range.
        :return: optimal shrinkage constant
        :rtype: float between 0 and 1.
        """
        T = self.X.shape[0]
        pi = self._pi()
        rho = self._rho()
        gamma = self._gamma()
        kappa = (pi - rho) / gamma
        self.delta = max(0, min(kappa / T, 1))
        return self.delta

    def shrink(self):
        if self.delta is None:
            self._optimal_shrinkage_constant()
        # Return shrunk matrix
        S_hat = self.delta * self.F + (1 - self.delta) * self.S
        return S_hat

88d52bdba0366127fffca9dfa93895 · 2019-03-06T05:47:19Z

IMO sklearn's shrinkage(s) is very well-optimized then we dont need to re-implement them. I will look at your attempt and go ahead.

robertmartin8 · 2019-03-06T08:02:15Z

The problem is that they only implement identity shrinkage, whereas ideally I want to offer constant correlation and single index

88d52bdba0366127fffca9dfa93895 · 2019-03-06T08:13:17Z

Just keep them in the wrapper, it's okay. I found that this guy (https://github.com/jasonstrimpel/covshrink/blob/master/core/portfolio.py#L417) converts authors' code from Matlab to python, I will verify and follow your codebase to add new shrinkage models.

robertmartin8 · 2019-03-20T21:30:32Z

@88d52bdba0366127fffca9dfa93895

I've just merged the PR – code looks excellent!

Also, could you share with me links to all the sources you used? This will be very important for me when I get round to writing documentation. For instance, where is this matlab code?

88d52bdba0366127fffca9dfa93895 · 2019-03-21T02:50:45Z

The authors' code is here: https://www.econ.uzh.ch/en/people/faculty/wolf/publications.html

For Constant Correlation model: https://www.econ.uzh.ch/dam/jcr:ffffffff-935a-b0d6-ffff-ffffde5e2d4e/covCor.m.zip
For Single Index model: https://www.econ.uzh.ch/dam/jcr:ffffffff-935a-b0d6-0000-0000648dfc98/covMarket.m.zip

robertmartin8 · 2019-03-25T09:02:48Z

Brilliant. I'll revisit this in summer and add docs and examples. Thanks for all your contributions!

robertmartin8 added the enhancement New feature or request label Mar 4, 2019

88d52bdba0366127fffca9dfa93895 mentioned this issue Mar 18, 2019

add risk-models: constant-correlation & single-index #22

Merged

robertmartin8 closed this as completed Mar 25, 2019

schneiderfelipe mentioned this issue Aug 6, 2019

Undocumented shrinkage estimators #41

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add new shrinkage estimators #20

Add new shrinkage estimators #20

88d52bdba0366127fffca9dfa93895 commented Mar 4, 2019

robertmartin8 commented Mar 4, 2019

88d52bdba0366127fffca9dfa93895 commented Mar 6, 2019

robertmartin8 commented Mar 6, 2019

88d52bdba0366127fffca9dfa93895 commented Mar 6, 2019

robertmartin8 commented Mar 20, 2019

88d52bdba0366127fffca9dfa93895 commented Mar 21, 2019

robertmartin8 commented Mar 25, 2019

Add new shrinkage estimators #20

Add new shrinkage estimators #20

Comments

88d52bdba0366127fffca9dfa93895 commented Mar 4, 2019

robertmartin8 commented Mar 4, 2019

88d52bdba0366127fffca9dfa93895 commented Mar 6, 2019

robertmartin8 commented Mar 6, 2019

88d52bdba0366127fffca9dfa93895 commented Mar 6, 2019

robertmartin8 commented Mar 20, 2019

88d52bdba0366127fffca9dfa93895 commented Mar 21, 2019

robertmartin8 commented Mar 25, 2019