How to project new data using existing VSP loadings? #68

ravwojdyla · 2022-07-06T14:05:46Z

Thank you so much for a great paper and this library!

I'm limited to python and have reimplemented VSP in python atop sklearn PCA, by adding the varimax rotation. I get the same results by following your exact implementation in python. But I have a follow up use case, what to do when I want to project new data using existing VSP loadings? This question is related to https://stats.stackexchange.com/questions/59213/how-to-compute-varimax-rotated-principal-components-in-r but in the context of VSP.

Would appreciate your help or hints.

ravwojdyla · 2022-07-06T14:57:50Z

And to follow up, I have tried a couple of approaches without success. And I'm testing this by "fitting" VSP on X to obtain the loadings/rotation matrices, and then using those (and X) I would like to reproduce the Z without s$u.

alexpghayes · 2022-07-06T15:09:55Z

Can you share a code example? My guess is that you aren't matching the pre-processing and post-processing steps, and I can provide some pointers on how to do this.

ravwojdyla · 2022-07-06T16:21:24Z

Hi @alexpghayes, thanks for prompt response, I appreciate that. Here's our current code as a sklearn transformer:

from typing import Any, Literal, Optional, Union

import numpy as np
import numpy.typing as npt
from sklearn.decomposition import PCA


class VSP(PCA):  # type:ignore[misc]
    """
    R VSP implementation is at: https://github.com/RoheLab/vsp
    """

    def __init__(
        self,
        n_components: Union[None, int, float, str] = None,
        *,
        copy: bool = True,
        svd_solver: str = "auto",
        tol: float = 0.0,
        iterated_power: str = "auto",
        random_state: Optional[float] = None,
        rotation: Literal["varimax", "quartimax"] = "varimax"
    ):
        """
        NOTE:
          * R tol/eps is higher by default
          * VSP will use always use svds
            https://github.com/RoheLab/vsp/blob/d57d6e1858444035e1ab78fed371eb68ebdee0eb/R/vsp.R#L126
          * VSP is whitening by default?
            https://github.com/RoheLab/vsp/blob/d57d6e1858444035e1ab78fed371eb68ebdee0eb/R/vsp.R#L141-L142
        """
        super().__init__(
            n_components=n_components,
            copy=copy,
            whiten=True,
            svd_solver=svd_solver,
            tol=tol,
            iterated_power=iterated_power,
            random_state=random_state,
        )
        self.rotation = rotation

    @classmethod
    def _ortho_matrix(
        cls,
        components: Any,
        method: str = "varimax",
        tol: float = 1e-6,
        max_iter: int = 100,
    ) -> npt.ArrayLike:
        """
        Returns rotation matrix.

        `components` is either components or scores. See:
        https://github.com/RoheLab/vsp/blob/d57d6e1858444035e1ab78fed371eb68ebdee0eb/R/vsp.R#L141

        This is copied from sklearn.decomposition._factor_analysis._ortho_rotation
        but doesn't perform final dot product, which we will do inside fit/transform.
        """
        nrow, ncol = components.shape
        rotation_matrix = np.eye(ncol)
        var = 0

        for _ in range(max_iter):
            comp_rot = np.dot(
                components, rotation_matrix
            )  # type:ignore[no-untyped-call]
            if method == "varimax":
                tmp = comp_rot * np.transpose((comp_rot**2).sum(axis=0) / nrow)
            elif method == "quartimax":
                tmp = 0
            u, s, v = np.linalg.svd(
                np.dot(
                    components.T, comp_rot**3 - tmp
                )  # type:ignore[no-untyped-call]
            )
            rotation_matrix = np.dot(u, v)
            var_new = np.sum(s)
            if var != 0 and var_new < var * (1 + tol):
                break
            var = var_new

        return rotation_matrix

    def _fit(
        self, X: npt.ArrayLike
    ) -> tuple[npt.ArrayLike, npt.ArrayLike, npt.ArrayLike]:
        U, S, _ = super()._fit(X)
        # NOTE: R_v is NOT normalized, as opposed to default R's varimax(rawLoadings)$rot
        R_v = self._ortho_matrix(self.components_.T, method=self.rotation, tol=self.tol)
        U_c = U[:, : self.n_components_]
        R_u = self._ortho_matrix(U_c, method=self.rotation, tol=self.tol)
        # NOTE: same as varimax(rawLoadings, normalize = F)$loadings, + sqrt(n)
        Y = np.sqrt(self.n_features_) * np.dot(
            self.components_.T, R_v
        )
        self.rot_components_ = Y.T
        self.R_v_ = R_v
        self.R_u_ = R_u
        return U, S, Y

    def fit_transform(
        self, X: npt.ArrayLike, y: Optional[npt.ArrayLike] = None
    ) -> npt.ArrayLike:
        U, S, Vt = self._fit(X)
        U = U[:, : self.n_components_]  # type:ignore
        # TODO: make_skew_positive(fa)
        return np.sqrt(self.n_samples_) * np.dot(
            U, self.R_u_
        )

    def transform(self, X: npt.ArrayLike) -> npt.ArrayLike:
        # TODO: ???
        raise NotImplementedError()

    def inverse_transform(self, X: npt.ArrayLike) -> npt.ArrayLike:
        raise NotImplementedError()

So to be clear fit_transform returns exactly the same results as your R code, but I couldn't come up with a formula for transform, such that I can fit and then project a different (or the same) x.

Here's usage on Iris:

from sklearn.datasets import load_iris
from sklearn.preprocessing import StandardScaler

iris = load_iris()["data"]

x = StandardScaler().fit_transform(iris)
# NOTE: use 1e-5 tol to match R
x_scores = VSP(2, tol=1e-5).fit_transform(x)

# would like to:
VSP(2, tol=1e-5).fit(x).transform(x)
# or `new_x`, like `.transform(new_x)`

ravwojdyla · 2022-07-06T16:31:37Z

And x_scores above will match this R code:

library(datasets)
library(vsp)

data(iris)

x = scale(data.matrix(iris[1:4]), scale = TRUE)
x_scores = vsp(x, rank=2, degree_normalize = FALSE)$Z

alexpghayes · 2022-07-06T18:18:37Z

Roughly you'll want something like

    def transform(self, X: npt.ArrayLike) -> npt.ArrayLike:
        return np.dot(super().transform(X), self.R_u_)

It looks like there is a little bit of a difference between VSP(2, tol=1e-5).fit_transform(x) and VSP(2, tol=1e-5).fit(x).transform(x) under this implementation and I'm not sure if comes down to numerical stuff or not quite understanding how super().transform() is working in full glory. The TL; DR is project onto the principle components like you usually would and then rotate. Trick is to be careful about what is scaled by S or S^{1/2} in the process. I think this implementation should be fine since you passed whiten=True to PCA() but I would be sure to test it a little before committing to it.

ravwojdyla · 2022-07-06T21:15:20Z

@alexpghayes this is great, thank you so much! The difference is too big to be numerical error 🤔

alexpghayes · 2022-07-06T21:21:06Z

It might come from the transformation of singular values into variance explained and back in the whitening logic. Personally I found the sklearn implementation rather frustrating to read this morning, and would just roll a Python carbon copy of vsp() if it were me. May or may not be a good idea for you depending on whether you're using any PCA() methods.

ravwojdyla · 2022-07-06T21:49:35Z

@alexpghayes I think you are right, the problem was the explained_variance_, sklearn implementation:

explained_variance_ = (S**2) / (n_samples - 1)

from https://github.com/scikit-learn/scikit-learn/blob/dfaef0c6c3aef0d00c72573728c90c1d542e2957/sklearn/decomposition/_pca.py#L532

if we adjust the denominator to n_samples_ instead of (n_samples - 1):

explained_variance_ = ((S ** 2) / self.n_samples_)[:self.n_components_]

the results are the same.

ravwojdyla · 2022-07-07T22:46:58Z

Thanks again @alexpghayes, will close this now.

kennyjoseph · 2023-04-11T15:20:32Z

@ravwojdyla Been looking at this thread and wondering if you'd be willing/able to share any updates to your python code here (for academic research). Only return would be that we are trying to get this to work with sparse input (afaik sklearn PCA does not handle sparse input matrices? EDIT: see here) and obviously can share back!

ravwojdyla · 2023-04-11T15:43:03Z

@kennyjoseph sure, here it is:

from typing import Any, Literal

import numpy as np
import numpy.typing as npt
from sklearn.decomposition import PCA


class VSP(PCA):  # type:ignore[misc]
    """
    R VSP implementation is at: https://github.com/RoheLab/vsp
    """

    def __init__(
        self,
        n_components: None | int | float | str = None,
        *,
        copy: bool = True,
        svd_solver: str = "auto",
        tol: float = 0.0,
        iterated_power: str = "auto",
        random_state: float | None = None,
        rotation: Literal["varimax", "quartimax"] = "varimax"
    ):
        """
        NOTE:
          * R tol/eps is higher by default
          * VSP will use always use svds
            https://github.com/RoheLab/vsp/blob/d57d6e1858444035e1ab78fed371eb68ebdee0eb/R/vsp.R#L126
          * VSP is whitening by default?
            https://github.com/RoheLab/vsp/blob/d57d6e1858444035e1ab78fed371eb68ebdee0eb/R/vsp.R#L141-L142
        """
        super().__init__(
            n_components=n_components,
            copy=copy,
            whiten=True,
            svd_solver=svd_solver,
            tol=tol,
            iterated_power=iterated_power,
            random_state=random_state,
        )
        self.rotation = rotation

    @classmethod
    def _ortho_matrix(
        cls,
        components: Any,
        method: str = "varimax",
        tol: float = 1e-6,
        max_iter: int = 100,
    ) -> npt.ArrayLike:
        """
        Returns rotation matrix.

        `components` is either components or scores. See:
        https://github.com/RoheLab/vsp/blob/d57d6e1858444035e1ab78fed371eb68ebdee0eb/R/vsp.R#L141

        This is copied from sklearn.decomposition._factor_analysis._ortho_rotation
        but doesn't perform final dot product, which we will do inside fit/transform.
        """
        nrow, ncol = components.shape
        rotation_matrix = np.eye(ncol)
        var = 0

        for _ in range(max_iter):
            comp_rot = np.dot(components, rotation_matrix)
            if method == "varimax":
                tmp = comp_rot * np.transpose((comp_rot**2).sum(axis=0) / nrow)
            elif method == "quartimax":
                tmp = 0
            u, s, v = np.linalg.svd(np.dot(components.T, comp_rot**3 - tmp))
            rotation_matrix = np.dot(u, v)
            var_new = np.sum(s)
            if var != 0 and var_new < var * (1 + tol):
                break
            var = var_new

        return rotation_matrix

    def _fit(
        self, X: npt.ArrayLike
    ) -> tuple[npt.ArrayLike, npt.ArrayLike, npt.ArrayLike]:
        U, S, _ = super()._fit(X)

        explained_variance = (S**2) / self.n_samples_
        total_var = explained_variance.sum()
        explained_variance_ratio = explained_variance / total_var

        # NOTE: R_v is NOT normalized, as opposed to default R's varimax(rawLoadings)$rot
        R_v = self._ortho_matrix(self.components_.T, method=self.rotation, tol=self.tol)
        U_c = U[:, : self.n_components_]
        R_u = self._ortho_matrix(U_c, method=self.rotation, tol=self.tol)
        # NOTE: same as varimax(rawLoadings, normalize = F)$loadings, but also includes
        #       the multiplication by sqrt(n)
        Y = np.sqrt(self.n_features_) * np.dot(self.components_.T, R_v)
        self.rot_components_ = Y.T
        self.R_v_ = R_v
        self.R_u_ = R_u
        self.explained_variance_ = explained_variance[: self.n_components_]
        self.total_var_ = total_var
        self.explained_variance_ratio_ = explained_variance_ratio[: self.n_components_]
        return U, S, Y

    def fit_transform(
        self, X: npt.ArrayLike, y: npt.ArrayLike | None = None
    ) -> npt.ArrayLike:
        U, S, Vt = self._fit(X)
        U = U[:, : self.n_components_]  # type:ignore
        # TODO: make_skew_positive(fa)
        return np.sqrt(self.n_samples_) * np.dot(  # type:ignore[no-any-return]
            U, self.R_u_
        )

    def transform(self, X: npt.ArrayLike) -> npt.ArrayLike:
        return np.dot(super().transform(X), self.R_u_)  # type:ignore[no-any-return]

    def inverse_transform(self, X: npt.ArrayLike) -> npt.ArrayLike:
        raise NotImplementedError()

kennyjoseph · 2023-04-13T13:35:59Z

Ah awesome, thanks so much!

ravwojdyla · 2023-04-13T17:12:21Z

@kennyjoseph np, if you have any updates or notice any issues please let us know, also would love to hear the updates about the sparse input.

ravwojdyla closed this as completed Jul 7, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to project new data using existing VSP loadings? #68

How to project new data using existing VSP loadings? #68

ravwojdyla commented Jul 6, 2022

ravwojdyla commented Jul 6, 2022

alexpghayes commented Jul 6, 2022

ravwojdyla commented Jul 6, 2022

ravwojdyla commented Jul 6, 2022 •

edited

alexpghayes commented Jul 6, 2022

ravwojdyla commented Jul 6, 2022

alexpghayes commented Jul 6, 2022

ravwojdyla commented Jul 6, 2022

ravwojdyla commented Jul 7, 2022

kennyjoseph commented Apr 11, 2023 •

edited

ravwojdyla commented Apr 11, 2023 •

edited

kennyjoseph commented Apr 13, 2023

ravwojdyla commented Apr 13, 2023

How to project new data using existing VSP loadings? #68

How to project new data using existing VSP loadings? #68

Comments

ravwojdyla commented Jul 6, 2022

ravwojdyla commented Jul 6, 2022

alexpghayes commented Jul 6, 2022

ravwojdyla commented Jul 6, 2022

ravwojdyla commented Jul 6, 2022 • edited

alexpghayes commented Jul 6, 2022

ravwojdyla commented Jul 6, 2022

alexpghayes commented Jul 6, 2022

ravwojdyla commented Jul 6, 2022

ravwojdyla commented Jul 7, 2022

kennyjoseph commented Apr 11, 2023 • edited

ravwojdyla commented Apr 11, 2023 • edited

kennyjoseph commented Apr 13, 2023

ravwojdyla commented Apr 13, 2023

ravwojdyla commented Jul 6, 2022 •

edited

kennyjoseph commented Apr 11, 2023 •

edited

ravwojdyla commented Apr 11, 2023 •

edited