Suggestion: Linear Subspace Classifier #14139

agamemnonc · 2019-06-21T13:44:14Z

Do folks believe that the following classifier is worth implementing?

Naseem, I., Togneri, R. and Bennamoun, M., 2010. Linear regression for face recognition. IEEE transactions on pattern analysis and machine intelligence, 32(11), pp.2106-2112.

Link to paper (~1K citations).

This is a simple classifier that looks at the distance of a test point to the linear subspace spanned by the training examples in each class. Then it predicts the class with the smallest distance. From my understanding, it only works when the number of features is larger than the number of counts for all classes, otherwise the computed linear kernel matrix is singular and therefore not invertible. Posterior probability estimates are not natively supported.

In the case of high input dimensionality with few samples, however, it might be useful in some cases. From a quick stab I had at implementing the classifier and playing around with toy datasets where n_features > n_samples, it seems to be doing a much better job than e.g. LogisticRegression or LinearDiscriminantAnalysis, which suffers from variable collinearity.

If the core team think that this is a feature worth implementing, I am happy to submit a PR.

The text was updated successfully, but these errors were encountered:

agramfort · 2019-06-22T11:04:02Z

Can you share a benchmark ?

agamemnonc · 2019-06-24T09:46:09Z

Sure.

I am also copying in the implementation so that people can run the benchmark themselves.

If we decide to include the feature, I will of course properly submit the code via a PR.

import warnings
warnings.filterwarnings("ignore")

import numpy as np

from sklearn.base import BaseEstimator, ClassifierMixin
from sklearn.utils import check_X_y
from sklearn.utils.multiclass import check_classification_targets

from sklearn.datasets import make_classification
from sklearn.model_selection import cross_val_score
from sklearn.discriminant_analysis import LinearDiscriminantAnalysis
from sklearn.linear_model import (LogisticRegression, SGDClassifier,
                                  RidgeClassifier)
from sklearn.ensemble import RandomForestClassifier
from sklearn.gaussian_process import GaussianProcessClassifier

class LinearSubspaceClassifier(BaseEstimator, ClassifierMixin):
    """Linear Subspace Classifier

    The classifier works by comparing the distance of a test point to the
    subspace spanned by the collection of training points in each class. It
    computes the distance for each class and picks the class with minimum
    distance. Prediction probabilities are not natively supported.

    Attributes
    ----------
    hat_ : list of array-like, shape = [n_features, n_features]
        Hat matrix for each class.

    References
    ----------
    .. [1] N. Imran, R. Togneri, M. Bennamoun.
           "Linear regression for face recognition." IEEE transactions on
           pattern analysis and machine intelligence. 32(11), 2106-2112, 2010.
    """

    def __init__(self):
        super(LinearSubspaceClassifier, self).__init__()

    def fit(self, X, y):
        """Fit the model according to the given training data and parameters.

        Parameters
        ----------
        X : array-like, shape = [n_samples, n_features]
            Training vector, where n_samples in the number of samples and
            n_features is the number of features.

        y : array, shape = [n_samples]
            Target values.
        """
        X, y = check_X_y(X, y)
        check_classification_targets(y)
        self.classes_, y = np.unique(y, return_inverse=True)
        n_samples, n_features = X.shape
        n_classes = len(self.classes_)

        _, counts = np.unique(y, return_counts=True)
        if np.any(counts > n_features):
            warnings.warn("Found some classes with more counts than input "
                          "features. Results may be unstable.")

        self.hat_ = []

        for ind in range(n_classes):
            Xg = X[y == ind, :]
            Gg = np.dot(Xg, Xg.T)
            self.hat_.append(np.dot(np.dot(Xg.T, np.linalg.inv(Gg)), Xg))

        return self

    def decision_function(self, X):
        """Apply decision function to an array of samples.

        Parameters
        ----------
        X : array-like, shape = [n_samples, n_features]
            Array of samples (test vectors).

        Returns
        -------
        C : array, shape = [n_samples, n_classes] or [n_samples,]
            Decision function values related to each class, per sample.
            In the two-class case, the shape is [n_samples,].
        """
        n_samples, n_features = X.shape
        n_classes = len(self.classes_)
        dec_func = np.zeros((n_samples, n_classes))
        for ind in range(n_classes):
            dec_func[:, ind] = np.linalg.norm(
                np.dot(X, np.eye(n_features) - self.hat_[ind]),
                axis=1)

        return dec_func

    def predict(self, X):
        """Perform classification on an array of test vectors X.

        The predicted class C for each sample in X is returned.

        Parameters
        ----------
        X : array-like, shape = [n_samples, n_features]
            Array of samples (test vectors).

        Returns
        -------
        C : array, shape = [n_samples,]
        """
        D = self.decision_function(X)
        return np.argmin(D, axis=1)

    
X, y = make_classification(n_samples=150, n_features=350, n_classes=5,
                           n_redundant=0, n_informative=350,
                           n_clusters_per_class=1, random_state=10)
estimators = [
    LinearDiscriminantAnalysis(),
    LinearSubspaceClassifier(),
    SGDClassifier(),
    LogisticRegression(multi_class='auto', solver='liblinear'),
    RandomForestClassifier(),
    RidgeClassifier(),
    GaussianProcessClassifier()
]

print("Cross-validated (3-fold) accuracy score")
print("---------------------------------------")
for estimator in estimators:
    print(estimator.__class__.__name__, "{:.2f}".format(
        np.mean(np.asarray(cross_val_score(
            X=X, y=y, estimator=estimator, cv=3)))
    ))

Output

Cross-validated (3-fold) accuracy score
---------------------------------------
LinearDiscriminantAnalysis 0.22
LinearSubspaceClassifier 0.83
SGDClassifier 0.33
LogisticRegression 0.39
RandomForestClassifier 0.26
RidgeClassifier 0.41
GaussianProcessClassifier 0.19

agramfort · 2019-06-30T11:39:52Z

i played quickly with it and it seems indeed surprisingly efficient in prediction. I see some scalability issues in the implementation could maybe be improved.
To me if it works out of the box LDA and RidgeClassifierCV on a comprehensive list of datasets that could be even more convincing.

agamemnonc · 2019-06-30T12:35:25Z

I was kind of surprised myself too!

Regarding scalability, as I mentioned above I think the algorithm only works when n_features > n_samples, otherwise the kernel matrix is low-rank and therefore non-invertible. But in that case it seems that it can outperform other classifiers, but I have only tested on artificial datasets. Would the fact that it works only in this special case be a blocker for including in sklearn? Or could it perhaps be included with an appropriate warning in documentation?

Apologies, I didn't quite understand your comment on LDA and RidgeClassiiferCV, could you please clarify? And do you have any specific datasets in mind? Again, if we decide to include, I think we should benchmark on datasets where n_features > n_samples.

agramfort · 2019-06-30T19:21:48Z

I meant take some datasets on openml and bench against LinearDisciminantAnalysis and RidgeClassiiferCV which are linear methods so you have a fair comparison to start with. I think there must be a reasonable solution to address the case n_features < n_samples

…

TomDLT added the New Feature label Jun 21, 2019

cmarmo added module:linear_model Needs Benchmarks A tag for the issues and PRs which require some benchmarks help wanted labels Mar 10, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Suggestion: Linear Subspace Classifier #14139

Suggestion: Linear Subspace Classifier #14139

agamemnonc commented Jun 21, 2019 •

edited

agramfort commented Jun 22, 2019 via email

agamemnonc commented Jun 24, 2019

agramfort commented Jun 30, 2019

agamemnonc commented Jun 30, 2019

agramfort commented Jun 30, 2019 via email

Suggestion: Linear Subspace Classifier #14139

Suggestion: Linear Subspace Classifier #14139

Comments

agamemnonc commented Jun 21, 2019 • edited

agramfort commented Jun 22, 2019 via email

agamemnonc commented Jun 24, 2019

agramfort commented Jun 30, 2019

agamemnonc commented Jun 30, 2019

agramfort commented Jun 30, 2019 via email

agamemnonc commented Jun 21, 2019 •

edited