Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.Sign up
Linear Discriminant Analysis eigen solver questionable implementation #11727
There seems to be a bug in the eigen solver part of LDA.
Steps/Code to Reproduce
When you use LDA with eigen solver. The decision function is implemented as
scores = safe_sparse_dot(X, self.coef_.T, dense_output=True) + self.intercept_ evals, evecs = linalg.eigh(Sb, Sw) self.coef_ = np.dot(self.means_, evecs).dot(evecs.T) self.intercept_ = (-0.5 * np.diag(np.dot(self.means_, self.coef_.T)) + np.log(self.priors_))
where self.means_ is the mean for each class.
This means the decision function becomes
scores= X @ means_ -0.5 * np.diag(means_ @ means_.T) + np.log(priors_)
while the true decision function should be
scores= X @ linalg.inv(Sw) @ means_ -0.5 * np.diag(means_ @ linalg.inv(Sw) @ means_.T) + np.log(priors_)
These could all be caused by the wrong line in eigen solver:
self.coef_ = np.dot(self.means_, evecs).dot(evecs.T)
where as in lsqr solver it is:
self.coef_ = linalg.lstsq(self.covariance_, self.means_.T).T
Sw means Covariance within group, the same as self.covariance_
I'm also wondering.
I played a bit with iris dataset
print(__doc__) import matplotlib.pyplot as plt from sklearn import datasets from sklearn.decomposition import PCA from sklearn.discriminant_analysis import LinearDiscriminantAnalysis iris = datasets.load_iris() X = iris.data X=X@np.diag([10,5,1,0.1]) #X= np.concatenate((X,X[:,0:2].mean(1,keepdims=True),np.zeros((150,2))),1) y = iris.target target_names = iris.target_names ldal = LinearDiscriminantAnalysis(solver='lsqr', shrinkage=0.1) ldal.fit(X,y) print(ldal.score(X,y)) plt.imshow(ldal.covariance_) lda = LinearDiscriminantAnalysis(solver='eigen', shrinkage=0.1) lda.fit(X,y) print(lda.score(X,y)) plt.figure() plt.imshow(lda.covariance_)
By the implementation difference I mentioned above, as long as the covariance matrix is not identity matrix, there should be difference between these 2 solvers.
but it only gives different results in 2 scenarios:
There seems to be something specific to shrinkage... I don't get it.
The reason the test passes is that it only checks that the predictions are the same and not that the posterior probabilities are equal. The test introduced in #11796 (