Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP

Loading…

Locality Preserving Projection module for manifold learning #1628

Open
wants to merge 27 commits into from

6 participants

@lucidfrontier45

This pull request adds a new module "Locality Preserving Projection" (LPP) to the manifold learning package. LPP is can be see as a linear approximation to the Laplacian Eigen Mapping. Unlike other manifold learning algorithms, LPP is a linear transformation and can be used like PCA.

Detail of LPP can be found in the following paper.
"X. He and P. Niyogi. Locality preserving projections. Advances in Neural Information Processing Systems 16 (NIPS 2003), 2003. Vancouver, Canada.".

Currently I finished main LPP module and added examples. The remained tasks are to write test codes and documents. Can any one suggest me a guideline to write unit-tests? I don't understand its manner.

Thank you.

@amueller
Owner

I never heard of LPP but 1500 cites say it is pretty well-known.
Still, I'd rather call the estimator LocalityPreservingProjection.

@amueller
Owner

When running the tests, I get this warning:

LPP(eigen_solver=None, kernel_func=None, kernel_param=None, max_iter=None,
  mode=None, n_components=None, n_neighbors=None, pca_preprocess=None,
  random_state=0, tol=None, use_ext=None)
The eigenvalue range specified is not valid.
Valid range is [0,0]
@amueller
Owner

several of the common tests are failing. please run the test-suite. LPP doesn't check the input for nan or inf (use the utils.validation) functions, and it changes some of its parameters in the __init__.

@amueller
Owner

The current examples don't really convince me. Maybe I have to read the paper. As it is a linear method, it can't do well on the sphere. For the S we get a projection that basically projects to the flat S. That might make some sense.
I tried the digits, too, and didn't see anything there....

@lucidfrontier45

Hi @amueller

I fixed name LPP to LocalityPreservingProjection.

Can you tell me how can I get the error you reported?

LPP(eigen_solver=None, kernel_func=None, kernel_param=None, max_iter=None,
  mode=None, n_components=None, n_neighbors=None, pca_preprocess=None,
  random_state=0, tol=None, use_ext=None)
The eigenvalue range specified is not valid.
Valid range is [0,0]
@lucidfrontier45 lucidfrontier45 - change LPP to LocalityPreservingProjection
- modified PCA preprocess of LPP for removing singularity
5f0b73c
@GaelVaroquaux
@GaelVaroquaux
@amueller
Owner

Thanks for the rename. I got the error when running the tests but can't reproduce now.
Before going into the details, I think it would be great if some other devs who might be familiar with the method could say if it is appropriate for sklearn.
Also: do you have any specific real-world application were it works particularly well or some example that demonstrates its benefits?

sklearn/manifold/lpp_util.pyx
((20 lines not shown))
+# xx = nzx[i]
+# yy = nzy[i]
+# if G[yy, xx] == 0:
+# G[yy, xx] = G[xx, yy]
+# return G
+
+@cython.boundscheck(False)
+def toSymmetrixMat(G, np.ndarray[int, ndim = 1] nzx,
+ np.ndarray[int, ndim = 1] nzy):
+ cdef int i, xx, yy, n_nz
+ n_nz = len(nzx)
+ for i in xrange(n_nz):
+ xx = nzx[i]
+ yy = nzy[i]
+ if G[yy, xx] == 0:
+ G[yy, xx] = G[xx, yy]
@agramfort Owner

is this really necessary? since G is not typed I guess it's not really a performance improvement.

@lucidfrontier45 Owner

sorry for late response, actually it made 10 times faster,

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
sklearn/manifold/lpp.py
((42 lines not shown))
+
+ Returns
+ -------
+ adjacency matrix : LIL sparse matrix, shape: (n_samples, n_samples)
+ (i, j) component of it is defined in the section 2.2 of [1]
+ """
+ X = np.asanyarray(X)
+ G = kneighbors_graph(X, n_neighbors, mode)
+ G = G.tolil()
+ nzx, nzy = G.nonzero()
+ if use_ext:
+ G = toSymmetrixMat(G, nzx, nzy)
+ else:
+ for xx, yy in zip(nzx, nzy):
+ if G[yy, xx] == 0:
+ G[yy, xx] = G[xx, yy]
@agramfort Owner

how about :

G = (G + G.T) / 2.

?

@lucidfrontier45 Owner

I think it's defiantly needed since kneighbors_graph does not give a symmetric matrix.
I know in the spectral_embedding module, (G + G.T /2.) is used but I don't think it's a correct way.

Suppose jth node is among k-nearest of ith node and the opposite is not. i.e. G[i, j] = dist(i, j) and G[j, i] = 0.
Also suppose G[m, l] = G[l, m] = dist(m, l).
Now if we use (G + G.T)/2, G[i, j] = G[j,i] = dist(i, j) /2 and G[m, l] = G[l, m] = dist(m. l].
It is obviously not what we want.

@agramfort Owner
@lucidfrontier45 Owner

OK, I will somehow try to merge lpp in spectral_embedding.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@agramfort agramfort commented on the diff
sklearn/manifold/lpp.py
((93 lines not shown))
+ ----------
+ afn_mat : CSR sparse matrix, shape(n_samples, n_samples)
+ the affinity matrix
+
+ Returns
+ -------
+ Laplacian matrix : CSR sparse matrix, shape(n_samples, n_samples)
+ (i, j) component L(i,j) is defined in [1] section 2.2 3
+
+ col_sum : ndarray, shape(n_samples)
+ diagonal matrix whose entries are column sums of the affinity matrix
+ """
+ col_sum = np.asarray(afn_mat.sum(0)).flatten()
+ lap_mat = (-afn_mat).tolil()
+ lap_mat.setdiag(col_sum)
+ return lap_mat.tocsr(), col_sum
@agramfort Owner

there is a need for the 3 functions above also for spectral embedding / laplacian eigenmaps so I guess the code should already exist and be factorized.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@agramfort agramfort commented on the diff
sklearn/manifold/lpp.py
((97 lines not shown))
+ Returns
+ -------
+ Laplacian matrix : CSR sparse matrix, shape(n_samples, n_samples)
+ (i, j) component L(i,j) is defined in [1] section 2.2 3
+
+ col_sum : ndarray, shape(n_samples)
+ diagonal matrix whose entries are column sums of the affinity matrix
+ """
+ col_sum = np.asarray(afn_mat.sum(0)).flatten()
+ lap_mat = (-afn_mat).tolil()
+ lap_mat.setdiag(col_sum)
+ return lap_mat.tocsr(), col_sum
+
+
+def auto_dsygv(M, N, k, k_skip=0, eigen_solver='auto', tol=1E-6,
+ max_iter=100, random_state=None):
@agramfort Owner

if such a function does not exist elsewhere it should be in utils as it's a common problem. Can you check that a similar solver is not already implemented for laplacian eigenmaps for example?

@lucidfrontier45 Owner

In the spectral_embedding module, such eigen value problem part was treated in the main spectral_embedding function. I want to separate eigen part from main.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@agramfort agramfort commented on the diff
sklearn/manifold/lpp.py
((232 lines not shown))
+ or matrix type. This method should be avoided for
+ large problems.
+
+ tol : float, optional
+ Tolerance for 'arpack' method.
+ Not used if eigen_solver=='dense'.
+
+ max_iter : maximum number of iterations for 'arpack' method
+ not used if eigen_solver=='dense'
+
+ random_state: numpy.RandomState or int, optional
+ The generator or seed used to determine the starting vector for arpack
+ iterations. Defaults to numpy.random.
+
+ use_ext : bool
+ whether use cython extension or not
@agramfort Owner

such an option should not exist. Either we go for cython or we don't.

@lucidfrontier45 Owner

I will remove use_ext and always use cython module.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
sklearn/manifold/lpp.py
((324 lines not shown))
+ self._n_components = n_components
+ self._mode = mode
+ self._kernel_func = kernel_func
+ self._kernel_param = kernel_param
+ self._eigen_solver = eigen_solver
+ self._tol = tol
+ self._max_iter = max_iter
+ self._random_state = random_state
+ self._use_ext = use_ext
+ self._components = None
+ self._pca_preprocess = pca_preprocess
+
+ def fit(self, X, y=None):
+ if self._pca_preprocess:
+ # PCA preprocess for removing singularity
+ target_dim = int(min(X.shape) * 0.9)
@agramfort Owner

I don't like much the hard coded 0.9.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@agramfort
Owner

for what I understand one benefit is that the embedding is defined everywhere such that the transform for new samples is just a matrix multiplication. The question for me now is: does it really work better than what we already have? maybe the examples I did not take the time to run are convincing...

@lucidfrontier45

@agramfort

In this paper "Face Recognition Using Laplacianfaces", LPP performed much better than PCA.
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.120.4055&rep=rep1&type=pdf

When I was doing some research, I found the above result can be reproduced for olivetti_faces data set.
However, I can't now due to some due to some changes or some bug... I'll first try to show you benchmark and then proceed to re-factoring.

sklearn/manifold/lpp_util.pyx
((12 lines not shown))
+#def adjacency_matrix(X, n_neighbors, mode='distance'):
+# cdef int i, xx, yy
+# X = np.asanyarray(X)
+# G = kneighbors_graph(X, n_neighbors, mode)
+# G = G.tolil()
+# nzx, nzy = G.nonzero()
+## for xx, yy in zip(nzx, nzy):
+# for i in xrange(nzx):
+# xx = nzx[i]
+# yy = nzy[i]
+# if G[yy, xx] == 0:
+# G[yy, xx] = G[xx, yy]
+# return G
+
+@cython.boundscheck(False)
+def toSymmetrixMat(G, np.ndarray[int, ndim = 1] nzx,
@mblondel Owner

-> to_symmetrix_matrix

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
sklearn/manifold/lpp.py
((47 lines not shown))
+ """
+ X = np.asanyarray(X)
+ G = kneighbors_graph(X, n_neighbors, mode)
+ G = G.tolil()
+ nzx, nzy = G.nonzero()
+ if use_ext:
+ G = toSymmetrixMat(G, nzx, nzy)
+ else:
+ for xx, yy in zip(nzx, nzy):
+ if G[yy, xx] == 0:
+ G[yy, xx] = G[xx, yy]
+
+ return G
+
+
+def affinity_matrix(adj_mat, kernel_func="heat", kernel_param=1.0):
@mblondel Owner

This functions arises in many estimators too, so it could be factorized.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@mblondel
Owner

It would indeed be nice to have an example on which LPP works better than PCA.

@GaelVaroquaux
Shiqiao Du - LPP code refactored
- added a comparison code of LPP and PCA (face recognition)
7c3208b
@lucidfrontier45

Hi, I added a benchmark code of PCA vs LPP.
It is a simple face recognition by PCA/LPP and one nearest neighbor.

for olivetti_faces, the average precision (score method) was
EigenFace : 88.7%, LaplacianFace : 89.1%

for lfw_people
EigenFace : 50.2%, LaplacianFace : 63.4%

While LPP and PCA were even for olivetti, LPP outperformed PCA for larger lfw_people dataset.

LPP has two additional parameters (number of neighbors and heat kernel parameter).
I used totally different kernel parameter for olivetti_faces (1e2) and lfw_people (1e7).
I will consider some sophisticated automatic parameter selection from now on.

@GaelVaroquaux
@lucidfrontier45

OK, that's very interesting. Given that, LPP does sound interesting. Do you have an hand-waving explanation of why LPP outperforms PCA?

According to the paper I listed above, LPP is a linear approximation to the Laplacian Eigen Mapping (Spectral Embedding). It tries to find a low dimensional representation which will preserve the local structure the most, while PCA tries to preserve the global structure. This can be seen from their scoring functions to be minimized or maximized.

The author said, by this way LPP can extract intrinsic low dimensional manifold from high dimensional complex data like face image. Of course it is merely a hypothesis that facial data lie in such a manifold.

@GaelVaroquaux
@lucidfrontier45

I don't know whether it is convincing but in my listed paper, the authors showed the so called Laplacianface, which is an LPP counterpart of Eigenface. At the first sight it is not so intuitive but the result was very similar to that of supervised Fisherface and therefore LPP is better for classification task than PCA. I can prepare the same kind of plot for the face dataset of sklearn.

@GaelVaroquaux
@lucidfrontier45 lucidfrontier45 - added `components_` to the LDA so that it is more compatibe to
    other `transformer` classes
- added LPP adn LDA to plot_faces_decomposition.py
ba342c8
@lucidfrontier45

I added LDA and LPP to the examples/decomposition/plot_faces_decomposition.py. Unlike the paper's result, LPP didn't looks like LDA so much. Nevertheless, we can see LPP and PCA returns completely different projection.

@amueller
Owner

Could you post the plot for the lazy reviewers please ;)

@lucidfrontier45

Here it is.

laplacianface
eigenface
fisherface

@amueller
Owner

Thanks :)

lucidfrontier45 and others added some commits
@GaelVaroquaux
@lucidfrontier45

Digit plot for PCA, Spectral Embedding and LPP.
Looks like PCA already done good enough for this example.
LPP's result seems equivalent to PCA. Just their 2nd component vectors have opposite sign.

pca
lem
lpp

@GaelVaroquaux

Yeah, LPP is not terrible different from PCA on digits. The good thing is that it does quite well.

How do the 'LaplacianDigits' look like?

@lucidfrontier45

Do we already have an EigenDigits or FisherDigits plot example?

@amueller
Owner

Actually I think the LPP on the digits looks quite nice. Not terribly different from PCA but if you look at the details, it is indeed "better'.

@GaelVaroquaux
@GaelVaroquaux
@lucidfrontier45

Actually I think the LPP on the digits looks quite nice. Not terribly different from PCA but if you look at the details, it is indeed "better'.

One drawback of LPP is that it has two additional parameters. While PCA only needs n_components, LPP also needs n_neighbors and kernel_param. Only when they are well tuned can LPP be better than PCA.

@GaelVaroquaux
@lucidfrontier45

Is the estimate_bandwidth trick on the mean-shift that I suggested
earlier of any use?

I haven't seen them yet. Since I've never heard of them, it might take some time.

@GaelVaroquaux
@lucidfrontier45

I hacked plot_face_decomposition and made a digits version.
However, plots of digits components are beyond my intuitive understanding.

center_digits
eigen_digits
laplacian_digits

@GaelVaroquaux

Thanks for the effort. I am indeed not sure that this is useful. Sorry for suggesting a dead end.

@lucidfrontier45

@GaelVaroquaux I looked into bandwidth estimate algorithm in meanshift module but I don't understand why it works.
Can you give me a reference of that algorithm?

@agramfort
Owner
@vene
Owner

Plots on the sklearn.datasets.digits are usually hard to interpret, maybe for curiosity we should look at a similar comparison but on mnist? Just on this thread, not in the docs, of course.

@lucidfrontier45

I found a good estimator for kernel_param. I just used a simple analogy to the variance parameter of Gaussian.
It worked beautifully for the previous face recognition example (examples/manifold/compare_lpp_pca.py).

if kernel_func == "heat":
    W.data **= 2

    # Estimate kernel_param by the heuristics that it must correspond to
    # the variance of Gaussian distribution.
    if kernel_param == "auto":
        kernel_param = np.mean(W.data)

    np.exp(-W.data / kernel_param, W.data)
@GaelVaroquaux

Usually, people take a percentile, rather than the mean, to be robust to skewed distribution. You could replace the mean by the median, for instance.

@lucidfrontier45

I plotted PCA, spectral embedding, and LPP of MNIST digit data.
This time LPP showed clearly different shape from PCA.
In fact, LPP showed very similar shape as spectral embedding, which is consistent to fact that LPP is an linear approximation of the Laplacian Eigen Mapping (spectral embedding).

pca
LEM
LPP

@GaelVaroquaux
@lucidfrontier45

I tried to merge overlapped function of spectral_embedding and LPP but it was hard.

First of all I want to replace the laplacian_matrix function to utils.graph.graph_laplacian. However, the two function gave different result for sparse matrix. For dense matrix, the result was same.
I think this code in _graph_laplacian_sparse do some bad thing.

diag_mask = (lap.row == lap.col)
if not diag_mask.sum() == n_nodes:
    # The sparsity pattern of the matrix has holes on the diagonal,
    # we need to fix that
    diag_idx = lap.row[diag_mask]

    lap = lap.tolil()

    diagonal_holes = list(set(range(n_nodes)).difference(diag_idx))
    lap[diagonal_holes, diagonal_holes] = 1
    lap = lap.tocoo()
    diag_mask = (lap.row == lap.col)
lap.data[diag_mask] = 0
w = -np.asarray(lap.sum(axis=1)).squeeze()

What is the holes on the diagonal to be fixed?

Second, LPP and spetral_embedding have different way of constructing affinity matrix. For nearest neighbor affinity matrix, LPP also have to calculate its heat kernel function. The two methods also solve a little different eigen value proble. In spectral_embedding, it is a ordinal eigen value problem but LPP is a generalized eigen value problem.

Considering those, I think code would be less readable if we merge the two module. I'd like to separate lpp and spectral embedding.

@GaelVaroquaux

I'll have a look at the commonalities between LPP and spectral embedding, to see if the code can be gracefully merged.

In the mean time, the branch does not pass the common tests, as can be seen from the travis log (look on the pull request page, it tells you that the tests are failing):
https://travis-ci.org/scikit-learn/scikit-learn/builds/4596058
The reason is that you don't use standard input preprocessors to deal with X and y, and as a result the estimator does not give good error messages on things like sparse input values, or NaNs.

@GaelVaroquaux

Another remark: you should add a section of LPP in the manifold part of the documentation. Follow the sections for the other methods: a paragraph with a hand-waving description of the algorithm; its strength and weaknesses, a picture (of the embedded digits), and a link to references and example.

Thanks!

@GaelVaroquaux

With regards to the difference between your version of graph_laplacian and the one that is in utils, it would help me understand the problem if you could give me a small example of a matrix on which they disagree.

The 'holes' on the diagonal are the diagonal elements that are zero, which shouldn't be the case for a laplacian matrix.

@GaelVaroquaux

Also, could you please rename lpp_utils.pyx to _lpp.pyx. Thanks!

@GaelVaroquaux

Hey @lucidfrontier45 : this PR looks like it has moved forward a lot, and I'd like to merge it before I leave on vacations. I am ready to put in some work to get it merged, but I will need some help. What's your schedule like in the next week? Will you be able to invest some time, and if so when? Sorry for asking, but I must do some planning for my own time management.

@lucidfrontier45

@GaelVaroquaux Hi. I will try to make some time but my office work is under some important part. Could you give me a list of task that I should do to merge this PR?

The reason is that you don't use standard input preprocessors to deal with X and y, and as a result the estimator does not give good error messages on things like sparse input values, or NaNs.

Can you give me an example of "standard input preprocessors to deal with X and y"

About document, I will check it today. (Maybe first copy from spectral embedding and modify it)

@GaelVaroquaux
@lucidfrontier45

I took a look at test error message.
I fixed some but the I cannot understand this one.
Can anyone tell me what this test exactly check and how to fix it?

FAIL: sklearn.tests.test_common.test_estimators_nan_inf

Traceback (most recent call last):
File "/usr/lib/python2.7/dist-packages/nose/case.py", line 197, in runTest
self.test(*self.arg)
File "/home/du/workspace/scikit-learn/sklearn/tests/test_common.py", line 378, in test_estimators_nan_inf
raise AssertionError(error_string_transform, Est)
AssertionError: ("Estimator doesn't check for NaN and inf in transform.", )

raise AssertionError("Estimator doesn't check for NaN and inf in transform.", Est)


@lucidfrontier45

I fixed to pass nosetest.

On rst document, when I type make doc and compiled html documents, the LaTex codes were not compiled. In addition to Sphinx and LaTex distribution, what do I need to embed math in html?
I used 64bit Linux Mint 12 apt repository to install Sphinx and LaTex.

@GaelVaroquaux
Try this code to reproduce that my lpp.laplacian_matrix and utils.graph.graph_laplacian give different result for sparse bug same for dense.
https://gist.github.com/lucidfrontier45/4987187

@mblondel
Owner

@lucidfrontier45 Since this PR has stalled, I think it would be a good idea to extract the source into a github repo and add it to https://github.com/scikit-learn/scikit-learn/wiki/Third-party-projects-and-code-snippets

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Commits on Jan 25, 2013
  1. @lucidfrontier45

    lpp modules

    lucidfrontier45 authored
  2. @lucidfrontier45

    - bugfix

    lucidfrontier45 authored
    - default eigen_solver -> auto
    - default kernel_param -> 10.0
    - use_ext member
  3. @lucidfrontier45

    PCA preprocess

    lucidfrontier45 authored
Commits on Jan 26, 2013
  1. @lucidfrontier45
  2. @lucidfrontier45

    Comments for LPP

    lucidfrontier45 authored
  3. @lucidfrontier45

    added example of LPP

    lucidfrontier45 authored
  4. @lucidfrontier45
Commits on Jan 27, 2013
  1. @lucidfrontier45

    - change LPP to LocalityPreservingProjection

    lucidfrontier45 authored
    - modified PCA preprocess of LPP for removing singularity
Commits on Jan 31, 2013
  1. - LPP code refactored

    Shiqiao Du authored
    - added a comparison code of LPP and PCA (face recognition)
  2. @lucidfrontier45

    - added `components_` to the LDA so that it is more compatibe to

    lucidfrontier45 authored
        other `transformer` classes
    - added LPP adn LDA to plot_faces_decomposition.py
Commits on Feb 1, 2013
  1. @lucidfrontier45

    lpp modules

    lucidfrontier45 authored
  2. @lucidfrontier45

    - bugfix

    lucidfrontier45 authored
    - default eigen_solver -> auto
    - default kernel_param -> 10.0
    - use_ext member
  3. @lucidfrontier45

    PCA preprocess

    lucidfrontier45 authored
  4. @lucidfrontier45
  5. @lucidfrontier45

    Comments for LPP

    lucidfrontier45 authored
  6. @lucidfrontier45

    added example of LPP

    lucidfrontier45 authored
  7. @lucidfrontier45
  8. @lucidfrontier45

    - change LPP to LocalityPreservingProjection

    lucidfrontier45 authored
    - modified PCA preprocess of LPP for removing singularity
  9. @lucidfrontier45

    - LPP code refactored

    Shiqiao Du authored lucidfrontier45 committed
    - added a comparison code of LPP and PCA (face recognition)
  10. @lucidfrontier45

    - added `components_` to the LDA so that it is more compatibe to

    lucidfrontier45 authored
        other `transformer` classes
    - added LPP adn LDA to plot_faces_decomposition.py
  11. @lucidfrontier45

    Merge branch 'lpp' of github.com:lucidfrontier45/scikit-learn into lpp

    lucidfrontier45 authored
    Conflicts:
    	examples/manifold/plot_compare_methods.py
Commits on Feb 4, 2013
  1. @lucidfrontier45

    - automatic estimation of LPP kernel param by analogy with Gaussian

    lucidfrontier45 authored
      distribution
    - added LPP to plot_lle_digits.py
Commits on Feb 5, 2013
  1. @lucidfrontier45
Commits on Feb 18, 2013
  1. @lucidfrontier45
  2. @lucidfrontier45

    rst document of lpp

    lucidfrontier45 authored
Commits on Feb 19, 2013
  1. @lucidfrontier45

    - modified lpp to pass nosetest

    lucidfrontier45 authored
    - renamed lpp_util cython module to _lpp
  2. @lucidfrontier45
Something went wrong with that request. Please try again.