{{ message }}

# A math mistake in spectral_embedding #8129

Closed
opened this issue Dec 28, 2016 · 19 comments
Closed

# A math mistake in spectral_embedding#8129

opened this issue Dec 28, 2016 · 19 comments
Labels
Milestone

### GenTang commented Dec 28, 2016 • edited

spectral_embedding_bug.pdf

#### Description

Spectral embedding is an algorithm proposed by the following paper:

“Laplacian Eigenmaps for Dimensionality Reduction and Data Representation” M. Belkin, P. Niyogi, Neural Computation, June 2003; 15 (6):1373-1396

In page 6 of this paper, it defines

$$Lf = \lambda D f$$
where $D$ is diagonal weight matrix, its entries are column sums of $W$, $D_{ii} = \sum_{j} W_{ji}$. $L=D-W$

Let $f_0, f_1, ... f_k-1$ be the solutions of the above equations with $0=\lambda_0 \le \lambda_1 \le ... \lambda_{k-1}$ Note: $f_0$ is a constant vector

Therefore, $(f_1, ... f_m)$ is lapacian eigenmaps

Let $g = D^{\frac{1}{2}}f$ and $L^{sym}=D^{\frac{-1}{2}}(D-W)D^{\frac{-1}{2}}$, we have

$$D^{\frac{-1}{2}}(D-W)D^{\frac{-1}{2}}*D^{\frac{1}{2}}f = \lambda D^{\frac{1}{2}}f$$
$$L^{sym}*g = \lambda g$$

In the code here:

laplacian, dd = graph_laplacian(adjacency,
normed=norm_laplacian, return_diag=True)


laplacian is $L^{sym}$ (with norm_laplacian=True), dd is $D^{\frac{1}{2}}$

In the code here

lambdas, diffusion_map = eigsh(laplacian, k=n_components,
sigma=1.0, which='LM',
tol=eigen_tol, v0=v0)
embedding = diffusion_map.T[n_components::-1] * dd


diffusion_map is $g$. and embedding is $f$.

However, $f = D^{\frac{-1}{2}}g$ which means

embedding = diffusion_map.T[n_components::-1] / dd


The same mistakes are in line 289 and line 313

#### Steps/Code to Reproduce

from sklearn.manifold.spectral_embedding_ import spectral_embedding
import numpy as np

matrix = np.zeros([4, 4])
matrix[0, 1] = 1
matrix[1, 0] = 1
matrix[0, 2] = 1
matrix[2, 0] = 1
matrix[1, 2] = 1
matrix[2, 1] = 1
matrix[2, 3] = 4
matrix[3, 2] = 4

f = spectral_embedding(matrix, n_components=1, drop_first=False)


#### Expected Results

Constant vector

array([[ 0.26726124],
[ 0.26726124],
[ 0.26726124],
[ 0.26726124]])


#### Actual Results

array([[ 0.53452248],
[ 0.53452248],
[ 1.60356745],
[ 1.06904497]])


#### Versions

>>> import platform; print(platform.platform())
Darwin-16.1.0-x86_64-i386-64bit
>>> import sys; print("Python", sys.version)
('Python', '2.7.9 (v2.7.9:648dcafa7e5f, Dec 10 2014, 10:10:46) \n[GCC 4.2.1 (Apple Inc. build 5666) (dot 3)]')
>>> import numpy; print("NumPy", numpy.__version__)
('NumPy', '1.11.1')
>>> import scipy; print("SciPy", scipy.__version__)
('SciPy', '0.17.0')
>>> import sklearn; print("Scikit-Learn", sklearn.__version__)
('Scikit-Learn', '0.17.1')

changed the title A bug in spectral_embedding A math mistake in spectral_embedding Dec 28, 2016

### jnothman commented Dec 29, 2016

 Thanks. Would you be willing to submit a patch and some more robust unit tests than what we presently have?
added the label Dec 29, 2016

### GenTang commented Dec 29, 2016 • edited

 OK. I have a question about spectral embedding for a graph which is not fully connected: When graph is not fully connected and norm_laplacian=False, dd could contains 0 (matrix D is singular and the code doesn't set value to 1 as norm_laplacian=True). It will take the program down. Right now, I don't know how to deal this situation, any idea? However, I don't understand why the parameter norm_laplacian exists. If norm_laplacian=False, I think that we can not get f from diffusion_map which is the eigenvectors of D-W

### jnothman commented Dec 29, 2016

 I'm not the best person to help you there, and I hope @GaelVaroquaux has a moment to chime in

### devanshdalal commented Jan 1, 2017

 Hi, I think I can dig up the issue, if @GenTang is not working on this one?

### GenTang commented Jan 2, 2017

 @devanshdalal, It is great that you want to help. But how do you want to solve the problems with non normalized laplacian matrix?

### devanshdalal commented Jan 2, 2017 • edited

 @GenTang, The “Laplacian Eigenmaps for Dimensionality Reduction and Data Representation. If you go to page 6 of the paper, you can see that the same algorithm is applicable to non-connected graphs as well(step 3 has to be applied for each component). I guess we might not need to change anything, or maybe very minute changes. I will have to dive deep into the code to be sure.

### agramfort commented Jan 2, 2017

 @jakevdp may have an opinion also. He wrote a lot of the manifold learning code.

### GaelVaroquaux commented Jan 2, 2017

 I don't think that @jakevdp wrote this part. It was adapted afterhand from the spectral clustering code that I wrote. I had a look at the equations and the code, and I believe that the bug report is correct (and hence the code is wrong).

### devanshdalal commented Jan 2, 2017

 @GaelVaroquaux,I can fix the issue. Let me fix the issue. if its not too much.

### GaelVaroquaux commented Jan 2, 2017

 @GaelVaroquaux,I can fix the issue. Let me fix the issue. if its not too much. The fix itself is trivial and is pointed out in the issue (it's a question of dividing by dd istead of multiplying by dd in the last assignement. Slightly harder is to implement a good set of tests. They are also suggested in the issue.

### devanshdalal commented Jan 2, 2017

 @GaelVaroquaux, will do both and submit a pull request soon.

### GenTang commented Jan 3, 2017

 @devanshdalal, I see this part of paper. But consider the following example which will cause divide by zero Error. >>> matrix array([[ 0., 1., 0., 0.], [ 1., 0., 0., 0.], [ 0., 0., 0., 0.], [ 0., 0., 0., 0.]]) >>> matrix, dd = graph_laplacian(matrix, normed=False, return_diag=True) >>> lambdas, diffusion_map = eigsh(matrix, k = 1) >>> diffusion_map.T[1::-1] / dd __main__:1: RuntimeWarning: divide by zero encountered in divide array([[ 0.70710678, -0.70710678, inf, -inf]])  @GaelVaroquaux, I sorry to bother you, but I don't understand what is this algorithm about to compute when norm_laplacian=False. The code corresponding is here Thanks

### devanshdalal commented Jan 3, 2017 • edited

 @GenTang , I got your point. Let me look into the probelm. I will discuss the possible solution with u before start implementing.

### GaelVaroquaux commented Jan 3, 2017

 >>> matrix array([[ 0., 1., 0., 0.], [ 1., 0., 0., 0.], [ 0., 0., 0., 0.], [ 0., 0., 0., 0.]]) >>> matrix, dd = graph_laplacian(matrix, normed=False, return_diag=True) >>> lambdas, diffusion_map = eigsh(matrix, k = 1) OK. What happens if you replace the values where dd is zero by ones? What is the value of "lambdas" there. spectral diffusion is not defined for such a graph.

### GenTang commented Jan 3, 2017 • edited

 @GaelVaroquaux Thanks a lot for your reply. And for your question, the lambdas is array([ 2.]). Of course, we can replace zero by one in dd, but it doesn't change anything. My question is not about dd has zeros, It is about why spectral_embedding introduce parameter norm_laplacian. When we compute spectral embedding, we try to solve the following equation: (D - W) * f = \lambda * D * f To solve this, we should compute normalized laplacian matrix D^(-1/2) * (D - W) * D^(-1/2), then compute eigenvectors of normalized laplacian matrix and finally get f. However, with norm_laplacian=False, diffusion_map is the eigenvectors of D - W. It means that: (D - W) * diffusion_map = \lambda * diffusion_map. But (D - W) * D^(-1/2) * diffusion_map = ? or (D - W) * D^(1/2) * diffusion_map = ? It seems that diffusion_map.T[:n_components] / dd or diffusion_map.T[:n_components] * dd doesn't have any mathematical meaning. (Please correct me if I am wrong) So I don't understand why sklearn learn introduce parameter norm_laplacian in spectral_embedding, since spectral_embedding should always work with normalized laplacian matrix.

### devanshdalal commented Jan 4, 2017 • edited

 @GenTang , I believe you are correct regarding norm_laplacian. For theoretically correct implementation, its should not be an argument of spectral_embedding() and should be like this in the code. graph_laplacian(adjacency, normed=True, return_diag=True)  Correct me if I am wrong? Removing norm_laplacian can be solution or we have to provide an implementation where the case of norm_laplacian=False is handled properly. (I think it can be done as well. any ideas in that direction??). EDIT: No update in this thread lately. What are your views, guys? @GenTang , @GaelVaroquaux .

### GenTang commented Jan 10, 2017

 For me, the second solution should be better @devanshdalal

### devanshdalal commented Jan 10, 2017

 ok, Let me provide a quick patch soon.
This was referenced Jan 12, 2017
mentioned this issue Jan 19, 2017

### amueller commented Mar 3, 2017

 didn't look into this but is this by any chance related to #6489 ?
mentioned this issue Jun 8, 2017
added this to the 0.19 milestone Jun 14, 2017