Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Robust PCA #88

Closed
jcreinhold opened this issue Dec 9, 2018 · 5 comments
Closed

Robust PCA #88

jcreinhold opened this issue Dec 9, 2018 · 5 comments

Comments

@jcreinhold
Copy link

Thank you very much for the package, it is great work

I am trying to use the robust_pca function on a large dataset and the computation is very slow. This issue seems to be referenced in #18, #36, and #23. I initially tried to use the pytorch backend, thinking that the GPU would speed up the computation. However, I see that in all backends, the partial_svd function converts the data into a numpy array and uses the numpy backend (scipy.linalg.svd or scipy.sparse.linalg.eigsh) to calculate the SVD.

My question is two-fold. If I am following the code correctly, it would appear that robust_pca will never use the scipy.sparse.linalg.eigsh function. If we examine the code and follow the chain of calculations starting in robust_pca we have a call to svd_thresholding. The minimum size of the shape of the input matrix is used as the n_eigenvecs keyword (see here). However, once partial_svd is called, partial_svd checks n_eigenvecs >= min_dim (see here) to decide to calculate the standard SVD or the partial SVD. Since n_eigenvecs == min_dim (because that is what was passed), the check always passes and the partial SVD is never used for calculation.

Is this intentional or should the >= be a >? If it is intentional, i.e., we want to calculate the standard SVD every time, wouldn't it be more computationally efficient to enable backends with that capability to use the GPU enabled SVD calculations? I'm specifically thinking of pytorch which does have an SVD calculation.

Thank you very much for the help

@scopatz
Copy link
Contributor

scopatz commented Dec 10, 2018

I have a fix for some of the issues (specifically the n_eignvecs) ones over at https://github.com/quansight/tensorly/tree/sparse-robust-pca (the sparse-robust-pca branch of the Quansight fork).

It is still quite slow and I have yet to find a sparse data set that is a full working example.

@JeanKossaifi
Copy link
Member

Generally RPCA with ADMM is quite slow.. Regarding the number of eigenvalues, we do not know it in advance: we'd need to be able to specify a threshold rather than a number of eigenvals. @scopatz which fix do you mean?

Regarding the Numpy VS PyTorch SVD, I need to find a good way to specify which SVD function to use (@scopatz perhaps we could manage backend-related variables in a BackendManager, in addition to the import mechanism?). @jcreinhold You could try to replace numpy_svd with the pytorch implementation but I suspect you'd run out of memory.

By the way, @scopatz I have some examples with RPCA here if you're interested, though I wrote this with an old version of TensorLy.

@jcreinhold
Copy link
Author

jcreinhold commented Dec 12, 2018

Thank you for the response. I misunderstood how the partial SVD aspect of the code works and now see why calling the normal SVD function is appropriate.

I am running into a memory issue with the robust_pca implementation in tensorly. I am receiving the following error:

---------------------------------------------------------------------------
MemoryError                               Traceback (most recent call last)
<ipython-input-25-045b9432754b> in <module>
      1 start_time = time.time()
----> 2 D, E = robust_pca(tl.tensor(Is,dtype=np.float32), tol=1e-5, reg_J=2, n_iter_max=1000, verbose=True)
      3 print("--- %s seconds ---" % (time.time() - start_time))

~/miniconda3/envs/neurotensor/lib/python3.6/site-packages/tensorly/decomposition/robust_decomposition.py in robust_pca(X, mask, tol, reg_E, reg_J, mu_init, mu_max, learning_rate, n_iter_max, random_state, verbose)
     91 
     92         for i in range(T.ndim(X)):
---> 93             J[i] = fold(svd_thresholding(unfold(D, i) + unfold(L[i], i)/mu, reg_J/mu), i, X.shape)
     94 
     95         D = L_x/mu + X - E

~/miniconda3/envs/neurotensor/lib/python3.6/site-packages/tensorly/tenalg/proximal.py in svd_thresholding(matrix, threshold)
     68     procrustes : procrustes operator
     69     """
---> 70     U, s, V = T.partial_svd(matrix, n_eigenvecs=min(matrix.shape))
     71     return T.dot(U, T.reshape(soft_thresholding(s, threshold), (-1, 1))*V)
     72 

~/miniconda3/envs/neurotensor/lib/python3.6/site-packages/tensorly/backend/numpy_backend.py in partial_svd(matrix, n_eigenvecs)
    210     if n_eigenvecs is None or n_eigenvecs >= min_dim:
    211         # Default on standard SVD
--> 212         U, S, V = scipy.linalg.svd(matrix)
    213         U, S, V = U[:, :n_eigenvecs], S[:n_eigenvecs], V[:n_eigenvecs, :]
    214         return U, S, V

~/miniconda3/envs/neurotensor/lib/python3.6/site-packages/scipy/linalg/decomp_svd.py in svd(a, full_matrices, compute_uv, overwrite_a, check_finite, lapack_driver)
    127     # perform decomposition
    128     u, s, v, info = gesXd(a1, compute_uv=compute_uv, lwork=lwork,
--> 129                           full_matrices=full_matrices, overwrite_a=overwrite_a)
    130 
    131     if info > 0:

MemoryError: 

Note my matrix is not that large (the shape is: [3047562, 10]) and the server I am running this analysis on has 125.80 GB of memory. Furthermore, I was able to use a slightly customized version of this RPCA implementation (see here), which—although slightly different—seems to basically implement the same algorithm (for 2D) as is implemented in tensorly. Interestingly this did not result in a memory error, even after I switched the np.linalg.svd with scipy.linalg.svd.

My numpy version is: 1.15.4, scipy: 1.1.0, tensorly: 0.4.2, and python version: 3.6.7. All of these were installed in a fresh conda environment built for the purpose of testing tensorly. The OS is linux. I can provide more setup details as necessary.

FWIW, I also implemented the RPCA version in pytorch and I got a 2-3x speedup in terms of wall-clock time. I should note that when I used the GPU implementation on pytorch v0.4.1 the output was partially corrupted but in pytorch v1.0.0 the output was the same as the CPU version. For reference,
my CPU implementation is here and the GPU implementation is here

@JeanKossaifi
Copy link
Member

I think the issue comes from the fact that the tensor RPCA unfolds the tensor along each mode (determining the rank of a tensor is NP hard, so, unlike in the matrix case, we cannot ignore one of the dimensions). I guess what you want is to skip one of the modes (and have the Lagrangian only for some of the modes). One way to achieve this is to add a skip_mode argument to not unfold along the specified mode.

However, note that, in your case, since you are decomposing a matrix, you will not gain anything by using the tensor RPCA as opposed to the traditional matrix robust PCA.

@jcreinhold
Copy link
Author

Ah, I see. Thank you for the response and great work!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants