-
Notifications
You must be signed in to change notification settings - Fork 526
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] LOBPCG Solver #2772
[WIP] LOBPCG Solver #2772
Conversation
venkywonka
commented
Aug 28, 2020
•
edited
Loading
edited
- This is a WIP initial implementation of the locally optimal block preconditioned conjugate gradient solver in CuPy
- Used to solve for the largest/smallest K eigen pairs of a generalized eigen value problem more info
Can one of the admins verify this patch? |
1 similar comment
Can one of the admins verify this patch? |
add to whitelist |
Please update the changelog in order to start CI tests. View the gpuCI docs here. |
Please update the changelog in order to start CI tests. View the gpuCI docs here. |
@venkywonka, I’m leaving a note to let you know that I have been reviewing this. Can you provide some tests for your implementation as well? Pytests against scipy would be great. Are we planning to also build a prim in RAFT for this? Do you guys think there’s a lot of performance to be gained by going closer to CUDA? |
sure, thanks @cjnolet , will update the tests soon... from what I know, cupy is not supported in RAFT so @teju85 asked me to implement first in cuml. Not too sure by how much there will be a performance improvement moving to CUDA C/C++ but there should definitely be some... |
Hi @cjnolet, |
@teju85 I think a call is a good idea, but I think it will be a better discussion after we have prototyped a TSVD/PCA implementation with it in Python so we can better measure end-to-end performance. What do you think? |
* tests for blopex lobpcg (the algorithm used in scipy's implementation) comparing with scipy's implementation * updated CHANGELOG.md * updated init.py in `cuml/python/cuml/`
rerun tests |
rerun tests |
Tagging @vinaydes to help review this PR, once our 0.16 deliverables are done. |
@venkywonka @teju85, @anaruse has been building implementations of scipy.sparse eigsh and svds for cupy. Since lobpcg is also an option in svds, what do you say we contribute this implementation directly to cupy? Ref: cupy/cupy#4155 |
My apologies, I did not mean to close this PR. |
rerun tests |
@cjnolet overall I think it is a good idea to contribute this to cupy repo, instead. The only unknown here is what would happen if we later want to move this inside raft as a C++ prim? Will we then have a cupy-based lobpcg and raft-based lobpcg solvers duplicated? |
Removing |
@teju85, ideally, if we ended up with an LOBPCG prim inside RAFT then both cuml and cupy could use that instead. I've been hoping that at some point (when RAFT is mature enough) we can have Cupy just use RAFT for cases like this. There are some graph-based prims that could be used in cupy as well. This would allow us to share the same prims across multiple projects in the ecosystem, which is the goal of RAFT. What do you think? |
@venkywonka, I'm continuing the discussion from cupy/cupy#2359 (comment) here. One of the requirements we have for sparse PCA (from the links provided in the referenced issue) is that we need the ability to mean center very sparse matrices but cannot just subtract the mean directly because doing so would convert the sparse matrix into a dense matrix. The "implicit mean centering trick" performs the mean centering in the optimizer using closures (the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm really excited to contribute this implementation to Cupy and have it available for cuML! As mentioned in the review, my primary concern is that we're able to provide a lazy custom linear operator so that we can implement the PCA implicit mean centering.
I do sincerely apologize that it took a little while to get to this review but I've done a first-pass for cuML's needs (and initial prep for Cupy contribution.)
""" | ||
|
||
|
||
def is_sparse(A): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would cupyx.scipy.sparse.isspmatrix
work for this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes it should @cjnolet . This is a helper script that is exclusively for the robust_lobpcg
(algo used in pytorch). This should be pruned along with robust_lobpcg
for now while our focus is on supporting scipy's implementation.
arr_cols = cp.array([]) | ||
for col in row: #col is a cupy ndarray | ||
if col.ndim == 0: | ||
col = cp.array([col]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we might be able to achieve this with cp.newaxis
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will update accordingly 👍
cp_error_flag = False | ||
scipy_error_flag = False | ||
|
||
A = np.random.rand(n, n) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This solver is of particular interest to us on cuML because of 1) its ability find the eigenvalues when given a sparse input and 2) its ability to accept a custom linear operation that we can use to perform mean centering at the vector level. It's important that we test sparse inputs here as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The thing is, in scipy, linearOperator
abstracts sparsity/density from the operations performed. Will add tests for sparse inputs as soon as I add support for linearOperator
(and subsequently, sparsity)
# TODO: debug and run tests for method ortho and basic | ||
# currently they aren't that accurate | ||
@pytest.mark.parametrize("method", [ | ||
# "ortho", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you think about pulling the robust_lobpcg
out of this initial implementation to start, then adding them back in when they are able to be tested?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I was thinking the same. Since it was precedent to blopex_lobpcg
, I had it dangling around. Will clean it up! 👍
]) | ||
@pytest.mark.parametrize("maxiter", [ | ||
20, | ||
# 200, #Not practically required |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are these values just taking too long to run? It would be helpful to try a couple different values here, just to verify correctness. I could understand something less than 20 might lower the chances for convergence, but perhaps a couple values >20, but maybe less than an order of magnitude?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, in fact they were... scipy set it's cap on maxiter (if unspecified) to 20 (see this line). That was the basis on which I decided that higher orders of magnitude were unnecessary.
But yes, you're definitely right; will add a few more of them >20 👍
print('order-updated lambda:', _lambda) | ||
|
||
# # Normalize eigenvectors! | ||
# aux = cp.sum( eigBlockVector.conj() * eigBlockVector, 0 ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are these no longer needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's an artifact from the original scipy code 😅 They aren't needed I suppose. For some reason they have it in the scipy code-base. Will prune up!👍
def _applyConstraints(blockVectorV, YBY, blockVectorBY, blockVectorY): | ||
"""Changes blockVectorV in place.""" | ||
YBV = cp.dot(blockVectorBY.T.conj(), blockVectorV) | ||
tmp = solve(YBY, YBV) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@daxiongshu is in the process of contributing cho_solve
to cupy, if it helps here: cupy/cupy#4172
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks @cjnolet ! That should definitely help 👍
aux.shape = (ar.shape[0], 1) | ||
return aux | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we able to include and use _makeoperator
in here as well? It looks like this version follows the Scipy version pretty closely.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes corey, will have to basically implement this in cupy. (This will also take us a step closer in completing this comparison table!)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
at the time of writing it, sparsity was not as much of a priority as the feature itself; so was prioritizing the functionality for dense input. I think it's time I prioritize sparsity ✌️
definite (SPD) generalized eigenproblems. | ||
Parameters | ||
---------- | ||
A : {cupy.ndarray} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are able to support cupyx.scipy.sparse arrays here too, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Technically yes, but linearOperator
abstracts this out and enables matmuls in scipy. The same cannot be done in cupy (as of now) so one has to densify, then do a matmul. Once linearOperator
support is added (which is next on the agenda), cupyx.scipy.sparse
arrays will get automatically supported.
# 2000 | ||
]) | ||
@pytest.mark.parametrize("isB", [True, False]) | ||
def test_lobpcg(dtype, k, n, method, largest, maxiter, isB): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the blopex
implementation follows so closely to the Scipy implementation, it would be helpful to also include many of their tests directly: https://github.com/scipy/scipy/blob/v1.5.3/scipy/sparse/linalg/eigen/lobpcg/tests/test_lobpcg.py. This will also help verify that not only does the implementation provide the proper eigenvalues, but all the potential inputs that we support are handled properly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @cjnolet , those are some very elaborate tests that are definitely missing here! will add them 👍
I like this idea, @cjnolet . I also agree that this primitive is better-off in cupy-land, for now. Let's continue with our cupy-based implementation and in case we need more perf and/or we need this operation in cuml-C++, then think about moving it to raft and updating cupy to use raft. |
Initially, when Venkat started, we did have plans to support |
Closing this PR because this feature is being implemented directly in cupy. |