-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
initial svds support based on eigs #9425
Conversation
My sense is that truncated SVD should be an option to the |
In fact, I'm not even sure a separate |
Sure, I've no preference on the structuring/naming. Just was currently missing the ability to run Arnoldi (ARPACK) methods to extract first singular vectors of large matrices that have sparse or special structure. |
We do need a different name though since svd on sparse matrices will do the full svd when we have it. We could have a named argument, but the interface adds too many options for my liking to combine it all in svd. |
@jaak-s could you add the documentation as well? |
Cc @Jutho |
Also remove the trailing whitespace from the comment in the tests |
Should the trailing whitespace be discussed in CONTRIBUTING.md along with configuration options for emacs/vim and in git? For example, I now have this in my .emacs thanks to some help in one of the issues:
|
We should have this conversation in a separate issue, but yes I think we should move it to several places, and not do the check in Travis. Code style nitpicking should not prevent running the build and tests. I think we should pick an arbitrary throwaway CI competitor to Travis, like Drone or CircleCI or Shippable, and do the whitespace check there as a separate status indicator. Or put together something custom on Heroku or whatever, but using an existing service seems easier. |
BTW, the bikeshedding should not block this PR. Let's get the functionality in along with tests and docs. The naming issue applies to |
Looks good to me. The tests are only ok for input matrices with real numbers (which is the case for the input matrix |
## svds | ||
|
||
type SvdX <: AbstractArray{Float64, 2} | ||
X |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we'd like to parametrize this type on the input element and matrix type (and use capital letters), i.e. something like
type SVDOperator{T,S}
data::S
end
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I addition, we'll have to support more than Float64
and the type should reflect that. The type can probably not be subtype of AbstractMatrix
either, as the input might not be a subtype of AbstractMatrix
.
@andreasnoack Allocation-free version of eigs is discussed in #7907. |
Octave takes the approach of using |
@ViralBShah I'm not sure about how to handle the cleanup in general. The |
Pushed few commits: manual for The comments from @andreasnoack are spot on. Regarding type information I don't have a good solution yet because the SVDOperator is only an internal type and not visible to users. It seems some additional tricks are needed. Currently, I have not checked how to do post-processing for |
I don't think it is a limitation to require that the operator type has a parameter for the element type. The linear operators in Base (the matrices) have and I think it would feel natural to have an element type for user defined linear operators as well. If we assume that then we already have the element type in the operator type. We'd also like to have that in order to have fast dispatch for
Not sure, but maybe it is discussed somewhere in the ARPACK manual, but that is not the most joyful read. Let's browse the literature and see what we can find. |
I don't think the ARPACK manual sheds light on that, and even if it does, it will be impossible to find it. IIRC, the only place where I see discussion of svd is in the examples, and they do |
Here is what octave does. I suggest that @jaak-s not look at the code, since it is GPL, and I am looking at it and describing what it does. For others look for Octave calls
The norm that they use is the largest singular value (if the sigma input variable is a character) or Other things they do are scaling the input by the largest value (1-norm) to make things a bit more stable. If sigma is specified, that also needs scaling. If sigma is 0, then they compute twice as many eigenvalues. Quote:
|
@alanedelman What are your thoughts here on doing |
I will soon add type information to @ViralBShah Thank you for information. It is now quite clear how to extract largest singular values from |
Bump @jaak-s |
Thanks for the bump, got busy during the holidays. I'll try to put patches together in the next couple of days. I tried out the |
Now Main changes:
|
end | ||
|
||
function svds{S}(X::S; nsv::Int = 6, ritzvec::Bool = true, tol::Float64 = 0.0, maxiter::Int = 1000) | ||
otype = getOperatorType(X) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can just simplify this to use just eltype(X)
and avoid getOperatorType()
altogether. The checks for whether eltype() is BlasFloat should ideally be in eigs
. It should be possible to make SVDOperator
free of types then.
For good performance, we should be using |
ex = eigs(SVDOperator{otype,S}(X), I; ritzvec = ritzvec, nev = 2*nsv, tol = tol, maxiter = maxiter) | ||
ind = [1:2:nsv*2] | ||
sval = abs(ex[1][ind]) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
travis doesn't like the trailing whitespace here or on line 151
We also need to filter out the singular values that are zero or close to zero. |
The reason for introducing If you wish I can remove White-space should be now fixed. |
Regarding removing close to zero singular values, I would prefer the current behavior. I think users like to get N values when they set |
I agree the integer matrix support is good to have. I suggest that instead of |
Sounds good, I will replace |
About the filtering out of singular values that are 0 - octave does do it. But, we can do it once we get this basic piece of code checked in - once you remove getOperatorType. |
I wonder what singular values does Octave remove because they return 0-valued singular values (version 3.8.1): octave:1> svds( zeros(4,3), 2 )
ans =
0
0 |
This was in the octave source that I quoted above. Of course there are legitimate zero singular values. Reading this carefully, they are doing a filter on negative values, and perhaps something in the case where sigma is specified. I will look deeper.
|
Travis failure on linux appears unrelated. |
initial svds support based on eigs
I think this is a great start with |
Added to NEWS in 47061a5 |
Thank you. I forgot to update the docs earlier but now pushed a commit that reflects the latest version of |
It looks like your last commit here is still 8 days ago. Could you create a new PR? Also, we should have squashed these commits, but it escaped my mind before merging. |
Created PR #9723 for docs. |
Added support for sparse truncated SVD based on
eigs
. Keyword inputs are the same as foreigs
. Returns left singular vectors, singular values, right singular vectors (and also outputs from eigs). Added a test case.