-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rationale behind some of the bits and pieces in the C code #60
Comments
Hi Aaron,
This is great! I'll try to help review and test your code; I'm not
great with C++ but I can at least help with testing, etc.
On 6/29/21, Aaron Lun ***@***.***> wrote:
I've been writing a C++ port of **irlba**'s C code
(https://github.com/LTLA/CppIrlba) for use in some other applications of
mine. It's been... tough going, but I think I've gotten it right, especially
after cross-referencing to the Baglama and Reichel paper. Not that I really
understand what's going on, but at least I know where the various steps are
coming from.
That said, there's a few steps that I can't find in the paper - perhaps you
can shed some light on why they exist:
https://github.com/bwlewis/irlba/blob/2ad759c142b1ef43cfdd21065e9ff4db22046256/src/irlb.c#L412-L415
Is this necessary? I understand that Gram-Schmidt orthogonalizes the
(j+1)-th column against the j-th column, but it's followed by an full
reorthogonalization via `orthog()` that does the same thing, so what's the
rationale for this extra step?
It helps to also compare to the reference R code, where this
reorthogonalization step is optional, see:
https://github.com/bwlewis/irlba/blob/master/R/irlba.R#L601
This is also an optional parameter in JIm and Lothar's original algorithm.
In practice due to floating poinr arithmetic, the Krylov basis vectors
lose orthogonality and a single GS vector orthogonalization is not
enough. This happens to varying degree whenever the input matrix is
ill-conditioned.
In the R code it's a user-controlled option. I think it's personally
one to many minor parameters for people to deal with and so in the
faster C code I made it mandatory favoring good results over speed. I
recommend just always re-orthogonalizing the basis in practice (users
will than you).
https://github.com/bwlewis/irlba/blob/2ad759c142b1ef43cfdd21065e9ff4db22046256/src/irlb.c#L363
I'm guessing that the various `RNORM()` calls are intended to just fill `W`
and `V` with some kind of content when the norm is zero, otherwise we'd end
up with some nonsense (indefinite values or all-zero values). I'd further
guess that the arbitrariness of the values don't really matter here because
the associated singular values will be zero after the SVD in the main IRLBA
loop.
Almost any vector that is not in the null space of the original matrix
will work. Obviously, the best vector(s) to start with are the
solution singular vectors! This is how the current restart mechanism
works (more on that below). But at its heart, IRLBA is a radomized
method similar but different to the more widely known one by Halko,
Martinsson, Tropp, Rokhlin, etc. The difference is that the Rokhlin
approach iterates one polynomial over a large amount of data; irlba
iterates over a small amount of data but many polynomials. See these
old papre review notes for more if you're interested:
http://illposed.net/rsvd.html
The nice thing about a random vector is that it's unlikely to be near
the null space of the input matrix, which is one of the error
conditions possible.
I probably have a few other questions that will come to me as I stare at the
code and paper for a while longer. Of course, any of your thoughts on the
current state of the port are also appreciated.
I will look at the code carefully.
I still have not refactored the package code sorry. My plan was to
use your approach to replace all the low-level C code. Separately I
will put a completely separate C code library version that I have up
somewhere (similar to your C++ library). I have been completely
side-tracked for like a year by some health problems though.
Best,
Bryan
p.s. there are some differences here and there that improve on Jim's
original work based on years of examples and experience. One thing
that is *not* an improvement is the restart scheme in the current
package, which Jim has complained to me is not as good as it should
be. That restart stuff remains to be re-implemented someday, so be
aware of that.
|
p.s. would you consider being also a co-maintainer of the original
irlba package since I am so negligent in updates sometimes? You're
into this code quite a bit and have made many valuable contributions
already...
…On 6/30/21, Bryan Lewis ***@***.***> wrote:
Hi Aaron,
This is great! I'll try to help review and test your code; I'm not
great with C++ but I can at least help with testing, etc.
On 6/29/21, Aaron Lun ***@***.***> wrote:
> I've been writing a C++ port of **irlba**'s C code
> (https://github.com/LTLA/CppIrlba) for use in some other applications of
> mine. It's been... tough going, but I think I've gotten it right,
> especially
> after cross-referencing to the Baglama and Reichel paper. Not that I
> really
> understand what's going on, but at least I know where the various steps
> are
> coming from.
>
> That said, there's a few steps that I can't find in the paper - perhaps
> you
> can shed some light on why they exist:
>
> https://github.com/bwlewis/irlba/blob/2ad759c142b1ef43cfdd21065e9ff4db22046256/src/irlb.c#L412-L415
>
> Is this necessary? I understand that Gram-Schmidt orthogonalizes the
> (j+1)-th column against the j-th column, but it's followed by an full
> reorthogonalization via `orthog()` that does the same thing, so what's
> the
> rationale for this extra step?
It helps to also compare to the reference R code, where this
reorthogonalization step is optional, see:
https://github.com/bwlewis/irlba/blob/master/R/irlba.R#L601
This is also an optional parameter in JIm and Lothar's original algorithm.
In practice due to floating poinr arithmetic, the Krylov basis vectors
lose orthogonality and a single GS vector orthogonalization is not
enough. This happens to varying degree whenever the input matrix is
ill-conditioned.
In the R code it's a user-controlled option. I think it's personally
one to many minor parameters for people to deal with and so in the
faster C code I made it mandatory favoring good results over speed. I
recommend just always re-orthogonalizing the basis in practice (users
will than you).
>
> https://github.com/bwlewis/irlba/blob/2ad759c142b1ef43cfdd21065e9ff4db22046256/src/irlb.c#L363
>
> I'm guessing that the various `RNORM()` calls are intended to just fill
> `W`
> and `V` with some kind of content when the norm is zero, otherwise we'd
> end
> up with some nonsense (indefinite values or all-zero values). I'd further
> guess that the arbitrariness of the values don't really matter here
> because
> the associated singular values will be zero after the SVD in the main
> IRLBA
> loop.
Almost any vector that is not in the null space of the original matrix
will work. Obviously, the best vector(s) to start with are the
solution singular vectors! This is how the current restart mechanism
works (more on that below). But at its heart, IRLBA is a radomized
method similar but different to the more widely known one by Halko,
Martinsson, Tropp, Rokhlin, etc. The difference is that the Rokhlin
approach iterates one polynomial over a large amount of data; irlba
iterates over a small amount of data but many polynomials. See these
old papre review notes for more if you're interested:
http://illposed.net/rsvd.html
The nice thing about a random vector is that it's unlikely to be near
the null space of the input matrix, which is one of the error
conditions possible.
>
> I probably have a few other questions that will come to me as I stare at
> the
> code and paper for a while longer. Of course, any of your thoughts on the
> current state of the port are also appreciated.
>
I will look at the code carefully.
I still have not refactored the package code sorry. My plan was to
use your approach to replace all the low-level C code. Separately I
will put a completely separate C code library version that I have up
somewhere (similar to your C++ library). I have been completely
side-tracked for like a year by some health problems though.
Best,
Bryan
p.s. there are some differences here and there that improve on Jim's
original work based on years of examples and experience. One thing
that is *not* an improvement is the restart scheme in the current
package, which Jim has complained to me is not as good as it should
be. That restart stuff remains to be re-implemented someday, so be
aware of that.
|
answers back to front...
I'm flattered but I don't think I have enough understanding of the theory to (co-)maintain it. I already knew my linear algebra was weak, but gee, attempting to read the paper really drove it home. I suspect that there is a decade of study between where I am now and the point at which I could confidently make improvements/fixes to the code. (For example, there were many lines in
Sure. I recently threw in some GHA tests to compare my implementation against the R package, which should allow me to keep track of changes on your end. (Assuming my test scenarios are comprehensive enough to trigger any changes in the restarts.)
Ah. Hope all is well.
I wonder if there is a way for us to combine our implementations so we can share the maintenance burden for the core logic. The code should compile in both C and C++; the problem is more about the dependencies during build. I'm eventually compiling to a system without BLAS or LAPACK, hence the use of Eigen (which makes compilations v-e-r-y slow)... I'd have to think about that for a bit.
I'm afraid that's several levels too deep for me, though I do appreciate the description of the distinction from On a related note, if you were to ever write an "IRLBA for dummies", I would definitely read it.
Will do. So if I'm always doing the full reorthogonalization... is there a need for the Gram-Schmidt step at all? I did some empirical testing where its removal didn't change the result IIRC, but I wasn't sure whether there was a deeper purpose to that step. |
Sure no problem about the package; there are some subtle ideas there.
I like the idea of a write up of a basic intro to projected/partial
SVDs (IRLBA for dummies). I'll think about that. Might be something
Jim Baglama or Benjamin Erichson would be interested in collaborating
on.
Speaking of rsvd, I looked at the Python randomized_svd method again
(https://scikit-learn.org/stable/modules/generated/sklearn.utils.extmath.randomized_svd.html)
and it still has no sense of convergence. Just run a fixed number of
iterations and take what you get. I find that approach pretty dumb. I
think once a reasonable block Lanczos method is implemented up then
there would be little reason to use the Rokhlin method in practice. I
think that's basically what Matlab is doing now.
You're right about the extra single Gram-Schmidt step, it is
superfluous. It came from my reference R implementation--where it is
needed because reorthogonalization is optional. When I decided to
make reorthogonalization mandatory in the C code I forgot to remove
that!
I have an old C library version I wrote for another project that I'm
cleaning up and will put somewhere for posterity. Maybe it (and I) can
help with your new implementation, but first I'll just get the code
cleaned up a bit and then we'll see.
Best,
Bryan
…On 7/1/21, Aaron Lun ***@***.***> wrote:
answers back to front...
> p.s. would you consider being also a co-maintainer of the original irlba
> package since I am so negligent in updates sometimes? You're into this
> code quite a bit and have made many valuable contributions already...
I'm flattered but I don't think I have enough understanding of the theory to
(co-)maintain it. I already knew my linear algebra was weak, but gee,
attempting to read the paper really drove it home. I suspect that there is a
decade of study between where I am now and the point at which I could
confidently make improvements/fixes to the code. (For example, there were
many lines in `irlb.c` that I thought, "this can't possibly be right, I'll
complain to Bryan about this"... but it was, so I guess that's me told.)
> That restart stuff remains to be re-implemented someday, so be aware of
> that.
Sure. I recently threw in some GHA tests to compare my implementation
against the R package, which should allow me to keep track of changes on
your end. (Assuming my test scenarios are comprehensive enough to trigger
any changes in the restarts.)
> I have been completely side-tracked for like a year by some health
> problems though.
Ah. Hope all is well.
> Separately I will put a completely separate C code library version that I
> have up somewhere (similar to your C++ library).
I wonder if there is a way for us to combine our implementations so we can
share the maintenance burden for the core logic. The code should compile in
both C and C++; the problem is more about the dependencies during build. I'm
eventually compiling to a system without BLAS or LAPACK, hence the use of
**Eigen** (which makes compilations v-e-r-y slow)...
I'd have to think about that for a bit.
> See these old papre review notes for more if you're interested:
I'm afraid that's several levels too deep for me, though I do appreciate the
distinction from `rsvd()`, having worked on C++s port of algorithms.
On a related note, if you were to ever write an "IRLBA for dummies", I would
definitely read it.
> I recommend just always re-orthogonalizing the basis in practice
Will do. So if I'm always doing the full reorthogonalization... is there a
need for the Gram-Schmidt step at all? I did some empirical testing where
its removal didn't change the result IIRC, but I wasn't sure whether there
was a deeper purpose to that step.
--
You are receiving this because you commented.
Reply to this email directly or view it on GitHub:
#60 (comment)
|
I've been writing a C++ port of irlba's C code (https://github.com/LTLA/CppIrlba) for use in some other applications of mine. It's been... tough going, but I think I've gotten it right, especially after cross-referencing to the Baglama and Reichel paper. Not that I really understand what's going on, but at least I know where the various steps are coming from.
That said, there's a few steps that I can't find in the paper - perhaps you can shed some light on why they exist:
irlba/src/irlb.c
Lines 412 to 415 in 2ad759c
Is this necessary? I understand that Gram-Schmidt orthogonalizes the (j+1)-th column against the j-th column, but it's followed by an full reorthogonalization via
orthog()
that does the same thing, so what's the rationale for this extra step?irlba/src/irlb.c
Line 363 in 2ad759c
I'm guessing that the various
RNORM()
calls are intended to just fillW
andV
with some kind of content when the norm is zero, otherwise we'd end up with some nonsense (indefinite values or all-zero values). I'd further guess that the arbitrariness of the values don't really matter here because the associated singular values will be zero after the SVD in the main IRLBA loop.I probably have a few other questions that will come to me as I stare at the code and paper for a while longer. Of course, any of your thoughts on the current state of the port are also appreciated.
The text was updated successfully, but these errors were encountered: