Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lanczos algorithm for symmetric/Hermitian matrices #370

Merged
merged 6 commits into from May 30, 2018
Merged

Lanczos algorithm for symmetric/Hermitian matrices #370

merged 6 commits into from May 30, 2018

Conversation

MSeeker1340
Copy link
Contributor

For symmetric/Hermitian A, we know that Arnoldi iteration will produce an H that is also real symmetric. The Lanczos algorithm improves upon Arnoldi by not calculating the zero elements of the upper part of H, thus saving time.

using OrdinaryDiffEq: KrylovSubspace, arnoldi!
n = 1000
m = 50
A = full(Hermitian(randn(n, n)))
Aperm = A + 1e-10 * randn(n, n) # no longer Hermitian
@show ishermitian(A)
@show ishermitian(Aperm);
ishermitian(A) = true
ishermitian(Aperm) = false
b = randn(n)
cache = similar(b)
Ks = KrylovSubspace{Float64}(n, m);
@time arnoldi!(Ks, A, b; m=m, cache=cache); # uses lanczos!
  0.043768 seconds (108 allocations: 13.016 KiB)
@time arnoldi!(Ks, Aperm, b; m=m, cache=cache);
  0.046262 seconds (2.61 k allocations: 130.094 KiB)

Note that the difference in time is not big: Arnoldi/Lanczos iteration is, after all, bounded in time by A_mul_B!. What Lanczos saves is just a bunch of dot products and axpy! updates.

Nevertheless, it's still valuable to have lanczos! around because it guarantees the results Ks[:H] is real symmetric, which can speed up later calculations. For example, expm is able to use diagonalization for Hermitian matrices:

@time expm(A);
  0.428920 seconds (36 allocations: 53.774 MiB, 2.35% gc time)
@time expm(Aperm);
  0.805784 seconds (103 allocations: 343.342 MiB, 14.75% gc time)

@MSeeker1340
Copy link
Contributor Author

One aspect of H for Hermitian A that I haven't touched upon is the fact that it's tridiagonal. I can modify KrylovSubspace to have Ks[:H] be a Tridiagonal in this case. However I don't see how this would be useful in expRK algorithms.

@ChrisRackauckas
Copy link
Member

Let's please fix up the interface in this PR. Ks[:H] is an anti-pattern which was discarded in Julia v0.7 and on v0.6 it gives type-stability issues in many cases (here might be fine?). Instead, just do Ks.H and remove that getindex function. On v0.7, even when this doesn't exist you can getproperty overload so it's fine as an interface.

@ChrisRackauckas
Copy link
Member

I can modify KrylovSubspace to have Ks[:H] be a Tridiagonal in this case. However I don't see how this would be useful in expRK algorithms.

That would change one matrix multiply to O(n) instead of O(n^3) via the dense multiply and also save some memory (which is arguably more important for PDEs). I would do this if it isn't difficult.

@coveralls
Copy link

coveralls commented May 29, 2018

Coverage Status

Coverage increased (+5.3%) to 83.678% when pulling 0e3eec7 on MSeeker1340:lanczos into 5896b10 on JuliaDiffEq:master.

@MSeeker1340
Copy link
Contributor Author

Ks[:H] is an anti-pattern which was discarded in Julia v0.7 and on v0.6 it gives type-stability issues in many cases (here might be fine?).

Oh I'm not aware of this. I was reading the docs on matrix factorizations (e.g. eigfact and qrfact) and noticed this pattern, so I just mimicked the behavior here. Seems they are deprecated in v0.7:

https://docs.julialang.org/en/latest/NEWS.html#Deprecated-or-removed-1

The lesson for me then is I should start reading the v0.7 docs and pay attention to the changes :P

Speaking of which, need to remind myself that expm(!) is renamed to exp(!) in v0.7. Probably should also drop the m in my functions to reflect this style change.

@ChrisRackauckas
Copy link
Member

Probably should also drop the m in my functions to reflect this style change.

Yes.

@codecov
Copy link

codecov bot commented May 29, 2018

Codecov Report

Merging #370 into master will increase coverage by 0.03%.
The diff coverage is 90%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #370      +/-   ##
==========================================
+ Coverage   83.63%   83.67%   +0.03%     
==========================================
  Files          75       75              
  Lines       21589    21615      +26     
==========================================
+ Hits        18057    18087      +30     
+ Misses       3532     3528       -4
Impacted Files Coverage Δ
src/perform_step/exponential_rk_perform_step.jl 100% <100%> (ø) ⬆️
src/caches/linear_nonlinear_caches.jl 87.83% <100%> (ø) ⬆️
src/exponential_utils.jl 86.84% <87.93%> (+5.88%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 5896b10...0e3eec7. Read the comment docs.

@MSeeker1340
Copy link
Contributor Author

MSeeker1340 commented May 29, 2018

Initially I just changed all Ks[:H] calls to @view(Ks[1:Ks.m, 1:Ks.m]), but then I realized an access method would still be necessary.

The reason is that in order to optimize lanczos!, I need to modify KrylovSubspace to specifically handle the case of Hermitian inputs. The simplest solution, for example, is to just let Ks.H be a SymTridiagonal in this case. However here @view(Ks.H[1:m, 1:m]) is no longer SymTridiagonal. So we indeed need elaborate access methods.

And of course getV and getH can be replaced by getpropoerty in v0.7 for a cleaner interface.

@ChrisRackauckas
Copy link
Member

So we indeed need elaborate access methods. And of course getV and getH can be replaced by getpropoerty in v0.7 for a cleaner interface.

Yup sounds good.

@MSeeker1340
Copy link
Contributor Author

Performance comparison:

using OrdinaryDiffEq: KrylovSubspace, getH, arnoldi!, expv!
n = 1000
m = 50
srand(0)
A = full(Hermitian(randn(n, n)))
Aperm = A + 1e-10 * randn(n, n) # No longer Hermitian
b = randn(n)
t = 1e-3
Ks = KrylovSubspace{Float64}(n, m)
Ksperm = KrylovSubspace{Float64}(n, m)
w = similar(b)
wperm = similar(b)
arnoldi_cache = similar(b)
expv_cache = zeros(m, m);

Arnoldi/Lanczos iteration step:

@time arnoldi!(Ks, A, b; m=m, cache=arnoldi_cache);
  0.060559 seconds (110 allocations: 13.141 KiB)
@time arnoldi!(Ksperm, Aperm, b; m=m, cache=arnoldi_cache);
  0.067272 seconds (2.61 k allocations: 130.219 KiB)
ishermitian(getH(Ks))
true
ishermitian(getH(Ksperm))
false

expv! step

@time expv!(w, t, Ks, cache=expv_cache);
  0.000516 seconds (78 allocations: 38.313 KiB)
@time expv!(wperm, t, Ksperm, cache=expv_cache);
  0.012546 seconds (55 allocations: 355.344 KiB)
w  wperm
true

Again, in the Arnoldi/Lanczos step the difference is small because Lanczos only saves some O(n) operations while the bottleneck lies in A_mul_B! of the linear operator. (Come to think of it, the difference will be significant for sparse linear operators whose A_mul_B! is also of order O(n) right?). For expv!, because the symtridiagonal matrix produced by Lanczos has very efficiently eigendecomposition, we have very substantial gains here.

@ChrisRackauckas
Copy link
Member

Come to think of it, the difference will be significant for sparse linear operators whose A_mul_B! is also of order O(n) right?

Most likely, yes. Also large BandedMatrix, TridiagonalMatrix, etc. types.

@ChrisRackauckas ChrisRackauckas merged commit 4d46d4b into SciML:master May 30, 2018
@MSeeker1340 MSeeker1340 deleted the lanczos branch May 31, 2018 20:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants