# Symmetric Eigenvalue Decomposition - Lanczos Method


If the matrix $A$ is large and sparse and/or if only some
eigenvalues and their eigenvectors are desired, iterative methods are
the methods of choice. For example, the power method can be useful to compute the
eigenvalue with the largest modulus. The basic
operation in the power method is matrix-vector multiplication, and this can be
performed very fast if $A$ is sparse. Moreover, $A$ need not be stored in the
computer -- the input for the algorithm can be just a function which,
given some vector $x$, returns the product $Ax$.

An _improved_ version of the power method, which efficiently computes
some eigenvalues (either largest in modulus or near some target value $\mu$)
and the corresponding eigenvectors, is the Lanczos method.

For more details, see 
[I. Slapničar, Symmetric Matrix Eigenvalue Techniques][Hog14] and the references therein.

[Hog14]: #1 "L. Hogben, ed., 'Handbook of Linear Algebra', pp. 55.1-55.25, CRC Press, Boca Raton, 2014."


## Prerequisites

The reader should be familiar with concepts of eigenvalues and eigenvectors, related perturbation theory, and algorithms. 

 
## Competences 

The reader should be able to recognise matrices which warrant use uf Lanczos method, to apply 
the method and to assess the accuracy of the solution.

## Lanczos method

$A$ is a real symmetric matrix of order $n$.

### Definitions

Given a nonzero vector $x$ and an index $k<n$, the __Krylov matrix__ is defined as

$$
K_k=\begin{bmatrix} x & Ax & A^2 x &\cdots & A^{k-1}x \end{bmatrix}.
$$

__Krilov subspace__ is the subspace spanned by the columns of $K_k$.

### Facts

1. The Lanczos method is based on the following observation. If $K_k=XR$ is the
  $QR$ factorization of the matrix $K_k$, 
  then the $k\times k$ matrix $T=X^T A X$ is tridiagonal. The matrices $X$ and
  $T$ can be computed by using only matrix-vector products in $O(kn)$
  operations.

2. Let $T=Q\Lambda Q^T$ be the EVD of $T$. Then $\lambda_i$ approximate well some of the largest 
and smallest eigenvalues of $A$, and the columns of the matrix $U=XQ$ approximate the corresponding
  eigenvectors.

3. As $k$ increases, the largest (smallest) eigenvalues of the matrix
  $T_{1:k,1:k}$ converge towards some of the largest (smallest) eigenvalues of $A$ (due to
  the Cauchy interlace property). The algorithm can be redesigned to compute
  only largest or smallest eigenvalues. Also, by using shift and invert
  strategy, the method can be used to compute eigenvalues near some specified
  value. In order to obtain better approximations, $k$ should be greater than
  the number of required eigenvalues. On the other side, in order to obtain
  better accuracy and efficacy, $k$ should be as small as possible.

4. The last computed element, $\mu=T_{k+1,k}$, provides
  information about accuracy:
  \begin{align*}
  \|AU-U\Lambda\|_2&=\mu, \\
  \|AU_{:,i}-\lambda_i U_{:,i}\|_2&=\mu |Q_{ki}|, \quad  i=1,\ldots,k.
  \end{align*}
  Further, there are $k$
  eigenvalues $\tilde\lambda_1,\ldots,\tilde\lambda_k$ of $A$ such that
  $|\lambda_i-\tilde\lambda_i|\leq \mu$, and for the corresponding eigenvectors, we
  have $$\sin2\Theta(U_{:,i},\tilde U_{:,i}) \leq \frac{2\mu}{\min_{j\neq i} 
|\lambda_i-\tilde \lambda_j|}.$$ 

5. In practical implementations, $\mu$ is usually used to determine the index $k$. 

6. The Lanczos method has inherent
  numerical instability in the floating-point arithmetic: since the Krylov vectors are, in fact,
  generated by the power method, they converge towards an eigenvector of $A$. 
  Thus, as $k$ increases, the Krylov vectors become more and more parallel, and the recursion in the 
  function `myLanczos()` becomes numerically unstable and the computed columns of $X$
  cease to be sufficiently orthogonal. This affects both the convergence and
  the accuracy of the algorithm. For example, several eigenvalues of $T$ may
  converge towards a simple eigenvalue of $A$ (the, so
  called, _ghost eigenvalues_).

7. The loss of orthogonality is dealt with by using the __full
  reorthogonalization__ procedure: in each step, the new ${\bf z}$ is orthogonalized against all
previous
  columns of $X$, that is, in function `myLanczos()`, the formula 
  ```
  z=z-Tr.dv[i]*X[:,i]-Tr.ev[i-1]*X[:,i-1]
  ```
  is replaced 
  by
  ```
  z=z-sum(dot(z,Tr.dv[i])*X[:,i]-Tr.ev[i-1]*X[:,i-1]
  ```
  To obtain better orthogonality, the latter formula is usually executed twice. 
  The full reorthogonalization raises the operation count to $O(k^2n)$. 
8. The __selective reorthogonalization__ is the procedure in which the current $z$
  is orthogonalized against some selected columns of $X$, in order to
  attain sufficient numerical stability and not increase the operation count
  too much. The details are very subtle and can be found in the references.
  
9. The Lanczos method is usually used for sparse matrices. Sparse matrix $A$
  is stored in the sparse format in which only values and indices of nonzero elements
  are stored. The number of operations required to multiply some vector by $A$ is
  also proportional to the number of nonzero elements.
  
10. The function `eigs()` implements Lanczos method real for symmetric matrices and more general Arnoldi method 
for general matrices.

### Examples

In [1]:
using LinearAlgebra

In [3]:
function myLanczos(A::Array{T}, x::Vector{T}, k::Int) where T
    n=size(A,1)
    X=Array{T}(undef,n,k)
    dv=Array{T}(undef,k)
    ev=Array{T}(undef,k-1)
    X[:,1]=x/norm(x)
    for i=1:k-1
        z=A*X[:,i]
        dv[i]=X[:,i]⋅z
        # Three-term recursion
        if i==1
            z=z-dv[i]*X[:,i]
        else
            # z=z-dv[i]*X[:,i]-ev[i-1]*X[:,i-1]
            # Full reorthogonalization - once or even twice
            z=z-sum([(z⋅X[:,j])*X[:,j] for j=1:i])
            z=z-sum([(z⋅X[:,j])*X[:,j] for j=1:i])
        end
        μ=norm(z)
        if μ==0
            Tr=SymTridiagonal(dv[1:i-1],ev[1:i-2])
            return eigvals(Tr), X[:,1:i-1]*eigvecs(Tr), X[:,1:i-1], μ
        else
            ev[i]=μ
            X[:,i+1]=z/μ
        end
    end
    # Last step
    z=A*X[:,end]
    dv[end]=X[:,end]⋅z
    z=z-dv[end]*X[:,end]-ev[end]*X[:,end-1]
    μ=norm(z)
    Tr=SymTridiagonal(dv,ev)
    eigvals(Tr), X*eigvecs(Tr), X, μ
end

myLanczos (generic function with 1 method)

In [5]:
using Random
Random.seed!(421)
n=100
A=Matrix(Symmetric(rand(n,n)))
# Or: A = rand(5,5) |> t -> t + t'
x=rand(n)
k=10

10

In [6]:
λ,U,X,μ=myLanczos(A,x,k)

([-5.61221, -4.67728, -3.31249, -2.09514, -0.571815, 1.36414, 3.01041, 4.45828, 5.26923, 50.0803], [0.0853943 0.265133 … -0.00235606 0.100402; -0.0687232 0.0439774 … 0.104552 0.101049; … ; -0.0396431 0.0547128 … -0.123865 0.111174; 0.0090337 -0.0950335 … 0.156378 0.104672], [0.03093 0.15679 … 0.0138633 -0.13277; 0.00610384 0.193053 … 0.0472577 0.0592124; … ; 0.14683 -0.0251192 … 0.0433138 -0.116158; 0.054315 0.111463 … 0.021722 0.12728], 2.600241666735842)

In [8]:
# Orthogonality
opnorm(X'*X-I)

5.029542274681478e-16

In [9]:
X'*A*X

10×10 Array{Float64,2}:
 36.5073       22.5598       -2.11194e-15  …   9.1199e-16    1.75439e-15
 22.5598       12.4312        2.7439          -6.42738e-16   3.17146e-16
 -2.22283e-16   2.7439        0.520964         4.21708e-16  -1.86541e-16
  8.88733e-16   2.89499e-16   2.76685          5.04414e-17   1.15035e-16
 -1.27035e-15  -4.85625e-16  -7.15073e-17      2.13874e-17  -1.9323e-18 
  1.81964e-15   2.98785e-16  -9.48446e-17  …  -5.15474e-16  -1.62284e-16
 -5.29521e-16  -4.91519e-16   7.25008e-17     -3.25047e-16   7.2262e-17 
  1.1164e-15    6.93339e-16   5.37353e-17      2.81876      -4.97993e-17
  9.5736e-16    1.73954e-16   7.63515e-17      0.154228      2.78294    
  6.27431e-16   5.55216e-16   3.18014e-17      2.78294      -0.592886   

In [12]:
# Residual
opnorm(A*U-U*diagm(0=>λ)), μ

(2.600241666735842, 2.600241666735842)

In [13]:
U'*A*U

10×10 Array{Float64,2}:
 -5.61221       3.14203e-16  -1.41735e-15  …  -3.62859e-15   1.79593e-17
  4.76245e-16  -4.67728      -1.02479e-15      1.93296e-15  -6.75152e-15
 -1.89447e-15  -9.93286e-16  -3.31249          1.24701e-15  -1.2461e-14 
 -3.28197e-17   3.22584e-15  -2.20446e-15      4.32856e-15   9.22623e-15
  8.91218e-16   4.86947e-16  -3.10259e-15     -1.88478e-15   6.13184e-15
  1.94277e-15  -1.62288e-15   1.49008e-15  …  -5.04866e-15   5.75508e-15
  3.18588e-15  -2.79383e-15   3.91819e-17     -2.81065e-15  -2.23716e-15
  8.92221e-17  -1.48958e-15   1.75307e-15      9.73103e-16  -2.96329e-15
 -3.75692e-15   1.97316e-15   1.04084e-15      5.26923       1.40362e-15
  5.24298e-16  -8.8514e-15   -1.03986e-14      4.59891e-15  50.0803     

In [15]:
# Orthogonality
opnorm(U'*U-I)

1.6835216457586782e-15

In [21]:
# Full eigenvalue decomposition
λeigen,Ueigen=eigen(A);

In [18]:
using Arpack

In [19]:
?eigs

search: [0m[1me[22m[0m[1mi[22m[0m[1mg[22m[0m[1ms[22m [0m[1me[22m[0m[1mi[22m[0m[1mg[22mvec[0m[1ms[22m [0m[1me[22m[0m[1mi[22m[0m[1mg[22mval[0m[1ms[22m [0m[1me[22m[0m[1mi[22m[0m[1mg[22mval[0m[1ms[22m! l[0m[1me[22mad[0m[1mi[22mn[0m[1mg[22m_one[0m[1ms[22m l[0m[1me[22mad[0m[1mi[22mn[0m[1mg[22m_zero[0m[1ms[22m [0m[1me[22m[0m[1mi[22m[0m[1mg[22men [0m[1me[22m[0m[1mi[22m[0m[1mg[22mmin



```
eigs(A; nev=6, ncv=max(20,2*nev+1), which=:LM, tol=0.0, maxiter=300, sigma=nothing, ritzvec=true, v0=zeros((0,))) -> (d,[v,],nconv,niter,nmult,resid)
```

Computes eigenvalues `d` of `A` using implicitly restarted Lanczos or Arnoldi iterations for real symmetric or general nonsymmetric matrices respectively. See [the manual](@ref lib-itereigen) for more information.

`eigs` returns the `nev` requested eigenvalues in `d`, the corresponding Ritz vectors `v` (only if `ritzvec=true`), the number of converged eigenvalues `nconv`, the number of iterations `niter` and the number of matrix vector multiplications `nmult`, as well as the final residual vector `resid`.

# Examples

```jldoctest
julia> using Arpack

julia> A = Diagonal(1:4);

julia> λ, ϕ = eigs(A, nev = 2);

julia> λ
2-element Array{Float64,1}:
 4.0
 3.0
```

---

```
eigs(A, B; nev=6, ncv=max(20,2*nev+1), which=:LM, tol=0.0, maxiter=300, sigma=nothing, ritzvec=true, v0=zeros((0,))) -> (d,[v,],nconv,niter,nmult,resid)
```

Computes generalized eigenvalues `d` of `A` and `B` using implicitly restarted Lanczos or Arnoldi iterations for real symmetric or general nonsymmetric matrices respectively. See [the manual](@ref lib-itereigen) for more information.


In [20]:
# Lanczos method implemented in Julia
λeigs,Ueigs=eigs(A; nev=k, which=:LM, ritzvec=true, v0=x)

([50.0803, -5.70867, -5.52979, 5.36771, 5.33537, -5.17477, 5.04423, -4.91332, 4.88666, 4.6994], [0.100402 0.00946309 … -0.176407 -0.0447553; 0.101049 -0.0742464 … 0.0770636 -0.0645418; … ; 0.111174 0.00475186 … 0.00958278 0.0831399; 0.104672 0.0326061 … -0.101123 -0.0894723], 10, 16, 136, [0.166414, 0.306999, -0.0927989, 0.0677047, 0.77893, 0.181925, 0.185167, 0.0837393, -0.0639227, 0.177081  …  0.312912, -0.0484846, 0.10256, -0.00887697, 0.0953349, 0.055649, -0.232884, -0.0873332, -0.373624, 0.302927])

In [22]:
[λ λeigs λeigen[sortperm(abs.(λeigen),rev=true)[1:k]] ]

10×3 Array{Float64,2}:
 -5.61221   50.0803   50.0803 
 -4.67728   -5.70867  -5.70867
 -3.31249   -5.52979  -5.52979
 -2.09514    5.36771   5.36771
 -0.571815   5.33537   5.33537
  1.36414   -5.17477  -5.17477
  3.01041    5.04423   5.04423
  4.45828   -4.91332  -4.91332
  5.26923    4.88666   4.88666
 50.0803     4.6994    4.6994 

We see that `eigs()` computes `k` eigenvalues with largest modulus. What eigenvalues did `myLanczos()` compute?

In [24]:
for i=1:k
    println(minimum(abs,λeigen.-λ[i]))
end

0.08242434872420912
0.05337415718834837
0.010057491766682247
0.0035411656264541236
0.008872819127862641
0.01209055507593737
0.10762601320997556
0.04919537417826447
0.06614007464435634
2.842170943040401e-14


Conslusion is that the naive implementation of Lanczos is not enough. However, it is fine, when all eigenvalues are computed. Why?

In [25]:
λall,Uall,Xall,μall=myLanczos(A,x,100)

([-5.70867, -5.52979, -5.17477, -4.91332, -4.6239, -4.45306, -4.32179, -4.23908, -4.03773, -3.96322  …  4.17311, 4.33832, 4.40909, 4.57007, 4.6994, 4.88666, 5.04423, 5.33537, 5.36771, 50.0803], [0.00946309 0.111976 … 0.0333816 0.100402; -0.0742464 0.112528 … -0.011281 0.101049; … ; 0.00475186 -0.144192 … -0.00446455 0.111174; 0.0326061 0.161813 … -0.0136039 0.104672], [0.03093 0.15679 … -0.130142 0.0895322; 0.00610384 0.193053 … 0.05595 0.0473923; … ; 0.14683 -0.0251192 … 0.131389 -0.0892948; 0.054315 0.111463 … 0.106959 0.135582], 1.4341613852970035e-15)

In [28]:
# Residual and relative errors 
norm(A*Uall-Uall*Diagonal(λall)), norm((λeigen-λall)./λeigen)

(1.152575626749133e-13, 1.4112079368023389e-13)

In [29]:
methods(eigs);

### Operator version

We can use Lanczos method with operator which, given vector `x`, returns the product `A*x`. We use the function `LinearMap()` from the package
[LinearMaps.jl](https://github.com/Jutho/LinearMaps.jl)

In [30]:
# Need Pkg.add("LinearMaps"); Pkg.checkout("LinearMaps")
using LinearMaps

In [31]:
methods(LinearMap)

In [32]:
# Operator from the matrix
C=LinearMap(A)

LinearMaps.WrappedMap{Float64,Array{Float64,2}}([0.345443 0.229319 … 0.90542 0.642131; 0.229319 0.264824 … 0.514178 0.321832; … ; 0.90542 0.514178 … 0.703601 0.472187; 0.642131 0.321832 … 0.472187 0.472606], true, true, false)

In [33]:
λC,UC=eigs(C; nev=k, which=:LM, ritzvec=true, v0=x)
λeigs-λC

10-element Array{Float64,1}:
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0

Here is an example of `LinearMap()` with the function. 

In [34]:
f(x)=A*x

f (generic function with 1 method)

In [35]:
D=LinearMap(f,n,issymmetric=true)

LinearMaps.FunctionMap{Float64}(f, 100, 100; ismutating=false, issymmetric=true, ishermitian=true, isposdef=false)

In [36]:
λD,UD=eigs(D, nev=k, which=:LM, ritzvec=true, v0=x)
λeigs-λD

10-element Array{Float64,1}:
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0

### Sparse matrices

In [38]:
using SparseArrays

In [39]:
?sprand

search: [0m[1ms[22m[0m[1mp[22m[0m[1mr[22m[0m[1ma[22m[0m[1mn[22m[0m[1md[22m [0m[1ms[22m[0m[1mp[22m[0m[1mr[22m[0m[1ma[22m[0m[1mn[22m[0m[1md[22mn [0m[1mS[22mte[0m[1mp[22m[0m[1mR[22m[0m[1ma[22m[0m[1mn[22mge [0m[1mS[22mte[0m[1mp[22m[0m[1mR[22m[0m[1ma[22m[0m[1mn[22mgeLen



```
sprand([rng],[type],m,[n],p::AbstractFloat,[rfn])
```

Create a random length `m` sparse vector or `m` by `n` sparse matrix, in which the probability of any element being nonzero is independently given by `p` (and hence the mean density of nonzeros is also exactly `p`). Nonzero values are sampled from the distribution specified by `rfn` and have the type `type`. The uniform distribution is used in case `rfn` is not specified. The optional `rng` argument specifies a random number generator, see [Random Numbers](@ref).

# Examples

```jldoctest; setup = :(using Random; Random.seed!(1234))
julia> sprand(Bool, 2, 2, 0.5)
2×2 SparseMatrixCSC{Bool,Int64} with 2 stored entries:
  [1, 1]  =  true
  [2, 1]  =  true

julia> sprand(Float64, 3, 0.75)
3-element SparseVector{Float64,Int64} with 1 stored entry:
  [3]  =  0.298614
```


In [40]:
# Generate a sparse symmetric matrix
C=sprand(n,n,0.05) |> t -> t+t'

100×100 SparseMatrixCSC{Float64,Int64} with 978 stored entries:
  [5  ,   1]  =  0.0892932
  [11 ,   1]  =  0.0856666
  [12 ,   1]  =  0.212039
  [38 ,   1]  =  0.905735
  [40 ,   1]  =  0.228686
  [50 ,   1]  =  0.90935
  [57 ,   1]  =  0.845193
  [58 ,   1]  =  0.116939
  [60 ,   1]  =  0.904932
  [76 ,   1]  =  0.0421612
  [89 ,   1]  =  0.916119
  [13 ,   2]  =  0.183209
  ⋮
  [78 ,  99]  =  0.177168
  [15 , 100]  =  0.124823
  [32 , 100]  =  0.226463
  [57 , 100]  =  0.285602
  [60 , 100]  =  0.285211
  [61 , 100]  =  1.44788
  [69 , 100]  =  0.475571
  [71 , 100]  =  0.228978
  [80 , 100]  =  0.0499066
  [85 , 100]  =  1.55049
  [88 , 100]  =  0.627886
  [90 , 100]  =  0.745141

In [41]:
issymmetric(C)

true

In [42]:
λ,U=eigs(C; nev=k, which=:LM, ritzvec=true, v0=x)
λ

10-element Array{Float64,1}:
  5.557900306048012 
 -3.7498542832632378
  3.6465624231981804
 -3.5880174392041035
  3.439966537151299 
 -3.349985181261683 
  3.2143463782441843
 -3.0824101076581405
  3.07328328474578  
  3.0507503470093575