# Eigenvalue Decomposition - Perturbation Theory

---

## Prerequisites

The reader should be familiar with basic linear algebra concepts and facts about eigenvalue decomposition. 

## Competences 

The reader should be able to understand and check the facts about perturbations of eigenvalues and eigenvectors.

---

## Norms

In order to measure changes, we need to define norms. For more details and the proofs of the Facts below, see 
[R. Byers and B. N. Datta, Vector and Matrix Norms, Error Analysis, Efficiency, and Stability][Hog14] and the references therein.

[Hog14]: #1 "L. Hogben, ed., 'Handbook of Linear Algebra', pp. 50.1-50.24, CRC Press, Boca Raton, 2014."

### Definitions

__Norm__ on a vector space $X$ is a real-valued function $\| \phantom{x} \| : X\to \mathbb{R}$ with the following properties:

1. $\| x\|\geq 0$ and $\|x\|=0$ if and only if $x$ is the zero vector _(Positive definiteness)_
2. $\| \lambda x\|=|\lambda| \|x\|$ _(Homogeneity)_
3. $\| x+y\| \leq \|x\|+\|y\|$ _(Triangle inequality)_

Commonly encountered vector norms for $x\in\mathbb{C}^n$ are:

* __H&ouml;lder norm__ or $p$-__norm__: for $p\geq 1$, $\|x\|_p=\big(|x_1|^p+|x_2|^p+\cdots |x_n|^p)^{1/p}$,
* __Sum norm__ or $1$-__norm__: $\|x\|_1=|x_1|+|x_2|+\cdots |x_n|$,
* __Euclidean norm__ or $2$-__norm__: $\|x\|_2=\sqrt{|x_1|^2+|x_2|^2+\cdots |x_n|^2}$,
* __Sup-norm__ or $\infty$-__norm__: $\|x\|_\infty = \max\limits_{i=1,\ldots,n} |x_i|$.

Vector norm is __absolute__ if $\||x|\|=\|x\|$.

Vector norm is __monotone__ if $|x|\leq |y|$ implies $\|x\|\leq \|y\|$. 

From every vector norm we can derive a corresponding __induced__ matrix norm (also, __operator norm__ or __natural norm__):

$$\|A\| = \max\limits_{x\neq 0} \frac{\|Ax\|}{\|x\|}=\max\limits_{\|x\|=1} \|Ax\|.$$

For matrix $A\in\mathbb{C}^{m\times n}$ we define:

* __Maximum absolute column sum norm__: $\|A\|_1=\max\limits_{1\leq j \leq n} \sum_{i=1}^m |a_{ij}|$,
* __ Spectral norm__: $\|A\|_2=\sqrt{\rho(A^*A)}=\sigma_{\max}(A)$  (largest singular value of $A$),
    * __Maximum absolute row sum norm__: $\|A\|_{\infty}=\max\limits_{1\leq i \leq m} \sum_{j=1}^n |a_{ij}|$,
* __Euclidean norm__ or __Frobenius norm__: 
$\|A\|_F =\sqrt{\sum_{i,j} |a_{ij}|^2}=\sqrt{\mathop{\mathrm{tr}}(A^*A)}$.

Matrix norm is __consistent__ if $\|A\cdot B\|\leq \|A\| \cdot \| B\|$, where $A$ and $B$ are compatible for matrix multiplication.

Matrix norm is __absolute__ if $\||A|\|=\|A\|$.

### Examples

In [1]:
x=rand(-4:4,5)

5-element Array{Int64,1}:
 -1
  0
  0
  3
  1

In [2]:
norm(x,1), norm(x), norm(x,Inf)

(5.0,3.3166247903554,3.0)

In [3]:
A=rand(-4:4,7,5)

7x5 Array{Int64,2}:
  3   2   1   4   3
  0   0   4   1  -2
 -3  -1   0  -1   4
  3  -2   0   1  -3
 -1   4  -1   1  -3
  1  -4   1   4   3
  4   4   0  -1   3

In [4]:
norm(A,1), norm(A), norm(A,2), norm(A,Inf), vecnorm(A)

(21.0,8.630367474712905,8.630367474712905,13.0,14.933184523068078)

### Facts


1. $\|x\|_1$, $\|x\|_2$, $\|x\|_\infty$ and $\|x\|_p$ are absolute and monotone vector norms.
2. A vector norm is absolute iff it is monotone.
3. _Convergence_: $x_k\to x_*$ iff for any vector norm $\|x_k-x_*\|\to 0$.
3. Any two vector norms are equivalent in the sense that, for all $x$ and some $\alpha,\beta>0$
$$
\alpha \|x\|_\mu \leq \|x\|_\nu \leq \beta \|x\|_\mu.
$$
In particular:
   * $\|x\|_2 \leq \|x\|_1\leq \sqrt{n}\|x\|_2$,
   * $\|x\|_\infty \leq \|x\|_2\leq \sqrt{n}\|x\|_\infty$,
   * $\|x\|_\infty \leq \|x\|_1\leq n\|x\|_\infty$.
2. _Cauchy-Schwartz inequality_: $|x^*y|\leq \|x\|_2\|y\|_2$.
3. _H&ouml;lder inequality_: if $p,q\geq 1$ and $\frac{1}{p}+\frac{1}{q}=1$, then $|x^*y|\leq \|x\|_p\|y\|_q$.
1. $\|A\|_1$, $\|A\|_2$ and $\|A\|_\infty$ are induced by the corresponding vector norms.
2. $\|A\|_F$ is not and induced norm.
3. $\|A\|_1$, $\|A\|_2$, $\|A\|_\infty$ and $\|A\|_F$ are consistent.
4. $\|A\|_1$, $\|A\|_\infty$ and $\|A\|_F$ are absolute. However, $\||A|\|_2\neq \|A\|_2$.
5. Any two matrix norms are equivalent in the sense that, for all $A$ and some $\alpha,\beta>0$
$$
\alpha \|A\|_\mu \leq \|A\|_\nu \leq \beta \|A\|_\mu.
$$
In particular:
   * $\frac{1}{\sqrt{n}}\|A\|_\infty \leq \|A\|_2\leq \sqrt{m}\|A\|_\infty$,
   * $\|A\|_2 \leq \|A\|_F\leq \sqrt{n}\|A\|_2$,
   * $\frac{1}{\sqrt{m}}\|A\|_1 \leq \|A\|_2\leq \sqrt{n}\|A\|_1$.
6. $\|A\|_2\leq \sqrt{\|A\|_1 \|A\|_\infty}$.
7. $\|AB\|_F\leq \|A\|_F\|B\|_2$ and $\|AB\|_F\leq \|A\|_2\|B\|_F$.
8. If $A=xy^*$, then $\|A\|_2=\|A\|_F=\|x\|_2\|y\|_2$.
9. $\|A^*\|_2=\|A\|_2$ and $\|A^*\|_F=\|A\|_F$.
10. For a unitary matrix $U$ of compatible dimension,
$$\|AU\|_2=\|A\|_2,\quad \|AU\|_F=\|A\|_F,\quad
\|UA\|_2=\|A\|_2,\quad  \|UA\|_F=\|A\|_F.
$$
11. For $A$ square, $\rho(A)\leq\|A\|$.
12. For $A$ square, $A_k\to 0$ iff $>\rho(A)<1$.

In [5]:
# Absolute norms
norm(A,1), norm(abs(A),1), norm(A,Inf), norm(abs(A),Inf), vecnorm(A), vecnorm(abs(A)), norm(A),norm(abs(A))

(21.0,21.0,13.0,13.0,14.933184523068078,14.933184523068078,8.630367474712905,13.343019797443741)

In [6]:
# Equivalence of norms
m,n=size(A)
norm(A,Inf)\sqrt(n),norm(A), sqrt(m)*norm(A,Inf)

(0.17200522903844537,8.630367474712905,34.39476704383968)

In [7]:
norm(A), vecnorm(A), sqrt(n)*norm(A)

(8.630367474712905,14.933184523068078,19.298088344261252)

In [8]:
norm(A,1)\sqrt(m),norm(A), sqrt(n)*norm(A,1)

(0.12598815766974242,8.630367474712905,46.95742752749558)

In [9]:
# Fact 12
norm(A), sqrt(norm(A,1)*norm(A,Inf))

(8.630367474712905,16.522711641858304)

In [10]:
# Fact 13
@show B=rand(n,rand(1:9))
vecnorm(A*B), vecnorm(A)*norm(B), norm(A)*vecnorm(B)

B = rand(n,rand(1:9)) = [0.14495313989355973 0.862559195282494 0.01605065932080363 0.840162518827775 0.148615798648561
 0.9453842917188175 0.6920182886342445 0.15450380085344428 0.5418628642600443 0.721535160867441
 0.9908564135480644 0.16589717593352749 0.6633695737378855 0.9327887269547015 0.6269165105169254
 0.9131840076398012 0.19744750171034164 0.08025578819472812 0.06845316582414762 0.2142159831935535
 0.21266906462893442 0.12458959053072438 0.6112940786953798 0.7620920738383359 0.7018932360587871]


(20.551011657783857,39.51029023753674,25.680600310151107)

In [11]:
# Fact 14
x=rand(10)+im*rand(10)
y=rand(10)+im*rand(10)
A=x*y'
norm(A), vecnorm(A), norm(x)*norm(y)

(6.534720162699072,6.534720162699072,6.534720162699073)

In [12]:
# Fact 15
A=rand(-4:4,7,5)+im*rand(-4:4,7,5)
norm(A), norm(A'), vecnorm(A), vecnorm(A')

(14.818932189561746,14.818932189561748,19.974984355438178,19.974984355438178)

In [13]:
# Unitary invariance - generate random unitary matrix U
U,R=qr(rand(size(A))+im*rand(size(A)),thin=false)

(
7x7 Array{Complex{Float64},2}:
  -0.242441-0.348882im       0.157628-0.195961im   …  -0.233571-0.310607im  
 -0.0957171-0.0514461im   -0.00987517-0.337264im       0.192149+0.148996im  
 -0.0539785-0.216606im      -0.311047-0.0264378im     -0.304321+0.322752im  
    -0.2287-0.389349im     3.83584e-5+0.238172im      -0.123098-0.108093im  
  -0.265665-0.278883im      -0.397433-0.233611im      -0.140684+0.289424im  
  -0.300869-0.396274im      -0.229445+0.243669im   …   0.283998-0.402601im  
  -0.397524-0.00569077im     0.312576-0.49647im          0.4689+0.00781712im,

5x5 Array{Complex{Float64},2}:
 -2.48169+0.0im   -1.99446-0.260571im  …   -1.35313-0.0937606im
      0.0+0.0im  -0.972279+0.0im          -0.915097+0.09813im  
      0.0+0.0im        0.0+0.0im          -0.367113-0.598992im 
      0.0+0.0im        0.0+0.0im          -0.647959-0.430968im 
      0.0+0.0im        0.0+0.0im           0.762635+0.0im      )

In [14]:
norm(A), norm(U*A), vecnorm(A), vecnorm(U*A)

(14.818932189561746,14.81893218956175,19.974984355438178,19.97498435543818)

In [15]:
# Spectral radius
A=rand(7,7)+im*rand(7,7)
maxabs(eigvals(A)), norm(A,Inf), norm(A,1), norm(A), vecnorm(A)

(5.03153499805523,6.185218050341691,6.557337870559409,5.183551400879956,5.690508669574304)

In [16]:
# Fact 18
B=A/(maxabs(eigvals(A))+2)
@show maxabs(eigvals(B))
B^20

maxabs(eigvals(B)) = 0.7155670844910607


7x7 Array{Complex{Float64},2}:
          -1.84667e-5+0.000101599im  …  -5.99373e-5+0.000152871im
          -1.82526e-5+0.000119428im      -6.5065e-5+0.000180757im
           -1.0326e-5+0.000135897im     -5.77417e-5+0.000208882im
          -2.05565e-5+0.000110855im     -6.60335e-5+0.000166673im
 -3.21678e-5+7.97822e-5im               -7.46326e-5+0.000114633im
           1.25282e-5+0.000136569im  …  -2.22871e-5+0.000216931im
           2.88948e-5+0.000120276im       8.2413e-6+0.000196522im

## Errors and condition numbers

We want to answer the question:

__How much the value of a function changes with respect to the change of its argument?__

### Definitions

For function $f(x)$ and argument $x$, the __absolute error__ with respect to the __perturbation__ of the argument 
$\delta x$ is 

$$
\| f(x+\delta x)-f(x)\| \leq \frac{\| f(x+\delta x)-f(x)\|}{\| \delta x \|} \|\delta x\| \equiv \kappa \|\delta x\|.
$$

The  __condition__ or  __condition number__ $\kappa$ tells how much does the perturbation of the argument increase. (Its form resembles derivative.)

Similarly, the __relative error__ with respect to the relative perturbation of the argument is

$$
\frac{\| f(x+\delta x)-f(x)\|}{\| f(x) \|}\leq \frac{\| f(x+\delta x)-f(x)\|\cdot  \|x\| }{\|\delta x\| \cdot\| f(x)\|}
\cdot \frac{\|\delta x\|}{\|x\|} \equiv \kappa_{rel} \frac{\|\delta x\|}{\|x\|}.
$$

## Peturbation bounds

### Definitions

Let $A\in\mathbb{C}^{n\times n}$.

Pair $(\lambda,x)\in\mathbb{C}\times\mathbb{C}^{n\times n}$ is an __eigenpair__ of $A$ if $x\neq 0$ and $Ax=\lambda x$.

Triplet $(y,\lambda,x)\in\times\mathbb{C}^{n}\times\mathbb{C}\times\mathbb{C}^{n}$ is an __eigentriplet__ of $A$ if $x,y\neq 0$ and $Ax=\lambda x$ and $y^*A=\lambda y^*$.

__Eigenvalue matrix__ is a diagonal matrix $\Lambda=\mathop{\mathrm{diag}}(\lambda_1,\lambda_2,\ldots,\lambda_n)$.

If all eigenvalues are real, they can be increasingly ordered. $\Lambda^\uparrow$ is the eigenvalue matrix of increasingly ordered eigenvalues.

$\tau$ is a __permutation__ of $\{1,2,\ldots,n\}$.

$\tilde A=A+\Delta A$ is a __perturbed matrix__, where $\Delta A$ is __perturbation__. $(\tilde \lambda,\tilde x)$ are the eigenpairs of $\tilde A$.

__Condition number__ of a nonsingular matrix $X$ is $\kappa(X)=\|X\| \|X^{-1}\|$.

Let $X,Y\in\mathbb{C}^{n\times k}$ with $\mathop{\mathrm{rank}}(X)=\mathop{\mathrm{rank}}(Y)=k$. The __canonical angles__ between their column spaces are $\theta_i=\cos^{-1}\sigma_i$, where $\sigma_i$ are the singular values of 
$(Y^*Y)^{-1/2}Y^*X(X^*X)^{-1/2}$. The __canonical angle matrix__ between $X$ and $Y$ is 
$$\Theta(X,Y)=\mathop{\mathrm{diag}}(\theta_1,\theta_2,\ldots,\theta_k).
$$
    

### Facts

Bounds become more strict as matrices have more structure. 
Many bounds have versions in spectral norm and Frobenius norm.
For more details and the proofs of the Facts below, see 
[R.-C. Li, Matrix Perturbation Theory][Hog14], and the references therein.

[Hog14]: #1 "L. Hogben, ed., 'Handbook of Linear Algebra', pp. 21.1-21.20, CRC Press, Boca Raton, 2014."

1. There exists $\tau$ such that
$$\|\Lambda- \tilde\Lambda_\tau\|_2\leq 4(\|A\|_2+\|\tilde A\|_2)^{1-1/n}\|\Delta A\|_2^{1/n}.$$

2. _(First-order perturbation bounds)_ Let $(y,\lambda,x)$ be an eigentriplet of a simple $\lambda$. $\Delta A$ changes $\lambda$ to 
$\tilde\lambda=\lambda+ \delta\lambda$, where
$$
\delta\lambda=\frac{y^*(\Delta A)x}{y^*x}+O(\|\Delta A\|_2^2).
$$

3. Let $\lambda$ be a semisimple eigenvalue of $A$ with multiplicitiy $k$, and let $X,Y\in \mathbb{C}^{n\times k}$ be the matrices of the corresponding right and left eigenvectors, that is, $AX=\lambda X$ and $Y^*A=\lambda Y^*$, such that $Y^*X=I_k$. $\Delta A$ changes the $k$ copies of $\mu$ to $\tilde \mu=\mu+\delta\mu_i$, where $\delta\mu_i$ are the eigenvalues of $Y^*(\Delta A) X$ up to $O(\|\Delta A\|_2^2$.

4. _(Bauer-Fike Theorem)_ If $A$ is diagonalizable and $A=X\Lambda X^{-1}$ is its eigenvalue decomposition, then
$$
\max_i\min_j |\tilde \lambda_i -
\lambda_j|\leq \|X^{-1}(\Delta A)X\|_p\leq \kappa_p(X)\|\Delta A\|_p.
$$

5. If $A$ and $\tilde A$ are diagonalizable, there exists $\tau$ such that
$$\|\Lambda-\tilde\Lambda_\tau\|_F\leq \sqrt{\kappa_2(X)\kappa_2(\tilde X)}\|\Delta A\|_F.
$$ If $\Lambda$ and  $\tilde\Lambda$ are real, then
$$
\|\Lambda^\uparrow-\tilde\Lambda^\uparrow\|_{2,F} \leq \sqrt{\kappa_2(X)\kappa_2(\tilde X)}\|\Delta A\|_{2,F}.
$$

6. If $A$ is normal, there exists $\tau$ such that $\|\Lambda-\tilde\Lambda_\tau\|_F\leq\sqrt{n}\|\Delta A\|_F$.

7. _(Hoffman-Wielandt Theorem)_ If $A$ and $\tilde A$ are normal, there exists $\tau$ such that $\|\Lambda-\tilde\Lambda_\tau\|_F\leq\|\Delta A\|_F$.

8. If $A$ is Hermitian, for any unitarily invariant norm $\|\Lambda^\uparrow-\tilde\Lambda^\uparrow\| \leq \|\Delta A\|$. In particular,
\begin{align*}
\max_i|\lambda^\uparrow_i-\tilde\lambda^\uparrow_i|&\leq \|\Delta A\|_2,\\ 
\sqrt{\sum_i(\lambda^\uparrow_i-\tilde\lambda^\uparrow_i)^2}&\leq \|\Delta A\|_F.
\end{align*}

9. _(Residual bounds)_ Let $A$ be Hermitian. For some $\tilde\lambda\in\mathbb{R}$ and $\tilde x\in\mathbb{C}^n$ with $\|\tilde x\|_2=1$, define __residual__ $r=A\tilde x-\tilde\lambda\tilde x$. Then
$|\tilde\lambda-\lambda|\leq \|r\|_2$ for some $\lambda\in\sigma(A)$.

9. Let, in addition,  $\tilde\lambda=\tilde x^* A\tilde x$, let $\lambda$ be closest to $\tilde\lambda$ and $x$ be its unit eigenvector, and let 
$$\eta=\mathop{\mathrm{gap}}(\tilde\lambda)= \min_{\lambda\neq\mu\in\sigma(A)}|\tilde\lambda-\mu|.$$
If $\eta>0$, then
$$ |\tilde\lambda-\lambda|\leq \frac{\|r\|_2^2}{\eta},\quad \sin\theta(x,\tilde x)\leq \frac{\|r\|_2}{\eta}.
$$

10. Let $A$ be Hermitian, $X\in\mathbb{C}^{n\times k}$ have full column rank, and $M\in\mathcal{H}_k$ having eigenvalues 
$\mu_1\leq\mu_2\leq\cdots\leq\mu_k$. Set $R=AX-XM$. Then
there exist $\lambda_{i_1}\leq\lambda_{i_2}\leq\cdots\leq\lambda_{i_k}\in\sigma(A)$ such that
\begin{align*}    
\max_{1\leq j\leq k} |\mu_j-\lambda_{i_j}|& \leq \frac{\|R\|_2}{\sigma_{\min}(X)},\\
\sqrt{\sum_{j=1}^k (\mu_j-\lambda_{i_j})^2}&\leq \frac{\|R\|_F}{\sigma_{\min}(X)}.
\end{align*}
(The indices $i_j$ need not be the same in the above formulae.)

10. If, additionally, $X^*X=I$ and $M=X^*AX$, and if all but $k$ of $A$'s eigenvalues differ from every one of $M$'s eigenvalues by at least $\eta>0$, then
$$\sqrt{\sum_{j=1}^k (\mu_j-\lambda_{i_j})^2}\leq \frac{\|R\|_F^2}{\eta\sqrt{1-\|R\|_F^2/\eta^2}}.$$

11. Let $A=\begin{bmatrix} M & E^* \\ E & H \end{bmatrix}$ and $\tilde A=\begin{bmatrix} M & 0 \\ 0 & H \end{bmatrix}$ be Hermitian, and set $\eta=\min |\mu-\nu|$ over all $\mu\in\sigma(M)$ and $\nu\in\sigma(H)$. Then
$$ 
\max |\lambda_j^\uparrow -\tilde\lambda_j^\uparrow| \leq \frac{2\|E\|_2^2}{\eta+\sqrt{\eta^2+4\|E\|_2^2}}.
$$

14. Let 
$$
\begin{bmatrix} X_1^*\\ X_2^* \end{bmatrix} A \begin{bmatrix} X_1 & X_2 \end{bmatrix}=
\begin{bmatrix} A_1 &  \\ & A_2 \end{bmatrix}, \quad \begin{bmatrix} X_1 & X_2 \end{bmatrix} \textrm{unitary},
\quad X_1\in\mathbb{C}^{n\times k}.
$$
Let $Q\in\mathbb{C}^{n\times k}$ have orthonormal columns and for a Hermitian $k\times k$ matrix $M$ set
$R=AQ-QM$. Let $\eta=\min|\mu-\nu|$ over all $\mu\in\sigma(M)$ and $\nu\in\sigma(A_2)$. If $\eta > 0$, then
$$
\|\sin\Theta(X_1,Q)\|_F\leq \frac{\|R\|_F}{\eta}.
$$

### Example - Nondiagonalizable matrix

In [17]:
A=[-3 7 -1; 6 8 -2; 72 -28 19]

3x3 Array{Int64,2}:
 -3    7  -1
  6    8  -2
 72  -28  19

In [18]:
# (Right) eigenvectors
λ,X=eig(A)

([-6.000000000000005,15.000000241477958,14.999999758522048],
3x3 Array{Float64,2}:
  0.235702   0.218218  -0.218218
 -0.235702   0.436436  -0.436436
 -0.942809  -0.872872   0.872872)

In [19]:
cond(X)

9.091581949434164e7

In [20]:
# Left eigenvectors
λ1,Y=eig(A')

(Complex{Float64}[-5.999999999999998 + 0.0im,14.999999999999993 + 2.0088262607214127e-7im,14.999999999999993 - 2.0088262607214127e-7im],
3x3 Array{Complex{Float64},2}:
     0.894427+0.0im               0.970143+0.0im               0.970143-0.0im
    -0.447214+0.0im  -7.58506e-16-1.62404e-8im    -7.58506e-16+1.62404e-8im  
 -6.07504e-17+0.0im       0.242536+4.0601e-9im         0.242536-4.0601e-9im  )

In [21]:
# Try k=2,3
k=1
Y[:,k]'*A-λ[k]*Y[:,k]'

1x3 Array{Complex{Float64},2}:
 8.88178e-16+0.0im  8.88178e-16+0.0im  -1.7408e-15+0.0im

In [22]:
ΔA=rand(3,3)/20
B=A+ΔA

3x3 Array{Float64,2}:
 -2.95663    7.03661  -0.995273
  6.00513    8.04044  -1.96629 
 72.0063   -27.9913   19.0024  

In [23]:
μ,Z=eig(B)

(Complex{Float64}[-5.951157189019204 + 0.0im,15.018700806403988 + 0.6318503088010021im,15.018700806403988 - 0.6318503088010021im],
3x3 Array{Complex{Float64},2}:
 -0.235969+0.0im  -0.213566+0.0340179im  -0.213566-0.0340179im
  0.233829+0.0im  -0.424639+0.0677232im  -0.424639-0.0677232im
  0.943209+0.0im   0.876543+0.0im         0.876543-0.0im      )

In [24]:
# Fact 2
δλ=μ[k]-λ[k]

0.04884281098080123 + 0.0im

In [25]:
Y[:,k]'*ΔA*X[:,k]/(Y[:,k]⋅X[:,k])

1-element Array{Complex{Float64},1}:
 0.048619+0.0im

### Example - Jordan form

In [26]:
n=70
c=0.5
J=Bidiagonal(c*ones(n),ones(n-1),true)

70x70 Bidiagonal{Float64}:
 0.5  1.0  0.0  0.0  0.0  0.0  0.0  0.0  …  0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.5  1.0  0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.5  1.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.5  1.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.5  1.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.5  1.0  0.0  …  0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.5  1.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.5     0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  …  0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0

In [27]:
# Accurately defined eigenvalues
λ=eigvals(J)

70-element Array{Float64,1}:
 0.5
 0.5
 0.5
 0.5
 0.5
 0.5
 0.5
 0.5
 0.5
 0.5
 0.5
 0.5
 0.5
 ⋮  
 0.5
 0.5
 0.5
 0.5
 0.5
 0.5
 0.5
 0.5
 0.5
 0.5
 0.5
 0.5

In [28]:
# Only one eigenvector
X=eigvecs(J)

70x70 Array{Float64,2}:
 1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0  …  1.0  1.0  1.0  1.0  1.0  1.0  1.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  …  0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  …  0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0 

In [29]:
x=eigvecs(J)[:,1]
y=eigvecs(J')[:,1]

70-element Array{Float64,1}:
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 ⋮  
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 1.0

In [30]:
y'*full(J)-0.5*y'

1x70 Array{Float64,2}:
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  …  0.0  0.0  0.0  0.0  0.0  0.0  0.0

In [31]:
# Just one perturbed element in the lower left corner
ΔJ=sqrt(eps())*[zeros(n-1);1]*eye(1,n)

70x70 Array{Float64,2}:
 0.0         0.0  0.0  0.0  0.0  0.0  …  0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0         0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0         0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0         0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0         0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0         0.0  0.0  0.0  0.0  0.0  …  0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0         0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0         0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0         0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0         0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0         0.0  0.0  0.0  0.0  0.0  …  0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0         0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0         0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
 

In [32]:
B=J+ΔJ
μ=eigvals(B)

70-element Array{Complex{Float64},1}:
 -0.273017+0.0im      
 -0.269905+0.0692927im
 -0.269905-0.0692927im
 -0.260594+0.138027im 
 -0.260594-0.138027im 
 -0.245159+0.205651im 
 -0.245159-0.205651im 
 -0.223725+0.271619im 
 -0.223725-0.271619im 
 -0.196464+0.335399im 
 -0.196464-0.335399im 
 -0.163595+0.39648im  
 -0.163595-0.39648im  
          ⋮           
   1.26059+0.138027im 
   1.26059-0.138027im 
   1.24516+0.205651im 
   1.24516-0.205651im 
   1.22373+0.271619im 
   1.22373-0.271619im 
   1.19646+0.335399im 
   1.19646-0.335399im 
    1.1636+0.39648im  
    1.1636-0.39648im  
   1.12538+0.454368im 
   1.12538-0.454368im 

In [33]:
# Fact 2
maximum(abs(λ-μ))

0.7730166686363998

In [34]:
y'*ΔJ*x/(y⋅x)

1-element Array{Float64,1}:
 Inf

However, since $B$ is diagonalizable, we can apply Bauer-Fike theorem to it: 

In [35]:
μ,Y=eig(B)

(Complex{Float64}[-0.273017+0.0im,-0.269905+0.0692927im,-0.269905-0.0692927im,-0.260594+0.138027im,-0.260594-0.138027im,-0.245159+0.205651im,-0.245159-0.205651im,-0.223725+0.271619im,-0.223725-0.271619im,-0.196464+0.335399im  …  1.24516+0.205651im,1.24516-0.205651im,1.22373+0.271619im,1.22373-0.271619im,1.19646+0.335399im,1.19646-0.335399im,1.1636+0.39648im,1.1636-0.39648im,1.12538+0.454368im,1.12538-0.454368im],
70x70 Array{Complex{Float64},2}:
   -0.634386+0.0im  …              0.634386-0.0im      
    0.490391+0.0im                 0.396734-0.288244im 
    -0.37908+0.0im                 0.117142-0.360527im 
    0.293035+0.0im               -0.0905529-0.278693im 
   -0.226521+0.0im                -0.183259-0.133146im 
    0.175105+0.0im  …    -0.175105-8.85556e-13im       
   -0.135359+0.0im                -0.109508+0.0795619im
    0.104635+0.0im               -0.0323339+0.0995134im
  -0.0808843+0.0im                0.0249946+0.0769255im
   0.0625249+0.0im                0.0505837+0.

In [36]:
cond(Y)

5.1876270485046625e7

In [37]:
norm(inv(Y)*ΔJ*Y), cond(Y)*norm(ΔJ)

(0.7730166686343102,0.7730166686333213)

### Example - Normal matrix

In [38]:
using SpecialMatrices

In [39]:
n=6
C=Circulant(rand(n))

6x6 SpecialMatrices.Circulant{Float64}:
 0.280027   0.456382   0.89517    0.147186   0.314493   0.0329194
 0.0329194  0.280027   0.456382   0.89517    0.147186   0.314493 
 0.314493   0.0329194  0.280027   0.456382   0.89517    0.147186 
 0.147186   0.314493   0.0329194  0.280027   0.456382   0.89517  
 0.89517    0.147186   0.314493   0.0329194  0.280027   0.456382 
 0.456382   0.89517    0.147186   0.314493   0.0329194  0.280027 

In [40]:
λ=eigvals(full(C))

6-element Array{Complex{Float64},1}:
   2.12618+0.0im     
  0.853202+0.0im     
 -0.227341+0.86961im 
 -0.227341-0.86961im 
 -0.422269+0.136152im
 -0.422269-0.136152im

In [41]:
ΔC=rand(n,n)*0.0001

6x6 Array{Float64,2}:
 2.70169e-5  4.43039e-5  4.50529e-5  7.88329e-5  4.25021e-5  7.55726e-5
 6.86612e-5  5.53859e-5  8.13803e-5  2.71019e-5  6.88746e-5  6.87128e-5
 1.1089e-5   3.69796e-5  3.66697e-5  4.31724e-5  6.25283e-5  3.94477e-5
 1.01807e-5  6.73907e-5  1.49378e-6  9.07601e-5  7.39689e-5  6.037e-5  
 3.62829e-5  7.80455e-5  3.79155e-5  2.09202e-5  1.5202e-5   1.46986e-5
 9.25536e-5  2.45492e-5  9.49463e-5  2.76453e-5  3.174e-5    5.73445e-5

In [42]:
@show norm(ΔC)
μ=eigvals(C+ΔC)

norm(ΔC) = 0.00029795097944865007


6-element Array{Complex{Float64},1}:
   2.12647+0.0im     
  0.853175+0.0im     
 -0.227349+0.869626im
 -0.227349-0.869626im
 -0.422252+0.136135im
 -0.422252-0.136135im

### Example - Hermitian matrix

In [43]:
m=10
n=6
A=rand(m,n)
# Some scaling
D=diagm((rand(n)-0.5)*exp(20))
A=A*D

10x6 Array{Float64,2}:
 -1.40598e7  -1.53282e7  4.81769e6  5.6818e7   -2.22987e7  -1.35568e8
 -3.30994e6  -4.32384e6  1.78038e6  1.0496e8   -1.8647e7   -1.38361e8
 -3.30849e7  -2.04635e7  6.17044e5  6.31594e7  -7.19669e7  -9.56312e6
 -1.0634e8   -3.14615e7  1.68204e6  5.7415e7   -8.16104e7  -1.27169e7
 -4.09174e7  -1.07478e7  8.57308e5  1.03076e8  -4.40167e7  -1.33705e8
 -6.53127e7  -2.94391e6  2.0912e6   1.0963e8   -8.96735e7  -9.95375e7
 -1.22917e8  -3.90819e6  3.76049e6  1.49196e8  -6.12102e6  -4.72304e7
 -1.36662e8  -5.28349e5  1.71862e6  8.64603e7  -9.99323e7  -1.10229e8
 -3.41082e6  -2.45257e7  4.90168e6  1.25894e8  -1.19024e8  -1.67825e8
 -1.22193e8  -2.80482e7  1.20269e6  2.79709e7  -7.63784e7  -1.2892e8 

In [44]:
A=cor(A)

6x6 Array{Float64,2}:
  1.0        -0.0505982    0.28378      0.148572  0.133623   -0.397334 
 -0.0505982   1.0          0.00824342   0.589239  0.353699   -0.178353 
  0.28378     0.00824342   1.0          0.384523  0.10674    -0.366137 
  0.148572    0.589239     0.384523     1.0       0.17392    -0.137651 
  0.133623    0.353699     0.10674      0.17392   1.0         0.0695847
 -0.397334   -0.178353    -0.366137    -0.137651  0.0695847   1.0      

In [45]:
ΔA=cor(rand(m,n)*D)*1e-5

6x6 Array{Float64,2}:
  1.0e-5      -1.75051e-6   7.27812e-7  -2.84144e-6   2.61253e-6  -1.4968e-6 
 -1.75051e-6   1.0e-5      -5.44976e-6   6.24717e-6  -1.08082e-6   4.28063e-6
  7.27812e-7  -5.44976e-6   1.0e-5      -3.11945e-6  -4.48206e-6   9.84006e-7
 -2.84144e-6   6.24717e-6  -3.11945e-6   1.0e-5      -3.09465e-6   4.83014e-6
  2.61253e-6  -1.08082e-6  -4.48206e-6  -3.09465e-6   1.0e-5      -7.72464e-7
 -1.4968e-6    4.28063e-6   9.84006e-7   4.83014e-6  -7.72464e-7   1.0e-5    

In [46]:
B=A+ΔA

6x6 Array{Float64,2}:
  1.00001   -0.0506       0.283781     0.148569  0.133626   -0.397336 
 -0.0506     1.00001      0.00823797   0.589246  0.353698   -0.178349 
  0.283781   0.00823797   1.00001      0.38452   0.106736   -0.366136 
  0.148569   0.589246     0.38452      1.00001   0.173917   -0.137646 
  0.133626   0.353698     0.106736     0.173917  1.00001     0.0695839
 -0.397336  -0.178349    -0.366136    -0.137646  0.0695839   1.00001  

In [47]:
λ,U=eig(A) 
μ=eigvals(B)
[λ μ]

6x2 Array{Float64,2}:
 0.198374  0.19838 
 0.580301  0.580309
 0.768204  0.768214
 0.918354  0.918368
 1.44494   1.44495 
 2.08983   2.08984 

In [48]:
# Residual bounds - how close is μ, y to λ[2],X[:,2]
k=3
μ=round(λ[k],2)
y=round(U[:,k],2)
y=y/norm(y)

6-element Array{Float64,1}:
  0.229279 
  0.378809 
 -0.687837 
 -0.239248 
 -0.0697805
 -0.51837  

In [49]:
μ

0.77

In [50]:
# Fact 9
r=A*y-μ*y

6-element Array{Float64,1}:
 -0.000531079
 -0.00334765 
  0.000334566
 -0.00302493 
 -0.00252885 
  0.00203196 

In [51]:
minimum(abs(μ-λ)), norm(r)

(0.0017962451153789027,0.005592394255532198)

In [52]:
# Fact 10 - μ is Rayleigh quotient
μ=(y'*A*y)[] # Vector, unfortunately
r=A*y-μ*y

6-element Array{Float64,1}:
 -0.000124534
 -0.00267597 
 -0.000885071
 -0.00344915 
 -0.00265258 
  0.00111282 

In [53]:
η=min(abs(μ-λ[k-1]),abs(μ-λ[k+1]))

0.15012666998626978

In [54]:
μ-λ[k], norm(r)^2/η

(2.309646400422416e-5,0.0001873805458551144)

In [55]:
# Eigenvector bound
# cos(θ)
cosθ=dot(y,U[:,k])
# sin(θ)
sinθ=sqrt(1-cosθ^2)
sinθ,norm(r)/η

(0.005544671937376312,0.035329161020332664)

In [56]:
# Residual bounds - Fact 13
U=eigvecs(A)
Q=round(U[:,1:3],2)
# Orthogonalize
X,R=qr(Q)
M=X'*A*X
R=A*X-X*M
μ=eigvals(M)
λ, μ, R

([0.19837443442964567,0.5803008298044705,0.7682037548846211,0.9183535213348951,1.4449353271002765,2.0898321324460927],[0.19843938978219455,0.5803164493996646,0.7682289660100081],
6x3 Array{Float64,2}:
  0.0018205    0.00128987   -0.000503279
  0.00580417  -0.00253157   -0.00327183 
  0.00408486   0.000309214  -0.00117756 
  0.00657818  -0.00195991   -0.00303458 
  0.00191     -0.00100452   -0.00231384 
 -0.00366677  -0.000649976   0.000661031)

In [57]:
# The entries of μ are not ordered - which algorithm was called?
issym(M)

false

In [58]:
M=Hermitian(M)
R=A*X-X*M
μ=eigvals(M)

3-element Array{Float64,1}:
 0.198439
 0.580316
 0.768229

In [59]:
η=λ[4]-λ[3]

0.150149766450274

In [60]:
norm(λ[1:3]-μ), vecnorm(R)^2/η

(7.140567499891654e-5,0.0010312354761840806)

In [61]:
# Fact 13
M=A[1:3,1:3]
H=A[4:6,4:6]
E=A[4:6,1:3]
# Block-diagonal matrix
B=cat([1,2],M,H)

6x6 Array{Float64,2}:
  1.0        -0.0505982   0.28378      0.0       0.0         0.0      
 -0.0505982   1.0         0.00824342   0.0       0.0         0.0      
  0.28378     0.00824342  1.0          0.0       0.0         0.0      
  0.0         0.0         0.0          1.0       0.17392    -0.137651 
  0.0         0.0         0.0          0.17392   1.0         0.0695847
  0.0         0.0         0.0         -0.137651  0.0695847   1.0      

In [62]:
η=minimum(abs(eigvals(M)-eigvals(H)))
μ=eigvals(B)
[λ μ], 2*norm(E)^2/(η+sqrt(η^2+4*norm(E)^2))

(
6x2 Array{Float64,2}:
 0.198374  0.710213
 0.580301  0.741316
 0.768204  1.00285 
 0.918354  1.0673  
 1.44494   1.19139 
 2.08983   1.28694 ,

0.9191335682117601)

In [63]:
# Eigenspace bounds - Fact 14
B=A+ΔA
μ,V=eig(B)

([0.19837978573850565,0.5803093118649132,0.7682141929345702,0.9183679137342919,1.4449536001424932,2.0898351955852275],
6x6 Array{Float64,2}:
  0.218703  -0.606799   -0.230288    0.450177   0.459302   0.342511
  0.59387    0.0550786  -0.383002   -0.201056  -0.523811   0.427543
  0.362037   0.308178    0.687362   -0.072474   0.324656   0.436847
 -0.536699  -0.439542    0.237067   -0.336018  -0.269445   0.526353
 -0.259439   0.330311    0.0676698   0.773572  -0.375119   0.282631
  0.336249  -0.481129    0.516751    0.200856  -0.442659  -0.390038)

In [64]:
# sin(Θ(U[:,1:3],V[:,1:3]))
X=U[:,1:3]
Q=V[:,1:3]
cosθ=svdvals(sqrtm(Q'*Q)*Q'*X*sqrtm(X'*X))
sinθ=sqrt(1-cosθ.^2)

3-element Array{Float64,1}:
 3.78149e-7
 7.37222e-6
 2.08993e-5

In [65]:
# Bound
M=Q'*A*Q

3x3 Array{Float64,2}:
  0.198374   2.0401e-6   -2.8022e-6 
  2.0401e-6  0.580301     4.77924e-7
 -2.8022e-6  4.77924e-7   0.768204  

In [66]:
R=A*Q-Q*M

6x3 Array{Float64,2}:
  8.10106e-9  -2.37492e-6   9.01564e-7
  4.25945e-8   4.66762e-6   1.19302e-6
 -4.76213e-7  -2.10329e-7  -2.17054e-7
 -3.30246e-7   3.88803e-6   4.05916e-7
  1.01383e-6   1.28973e-6   3.54605e-6
  6.87361e-7   7.28328e-7   9.24148e-7

In [67]:
eigvals(M), λ

([0.1983744344317503,0.768203754952121,0.5803008298595141],[0.19837443442964567,0.5803008298044705,0.7682037548846211,0.9183535213348951,1.4449353271002765,2.0898321324460927])

In [68]:
η=abs(eigvals(M)[3]-λ[4])
vecnorm(sinθ), vecnorm(R)/η

(2.2164694638019082e-5,2.3385305497259946e-5)