# Eigenvalue Decomposition - Perturbation Theory


## Prerequisites

The reader should be familiar with basic linear algebra concepts and facts about eigenvalue decomposition. 

## Competences 

The reader should be able to understand and check the facts about perturbations of eigenvalues and eigenvectors.


## Norms

In order to measure changes, we need to define norms. For more details and the proofs of the Facts below, see 
[R. Byers and B. N. Datta, Vector and Matrix Norms, Error Analysis, Efficiency, and Stability][Hog14] and the references therein.

[Hog14]: #1 "L. Hogben, ed., 'Handbook of Linear Algebra', pp. 50.1-50.24, CRC Press, Boca Raton, 2014."

### Definitions

__Norm__ on a vector space $X$ is a real-valued function $\| \phantom{x} \| : X\to \mathbb{R}$ with the following properties:

1. __Positive definiteness.__ $\| x\|\geq 0$ and $\|x\|=0$ if and only if $x$ is the zero vector.
2. __Homogeneity.__ $\| \lambda x\|=|\lambda| \|x\|$ 
3. __Triangle inequality.__ $\| x+y\| \leq \|x\|+\|y\|$

Commonly encountered vector norms for $x\in\mathbb{C}^n$ are:

* __H&ouml;lder norm__ or $p$-__norm__: for $p\geq 1$, $\|x\|_p=\big(|x_1|^p+|x_2|^p+\cdots |x_n|^p)^{1/p}$,
* __Sum norm__ or $1$-__norm__: $\|x\|_1=|x_1|+|x_2|+\cdots |x_n|$,
* __Euclidean norm__ or $2$-__norm__: $\|x\|_2=\sqrt{|x_1|^2+|x_2|^2+\cdots |x_n|^2}$,
* __Sup-norm__ or $\infty$-__norm__: $\|x\|_\infty = \max\limits_{i=1,\ldots,n} |x_i|$.

Vector norm is __absolute__ if $\||x|\|=\|x\|$.

Vector norm is __monotone__ if $|x|\leq |y|$ implies $\|x\|\leq \|y\|$. 

From every vector norm we can derive a corresponding __induced__ matrix norm (also, __operator norm__ or __natural norm__):

$$\|A\| = \max\limits_{x\neq 0} \frac{\|Ax\|}{\|x\|}=\max\limits_{\|x\|=1} \|Ax\|.$$

For matrix $A\in\mathbb{C}^{m\times n}$ we define:

* __Maximum absolute column sum norm__: $\|A\|_1=\max\limits_{1\leq j \leq n} \sum_{i=1}^m |a_{ij}|$,
* __ Spectral norm__: $\|A\|_2=\sqrt{\rho(A^*A)}=\sigma_{\max}(A)$  (largest singular value of $A$),
* __Maximum absolute row sum norm__: $\|A\|_{\infty}=\max\limits_{1\leq i \leq m} \sum_{j=1}^n |a_{ij}|$,
* __Euclidean norm__ or __Frobenius norm__: 
$\|A\|_F =\sqrt{\sum_{i,j} |a_{ij}|^2}=\sqrt{\mathop{\mathrm{tr}}(A^*A)}$.

Matrix norm is __consistent__ if $\|A\cdot B\|\leq \|A\| \cdot \| B\|$, where $A$ and $B$ are compatible for matrix multiplication.

Matrix norm is __absolute__ if $\||A|\|=\|A\|$.

### Examples

In [2]:
import Random
Random.seed!(425)
x=rand(-4:4,5)

5-element Array{Int64,1}:
 -2
 -3
 -2
  4
  2

In [4]:
using LinearAlgebra
norm(x,1), norm(x), norm(x,Inf)

(13.0, 6.082762530298219, 4.0)

In [5]:
A=rand(-4:4,7,5)

7×5 Array{Int64,2}:
  0  -3   2   0   4
  1   1   4  -3  -3
  3   2   1  -3   0
 -4  -4  -2  -1  -3
  2   3  -4   2   0
 -1  -1   4  -3  -3
 -1  -3  -2   4   4

In [7]:
norm(A,1), norm(A), norm(A,2), norm(A,Inf), 
opnorm(A), opnorm(A,1), opnorm(A,Inf)

(81.0, 15.7797338380595, 15.7797338380595, 4.0, 11.053558535027305, 19.0, 14.0)

In [9]:
# Frobenius norm
norm(vec(A)) 

15.7797338380595

### Facts


1. $\|x\|_1$, $\|x\|_2$, $\|x\|_\infty$ and $\|x\|_p$ are absolute and monotone vector norms.
2. A vector norm is absolute iff it is monotone.
3. __Convergence.__ $x_k\to x_*$ iff for any vector norm $\|x_k-x_*\|\to 0$.
3. Any two vector norms are equivalent in the sense that, for all $x$ and some $\alpha,\beta>0$
$$
\alpha \|x\|_\mu \leq \|x\|_\nu \leq \beta \|x\|_\mu.
$$
In particular:
   * $\|x\|_2 \leq \|x\|_1\leq \sqrt{n}\|x\|_2$,
   * $\|x\|_\infty \leq \|x\|_2\leq \sqrt{n}\|x\|_\infty$,
   * $\|x\|_\infty \leq \|x\|_1\leq n\|x\|_\infty$.
2. __Cauchy-Schwartz inequality.__ $|x^*y|\leq \|x\|_2\|y\|_2$.
3. __H&ouml;lder inequality.__ if $p,q\geq 1$ and $\frac{1}{p}+\frac{1}{q}=1$, then $|x^*y|\leq \|x\|_p\|y\|_q$.
1. $\|A\|_1$, $\|A\|_2$ and $\|A\|_\infty$ are induced by the corresponding vector norms.
2. $\|A\|_F$ is not an induced norm.
3. $\|A\|_1$, $\|A\|_2$, $\|A\|_\infty$ and $\|A\|_F$ are consistent.
4. $\|A\|_1$, $\|A\|_\infty$ and $\|A\|_F$ are absolute. However, $\||A|\|_2\neq \|A\|_2$.
5. Any two matrix norms are equivalent in the sense that, for all $A$ and some $\alpha,\beta>0$
$$
\alpha \|A\|_\mu \leq \|A\|_\nu \leq \beta \|A\|_\mu.
$$
In particular:
   * $\frac{1}{\sqrt{n}}\|A\|_\infty \leq \|A\|_2\leq \sqrt{m}\|A\|_\infty$,
   * $\|A\|_2 \leq \|A\|_F\leq \sqrt{n}\|A\|_2$,
   * $\frac{1}{\sqrt{m}}\|A\|_1 \leq \|A\|_2\leq \sqrt{n}\|A\|_1$.
6. $\|A\|_2\leq \sqrt{\|A\|_1 \|A\|_\infty}$.
7. $\|AB\|_F\leq \|A\|_F\|B\|_2$ and $\|AB\|_F\leq \|A\|_2\|B\|_F$.
8. If $A=xy^*$, then $\|A\|_2=\|A\|_F=\|x\|_2\|y\|_2$.
9. $\|A^*\|_2=\|A\|_2$ and $\|A^*\|_F=\|A\|_F$.
10. For a unitary matrix $U$ of compatible dimension,
$$\|AU\|_2=\|A\|_2,\quad \|AU\|_F=\|A\|_F,\quad
\|UA\|_2=\|A\|_2,\quad  \|UA\|_F=\|A\|_F.
$$
11. For $A$ square, $\rho(A)\leq\|A\|$.
12. For $A$ square, $A^k\to 0$ iff $>\rho(A)<1$.

In [10]:
# Absolute norms
opnorm(A,1), opnorm(abs.(A),1), opnorm(A,Inf), opnorm(abs.(A),Inf), norm(A),
norm(abs.(A)),  opnorm(A),opnorm(abs.(A))

(19.0, 19.0, 14.0, 14.0, 15.7797338380595, 15.7797338380595, 11.053558535027305, 14.024662857881664)

In [11]:
# Equivalence of norms
m,n=size(A)
opnorm(A,Inf)\sqrt(n),opnorm(A), sqrt(m)*opnorm(A,Inf)

(0.15971914124998499, 11.053558535027305, 37.04051835490427)

In [12]:
opnorm(A), norm(A), sqrt(n)*opnorm(A)

(11.053558535027305, 15.7797338380595, 24.716508277594045)

In [13]:
opnorm(A,1)\sqrt(m),opnorm(A), sqrt(n)*opnorm(A,1)

(0.1392500690033995, 11.053558535027305, 42.48529157249601)

In [14]:
# Fact 12
opnorm(A), sqrt(opnorm(A,1)*opnorm(A,Inf))

(11.053558535027305, 16.30950643030009)

In [16]:
# Fact 13
B=rand(n,rand(1:9))
norm(A*B), norm(A)*opnorm(B), opnorm(A)*norm(B)

(20.494754750655506, 38.06835891122696, 28.786978256459832)

In [17]:
# Fact 14
x=rand(10)+im*rand(10)
y=rand(10)+im*rand(10)
A=x*y'
opnorm(A), norm(A), norm(x)*norm(y)

(6.401718334354508, 6.40171833435451, 6.401718334354509)

In [18]:
# Fact 15
A=rand(-4:4,7,5)+im*rand(-4:4,7,5)
opnorm(A), opnorm(A'), norm(A), norm(A')

(16.92170312591604, 16.921703125916043, 23.08679276123039, 23.08679276123039)

In [24]:
# Unitary invariance - generate random unitary matrix U
U,R=qr(rand(ComplexF64,size(A)));

In [25]:
opnorm(A), opnorm(U*A), norm(A), norm(U*A)

(16.92170312591604, 16.92170312591604, 23.08679276123039, 23.08679276123039)

In [26]:
# Spectral radius
A=rand(7,7)+im*rand(7,7)
maximum(abs,eigvals(A)), opnorm(A,Inf), opnorm(A,1), opnorm(A), norm(A)

(4.8459991947110845, 6.2882314110144195, 6.1256533319850615, 5.1158067806404235, 5.74641257060135)

In [27]:
# Fact 18
B=A/(maximum(abs,eigvals(A))+2)
@show maximum(abs,eigvals(B))
norm(B^100)

maximum(abs, eigvals(B)) = 0.7078585691997867


1.0545463767656214e-15

## Errors and condition numbers

We want to answer the question:

__How much the value of a function changes with respect to the change of its argument?__

### Definitions

For function $f(x)$ and argument $x$, the __absolute error__ with respect to the __perturbation__ of the argument 
$\delta x$ is 

$$
\| f(x+\delta x)-f(x)\| = \frac{\| f(x+\delta x)-f(x)\|}{\| \delta x \|} \|\delta x\| \equiv \kappa \|\delta x\|.
$$

The  __condition__ or  __condition number__ $\kappa$ tells how much does the perturbation of the argument increase. (Its form resembles derivative.)

Similarly, the __relative error__ with respect to the relative perturbation of the argument is

$$
\frac{\| f(x+\delta x)-f(x)\|}{\| f(x) \|}= \frac{\| f(x+\delta x)-f(x)\|\cdot  \|x\| }{\|\delta x\| \cdot\| f(x)\|}
\cdot \frac{\|\delta x\|}{\|x\|} \equiv \kappa_{rel} \frac{\|\delta x\|}{\|x\|}.
$$

## Peturbation bounds

### Definitions

Let $A\in\mathbb{C}^{n\times n}$.

Pair $(\lambda,x)\in\mathbb{C}\times\mathbb{C}^{n\times n}$ is an __eigenpair__ of $A$ if $x\neq 0$ and $Ax=\lambda x$.

Triplet $(y,\lambda,x)\in\times\mathbb{C}^{n}\times\mathbb{C}\times\mathbb{C}^{n}$ is an __eigentriplet__ of $A$ if $x,y\neq 0$ and $Ax=\lambda x$ and $y^*A=\lambda y^*$.

__Eigenvalue matrix__ is a diagonal matrix $\Lambda=\mathop{\mathrm{diag}}(\lambda_1,\lambda_2,\ldots,\lambda_n)$.

If all eigenvalues are real, they can be increasingly ordered. $\Lambda^\uparrow$ is the eigenvalue matrix of increasingly ordered eigenvalues.

$\tau$ is a __permutation__ of $\{1,2,\ldots,n\}$.

$\tilde A=A+\Delta A$ is a __perturbed matrix__, where $\Delta A$ is __perturbation__. $(\tilde \lambda,\tilde x)$ are the eigenpairs of $\tilde A$.

__Condition number__ of a nonsingular matrix $X$ is $\kappa(X)=\|X\| \|X^{-1}\|$.

Let $X,Y\in\mathbb{C}^{n\times k}$ with $\mathop{\mathrm{rank}}(X)=\mathop{\mathrm{rank}}(Y)=k$. The __canonical angles__ between their column spaces, $\theta_i$, are defined by $\cos \theta_i=\sigma_i$, where $\sigma_i$ are the singular values of the matrix
$$(Y^*Y)^{-1/2}Y^*X(X^*X)^{-1/2}.$$ 
The __canonical angle matrix__ between $X$ and $Y$ is 
$$\Theta(X,Y)=\mathop{\mathrm{diag}}(\theta_1,\theta_2,\ldots,\theta_k).
$$
    

### Facts

Bounds become more strict as matrices have more structure. 
Many bounds have versions in spectral norm and Frobenius norm.
For more details and the proofs of the Facts below, see 
[R.-C. Li, Matrix Perturbation Theory][Hog14], and the references therein.

[Hog14]: #1 "L. Hogben, ed., 'Handbook of Linear Algebra', pp. 21.1-21.20, CRC Press, Boca Raton, 2014."

1. There exists $\tau$ such that
$$\|\Lambda- \tilde\Lambda_\tau\|_2\leq 4(\|A\|_2+\|\tilde A\|_2)^{1-1/n}\|\Delta A\|_2^{1/n}.$$

2. __First-order perturbation bounds.__ Let $(y,\lambda,x)$ be an eigentriplet of a simple $\lambda$. $\Delta A$ changes $\lambda$ to 
$\tilde\lambda=\lambda+ \delta\lambda$, where
$$
\delta\lambda=\frac{y^*(\Delta A)x}{y^*x}+O(\|\Delta A\|_2^2).
$$

3. Let $\lambda$ be a semisimple eigenvalue of $A$ with multiplicitiy $k$, and let $X,Y\in \mathbb{C}^{n\times k}$ be the matrices of the corresponding right and left eigenvectors, that is, $AX=\lambda X$ and $Y^*A=\lambda Y^*$, such that $Y^*X=I_k$. $\Delta A$ changes the $k$ copies of $\mu$ to $\tilde \mu=\mu+\delta\mu_i$, where $\delta\mu_i$ are the eigenvalues of $Y^*(\Delta A) X$ up to $O(\|\Delta A\|_2^2$.

3. Perturbations of an inverse matrix are as follows: if $\|A\|_p<1$, then $I-A$ is nonsingular and
$$
(I-A)^{-1}=\sum\limits_{k=0}^\infty A^k
$$
with
$$
\|I-A\|_p \leq \frac{1}{1-\|A\|_p},\qquad 
\|(I-A)^{-1}-I\|_p \leq \frac{\|A\|_p}{1-\|A\|_p}.
$$

4. __Geršgorin Circle Theorem.__ If $X^{-1} A X=D+F$, where $D=\mathop{\mathrm{diag}}(d_1,\ldots,d_n)$ 
and $F$ has zero diagonal entries, then
$$\sigma(A) \subseteq \bigcup\limits_{i=1}^n D_i,$$
where 
$$D_i=\big\{z\in\mathbb{C} : |z-d_i| \leq \sum\limits_{j=1}^n |f_{ij}| \big\}.
$$
Moreover, by continuity, if a connected component of $D$ consists of $k$ circles, it contains $k$ eigenvalues.

4. __Bauer-Fike Theorem.__ If $A$ is diagonalizable and $A=X\Lambda X^{-1}$ is its eigenvalue decomposition, then
$$
\max_i\min_j |\tilde \lambda_i -
\lambda_j|\leq \|X^{-1}(\Delta A)X\|_p\leq \kappa_p(X)\|\Delta A\|_p.
$$

5. If $A$ and $\tilde A$ are diagonalizable, there exists $\tau$ such that
$$\|\Lambda-\tilde\Lambda_\tau\|_F\leq \sqrt{\kappa_2(X)\kappa_2(\tilde X)}\|\Delta A\|_F.
$$ If $\Lambda$ and  $\tilde\Lambda$ are real, then
$$
\|\Lambda^\uparrow-\tilde\Lambda^\uparrow\|_{2,F} \leq \sqrt{\kappa_2(X)\kappa_2(\tilde X)}\|\Delta A\|_{2,F}.
$$

6. If $A$ is normal, there exists $\tau$ such that $\|\Lambda-\tilde\Lambda_\tau\|_F\leq\sqrt{n}\|\Delta A\|_F$.

7. __Hoffman-Wielandt Theorem.__ If $A$ and $\tilde A$ are normal, there exists $\tau$ such that $\|\Lambda-\tilde\Lambda_\tau\|_F\leq\|\Delta A\|_F$.

8. If $A$ and $\tilde A$ are Hermitian, for any unitarily invariant norm $\|\Lambda^\uparrow-\tilde\Lambda^\uparrow\| \leq \|\Delta A\|$. In particular,
\begin{align*}
\max_i|\lambda^\uparrow_i-\tilde\lambda^\uparrow_i|&\leq \|\Delta A\|_2,\\ 
\sqrt{\sum_i(\lambda^\uparrow_i-\tilde\lambda^\uparrow_i)^2}&\leq \|\Delta A\|_F.
\end{align*}

9. __Residual bounds.__ Let $A$ be Hermitian. For some $\tilde\lambda\in\mathbb{R}$ and $\tilde x\in\mathbb{C}^n$ with $\|\tilde x\|_2=1$, define __residual__ $r=A\tilde x-\tilde\lambda\tilde x$. Then
$|\tilde\lambda-\lambda|\leq \|r\|_2$ for some $\lambda\in\sigma(A)$.

9. Let, in addition,  $\tilde\lambda=\tilde x^* A\tilde x$, let $\lambda$ be closest to $\tilde\lambda$ and $x$ be its unit eigenvector, and let 
$$\eta=\mathop{\mathrm{gap}}(\tilde\lambda)= \min_{\lambda\neq\mu\in\sigma(A)}|\tilde\lambda-\mu|.$$
If $\eta>0$, then
$$ |\tilde\lambda-\lambda|\leq \frac{\|r\|_2^2}{\eta},\quad \sin\theta(x,\tilde x)\leq \frac{\|r\|_2}{\eta}.
$$

10. Let $A$ be Hermitian, $X\in\mathbb{C}^{n\times k}$ have full column rank, and $M\in\mathcal{H}_k$ having eigenvalues 
$\mu_1\leq\mu_2\leq\cdots\leq\mu_k$. Set $R=AX-XM$. Then
there exist $\lambda_{i_1}\leq\lambda_{i_2}\leq\cdots\leq\lambda_{i_k}\in\sigma(A)$ such that
\begin{align*}    
\max_{1\leq j\leq k} |\mu_j-\lambda_{i_j}|& \leq \frac{\|R\|_2}{\sigma_{\min}(X)},\\
\sqrt{\sum_{j=1}^k (\mu_j-\lambda_{i_j})^2}&\leq \frac{\|R\|_F}{\sigma_{\min}(X)}.
\end{align*}
(The indices $i_j$ need not be the same in the above formulae.)

10. If, additionally, $X^*X=I$ and $M=X^*AX$, and if all but $k$ of $A$'s eigenvalues differ from every one of $M$'s eigenvalues by at least $\eta>0$, then
$$\sqrt{\sum_{j=1}^k (\mu_j-\lambda_{i_j})^2}\leq \frac{\|R\|_F^2}{\eta\sqrt{1-\|R\|_F^2/\eta^2}}.$$

11. Let $A=\begin{bmatrix} M & E^* \\ E & H \end{bmatrix}$ and $\tilde A=\begin{bmatrix} M & 0 \\ 0 & H \end{bmatrix}$ be Hermitian, and set $\eta=\min |\mu-\nu|$ over all $\mu\in\sigma(M)$ and $\nu\in\sigma(H)$. Then
$$ 
\max |\lambda_j^\uparrow -\tilde\lambda_j^\uparrow| \leq \frac{2\|E\|_2^2}{\eta+\sqrt{\eta^2+4\|E\|_2^2}}.
$$

14. Let 
$$
\begin{bmatrix} X_1^*\\ X_2^* \end{bmatrix} A \begin{bmatrix} X_1 & X_2 \end{bmatrix}=
\begin{bmatrix} A_1 &  \\ & A_2 \end{bmatrix}, \quad \begin{bmatrix} X_1 & X_2 \end{bmatrix} \textrm{unitary},
\quad X_1\in\mathbb{C}^{n\times k}.
$$
Let $Q\in\mathbb{C}^{n\times k}$ have orthonormal columns and for a Hermitian $k\times k$ matrix $M$ set
$R=AQ-QM$. Let $\eta=\min|\mu-\nu|$ over all $\mu\in\sigma(M)$ and $\nu\in\sigma(A_2)$. If $\eta > 0$, then
$$
\|\sin\Theta(X_1,Q)\|_F\leq \frac{\|R\|_F}{\eta}.
$$

### Example - Nondiagonalizable matrix

In [28]:
A=[-3 7 -1; 6 8 -2; 72 -28 19]

3×3 Array{Int64,2}:
 -3    7  -1
  6    8  -2
 72  -28  19

In [30]:
# (Right) eigenvectors
λ,X=eigen(A)
λ

3-element Array{Float64,1}:
 -6.000000000000005
 15.000000241477958
 14.999999758522048

In [31]:
X

3×3 Array{Float64,2}:
  0.235702   0.218218  -0.218218
 -0.235702   0.436436  -0.436436
 -0.942809  -0.872872   0.872872

In [32]:
cond(X)

9.091581997455394e7

In [34]:
# Left eigenvectors
λ1,Y=eigen(Matrix(A'))

Eigen{Complex{Float64},Complex{Float64},Array{Complex{Float64},2},Array{Complex{Float64},1}}
eigenvalues:
3-element Array{Complex{Float64},1}:
 -5.999999999999998 + 0.0im                  
 14.999999999999993 + 2.0088262607214127e-7im
 14.999999999999993 - 2.0088262607214127e-7im
eigenvectors:
3×3 Array{Complex{Float64},2}:
     0.894427+0.0im      0.970143+0.0im             0.970143-0.0im       
    -0.447214+0.0im  -7.58506e-16-1.62404e-8im  -7.58506e-16+1.62404e-8im
 -6.07504e-17+0.0im      0.242536+4.0601e-9im       0.242536-4.0601e-9im 

In [35]:
# Try k=2,3
k=3
Y[:,k]'*A-λ[k]*Y[:,k]'

1×3 Adjoint{Complex{Float64},Array{Complex{Float64},1}}:
 2.34268e-7+1.94885e-7im  9.60124e-15-3.9217e-15im  5.8567e-8+4.87212e-8im

In [36]:
ΔA=rand(3,3)/20
B=A+ΔA

3×3 Array{Float64,2}:
 -2.96907    7.02339  -0.996919
  6.03473    8.03619  -1.98852 
 72.0438   -27.985    19.0119  

In [37]:
norm(ΔA)

0.08030853365461708

In [38]:
μ,Z=eigen(B)

Eigen{Complex{Float64},Complex{Float64},Array{Complex{Float64},2},Array{Complex{Float64},1}}
eigenvalues:
3-element Array{Complex{Float64},1}:
 -5.9873733390319295 + 0.0im               
  15.033200215838555 + 0.5349652719330414im
  15.033200215838555 - 0.5349652719330414im
eigenvectors:
3×3 Array{Complex{Float64},2}:
 -0.235832+0.0im   -0.21523+0.0289587im   -0.21523-0.0289587im
  0.235187+0.0im  -0.429746+0.0578329im  -0.429746-0.0578329im
  0.942905+0.0im   0.874535+0.0im         0.874535-0.0im      

In [39]:
# Fact 2
δλ=μ[1]-λ[1]

0.012626660968075853 + 0.0im

In [40]:
k=1
Y[:,k]'*ΔA*X[:,k]/(Y[:,k]⋅X[:,k])

0.012608691870119986 + 0.0im

### Example - Jordan form

In [43]:
n=10
c=0.5
J=Bidiagonal(c*ones(n),ones(n-1),'U')

10×10 Bidiagonal{Float64,Array{Float64,1}}:
 0.5  1.0   ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅ 
  ⋅   0.5  1.0   ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅ 
  ⋅    ⋅   0.5  1.0   ⋅    ⋅    ⋅    ⋅    ⋅    ⋅ 
  ⋅    ⋅    ⋅   0.5  1.0   ⋅    ⋅    ⋅    ⋅    ⋅ 
  ⋅    ⋅    ⋅    ⋅   0.5  1.0   ⋅    ⋅    ⋅    ⋅ 
  ⋅    ⋅    ⋅    ⋅    ⋅   0.5  1.0   ⋅    ⋅    ⋅ 
  ⋅    ⋅    ⋅    ⋅    ⋅    ⋅   0.5  1.0   ⋅    ⋅ 
  ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅   0.5  1.0   ⋅ 
  ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅   0.5  1.0
  ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅   0.5

In [44]:
# Accurately defined eigenvalues
λ=eigvals(J)

10-element Array{Float64,1}:
 0.5
 0.5
 0.5
 0.5
 0.5
 0.5
 0.5
 0.5
 0.5
 0.5

In [45]:
# Only one eigenvector
X=eigvecs(J)

10×10 Array{Float64,2}:
 1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0

In [46]:
x=eigvecs(J)[:,1]
y=eigvecs(J')[:,1]

10-element Array{Float64,1}:
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 1.0

In [48]:
y'*Matrix(J)-0.5*y'

1×10 Adjoint{Float64,Array{Float64,1}}:
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0

In [52]:
# Just one perturbed element in the lower left corner
ΔJ=sqrt(eps())*[zeros(n-1);1]*Matrix(I,1,n)

10×10 Array{Float64,2}:
 0.0         0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0         0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0         0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0         0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0         0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0         0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0         0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0         0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0         0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0
 1.49012e-8  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0

In [53]:
B=J+ΔJ
μ=eigvals(B)

10-element Array{Complex{Float64},1}:
 0.3350615111920276 + 0.0im                
 0.3665619595167544 + 0.09694841124280963im
 0.3665619595167544 - 0.09694841124280963im
 0.4490312038888047 + 0.15686582456677023im
 0.4490312038888047 - 0.15686582456677023im
 0.5509687960268455 + 0.15686582462828694im
 0.5509687960268455 - 0.15686582462828694im
 0.6334380405156275 + 0.09694841134170258im
 0.6334380405156275 - 0.09694841134170258im
 0.6649384889119079 + 0.0im                

In [54]:
# Fact 2
maximum(abs,λ-μ)

0.16493848891190788

In [55]:
y'*ΔJ*x/(y⋅x)

Inf

However, since $B$ is diagonalizable, we can apply Bauer-Fike theorem to it: 

In [56]:
μ,Y=eigen(B)

Eigen{Complex{Float64},Complex{Float64},Array{Complex{Float64},2},Array{Complex{Float64},1}}
eigenvalues:
10-element Array{Complex{Float64},1}:
 0.3350615111920276 + 0.0im                
 0.3665619595167544 + 0.09694841124280963im
 0.3665619595167544 - 0.09694841124280963im
 0.4490312038888047 + 0.15686582456677023im
 0.4490312038888047 - 0.15686582456677023im
 0.5509687960268455 + 0.15686582462828694im
 0.5509687960268455 - 0.15686582462828694im
 0.6334380405156275 + 0.09694841134170258im
 0.6334380405156275 - 0.09694841134170258im
 0.6649384889119079 + 0.0im                
eigenvectors:
10×10 Array{Complex{Float64},2}:
   -0.986304+0.0im     0.986304+0.0im          …     0.986304+0.0im
    0.162679+0.0im     -0.13161+0.0956206im          0.162679+0.0im
  -0.0268321+0.0im   0.00829158-0.0255188im         0.0268321+0.0im
  0.00442565+0.0im    0.0013676+0.00420904im       0.00442565+0.0im
 -0.00072996+0.0im  -0.00059055-0.000429059im      0.00072996+0.0im
 0.000120398+0.0im  0.0001203

In [57]:
cond(Y)

1.1068834616372574e7

In [58]:
opnorm(inv(Y)*ΔJ*Y), cond(Y)*opnorm(ΔJ)

(0.16493848884660864, 0.1649384888466086)

### Example - Normal matrix

In [59]:
using SpecialMatrices

In [60]:
n=6
C=Circulant(rand(-5:5,n))

6×6 Circulant{Int64}:
 -2  -4  -1   3   2  -5
 -5  -2  -4  -1   3   2
  2  -5  -2  -4  -1   3
  3   2  -5  -2  -4  -1
 -1   3   2  -5  -2  -4
 -4  -1   3   2  -5  -2

In [62]:
λ=eigvals(Matrix(C))

6-element Array{Complex{Float64},1}:
   5.000000000000003 + 3.464101615137756im 
   5.000000000000003 - 3.464101615137756im 
   5.000000000000001 + 0.0im               
 -10.000000000000004 + 1.7320508075688794im
 -10.000000000000004 - 1.7320508075688794im
  -6.999999999999999 + 0.0im               

In [63]:
ΔC=rand(n,n)*0.0001

6×6 Array{Float64,2}:
 9.40524e-5  9.15958e-5  8.61254e-5  7.7636e-5   2.10888e-5  1.38132e-5
 6.85129e-5  8.49861e-5  5.72666e-5  5.32965e-5  5.96122e-5  8.09176e-5
 5.25792e-5  9.93138e-5  9.56191e-5  2.97205e-5  1.59796e-5  7.18894e-5
 8.22256e-5  4.05577e-6  8.24703e-5  9.25963e-5  9.97811e-5  4.92671e-5
 8.26403e-5  7.55158e-5  7.55995e-5  5.59783e-5  4.68669e-5  6.32703e-5
 7.23123e-5  2.38365e-5  3.6031e-5   3.2742e-5   9.56895e-5  6.89493e-5

In [64]:
@show opnorm(ΔC)
μ=eigvals(C+ΔC)

opnorm(ΔC) = 0.000386083249991022


6-element Array{Complex{Float64},1}:
 5.0000303421706445 + 3.4640949763690414im
 5.0000303421706445 - 3.4640949763690414im
  4.999971427164099 + 0.0im               
 -9.999965673450005 + 1.7320446974309889im
 -9.999965673450005 - 1.7320446974309889im
 -6.999617694558269 + 0.0im               

### Example - Hermitian matrix

In [66]:
m=10
n=6
A=rand(m,n)
# Some scaling
D=diagm(0=>(rand(n).-0.5)*exp(20))
A=A*D

10×6 Array{Float64,2}:
 4.92007e6  4.93416e7  -4.34766e7  -2.58029e7  -4.98121e7  1.38599e8
 1.08382e6  8.2599e7   -4.86728e7  -1.73277e7  -4.31622e7  5.56003e7
 1.50689e6  8.33009e7  -1.08457e7  -2.95154e7  -5.37346e7  1.18649e8
 7.81521e5  4.2381e7   -1.01873e6  -1.48321e7  -4.31134e7  2.00849e8
 7.2617e6   2.22914e7  -4.38723e7  -2.5239e7   -3.20463e7  9.0358e6 
 3.26314e6  1.23449e8  -4.19335e7  -2.44673e7  -1.37485e7  2.28415e7
 1.37869e6  2.58726e7  -9.47336e6  -2.56283e7  -1.50527e7  1.16457e7
 1.18847e6  1.78058e8  -4.18966e7  -5.3128e6   -2.93737e7  1.79559e8
 3.01637e5  1.32422e8  -1.22643e7  -3.52463e7  -3.47206e7  1.50775e8
 4.59356e6  5.25761e7  -5.65879e7  -2.50992e7  -6.48676e6  6.18467e7

In [68]:
using Statistics
A=cor(A)

6×6 Array{Float64,2}:
  1.0       -0.471846   -0.605618  -0.18623      0.194104    -0.487121
 -0.471846   1.0        -0.108464   0.29301      0.0210629    0.397379
 -0.605618  -0.108464    1.0       -0.218926    -0.315518     0.375967
 -0.18623    0.29301    -0.218926   1.0         -0.00609117   0.327275
  0.194104   0.0210629  -0.315518  -0.00609117   1.0         -0.534082
 -0.487121   0.397379    0.375967   0.327275    -0.534082     1.0     

In [69]:
ΔA=cor(rand(m,n)*D)*1e-5

6×6 Array{Float64,2}:
  1.0e-5       3.51637e-6   3.0613e-6   -1.96274e-6  -4.18259e-7   9.62218e-7
  3.51637e-6   1.0e-5       1.92438e-6  -1.19746e-6  -4.0048e-6   -5.46649e-6
  3.0613e-6    1.92438e-6   1.0e-5       3.24917e-6  -6.02937e-6  -1.46595e-6
 -1.96274e-6  -1.19746e-6   3.24917e-6   1.0e-5      -5.44071e-6   4.07396e-7
 -4.18259e-7  -4.0048e-6   -6.02937e-6  -5.44071e-6   1.0e-5       1.27849e-6
  9.62218e-7  -5.46649e-6  -1.46595e-6   4.07396e-7   1.27849e-6   1.0e-5    

In [70]:
B=A+ΔA

6×6 Array{Float64,2}:
  1.00001   -0.471843   -0.605615  -0.186232     0.194103    -0.48712 
 -0.471843   1.00001    -0.108462   0.293009     0.0210589    0.397373
 -0.605615  -0.108462    1.00001   -0.218923    -0.315524     0.375966
 -0.186232   0.293009   -0.218923   1.00001     -0.00609662   0.327275
  0.194103   0.0210589  -0.315524  -0.00609662   1.00001     -0.53408 
 -0.48712    0.397373    0.375966   0.327275    -0.53408      1.00001 

In [72]:
λ,U=eigen(A) 
μ=eigvals(B)
[λ μ]

6×2 Array{Float64,2}:
 0.13228   0.132298
 0.314272  0.314284
 0.643124  0.643134
 0.978018  0.978025
 1.4894    1.4894  
 2.44291   2.44292 

In [73]:
norm(ΔA)

3.0272392572268303e-5

In [75]:
?round

search: [0m[1mr[22m[0m[1mo[22m[0m[1mu[22m[0m[1mn[22m[0m[1md[22m [0m[1mr[22m[0m[1mo[22m[0m[1mu[22m[0m[1mn[22m[0m[1md[22ming [0m[1mR[22m[0m[1mo[22m[0m[1mu[22m[0m[1mn[22m[0m[1md[22mUp [0m[1mR[22m[0m[1mo[22m[0m[1mu[22m[0m[1mn[22m[0m[1md[22mDown [0m[1mR[22m[0m[1mo[22m[0m[1mu[22m[0m[1mn[22m[0m[1md[22mToZero [0m[1mR[22m[0m[1mo[22m[0m[1mu[22m[0m[1mn[22m[0m[1md[22mingMode [0m[1mR[22m[0m[1mo[22m[0m[1mu[22m[0m[1mn[22m[0m[1md[22mNearest



```
round(z::Complex[, RoundingModeReal, [RoundingModeImaginary]])
round(z::Complex[, RoundingModeReal, [RoundingModeImaginary]]; digits=, base=10)
round(z::Complex[, RoundingModeReal, [RoundingModeImaginary]]; sigdigits=, base=10)
```

Return the nearest integral value of the same type as the complex-valued `z` to `z`, breaking ties using the specified [`RoundingMode`](@ref)s. The first [`RoundingMode`](@ref) is used for rounding the real components while the second is used for rounding the imaginary components.

# Example

```jldoctest
julia> round(3.14 + 4.5im)
3.0 + 4.0im
```

---

```
round([T,] x, [r::RoundingMode])
round(x, [r::RoundingMode]; digits::Integer=0, base = 10)
round(x, [r::RoundingMode]; sigdigits::Integer, base = 10)
```

Rounds the number `x`.

Without keyword arguments, `x` is rounded to an integer value, returning a value of type `T`, or of the same type of `x` if no `T` is provided. An [`InexactError`](@ref) will be thrown if the value is not representable by `T`, similar to [`convert`](@ref).

If the `digits` keyword argument is provided, it rounds to the specified number of digits after the decimal place (or before if negative), in base `base`.

If the `sigdigits` keyword argument is provided, it rounds to the specified number of significant digits, in base `base`.

The [`RoundingMode`](@ref) `r` controls the direction of the rounding; the default is [`RoundNearest`](@ref), which rounds to the nearest integer, with ties (fractional values of 0.5) being rounded to the nearest even integer. Note that `round` may give incorrect results if the global rounding mode is changed (see [`rounding`](@ref)).

# Examples

```jldoctest
julia> round(1.7)
2.0

julia> round(Int, 1.7)
2

julia> round(1.5)
2.0

julia> round(2.5)
2.0

julia> round(pi; digits=2)
3.14

julia> round(pi; digits=3, base=2)
3.125

julia> round(123.456; sigdigits=2)
120.0

julia> round(357.913; sigdigits=4, base=2)
352.0
```

!!! note
    Rounding to specified digits in bases other than 2 can be inexact when operating on binary floating point numbers. For example, the [`Float64`](@ref) value represented by `1.15` is actually *less* than 1.15, yet will be rounded to 1.2.

    # Examples

    ```jldoctest; setup = :(using Printf)
    julia> x = 1.15
    1.15

    julia> @sprintf "%.20f" x
    "1.14999999999999991118"

    julia> x < 115//100
    true

    julia> round(x, digits=1)
    1.2
    ```


# Extensions

To extend `round` to new numeric types, it is typically sufficient to define `Base.round(x::NewType, r::RoundingMode)`.

---

```
round(dt::TimeType, p::Period, [r::RoundingMode]) -> TimeType
```

Return the `Date` or `DateTime` nearest to `dt` at resolution `p`. By default (`RoundNearestTiesUp`), ties (e.g., rounding 9:30 to the nearest hour) will be rounded up.

For convenience, `p` may be a type instead of a value: `round(dt, Dates.Hour)` is a shortcut for `round(dt, Dates.Hour(1))`.

```jldoctest
julia> round(Date(1985, 8, 16), Dates.Month)
1985-08-01

julia> round(DateTime(2013, 2, 13, 0, 31, 20), Dates.Minute(15))
2013-02-13T00:30:00

julia> round(DateTime(2016, 8, 6, 12, 0, 0), Dates.Day)
2016-08-07T00:00:00
```

Valid rounding modes for `round(::TimeType, ::Period, ::RoundingMode)` are `RoundNearestTiesUp` (default), `RoundDown` (`floor`), and `RoundUp` (`ceil`).

---

```
round(x::Period, precision::T, [r::RoundingMode]) where T <: Union{TimePeriod, Week, Day} -> T
```

Round `x` to the nearest multiple of `precision`. If `x` and `precision` are different subtypes of `Period`, the return value will have the same type as `precision`. By default (`RoundNearestTiesUp`), ties (e.g., rounding 90 minutes to the nearest hour) will be rounded up.

For convenience, `precision` may be a type instead of a value: `round(x, Dates.Hour)` is a shortcut for `round(x, Dates.Hour(1))`.

```jldoctest
julia> round(Dates.Day(16), Dates.Week)
2 weeks

julia> round(Dates.Minute(44), Dates.Minute(15))
45 minutes

julia> round(Dates.Hour(36), Dates.Day)
2 days
```

Valid rounding modes for `round(::Period, ::T, ::RoundingMode)` are `RoundNearestTiesUp` (default), `RoundDown` (`floor`), and `RoundUp` (`ceil`).

Rounding to a `precision` of `Month`s or `Year`s is not supported, as these `Period`s are of inconsistent length.


In [91]:
# Residual bounds - how close is μ, y to λ[2],X[:,2]
k=3
μ=round(λ[k],sigdigits=2)
y=round.(U[:,k],sigdigits=2)
y=y/norm(y)

6-element Array{Float64,1}:
  0.15083003916676666
  0.5832094847781644 
 -0.31171541427798444
 -0.6737075082782245 
 -0.2815494064446311 
  0.08245375474449912

In [77]:
μ

0.64

In [78]:
# Fact 9
r=A*y-μ*y

6-element Array{Float64,1}:
 -0.0014561274878872321
  0.0020289330057904897
  0.0005058008388426904
 -0.002795069187298993 
 -0.001378441215101972 
  0.0006539776637512901

In [80]:
minimum(abs,μ.-λ), norm(r)

(0.0031241344974130003, 0.0040783464322573714)

In [81]:
# Fact 10 - μ is Rayleigh quotient
μ=y⋅(A*y)
r=A*y-μ*y

6-element Array{Float64,1}:
 -0.0019283884983804789
  0.0002028570985499467
  0.001481806927195406 
 -0.0006856366737625352
 -0.0004968873288479225
  0.0003958083113483196

In [82]:
η=min(abs(μ-λ[k-1]),abs(μ-λ[k+1]))

0.3288590457130568

In [83]:
μ-λ[k], norm(r)^2/η

(6.94610895801695e-6, 2.076647714860639e-5)

In [84]:
# Eigenvector bound
# cos(θ)
cosθ=dot(y,U[:,k])
# sin(θ)
sinθ=sqrt(1-cosθ^2)
sinθ,norm(r)/η

(0.004173392282300726, 0.007946511535176204)

In [106]:
# Residual bounds - Fact 13
U=eigvecs(A)
Q=round.(U[:,1:3],sigdigits=2)
# Orthogonalize
F=qr(Q)
X=Matrix(F.Q)
M=X'*A*X
R=A*X-X*M
μ=eigvals(M)
R

6×3 Array{Float64,2}:
 -0.00575363  -0.00477589  -0.00156264 
  0.00468678   0.0030594    0.000332569
  0.00309373   0.00346027   0.00176162 
  0.00269254   0.00201591  -0.000669568
 -0.00181593  -0.0033898   -0.000311431
  0.00486964   0.00507475   0.000631649

In [107]:
λ

6-element Array{Float64,1}:
 0.13227975264395728
 0.31427203489331423
 0.643124134497413  
 0.9780177630257902 
 1.4893973044504754 
 2.442909010489049  

In [108]:
μ

3-element Array{Float64,1}:
 0.13232533764909526
 0.6431319203119484 
 0.31431246314009   

In [109]:
# The entries of μ are not ordered - which algorithm was called?
issymmetric(M)

false

In [110]:
M=Hermitian(M)
R=A*X-X*M
μ=eigvals(M)

3-element Array{Float64,1}:
 0.13232533764909518
 0.31431246314009   
 0.6431319203119483 

In [111]:
η=λ[4]-λ[3]

0.33489362852837723

In [112]:
norm(λ[1:3]-μ), vecnorm(R)^2/η

UndefVarError: UndefVarError: vecnorm not defined

In [113]:
# Fact 13
M=A[1:3,1:3]
H=A[4:6,4:6]
E=A[4:6,1:3]
# Block-diagonal matrix
B=cat([1,2],M,H)

UndefKeywordError: UndefKeywordError: keyword argument dims not assigned

In [68]:
η=minimum(abs,eigvals(M)-eigvals(H))
μ=eigvals(B)
[λ μ]

6×2 Array{Float64,2}:
 0.156967  0.322199
 0.564736  0.698823
 0.697134  0.744951
 0.76263   0.854027
 1.25555   1.55623 
 2.56298   1.82377 

In [69]:
2*norm(E)^2/(η+sqrt(η^2+4*norm(E)^2))

In [70]:
# Eigenspace bounds - Fact 14
B=A+ΔA
μ,V=eig(B)

([0.156978, 0.564741, 0.697143, 0.762641, 1.25556, 2.56299], [-0.0482105 -0.306212 … 0.540217 -0.286526; 0.475767 -0.521781 … 0.0265189 -0.494506; … ; 0.386804 0.00561496 … -0.674433 -0.256014; 0.764722 0.118785 … 0.279685 0.555999])

In [71]:
# sin(Θ(U[:,1:3],V[:,1:3]))
X=U[:,1:3]
Q=V[:,1:3]
cosθ=svdvals(sqrtm(Q'*Q)*Q'*X*sqrtm(X'*X))
sinθ=sqrt.(1-cosθ.^2)

3-element Array{Float64,1}:
 7.60543e-7
 4.28882e-6
 7.03147e-5

In [72]:
# Bound
M=Q'*A*Q

3×3 Array{Float64,2}:
  0.156967    -2.16968e-6  -2.56127e-6
 -2.16968e-6   0.564736     1.12995e-6
 -2.56127e-6   1.12995e-6   0.697134  

In [73]:
R=A*Q-Q*M

6×3 Array{Float64,2}:
  1.7645e-6    3.2951e-7    1.0646e-6 
 -1.7178e-6    1.20688e-7   1.61113e-6
  5.73265e-6   2.1145e-6   -2.88202e-6
  2.51765e-6   3.3477e-7   -2.48137e-6
 -1.45408e-6   9.07386e-7  -1.91377e-6
  1.49139e-6  -7.85053e-7   9.79858e-9

In [74]:
eigvals(M), λ

([0.156967, 0.697134, 0.564736], [0.156967, 0.564736, 0.697134, 0.76263, 1.25555, 2.56298])

In [75]:
η=abs(eigvals(M)[3]-λ[4])
vecnorm(sinθ), vecnorm(R)/η

(7.044952891469887e-5, 4.4515758564381295e-5)