In [1]:
using Plots,LaTeXStrings
default(markersize=3,linewidth=1.5)
using LightGraphs,GraphPlot
using Images,TestImages
using DataFrames,JLD
using LinearAlgebra
#include("FNC.jl");

┌ Info: Precompiling Plots [91a5bcdd-55d7-5caf-9e0b-520d859cae80]
└ @ Base loading.jl:1278
┌ Info: Precompiling LightGraphs [093fc24a-ae57-5d10-9952-331d41423f4d]
└ @ Base loading.jl:1278
┌ Info: Precompiling GraphPlot [a2cc645c-3eea-5389-862e-a155d0052231]
└ @ Base loading.jl:1278
┌ Info: Precompiling Images [916415d5-f1e6-5110-898d-aaa5f9f070e0]
└ @ Base loading.jl:1278
┌ Info: Precompiling JLD [4138dd39-2aa7-5051-a626-17a0bb65d9c8]
└ @ Base loading.jl:1278


# Example 7.1.4

Here is the adjacency matrix of a "small-world" network on 200 nodes. Each node is connected to 4 neighbors, and then some edges are randomly changed to distant connections. 

In [None]:
g = watts_strogatz(200,4,0.06)
gplot(g)

In [None]:
g

In [None]:
g.fadjlist

In [None]:
g.ne

In [None]:
vertices(g)

In [None]:
edges(g)

In [None]:
collect(edges(g))

The adjacency matrix for this graph reveals the connections as mostly local (i.e., the nonzeros are near the diagonal).

In [None]:
A = adjacency_matrix(g,Float64)

In [None]:
Matrix(A)

In [None]:
spy(A,m=1,color=:black,title="Adjacency matrix",leg=:none,size=(400,400))
xlims!(-10,210); ylims!(-10,210)

In [None]:
sum(A,dims=2)

# Example 7.1.5

We will use the `Images` package for working with images. We also load here the `TestImages` package for a large library of well-known standard images.

In [None]:
img = testimage("peppers")

The details vary by image type, but for the most part an image is an array of color values.

In [None]:
size(img),eltype(img)

In [None]:
2^8

The elements here have four values, for red, green, blue, and alpha (opacity). We can convert each of those "planes" into an ordinary matrix.

In [None]:
R = red.(img)

In [None]:
R[1:5,1:5]

The values above go from zero (no red) to one (full red). It may also be convenient to convert the image to grayscale, which has just one "layer" from zero (black) to one (white). 

In [None]:
G = Gray.(img)

In [None]:
A = @. gray(Gray(img))

In [None]:
A[1:5,1:5]

Finally, we can save an image locally for reloading later.

In [None]:
save("peppers.png",Gray.(img))

In [None]:
load("peppers.png")

# Example 7.2.1

The `eigvals` command will return just the eigenvalues, as a vector. 

In [None]:
A = pi*ones(2,2)

In [None]:
lambda = eigvals(A)

If we also want the eigenvectors (returned as the matrix $V$), we use `eigen`.

In [None]:
lambda,V = eigen(A)

In [None]:
lambda

In [None]:
V

We can check the fact that this is an EVD.

In [None]:
D = diagm(0=>lambda)

In [None]:
D = diagm(lambda)

In [None]:
D = Diagonal(lambda)

In [None]:
opnorm( A - V*D/V )      # "/V" is like "*inv(V)""

Even if the matrix is not diagonalizable, `eigen` will run successfully, but the matrix ${V}$ will not be invertible.

In [None]:
lambda,V = eigen([1 1;0 1])

In [None]:
rank(V)

# Example 7.2.2

We will confirm the Bauer-Fike theorem on a triangular matrix. These tend to be far from normal. 

In [4]:
n = 15
lambda = 1:n
A = triu( ones(n)*lambda' )

15×15 Array{Float64,2}:
 1.0  2.0  3.0  4.0  5.0  6.0  7.0  …  10.0  11.0  12.0  13.0  14.0  15.0
 0.0  2.0  3.0  4.0  5.0  6.0  7.0     10.0  11.0  12.0  13.0  14.0  15.0
 0.0  0.0  3.0  4.0  5.0  6.0  7.0     10.0  11.0  12.0  13.0  14.0  15.0
 0.0  0.0  0.0  4.0  5.0  6.0  7.0     10.0  11.0  12.0  13.0  14.0  15.0
 0.0  0.0  0.0  0.0  5.0  6.0  7.0     10.0  11.0  12.0  13.0  14.0  15.0
 0.0  0.0  0.0  0.0  0.0  6.0  7.0  …  10.0  11.0  12.0  13.0  14.0  15.0
 0.0  0.0  0.0  0.0  0.0  0.0  7.0     10.0  11.0  12.0  13.0  14.0  15.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0     10.0  11.0  12.0  13.0  14.0  15.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0     10.0  11.0  12.0  13.0  14.0  15.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0     10.0  11.0  12.0  13.0  14.0  15.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  …   0.0  11.0  12.0  13.0  14.0  15.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0      0.0   0.0  12.0  13.0  14.0  15.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0      0.0   0.0   0.0  13.0  14.0  15.0
 0.0  0.0  0.0

The Bauer-Fike theorem provides an upper bound on the condition number of these eigenvalues.

In [5]:
lambda,V = eigen(A)

Eigen{Float64,Float64,Array{Float64,2},Array{Float64,1}}
values:
15-element Array{Float64,1}:
  1.0
  2.0
  3.0
  4.0
  5.0
  6.0
  7.0
  8.0
  9.0
 10.0
 11.0
 12.0
 13.0
 14.0
 15.0
vectors:
15×15 Array{Float64,2}:
 1.0  0.894427  0.818182  0.764291   0.723814   …  0.55868      0.548957
 0.0  0.447214  0.545455  0.573219   0.579051      0.518774     0.51236
 0.0  0.0       0.181818  0.286609   0.347431      0.444663     0.444045
 0.0  0.0       0.0       0.0716523  0.138972      0.349378     0.355236
 0.0  0.0       0.0       0.0        0.0277944     0.249556     0.260506
 0.0  0.0       0.0       0.0        0.0        …  0.160429     0.173671
 0.0  0.0       0.0       0.0        0.0           0.0916736    0.104203
 0.0  0.0       0.0       0.0        0.0           0.0458368    0.0555747
 0.0  0.0       0.0       0.0        0.0           0.0196443    0.0259349
 0.0  0.0       0.0       0.0        0.0           0.00701584   0.0103739
 0.0  0.0       0.0       0.0        0.0        …  

In [6]:
cond(V)

7.197767264538044e7

The theorem suggests that eigenvalue changes may be up to 7 orders of magnitude larger than a perturbation to the matrix. A few random experiments show that effects of nearly that size are not hard to observe.

In [7]:
for k = 1:3
    E = randn(n,n);  E = 1e-7*E/opnorm(E)
    mu = eigvals(A+E)
    @show max_change = norm( sort(mu)-lambda, Inf )
end

max_change = norm(sort(mu) - lambda, Inf) = 0.18143431860797676
max_change = norm(sort(mu) - lambda, Inf) = 0.07242703210584445
max_change = norm(sort(mu) - lambda, Inf) = 0.22013074346389772


In [9]:
E = randn(n,n);  E = 1e-7*E/opnorm(E)
mu = eigvals(A+E)

15-element Array{Float64,1}:
  0.9999999477785514
  2.00000009704959
  2.9999991383777016
  4.000004744854077
  4.999958818573337
  6.000436758810212
  6.997136390045077
  8.01251775242734
  8.964171489808939
 10.085424790249233
 10.876617644386236
 12.135354549245463
 12.898638853903114
 14.03674057002064
 14.992998399159818

In [10]:
R = A*V - V*Diagonal(lambda)

15×15 Array{Float64,2}:
 0.0  0.0  0.0  4.44089e-16   0.0          …  -1.77636e-15   1.77636e-15
 0.0  0.0  0.0  0.0           0.0             -1.77636e-15   0.0
 0.0  0.0  0.0  0.0          -4.44089e-16     -8.88178e-16   8.88178e-16
 0.0  0.0  0.0  0.0           0.0              0.0           8.88178e-16
 0.0  0.0  0.0  0.0           0.0              4.44089e-16   0.0
 0.0  0.0  0.0  0.0           0.0          …   4.44089e-16  -4.44089e-16
 0.0  0.0  0.0  0.0           0.0              2.22045e-16  -2.22045e-16
 0.0  0.0  0.0  0.0           0.0              1.11022e-16   0.0
 0.0  0.0  0.0  0.0           0.0              5.55112e-17   0.0
 0.0  0.0  0.0  0.0           0.0              0.0           0.0
 0.0  0.0  0.0  0.0           0.0          …   0.0           6.93889e-18
 0.0  0.0  0.0  0.0           0.0             -8.67362e-19  -1.73472e-18
 0.0  0.0  0.0  0.0           0.0              1.0842e-19   -4.33681e-19
 0.0  0.0  0.0  0.0           0.0              0.0           0.0
 0

In [16]:
[cond(V)*norm(R[:,i])/norm(V[:,i]) for i=1:n]

15-element Array{Float64,1}:
 0.0
 0.0
 0.0
 3.1964507771933465e-8
 3.1964507771933465e-8
 7.147481224536357e-8
 4.7946761657900204e-8
 3.221326255685248e-8
 7.194788410504182e-8
 9.181105868081176e-8
 1.0266853801296425e-7
 9.754956948861239e-8
 1.0117938930778933e-7
 1.978913585908047e-7
 1.6062049177275957e-7

In [18]:
lambda - collect(1:n)

15-element Array{Float64,1}:
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0

# Example 7.2.3

Let's start with a known set of eigenvalues and an orthogonal eigenvector basis.

In [52]:
D = diagm([-6,-1,2,4,5])

5×5 Array{Int64,2}:
 -6   0  0  0  0
  0  -1  0  0  0
  0   0  2  0  0
  0   0  0  4  0
  0   0  0  0  5

In [53]:
V,_ = qr(randn(5,5))  # compute a random orthogonal matrix V
A = Symmetric(V*D*V')    # note that V' = inv(V)

5×5 Symmetric{Float64,Array{Float64,2}}:
  0.574502   1.34308    0.304273  -0.302506   1.31849
  1.34308    2.87859   -1.26239    0.167233  -1.82041
  0.304273  -1.26239    2.73478   -2.45381   -3.24429
 -0.302506   0.167233  -2.45381   -1.70459   -2.48482
  1.31849   -1.82041   -3.24429   -2.48482   -0.483273

In [54]:
issymmetric(A)

true

In [55]:
eigvals(A)

5-element Array{Float64,1}:
 -6.0000000000000036
 -0.9999999999999972
  1.9999999999999987
  4.000000000000002
  5.0

Now we will take the QR factorization and just reverse the factors.

In [56]:
Q,R = qr(A)
A = Symmetric(R*Q)

5×5 Symmetric{Float64,Array{Float64,2}}:
  0.280212   2.57334   0.475992  -1.95816    1.7479
  2.57334    2.02614  -1.51132    2.07886   -1.70359
  0.475992  -1.51132   3.06945    2.90886   -1.01507
 -1.95816    2.07886   2.90886   -1.64607   -0.260941
  1.7479    -1.70359  -1.01507   -0.260941   0.270265

It turns out that this is a similarity transformation, so the eigenvalues are unchanged.

In [57]:
eigvals(A)

5-element Array{Float64,1}:
 -5.999999999999998
 -0.9999999999999993
  1.9999999999999996
  3.999999999999999
  4.999999999999997

What's remarkable is that if we repeat the transformation many times, the process converges to $D$. 

In [58]:
for k = 1:40
    Q,R = qr(A)
    A = Symmetric(R*Q)
end
A

5×5 Symmetric{Float64,Array{Float64,2}}:
 -6.0          -0.00486335    6.28528e-7   -1.0123e-18   -3.96294e-31
 -0.00486335    5.0          -0.00023094   -1.26511e-16  -2.71143e-29
  6.28528e-7   -0.00023094    4.0          -1.9083e-12    8.02741e-26
 -1.0123e-18   -1.26511e-16  -1.9083e-12    2.0           2.1603e-12
 -3.96294e-31  -2.71143e-29   8.02741e-26   2.1603e-12   -1.0

# Example 7.3.2

We verify some of the fundamental SVD properties using the built-in Julia command `svd`.

In [None]:
A = [i^j for i=1:5, j=0:3]

In [None]:
U,sigma,V = svd(A);

Note that while the "full" SVD has a square $U$, the "thin" form is the default. Here the columns are orthonormal even though $U$ is not square.

In [None]:
@show size(U),opnorm(U'*U - I);

In [None]:
@show size(V),opnorm(V'*V - I);

In [None]:
sigma

In [None]:
@show opnorm(A),sigma[1];

In [None]:
@show cond(A), sigma[1]/sigma[end];

# Example 7.4.1

The following matrix is not hermitian.

In [None]:
A = [0 2; -2 0]

It has an eigenvalue decomposition with a unitary matrix of eigenvectors, though, so it is normal. 

In [None]:
lambda,V = eigen(A)
opnorm( V'*V - I )

The eigenvalues are pure imaginary.

In [None]:
lambda

The singular values are the complex magnitudes of the eigenvalues.

In [None]:
svdvals(A)

# Example 7.4.2

We construct a real symmetric matrix with known eigenvalues by using the QR factorization to produce a random orthogonal set of eigenvectors. 

In [None]:
n = 30;
lambda = 1:n 

D = diagm(0=>lambda)
V,R = qr(randn(n,n))   # get a random orthogonal V
A = V*D*V';

The condition number of these eigenvalues is one. Thus the effect on them is bounded by the norm of the perturbation to $A$. 

In [None]:
for k = 1:3
    E = randn(n,n); E = 1e-4*E/opnorm(E);
    mu = sort(eigvals(A+E))
    @show max_change = norm(mu-lambda,Inf)
end

# Example 7.4.3

We construct a symmetric matrix with a known EVD.

In [None]:
n = 20;
lambda = 1:n 

D = diagm(0=>lambda)
V,R = qr(randn(n,n))   # get a random orthogonal V
A = V*D*V';

The Rayleigh quotient of an eigenvector is its eigenvalue.

In [None]:
R = x -> (x'*A*x)/(x'*x);
R(V[:,7])

The Rayleigh quotient's value is much closer to an eigenvalue than its input is to an eigenvector. In this experiment, each additional digit of accuracy in the eigenvector estimate gives two more digits to the eigenvalue estimate.

In [None]:
delta = @. 1 ./10^(1:4)
quotient = zeros(size(delta))
for (k,delta) = enumerate(delta)
    e = randn(n);  e = delta*e/norm(e);
    x = V[:,7] + e
    quotient[k] = R(x)
end
DataFrame(perturbation=delta,RQminus7=quotient.-7)

# Example 7.5.1

We make an image from some text, then reload it as a matrix.

In [None]:
plot([],[],leg=:none,annotations=(0.5,0.5,text("Hello world",44,:center,:middle)),
    grid=:none,frame=:none)

In [None]:
savefig("hello.png")
img = load("hello.png")
A = @. Float64(Gray(img));
@show m,n = size(A);

Next we show that the singular values decrease exponentially, until they reach zero (more precisely, are about $\sigma_1 \varepsilon_\text{mach}$). For all numerical purposes, this determines the rank of the matrix.

In [None]:
U,sigma,V = svd(A)
scatter(sigma,
    title="Singular values",xaxis=(L"i"), yaxis=(:log10,L"\sigma_i"),leg=:none )

In [None]:
r = findlast(@.sigma/sigma[1] > 10*eps())

The rapid decrease suggests that we can get fairly good low-rank approximations. 

In [None]:
Ak = [ U[:,1:k]*diagm(0=>sigma[1:k])*V[:,1:k]' for k=2*(1:4) ]
reshape( [ @.Gray(Ak[i]) for i=1:4 ],2,2)

Consider how little data is needed to reconstruct these images. For rank 8, for instance, we have 8 left and right singular vectors plus 8 singular values, for a compression ratio of better than 25:1.  

In [None]:
compression = 8*(m+n+1) / (m*n)

# Example 7.5.2

This matrix describes the votes on bills in the 111th session of the United States Senate. (The data set was obtained from voteview.com.) Each row is one senator and each column is a vote item.

In [None]:
vars = load("voting.jld")
A = vars["A"]
m,n = size(A)

If we visualize the votes (white is "yea," black is "nay," and gray is anything else), we can see great similarity between many rows, reflecting party unity.

In [None]:
heatmap(A,color=:viridis,
    title="Votes in 111th U.S. Senate",xlabel="bill",ylabel="senator")

We use singular value "energy" to quantify the decay rate of the values. 

In [None]:
U,sigma,V = svd(A)
tau = cumsum(sigma.^2) / sum(sigma.^2)
scatter(tau[1:16],label="",
    xaxis=("k"), yaxis=(L"\tau_k"), title="Fraction of singular value energy")

The first and second singular triples contain about 58% and 17% respectively of the energy of the matrix. All others have far less effect, suggesting that the information is primarily two-dimensional. The first left and right singular vectors also contain interesting structure.

In [None]:
scatter( U[:,1],label="",layout=(1,2),
    xlabel="senator" ,title="left singular vector")
scatter!( V[:,1],label="",subplot=2,
    xlabel="bill",title="right singular vector")

Both vectors have values greatly clustered near $\pm C$  for a constant $C$. These can be roughly interpreted as how partisan a particular senator or bill was, and for which political party.   Projecting the senators' vectors into the first two $\V$-coordinates gives a particularly nice way to reduce them to two dimensions. Political scientists label these dimensions "partisanship" and "bipartisanship." Here we color them by actual party affiliation (also given in the data file): red for Republican, blue for Democrat, and yellow for independent. 

In [None]:
x1 = A*V[:,1];   x2 = A*V[:,2];

Rep = vec(vars["Rep"]); Dem = vec(vars["Dem"]);  Ind = vec(vars["Ind"]);
scatter(x1[Dem],x2[Dem],color=:blue,label="D",
    xaxis=("partisanship"),yaxis=("bipartisanship"),title="111th US Senate in 2D" )
scatter!(x1[Rep],x2[Rep],color=:red,label="R")
scatter!(x1[Ind],x2[Ind],color=:yellow,label="I")