# Rayleigh quotient

![Creative Commons License](https://i.creativecommons.org/l/by/4.0/88x31.png)  
This work by Jephian Lin is licensed under a [Creative Commons Attribution 4.0 International License](http://creativecommons.org/licenses/by/4.0/).

In [None]:
import numpy as np
import matplotlib.pyplot as plt
from scipy import linalg as LA

## Main idea

Suppose a symmetric matrix $A$ has eigenvalues  
$$\lambda_0\leq \cdots \leq\lambda_{n-1}.$$
Then  
$$\max_{{\bf x}\neq {\bf 0}}\frac{{\bf x}^\top A{\bf x}}{{\bf x}^\top {\bf x}} = \max_{\|{\bf x}\| = 1} {\bf x}^\top A{\bf x} = \lambda_{n-1}$$
and 
$$\min_{{\bf x}\neq {\bf 0}}\frac{{\bf x}^\top A{\bf x}}{{\bf x}^\top {\bf x}} = \min_{\|{\bf x}\| = 1} {\bf x}^\top A{\bf x} = \lambda_{0}.$$
The vector ${\bf x}$ that achieve the maximum or the minimum is an eigenvector.

Moreover, if ${\bf u}_0,\ldots,{\bf u}_{n-1}$ form an eigenbasis with respect to the eigenvalues $\lambda_0,\ldots,\lambda_{n-1}$, then  
$$\max_{\substack{{\bf x}\neq {\bf 0}\\{\bf x}\perp L_k}}\frac{{\bf x}^\top A{\bf x}}{{\bf x}^\top {\bf x}} = \max_{\substack{\|{\bf x}\| = 1\\{\bf x}\perp L_k}} {\bf x}^\top A{\bf x} = \lambda_{k+1}$$
and 
$$\min_{\substack{{\bf x}\neq {\bf 0}\\{\bf x}\perp U_k}}\frac{{\bf x}^\top A{\bf x}}{{\bf x}^\top {\bf x}} = \min_{\substack{\|{\bf x}\| = 1\\{\bf x}\perp U_k}} {\bf x}^\top A{\bf x} = \lambda_{k-1},$$
where $L_k = \operatorname{span}(\{{\bf u}_0,\ldots,{\bf u}_k\})$ and $U_k = \operatorname{span}(\{{\bf u}_k,\ldots,{\bf u}_{n-1}\})$.

## Side stories

- Covariance matrix
- Laplacian matrix and its Rayleigh quotient

## Experiments

###### Exercise 1
Let  
```python
A = np.ones((3,3))
vs = np.random.randn(3,100)
vs = vs / np.linalg.norm(vs, axis=0)
```

###### 1(a)
Generate an array of ${\bf x}^\top A{\bf x}$, where ${\bf x}$ runs through the columns of `vs` .  
Find the minimum and the maximum.  
Compare them to the smallest and the largest eigenvalues of $A$.

In [None]:
### your answer here
A = np.ones((3,3))
vs = np.random.randn(3,1000)
vs = vs / np.linalg.norm(vs, axis=0)

a=np.sum(vs*(A.dot(vs)),axis=0)
print(min(a),max(a))
print(np.linalg.eigh(A)[0])

###### 1(b)
It is known that 
```python
u2 = np.array([1,1,1])
```
is the eigenvector for the largest eigenvalue $\lambda_2 = 3$.  
Generate 10000 random points of length 1 in $\mathbb{R}^3$.  
Select those that are (almost) perpendicular to `u2` .  
Calculate the maximum of ${\bf x}^\top {\bf x}$ over these points ${\bf x}$.

In [None]:
### your answer here
u2 = np.array([1,1,1])
vs = np.random.randn(3,10000)
vs1 = vs / np.linalg.norm(vs, axis=0)
Avs = u2.dot(vs1) ### 1 x 3 times 3 x 10000
mask = (np.abs(Avs) < 0.01) #T or F

new_vs = vs1[:, mask]
%matplotlib notebook
ax = plt.axes(projection='3d')
ax.set_xlim(-5,5)
ax.set_ylim(-5,5)
ax.set_zlim(-5,5)
ax.scatter(new_vs[0], new_vs[1], new_vs[2])
a=np.sum(new_vs*(A.dot(new_vs)),axis=0)
max(a)

## Exercises

###### Exercise 2
Let  
```python
mu = np.array([0,0])
cov = np.array([[1.1,1],
                [1,1.1]])
vs = np.random.multivariate_normal(mu, cov, 100)
```

###### 2(a)
Plot the points (rows) in `vs` .

In [None]:
### your answer here
mu = np.array([0,0])
cov = np.array([[1.1,1],
                [1,1.1]])
vs = np.random.multivariate_normal(mu, cov, 100)

plt.scatter(vs[:,0],vs[:,1])

##### Jephian:
You may consider switching back to `%matplotlib inline` .

###### 2(b)
Find the center of mass over the points in `vs` .  
Shift the points in `vs` so that the center is at the origin.  

In [None]:
### your answer here
center=np.mean(vs[:,0]),np.mean(vs[:,1])
plt.scatter(vs[:,0]-center[0],vs[:,1]-center[1])

###### 2(c)
Suppose $X$ is a $N\times d$ data matrix whose rows are samples and columns are features.  
If the rows are centered at the origin, then $\frac{1}{N}X^\top X$ is called the **covariance matrix** between the features.

Thinking of `vs` as a data matrix whose rows are centered at the origin, find the covariance matrix `C` .

In [None]:
### your answer here
mu = np.array([0,0]) ##(2,)
cov = np.array([[1.1, 1],
         [1 , 1.1]]) ##(2, 2)
vs = np.random.multivariate_normal(mu, cov, 100) ##(100, 2)

print((1/100)*(vs.T.dot(vs)))

###### 2(d)
Generate 100 vectors of length 1 in $\mathbb{R}^2$.  
Find the smallest Rayleigh quotient and the vector ${\bf u}_0$ that achieve it.  
Find the largest Rayleigh quotient and the vector ${\bf u}_1$ that achieve it.   

In [None]:
### your answer here
mu = np.array([0,0])
cov = np.array([[1.1,1],
                [1,1.1]])
vs = np.random.multivariate_normal(mu, cov, 100)#100*2
A = np.ones((2,2))#2*2
vs = vs / np.linalg.norm(vs, axis=0)
print(np.linalg.eigh(A)[0])
u0=np.linalg.eigh(A)[1][0]
u1=np.linalg.eigh(A)[1][1]
print(np.sum(u0.T*(A.dot(u0.T)),axis=0),np.sum(u1.T*(A.dot(u1.T)),axis=0))
print(np.isclose(np.linalg.eigh(A)[0][0],np.sum(u0.T*(A.dot(u0.T)),axis=0)))
print(np.isclose(np.linalg.eigh(A)[0][1],np.sum(u1.T*(A.dot(u1.T)),axis=0)))

##### Jephian:
The code calculate `np.linalge.eigh(A)` many times.  
This can be bad if this command takes a long time.  
It would be nice to store its results for future uses.

###### 2(e)
Plot the points in the shifted `vs` .  
Draw the vectors ${\bf u}_0$ and ${\bf u}_1$.

In [None]:
### your answer here
mu = np.array([0,0]) ##(2,)
cov = np.array([[1.1, 1],
         [1 , 1.1]]) ##(2, 2)
vs = np.random.multivariate_normal(mu, cov, 100) ##(100, 2)

center = np.mean(vs[:,0]),np.mean(vs[:,1])
plt.scatter(vs[:,0]-center[0],vs[:,1]-center[1])

plt.arrow(0,0,*u0,head_width=0.2,color="red")
plt.arrow(0,0,*u1,head_width=0.2,color="green")

##### Jephian:
If you do `plt.axis('equal')`  
you will realize the two vectors are orthogonal.

###### Exercise 3
Let  
```python
A = np.array([[0,1,0,0,0],
              [1,0,1,0,0],
              [0,1,0,1,0],
              [0,0,1,0,1],
              [0,0,0,1,0]])
vals,vecs = LA.eigh(A)
```
Let $\lambda_0,\ldots,\lambda_4$ be the values in `vals` .  
Let $\beta = \{{\bf u}_0,\ldots, {\bf u}_4\}$ be the column vectors in `vecs` .

###### 3(a)
Pick a random vector ${\bf x}$ of length 1 in $\mathbb{R}^5$.  
Compute ${\bf c} = [{\bf x}]_\beta = (c_0,\ldots, c_4)^\top$.  

In [None]:
### your answer here
A = np.array([[0,1,0,0,0],
              [1,0,1,0,0],
              [0,1,0,1,0],
              [0,0,1,0,1],
              [0,0,0,1,0]])
vals,vecs = LA.eigh(A)
x = np.random.randn(5,1)
x = x / np.linalg.norm(x, axis=0)
c=vecs.T.dot(x)
c

##### Jephian:
Depending on the purpose,  
sometimes the vector `x` is of the shape `(5,)`  
and sometimes it is of the shape `(5,1)` .

###### 3(b)
Check that $\|{\bf x}\|^2 = c_0^2 + \cdots + c_4^2$.  
Therefore, the condition $\|{\bf x}\| = 1$ is equivalent to $c_0^2 + \cdots + c_4^2 = 1$.

In [None]:
### your answer here
np.power(c,2).sum(axis=0)

##### Jephian:
Alternatively, `np.sum(c**2, axis=0)` .  
If `c.shape` is `(5,)`, then `np.sum(c**2)` is good enough.

###### 3(c)
Check that  
$A{\bf x} = c_0\lambda_0{\bf u}_0 + \cdots + c_4\lambda_4{\bf u}_4$ and  
${\bf x}^\top A{\bf x} = c_0^2\lambda_0 + \cdots c_4^2\lambda_4$.  
Therefore, under the condition that $c_0^2 + \cdots + c_4^2 = 1$, the extrema of ${\bf x}^\top A{\bf x}$ are $\lambda_0$ and $\lambda_4$.

In [None]:
cv=c.T*vals.T

In [None]:
#𝐴𝐱=𝑐0𝜆0𝐮0+⋯+𝑐4𝜆4𝐮4
np.isclose(A.dot(x),np.sum(np.matrix(cv*np.split(vecs, 5, axis=0)),1))

In [None]:
#𝐱⊤𝐴𝐱=𝑐20𝜆0+⋯𝑐24𝜆4
np.isclose(np.sum(x*(A.dot(x)),axis=0),np.sum(np.power(c,2).T*vals.T,axis=1))

In [None]:
A = np.array([[0,1,0,0,0],
              [1,0,1,0,0],
              [0,1,0,1,0],
              [0,0,1,0,1],
              [0,0,0,1,0]])
vs = np.random.randn(5,100000)
vs = vs / np.linalg.norm(vs, axis=0)

a=np.sum(vs*(A.dot(vs)),axis=0)
print(min(a),max(a))
print(np.linalg.eigh(A)[0])

##### Jephian:
I would probably compare $[A{\bf x}]_\beta$ with $(c_0\lambda_0, \ldots, c_4\lambda_4)^\top$  
instead of comparing $A{\bf x}$ $c_0\lambda_0{\bf u}_0 + \cdots + c_4\lambda_4{\bf u}_4$.  
```python
A = np.array([[0,1,0,0,0],
              [1,0,1,0,0],
              [0,1,0,1,0],
              [0,0,1,0,1],
              [0,0,0,1,0]])
vals,vecs = LA.eigh(A)
x = np.random.randn(5)
c = vecs.T.dot(x)

print("x =", x)
print("[x]_beta =", c)
print("|x|^2, |[x]_beta|^2 =", np.sum(x**2), np.sum(c**2))
print("---")
print("[Ax]_beta =", vecs.T.dot(A.dot(x)))
print("ci lambda i =", c*vals)
print("---")
print("xAx =", x.dot(A.dot(x)))
print("sum of ci^2 lambda i =", np.sum(c**2 * vals))
```

###### Exercise 4
Let  
```python
A = np.array([[1,-1,0,0,0],
              [-1,2,-1,0,0],
              [0,-1,2,-1,0],
              [0,0,-1,2,-1],
              [0,0,0,-1,1]])
```

###### 4(a)
Pick a random vector ${\bf x} = (x_0,x_1,x_2,x_3,x_4)^\top$.  
Check that 
$${\bf x}^\top A{\bf x} = \sum_{\substack{i<j \\ (A)_{ij} = -1}}(x_i - x_j)^2.$$  
For convenience, we call this value as $R({\bf x})$.

In [None]:
### your answer here
A = np.array([[1,-1,0,0,0],
              [-1,2,-1,0,0],
              [0,-1,2,-1,0],
              [0,0,-1,2,-1],
              [0,0,0,-1,1]])
vs = np.random.randn(5,1)
vs = vs / np.linalg.norm(vs, axis=0)
a=np.sum(vs*(A.dot(vs)),axis=0)
a
b=np.matrix(np.power(vs-vs.T,2))
b=np.diagonal(b,offset=1).sum(axis=0)

np.isclose(a,b)

##### Jephian:
- `np.matrix` is not recommended by the NumPy community since it is no more well-maintained.
- the code can be simplier as follows.

```python
A = np.array([[1,-1,0,0,0],
              [-1,2,-1,0,0],
              [0,-1,2,-1,0],
              [0,0,-1,2,-1],
              [0,0,0,-1,1]])

x = np.random.randn(5)
a = x.dot(A.dot(x))

diff_square = (x[:,np.newaxis] - x)**2
mask = (A == -1)
b = diff_square[mask].sum() / 2

print(a, b)
```

###### 4(b)
Pick 1000000 random vector ${\bf x}$ of length 1 in $\mathbb{R}^5$.  
Find the one ${\bf u}_0$ that achieve the minimum $R({\bf x})$.  
Can you guess the correct ${\bf u}_0$ by the identity in 4(a)?

In [None]:
### your answer here
A = np.array([[1,-1,0,0,0],
              [-1,2,-1,0,0],
              [0,-1,2,-1,0],
              [0,0,-1,2,-1],
              [0,0,0,-1,1]])
vs = np.random.randn(5,1000000)
vs = vs / np.linalg.norm(vs, axis=0)
a=np.sum(vs*(A.dot(vs)),axis=0)
np.where(a == a.min()) 
minvs=vs[:,np.where(a == a.min())]
v=np.array(vs[:,np.where(a == a.min())])[:,0,:]
b=np.sum(v*(A.dot(v)),axis=0)
b

In [None]:
u0=minvs[:,:,0][:,0]
a=np.sum(u0*(A.dot(u0)),axis=0)
a

In [None]:
np.isclose(a,b)

##### Jephian:
Since we are only interest in one vector that achieves the minimum  
but not all vectors that achieve the minimum,  
we may do `np.argmin` instead of `np.where(a = a.min())` .  

Also, you did not answer what is the vector realizing the minimum.  
According to the equation in (a),  
the right-hand side has minimum achieved by ${\bf x} = (x_0,\ldots,x_4)^\top$  
with $x_0=x_1$, $\ldots$, and $x_3=x_4$.

See sample below.  

```python
A = np.array([[1,-1,0,0,0],
              [-1,2,-1,0,0],
              [0,-1,2,-1,0],
              [0,0,-1,2,-1],
              [0,0,0,-1,1]])
vs = np.random.randn(5,1000000)
vs = vs / np.linalg.norm(vs, axis=0)
Rs = np.sum(vs*(A.dot(vs)),axis=0)
ind = Rs.argmin()

print(Rs[ind])
print(vs[:,ind])
```