# Rayleigh quotient

![Creative Commons License](https://i.creativecommons.org/l/by/4.0/88x31.png)  
This work by Jephian Lin is licensed under a [Creative Commons Attribution 4.0 International License](http://creativecommons.org/licenses/by/4.0/).

In [None]:
import numpy as np
import matplotlib.pyplot as plt
from scipy import linalg as LA

## Main idea

Suppose a symmetric matrix $A$ has eigenvalues  
$$\lambda_0\leq \cdots \leq\lambda_{n-1}.$$
Then  
$$\max_{{\bf x}\neq {\bf 0}}\frac{{\bf x}^\top A{\bf x}}{{\bf x}^\top {\bf x}} = \max_{\|{\bf x}\| = 1} {\bf x}^\top A{\bf x} = \lambda_{n-1}$$
and 
$$\min_{{\bf x}\neq {\bf 0}}\frac{{\bf x}^\top A{\bf x}}{{\bf x}^\top {\bf x}} = \min_{\|{\bf x}\| = 1} {\bf x}^\top A{\bf x} = \lambda_{0}.$$
The vector ${\bf x}$ that achieve the maximum or the minimum is an eigenvector.

Moreover, if ${\bf u}_0,\ldots,{\bf u}_{n-1}$ form an eigenbasis with respect to the eigenvalues $\lambda_0,\ldots,\lambda_{n-1}$, then  
$$\max_{\substack{{\bf x}\neq {\bf 0}\\{\bf x}\perp U_k}}\frac{{\bf x}^\top A{\bf x}}{{\bf x}^\top {\bf x}} = \max_{\substack{\|{\bf x}\| = 1\\{\bf x}\perp U_k}} {\bf x}^\top A{\bf x} = \lambda_{k+1}$$
and 
$$\min_{\substack{{\bf x}\neq {\bf 0}\\{\bf x}\perp L_k}}\frac{{\bf x}^\top A{\bf x}}{{\bf x}^\top {\bf x}} = \min_{\substack{\|{\bf x}\| = 1\\{\bf x}\perp L_k}} {\bf x}^\top A{\bf x} = \lambda_{k-1},$$
where $L_k = \operatorname{span}(\{{\bf u}_0,\ldots,{\bf u}_k\})$ and $U_k = \operatorname{span}(\{{\bf u}_k,\ldots,{\bf u}_{n-1}\})$.

## Side stories

- Covariance matrix
- Laplacian matrix and its Rayleigh quotient

## Experiments

###### Exercise 1
Let  
```python
A = np.ones((3,3))
vs = np.random.randn(3,100)
vs = vs / np.linalg.norm(vs, axis=0)
```

###### 1(a)
Generate an array of ${\bf x}^\top A{\bf x}$, where ${\bf x}$ runs through the columns of `vs` .  
Find the minimum and the maximum.  
Compare them to the smallest and the largest eigenvalues of $A$.

In [None]:
A = np.ones((3,3))

#initialize our x's
vs = np.random.randn(3,10000) #the more random numbers we have, the closer the result will be to the real eigenvalues

vs = vs / np.linalg.norm(vs, axis=0) #normalize, x's as we only care about unit vectors

vals = (vs * A.dot(vs)).sum(axis = 0) #x.T Ax

#np.linalg.eig(A) returns 2 elements: eigenvalues and eivenvectors
eigenv_A, _ = np.linalg.eig(A)

#print min and max of x.T Ax
print('minimum and maximum of x.T Ax: {}; {}'.format(vals.min().round(2), vals.max().round(2)))

#print min and max of eigenvalues of A
print('minimum and maximum eigenvalues: {}; {}'.format(eigenv_A.min().round(2), eigenv_A.max().round(2)))

#check how close they are
print('The difference between them: {}; {}'.format( (abs(vals.min()-eigenv_A.min()).round(5) ) , (abs( vals.max()-eigenv_A.max()).round(5) ) ) )

##### Jephian

Nice and clean answer.

In [None]:
first = vs[:,1]

print(first.sum()) 

print((first.T.dot(A).dot(first)))

print(first.T.dot(A))

print((first * A.dot(first)).sum(axis = 0))

###### 1(b)
It is known that 
```python
u2 = np.array([1,1,1])
```
is the eigenvector for the largest eigenvalue $\lambda_2 = 3$.  
Generate 10000 random points of length 1 in $\mathbb{R}^3$.  
Select those that are (almost) perpendicular to `u2` .  
Calculate the maximum of ${\bf x}^\top A{\bf x}$ over these points ${\bf x}$.

In [None]:
A = np.ones((3,3))

u2 = np.array([1,1,1])

#generating 10k random points 
vs = np.random.randn(3,10000) 

#normalizing
vs = vs / np.linalg.norm(vs, axis=0) 
eigenv_A, _ = np.linalg.eig(A)
print(_)
xTAx = np.sum(vs*A.dot(vs), axis=0)
print(max(xTAx))
#select x's almost perpendicular to u2 
#multiply each column of vs (each x) by u2
multipl = vs * u2[:, np.newaxis]
#calculate the module of inner product of each x and u2
module = np.abs(multipl.sum(axis = 0))

#choose x's that are almost perpendicular
mask = module < 0.001
selected = vs[:,mask]
#print(selected.shape)
#print(selected)

vals = (selected * A.dot(selected)).sum(axis = 0) #x.T Ax
#print(vals)

print('maximum of x.T Ax: {}'.format(vals.max().round(5)))



##### Jephian

The answer is correct.  

Here are some comments.
It is hard to randomly generate vectors that is perpendicular to `u2` .  
You may increase the number of samples or increase the threshold `0.001` .  

Alternatively, you may replace each `v` by `v - proj(v)` and normalize the vector again, where `proj(v)` is the porjection of `v` onto `u2` .

## Exercises

###### Exercise 2
Let  
```python
mu = np.array([0,0])
cov = np.array([[1.1,1],
                [1,1.1]])
vs = np.random.multivariate_normal(mu, cov, 100)
```

###### 2(a)
Plot the points (rows) in `vs` .

In [None]:
mu = np.array([0,0])
cov = np.array([[1.1,1],
                [1,1.1]])
vs = np.random.multivariate_normal(mu, cov, 100)

#%matplotlib notebook
ax = plt.axes()#(projection='3d')
ax.scatter(vs[:,0],vs[:,1])

###### 2(b)
Find the center of mass over the points in `vs` .  
Shift the points in `vs` so that the center is at the origin.  

In [None]:
center = np.array([sum(vs[:,0])/len(vs),sum(vs[:,1])/len(vs)])

shifted_vs = vs - center.T

ax = plt.axes()#(projection='3d')

ax.set_xlim(-0.5,0.5)
ax.set_ylim(-0.5,0.5)

ax.scatter(vs[:,0],vs[:,1], alpha = 0.3)

ax.scatter(*center, color = 'r')
ax.scatter(shifted_vs[:,0],shifted_vs[:,1], alpha = 0.3)

center_shifted = np.array([sum(shifted_vs[:,0])/len(shifted_vs),sum(shifted_vs[:,1])/len(shifted_vs)])
print(center.round(4),center_shifted.round(4))

##### Jephian

The first line can be replaced by `center = vs.mean(axis=0)` .  
Also, it seems not necessary to set `xlim` and `ylim` .

###### 2(c)
Suppose $X$ is a $N\times d$ data matrix whose rows are samples and columns are features.  
If the rows are centered at the origin, then $\frac{1}{N}X^\top X$ is called the **covariance matrix** between the features.

Thinking of `vs` as a data matrix whose rows are centered at the origin, find the covariance matrix `C` .

In [None]:
#3 ways to do it:
#https://datascienceplus.com/understanding-the-covariance-matrix/#:~:text=where%20our%20data%20set%20is,XT%20X%20X%20T%20.

#1
'''# Covariance
def cov(x, y):
    xbar, ybar = x.mean(), y.mean()
    return np.sum((x - xbar)*(y - ybar))/(len(x) - 1)

# Covariance matrix
def cov_mat(X):
    return np.array([[cov(X[0], X[0]), cov(X[0], X[1])], \
            [cov(X[1], X[0]), cov(X[1], X[1])]])


# Calculate covariance matrix 
cov_mat(vs.T)''' 

#2
#np.cov(vs.T)

#3. as per instructions
covarience_mat = 1/len(vs) * shifted_vs.T.dot(shifted_vs)
print(covarience_mat)

##### Jephian

Nice that you have several answers.  
I believe they all lead to the same result.  

###### 2(d)
Let $C$ be the covariance matrix found in 2(c).  
Generate 100 vectors of length 1 in $\mathbb{R}^2$.  
Find the smallest Rayleigh quotient of $C$ and the vector ${\bf u}_0$ that achieve it.  
Find the largest Rayleigh quotient of $C$ and the vector ${\bf u}_1$ that achieve it.   

In [None]:
### your answer here
vs = np.random.randn(2, 100) #Generate 100 vectors of length 1 in  ℝ2 
vs = vs / np.linalg.norm(vs, axis=0) #(2, 100)
u0, u1 = LA.eigh(covarience_mat)[1] #eigenvectors of c : u0, u1
print('all eigenvalues: ',LA.eigh(covarience_mat)[0]) #eigenvalues
print('smallest Rayleigh quotient of  𝐶: ', np.sum(u0.T*(covarience_mat.dot(u0)))) #smallest Rayleigh quotient of  𝐶 = first eigenvalue
print('corresponding vector u0: ', u0)
print('largest Rayleigh quotient of  𝐶: ', np.sum(u1.T*(covarience_mat.dot(u1)))) #largest Rayleigh quotient of  𝐶  = second eigenvalue
print('corresponding vector u1: ', u1)

###### 2(e)
Plot the points in the shifted `vs` .  
Draw the vectors ${\bf u}_0$ and ${\bf u}_1$.

In [None]:
### your answer here
mu = np.array([0,0])
cov = np.array([[1.1,1],
                [1,1.1]])
vs = np.random.multivariate_normal(mu, cov, 100) #(100, 2)
center = vs[:,0].mean(),vs[:,1].mean() 

plt.axis('equal')
plt.scatter(vs[:,0] - center[0], vs[:,1] - center[1] , c = 'blue')  # new points which center is swifted to (0,0)
plt.arrow(0, 0, u0[0], u0[1], head_width = 0.1, color = 'red') #vector u0
plt.arrow(0, 0, u1[0], u1[1], head_width = 0.1, color = 'black') #vector u1

###### Exercise 3
Let  
```python
A = np.array([[0,1,0,0,0],
              [1,0,1,0,0],
              [0,1,0,1,0],
              [0,0,1,0,1],
              [0,0,0,1,0]])
vals,vecs = LA.eigh(A)
```
Let $\lambda_0,\ldots,\lambda_4$ be the values in `vals` .  
Let $\beta = \{{\bf u}_0,\ldots, {\bf u}_4\}$ be the column vectors in `vecs` .

###### 3(a)
Pick a random vector ${\bf x}$ of length 1 in $\mathbb{R}^5$.  
Compute ${\bf c} = [{\bf x}]_\beta = (c_0,\ldots, c_4)^\top$.  

In [None]:
### your answer here
A = np.array([[0,1,0,0,0],
              [1,0,1,0,0],
              [0,1,0,1,0],
              [0,0,1,0,1],
              [0,0,0,1,0]])
vals,vecs = LA.eigh(A)
x = np.random.randn(5,1) 
x = x / np.linalg.norm(x, axis=0) #(5, 1)
c = vecs.T.dot(x) #LA10
print(c)

##### Jephian

I would do `x = np.random.randn(5)` instead.  

###### 3(b)
Check that $\|{\bf x}\|^2 = c_0^2 + \cdots + c_4^2$.  
Therefore, the condition $\|{\bf x}\| = 1$ is equivalent to $c_0^2 + \cdots + c_4^2 = 1$.

In [None]:
### your answer here
print('‖𝐱‖^2=𝑐0^2+⋯+𝑐4^2: ', x.T.dot(x).round(8) == (c[0]**2 + c[1]**2 + c[2]**2 + c[3]**2 + c[4]**2).round(8)) #check : ‖𝐱‖^2=𝑐0^2+⋯+𝑐4^2
print('‖𝐱‖= ',x.T.dot(x)) #‖𝐱‖=1
print('𝑐0^2+⋯+𝑐4^2 = ', c[0]**2 + c[1]**2 + c[2]**2 + c[3]**2 + c[4]**2) #𝑐0^2+⋯+𝑐4^2 = 1

##### Jephian

The second line should be $\|{\bf x}\|^2$; you missed the square.  
The third line can be simplified as `np.sum(c ** 2)` .

###### 3(c)
Check that  
$A{\bf x} = c_0\lambda_0{\bf u}_0 + \cdots + c_4\lambda_4{\bf u}_4$ and  
${\bf x}^\top A{\bf x} = c_0^2\lambda_0 + \cdots c_4^2\lambda_4$.  
Therefore, under the condition that $c_0^2 + \cdots + c_4^2 = 1$, the extrema of ${\bf x}^\top A{\bf x}$ are $\lambda_0$ and $\lambda_4$.

In [None]:
### your answer here

print ('Ax=c0λ0u0+⋯+c4λ4u4: ', np.allclose(A.dot(x).T,c[0]*vals[0]*vecs[:,0] + c[1]*vals[1]*vecs[:,1] + c[2]*vals[2]*vecs[:,2] + c[3]*vals[3]*vecs[:,3] + c[4]*vals[4]*vecs[:,4]))
print ('x.⊤Ax=c20λ0+⋯c24λ4: ', np.isclose(x.T.dot(A).dot(x),c[0]**2*vals[0] + c[1]**2*vals[1] + c[2]**2*vals[2] + c[3]**2*vals[3] + c[4]**2*vals[4]))


##### Jephian

Again, the computation can be easier.  
For $c_0\lambda_0{\bf u}_0 + \cdots + c_4\lambda_4{\bf u}_4$, you may do `np.sum(c * vals * vecs, axis=1)` .  
For $c_0^2\lambda_0 + \cdots c_4^2\lambda_4$, you may do `np.sum(c ** 2 * vals)` .

###### Exercise 4
Let  
```python
A = np.array([[1,-1,0,0,0],
              [-1,2,-1,0,0],
              [0,-1,2,-1,0],
              [0,0,-1,2,-1],
              [0,0,0,-1,1]])
```

###### 4(a)
Pick a random vector ${\bf x} = (x_0,x_1,x_2,x_3,x_4)^\top$.  
Check that 
$${\bf x}^\top A{\bf x} = \sum_{\substack{i<j \\ (A)_{ij} = -1}}(x_i - x_j)^2.$$  
For convenience, we call this value as $R({\bf x})$.

In [None]:
### your answer here
A = np.array([[1,-1,0,0,0],
              [-1,2,-1,0,0],
              [0,-1,2,-1,0],
              [0,0,-1,2,-1],
              [0,0,0,-1,1]]) #(5,5)
x = np.random.randn(5) #(5,)
sum = 0
print(x.dot(A).dot(x))  # Check xTAx first
for i in range(5): # Check sum of square of the diff 
  for j in range(i,5):
    if A[i,j] == -1: #like a mask
      sum = sum + (x[i] - x[j])**2
    else:
      sum = sum
print(sum) #print them out, and we will know they are the same.

##### Jephian

Usually, the indent is four spaces.

###### 4(b)
Pick 1000000 random vector ${\bf x}$ of length 1 in $\mathbb{R}^5$.  
Find the one ${\bf u}_0$ that achieve the minimum $R({\bf x})$.  
Can you guess the correct ${\bf u}_0$ by the identity in 4(a)?

In [None]:
### your answer here
A = np.array([[1,-1,0,0,0],
              [-1,2,-1,0,0],
              [0,-1,2,-1,0],
              [0,0,-1,2,-1],
              [0,0,0,-1,1]])
x = np.random.randn(5,1000000)
x = x / np.linalg.norm(x, axis=0)
Rs = np.sum(x*(A.dot(x)),axis=0) #(1000000,)
u0 = min(Rs) 
print(u0) #print the min value of Rs

In [None]:
#從4(a)的計算1個random vector  𝐱，到計算1000000 random vector  𝐱，利用4(a)的等式，就用另一種方式計算出u0
minsum = 100
for i in range(1000000): # have 1000000 random vector  𝐱 
  sum = 0 # need to reset the sum after calculate each sum of colume
  for j in range(5): # Check sum of square of the diff 
    for k in range(j,5): # j<k...by identity in 4(a), so it is from j to 5 
      if A[j,k] == -1: # like a mask
        sum = sum + (x[j,i] - x[k,i])**2
      else:
        sum = sum 
  if minsum > sum: #all sums of columes need to find the min, which is same as u0 above
    minsum = sum
u1 = minsum
print(minsum)
print(np.isclose(u0,u1))

##### Jephian

What I wanted to say is, the minimum value of  
$$
    {\bf x}^\top A{\bf x} = \sum_{\substack{i<j \\ (A)_{ij} = -1}}(x_i - x_j)^2
$$  
happens when $x_i = x_j$ for all "related" $i$ and $j$.  