# Chapter 12 - The Eigenvector

### Questions I have:
- How is matrix related with markov chains?
- How is matrix related with Google's page rank algorithm?
- How to decompose eigenvector and eigenvalues?
- What's the relatioship between eigenvalues and eigenvectors and other known matrix decomposision? (i.e. Orthogonal, SVD)
- What's the other applications of eigenvalues and eigenvectors in data science?

## Notes
After finding a diagonal matrix, any vector could be decomposed as a linear combination of two vectors with eigen values being its coefficients.

## Eigenvalues and eigen vectors
### Definition 12.3.1
For a matrix A whose row-label set equals its column-label set, if λ is a scalar and v is a nonzero vector such that Av = λv, we say that λ is an eigenvalue of A, and v is a corresponding eigenvector.

- square matrix
- eigen vectors reside in a vector space, named eigenspace

Interpretation
- In the eigenspace __of A__, multiple by A is equivalent scaling by λ

### Lemma 12.3.6, find eigen vectors and values
Let A be a square matrix.
- The number λ is an eigenvalue of A if and only if A − λ 1 is not invertible.
- If λ is in fact an eigenvalue of A then the corresponding eigenspace is the null space of A − λ 1.

### Corollary 12.3.8 λ is eigen values for both A and Aᵀ
If λ is an eigenvalue of A then it is an eigenvalue of Aᵀ


## diagonalizable
### Definition Similarity
- Definition 12.3.9: We say two square matrices A and B are similar if there is an invertible matrix S such that S⁻¹AS = B
- Proposition 12.3.10: Similar matrices have the same eigenvalues.
- Definition 12.3.12: If a square matrix A is similar to a diagonal matrix, i.e. if there is an invertible matrix S such that S−1AS = Λ where Λ is a diagonal matrix, we say A is diagonalizable.

Interpretation:
- Linear transform (with independent basis) doesn't change eigen values, this is a strong property.
- Diagonal matrix forms a simple basis, orthogonal general generators; this makes multiplications trivial to perform.

### Theorem 12.3.15
An n × n matrix is diagonalizable __iff__ it has n linearly independent eigenvectors.

- Lemma 12.3.13 If Λ = S⁻¹AS is a diagonal matrix then the diagonal elements of Λ are eigenvalues, and the columns of S are linearly independent eigenvectors
- Lemma 12.3.14 If an n × n matrix A has n linearly independent eigenvectors then A is diagonalizable.

Numerical property in multiplications:
The eigenvalues with large absolute value will dominate; the other terms will be relatively small.

## 12.6 Existence of eigenvalues

### Definition 12.6.1: Positive-definite matrix
A symmetric matrix whose eigenvalues are all positive real numbers is called a positive-definite matrix.
Any positive-definite matrix can be written as AᵀA for some invertible matrix A.

Positive-semidefinite matrix?

### Lemma 12.6.2: distinct eigenvalues -> independent eigen-vectors
For a matrix A, for any set T of distinct eigenvalues, the corresponding eigen-vectors are linearly independent

### Theorem 12.6.3 distinct eigenvalues -> diagonalizable
A n × n matrix with n distinct eigenvalues is diagonalizable.

### Theorem 12.6.4 (Diagonalization of symmetric matrices)
Let A be a symmetric matrix over R. Then there is an orthogonal matrix Q and a real-valued diagonal matrix Λ such that QᵀAQ = Λ.
All the eigenvalues are guaranteed to be real numbers for symmetric matrix.

Fortunately, matrices arising in practice are often diagonalizable. #question, exception: Markov chains.

### Upper-triangular matrix
Lemma 12.6.5: The diagonal elements of an upper-triangular matrix U are the eigenvalues of U.
- Use the method to find null space of U - λ1 = 0

Definition 12.6.7: The spectrum of an upper-triangular matrix U is __the multiset__ of diagonal elements.

### General square matrices
Theorem 12.6.9: Every square matrix over C has an eigenvalue.
Theorem 12.6.10: For any n × n matrix A, there is a unitary matrix Q such that Q−1AQ is an upper-triangular matrix.

Every matrix is __similar__ to upper-triangular matrix, this computes eigen values via QR factorization.

## Power method
Based on the property of power operation, Aᵗx could approximate the eigenvectors corresponding to the largest eigenvalues.

## 12.8 Markov chains
Each column represents a probability distribution.

Application:
- memory block
- words generation

Stationary condition, or 1 is an eigenvalue of A.

Theorem: If every entry of the stochastic matrix is positive, then there is a nonnegative eigen- vector corresponding to the eigenvalue 1, and also (and we’ll see why this is important) every other eigenvalue is smaller in absolute value than 1.

Use the power method to find such as value.

## 12.10 The determinant
Areas of parallelograms -> Volumes of parallelepipeds

Area of polygon in terms of parallelogram -> Signed area

Determinant is the signed area/volume of parallelpipeds.

Proposition 12.10.6: A square matrix A is invertible if and only if its determinant is nonzero.

Multilinearity: det A is a linear function of each entry of A. 
Multiplicativity: det(AB)=(detA)(detB)

In [1]:
from book.mat import *
from book.matutil import *
from book.vec import *
from book.vecutil import *

import math
from math import sqrt

In [2]:
# Fibonacchi matrix to compute series
Fib_M = listlist2mat([[1,1],[1,0]])
F_0 = list2vec([1,0])

def fib_mul(v, n=0):
    if n == 0:
        return v
    else:
        return Fib_M*fib_mul(v, n-1)
fib_mul(F_0, 11)

Vec({0, 1},{0: 144, 1: 89})

In [3]:
import sympy
from sympy import Matrix, linsolve, simplify

def Vector(*args):
    return Matrix([[a] for a in args])

A = Matrix([[1,1],[1,0]])
l1 = ((1+sympy.sqrt(5))/2).simplify()
l2 = ((1-sympy.sqrt(5))/2).simplify()
S = Matrix([[l1, l2],[1,1]]) # S^-1 * A * S is diagonal matrix, it is easier to compute rate of changes
Lambda = S.inv() * A * S
[
    A**10,
    S * Lambda**10 * S.inv()
]

[Matrix([
 [89, 55],
 [55, 34]]),
 Matrix([
 [-sqrt(5)*((1/2 - sqrt(5)/2)*(123/2 - 55*sqrt(5)/2) + (1/2 + sqrt(5)/2)*((7/2 - 3*sqrt(5)/2)*(11/2 - 5*sqrt(5)/2)*((1/2 - sqrt(5)/2)*(1/(sqrt(5)/2 + 5/2) + (1 + sqrt(5))/(sqrt(5) + 5)) + (1 + sqrt(5))/(sqrt(5) + 5)) + (7/2 - 3*sqrt(5)/2)*(275 + 123*sqrt(5))*((1/2 - sqrt(5)/2)*(1/(sqrt(5)/2 + 5/2) + (1 + sqrt(5))/(sqrt(5) + 5)) + (1 + sqrt(5))/(sqrt(5) + 5))/(11*sqrt(5) + 25)))/5 + (1 + sqrt(5))*((1/2 - sqrt(5)/2)*((11/2 - 5*sqrt(5)/2)*(-sqrt(5)/5 + (1/2 - sqrt(5)/10)*(1/2 + sqrt(5)/2))*(21*sqrt(5) + 47)/(3*sqrt(5) + 7) + (275 + 123*sqrt(5))*(-sqrt(5)/5 + (1/2 - sqrt(5)/10)*(1/2 + sqrt(5)/2))*(21*sqrt(5) + 47)/((3*sqrt(5) + 7)*(11*sqrt(5) + 25))) + (1/2 + sqrt(5)/2)*(6765*sqrt(5) + 15127)/(55*sqrt(5) + 123))/(sqrt(5) + 5), (sqrt(5)/10 + 1/2)*((1/2 - sqrt(5)/2)*(123/2 - 55*sqrt(5)/2) + (1/2 + sqrt(5)/2)*((7/2 - 3*sqrt(5)/2)*(11/2 - 5*sqrt(5)/2)*((1/2 - sqrt(5)/2)*(1/(sqrt(5)/2 + 5/2) + (1 + sqrt(5))/(sqrt(5) + 5)) + (1 + sqrt(5))/(sqrt(5) + 5)

In [4]:
# For each x^t = A^t * x 
# It exists decomposition: x^t_i = a_i * l1^t + b_i * l2^t
# a_i and b_i are constants which could be calculated by initial conditions
L = Matrix([[l1, l2, 0, 0], [0, 0, l1, l2], [l1**2, l2**2, 0, 0], [0, 0, l1**2, l2**2]])
X = Matrix([[1, 1, 2, 1]])
solution = sympy.linsolve((L, X))
solution

{(sqrt(5)/10 + 1/2, 1/2 - sqrt(5)/10, sqrt(5)/5, -sqrt(5)/5)}

In [5]:
# a1, b1 could be used as a basis for computation
a1, b1, _, _ = solution.args[0]
v1 = Vector(l1, 1)
v2 = Vector(l2, 1)
sympy.simplify(v1.T @ v2)  # orthogonal

Matrix([[0]])

In [6]:
# S is the matrix which transform basis of eigenvectors to original basis
# v1, v2 are eigenvectors of A, and l1, l2 are eigenvalues of A
u = Matrix([[a1, b1]]).transpose()
[
    simplify(S @ u),
    simplify(S @ Lambda @ u),
    simplify(S @ Lambda @ Lambda @ u),
]

[Matrix([
 [1],
 [1]]),
 Matrix([
 [2],
 [1]]),
 Matrix([
 [3],
 [2]])]

In [7]:
# Find eigenspace and eigenvalues of A
A = Matrix([[1, 2],[3, 4]])
λ1 = (5+sympy.sqrt(33))/2
B = A - λ1 * sympy.eye(2)
B_eigensapce = linsolve((B, Matrix([[0,0]])))
v1 = Vector(*B_eigensapce.args[0])
[display(x) for x in [
    v1,
    simplify(B@v1), # non-zero vector in null space
    simplify(A@v1),
    simplify(λ1*v1),
]]

Matrix([
[tau0*(-1/2 + sqrt(33)/6)],
[                    tau0]])

Matrix([
[0],
[0]])

Matrix([
[tau0*(sqrt(33) + 9)/6],
[tau0*(5 + sqrt(33))/2]])

Matrix([
[tau0*(sqrt(33) + 9)/6],
[tau0*(5 + sqrt(33))/2]])

[None, None, None, None]

In [8]:
# Problem 12.14.5
# x_t = A^tx ≈ a1λ1^t*x_0 + (others)
m = [[1, 2, 5, 7], [2, 9, 3, 7], [1, 0, 2, 2], [7, 3, 9, 1]]
A = listlist2mat(m)
x_0 = list2vec([1, 1, 1, 1])
def power(M, x, n):
    assert n > 0
    for _ in range(n):
        x = M*x
        x /= sqrt(x*x)
    return x
N = 50
x_t = power(A, x_0, N)
l_1 = (A * x_t)[0] / (x_t)[0] # A * x_t = l_1 * x_t
x_t, l_1

(Vec({0, 1, 2, 3},{0: 0.3943640250307155, 1: 0.7921813352713714, 2: 0.10496830176478306, 3: 0.453770209945362}),
 14.402834217884141)

In [9]:
import numpy as np
A = np.array([[1, 2, 5, 7], [2, 9, 3, 7], [1, 0, 2, 2], [7, 3, 9, 1]])
λ, v = np.linalg.eig(A)
λ1 = λ[0]; v1 = v[:,0]
λ1, v1 # v_1 equivalent to -a_1, λ1 has some difference why?
λ

array([14.40283422, -6.52407415,  5.55070975, -0.42946982])

In [10]:
# Problem 12.14.8
# Thought #1:
# f(x) = Ax; f^-1(x) = A^-1x
# let x be the generators, could get 16 equastions to solve the entire matrix

# Thought #2:
# [A | I] -> [I | A^-1]
A = sympy.Matrix(m)
I = sympy.eye(4)
RREF, pivot = sympy.Matrix.hstack(A, I).rref()
A_inv = RREF[:, 4:]
A*A_inv

Matrix([
[1, 0, 0, 0],
[0, 1, 0, 0],
[0, 0, 1, 0],
[0, 0, 0, 1]])

In [11]:
A_inv = listlist2mat(RREF[:, 4:].tolist())
A_inv
N = 10
x_t = power(A_inv, x_0, N)
l_1 = (A_inv * x_t)[0] / (x_t)[0]
1.0/l_1

-0.429469819938505

In [12]:
# Problem 12.14.10, TODO
# If (A - kI) * x = 0, λ = k
def eye(D):
    return Mat((D, D), {(r, c): 1 for r in D for c in D if r == c})

def better_eigen_values(A, k):
    """
    input: a matrix A and a value k that is an estimate of an eigenvalue λi of A (and is closer to λi than to any other eigenvalue of A)
    output: an even better estimate of that eigenvalue.
    """
    assert A.D[0] == A.D[1]
    B = A - k*eye(A.D[0])
    N = 10
    x_0 = list2vec([1 for _ in range(len(A.D[0]))])
    x_t = power(B, x_0, N)
    l_1 = (B * x_t)[0] / (x_t)[0]
    print(l_1)
    return k - l_1 # how do I know which sign to use?

A = listlist2mat([[3, 0, 1], [4, 8, 1], [9, 0, 0]])
k = 4
better_eigen_values(A, k)

-5.854101765316718


9.854101765316718

In [13]:
A = np.array([[3, 0, 1], [4, 8, 1], [9, 0, 0]])
np.linalg.eig(A) # 8, 4.85, -1.85

(array([ 8.        ,  4.85410197, -1.85410197]),
 array([[ 0.        ,  0.35577221, -0.20174326],
        [ 1.        , -0.66204524, -0.01748605],
        [ 0.        ,  0.65963796,  0.97928234]]))

In [22]:
λ1 = sympy.symbols('λ1')
λ2 = sympy.symbols('λ2')
P = Matrix([[0, 1, 0], [0, 0, 1], [1, 0, 0]])
(P - sympy.eye(3) * λ1).det()

1 - λ1**3

In [2]:
P2 = Matrix([[0, 0, 1], [0, 1, 0], [1, 0, 0]])
D2 = (P2 - sympy.eye(3) * λ2).det()
sympy.solve(D2, λ2)

NameError: name 'Matrix' is not defined