In [61]:
import numpy as np

# Problem 1: Power iteration method for eigenvalue calculation (3 pts)

### <div align="right"> &copy; Volodymyr Kuchynskyi & Rostyslav Hryniv, 2023 </div>

## Completed by:   
*   Markiian Mandzak 
*   Artur Shevtsov

---

#### In this part of the homework, you will implement the **power method**, a simple iterative algorithm for numerical calculation of the dominant eigenvalue and the corresponding eigenvector of a square matrix $A$, test its limitations, and verify necessary conditions on $A$. You will also use the **inverse power method**, a modification of the regular **power method**, that finds the remaining non-dominant eigenvalues and eigenvectors of $A$. For simplicity, we will be working with real matrices only  

---

## 1. Power iteration method (1 pt)

### 1.1 Explanation of the method
#### Assume that a $k\times k$ matrix $A$ is diagonalizable and that $\lambda_1, \lambda_2, \dots, \lambda_k$ are its eigenvalues listed according to multiplicities. We say that $\lambda_1$ is a <font color="red">dominant eigenvalue</font> of $A$ if the eigenvalues can be ordered so that $|\lambda_1|> |\lambda_2| \ge \dots \ge |\lambda_k|$. In particular, the dominant eigenvalue must be <font color = "blue"> simple</font>; we denote by $\mathbf{v}_1$ a corresponding normalized eigenvector. Analysis of the large-$n$ asymptotics of $A^n\mathbf{x}_0$ for a generic vector $\mathbf{x}_0\in \mathbb{R}^k$ suggests a simple <font color="red">power iteration</font> method to find the eigenspace $\text{ls}\{\mathbf{v}_1\}$ and the dominant eigenvalue $\lambda_1$.  

#### Denote by $\mathbf{v}_j$ an eigenvector for the eigenvalue $\lambda_j$, $j=2,\dots, k$. Assume also that the starting vector $\mathbf{x}_0$ has a nonzero component in the direction of $\mathbf{v}_1$, i.e., that in the representation 
$$
	\mathbf{x}_0 = c_1 \mathbf{v}_1 + c_2 \mathbf{v}_2 + \cdots + c_k\mathbf{v}_k
$$ 
we have $c_1 \ne 0$. Then 
$$
	A^n \mathbf{x}_0 = \lambda_1^n \Bigl[c_1\mathbf{v}_1 + c_2 \Bigr(\frac{\lambda_2}{\lambda_1}\Bigr)^n \mathbf{v}_2 + \cdots + c_k \Bigr(\frac{\lambda_k}{\lambda_1}\Bigr)^n\mathbf{v}_k\Bigr] =  \lambda_1^n \bigl[c_1 \mathbf{v}_1 + o(1)\bigr]
$$ 
as $n \to \infty$. The latter relation still does not allow to identify $\lambda_1$ and $\mathbf{v}_1$ because the term $\lambda_1^n$ blows up when $|\lambda_1|>1$, decays to zero when $|\lambda_1|<1$, and rotates over the unit circle $|z|=1$ when $|\lambda_1|  = 1$ is different from $1$. To compensate that behavior, one iterates over the normalized vectors $\mathbf{x}_n$ defined via 
$$
	\mathbf{x}_{n} := \frac{A\mathbf{x}_{n-1}}{\|A\mathbf{x}_{n-1}\|} = \frac{A^n\mathbf{x}_0}{\|A^n\mathbf{x}_0\|} = \frac{c_1}{|c_1|} \Bigl(\frac{\lambda_1}{|\lambda_1|}\Bigr)^n\mathbf{v}_1 + o(1), \tag{1}
$$ 
whose distance to the eigenspace $\text{ls}\{\mathbf{v}_1\}$ decays exponentially. The eigenvalue $\lambda_1$ can be approximated by 
$$
	\lambda_1 \approx \frac{\mathbf{x}_n^\top A \mathbf{x}_n}{\mathbf{x}_n^\top \mathbf{x}_n} = \mathbf{x}_n^\top A \mathbf{x}_n. \tag{2}
$$ 
Formulae (1) and (2) lay in the basis of the method.


### **1.2 (0.5 pt)**  Implement the power iteration method

> **Note:** Although the use of any function from numpy is allowed, in this part of your homework, methods such as ``power_method`` and ``inverse_power_method`` must be implemented explicitly, without relying on functions that could use them implicitly in their implementations, such as ``np.linalg.eig``. However, you can use the latter in the rest of your homework to e.g. verify the correctness of implemented functions

> **Hint:** The stopping criterion for the iteration (1), i.e., $\mathbf{x}_{n} := {A\mathbf{x}_{n-1}}/{\|A\mathbf{x}_{n-1}\|}$ should be expressed in terms of stabilizing the corresponding eigenvalue $\lambda_1 \approx \mathbf{x}_n^\top \mathbf{x}_{n-1}$. The reason is that if $\lambda_1$ is not positive, then, on each iteration, the vector $\mathbf{x}_n$ gets multiplied by the number $\lambda_1/|\lambda_1|$, which prevents $\mathbf{x}_n$ from converging (cf. (1))

In [62]:
def power_method(A, start_vector=None, tol = 0.001, max_iter=100):
    """
    Return the dominant eigenvalue and its corresponding eigenvector
    using power method.

    Args:
        A - matrix for which to compute the eigenpair
        start_vector(optional) - vector used for initialization
                                 on the first step of power iteration algorithm.
                                 Defaults to None. In such case, the initial vector
                                 will be randomized.
        tol (optional) - stopping criterion: stop iterations when lambda_1 update
                                 gets smaller than tol
        max_iter(optional) - maximum number of iterations for the power method.
                                 Perform sanity check if the returned values are
                                 close to eigenvalue and eigenvector of A
    Returns:
        (eigval, eigvec) - a pair of the dominant eigenvalue and its eigenvector.
    """
    n = A.shape[1]
    x = np.random.rand(n) if start_vector is None else start_vector
    x = x / np.linalg.norm(x)
    
    lambda_prev = 0

    for _ in range(max_iter):
        Ax = np.dot(A, x)
        x = Ax / np.linalg.norm(Ax)
        
        lambda_curr = np.dot(x.T, np.dot(A, x)) / np.dot(x.T, x)

        if abs(lambda_curr - lambda_prev) < tol:
            break

        lambda_prev = lambda_curr
    
    return lambda_curr, x

In [63]:
A_1 = np.array([[0, 1], [-2, -3]])
#You should invoke similar calls in the next tasks by yourself
A_1_eival, A_1_eivec = power_method(A_1, start_vector=np.random.rand(A_1.shape[0]), tol=0.001)
print(f"eigenvalue: {A_1_eival}, eigenvector: {A_1_eivec}")
np.linalg.eig(A_1)

eigenvalue: -2.000525294424379, eigenvector: [ 0.44705697 -0.89450549]


EigResult(eigenvalues=array([-1., -2.]), eigenvectors=array([[ 0.70710678, -0.4472136 ],
       [-0.70710678,  0.89442719]]))

#### Test your implementation of power method:

In [64]:
def test_power_method(A, eigenvalue, eigenvector):

    eigenvals_ref, eigenvecs_ref = np.linalg.eig(A)
    eigenvecs_ref = eigenvecs_ref.T

    eig_imax = np.argmax(np.abs(eigenvals_ref))
    #compare eigenvalues
    assert np.allclose(eigenvalue, eigenvals_ref[eig_imax]),\
                       f"Incorrect eigenvalue found: {eigenvalue} differs from {eigenvals_ref[eig_imax]}"
    #compare eigenvectors w.r.t scalar multiple (normalize)
    assert np.allclose(eigenvector / np.linalg.norm(eigenvector),
                       eigenvecs_ref[eig_imax]) or np.allclose(-(eigenvector / np.linalg.norm(eigenvector)),
                       eigenvecs_ref[eig_imax]),\
                       f"Incorrect eigenvector found: {eigenvector} is not a constant multiple of {eigenvecs_ref[eig_imax]}"

    print("test_power_method passed successfully")

In [65]:
A_test_3x3 = (10*np.random.rand(3,3))
A_test_5x5 = (10*np.random.rand(5,5))
A_test_10x10 = (10*np.random.rand(10,10))

test_power_method(A_test_3x3, *power_method(A_test_3x3, tol=1e-9, max_iter=100_000))
test_power_method(A_test_5x5, *power_method(A_test_5x5, tol=1e-9, max_iter=100_000))
test_power_method(A_test_10x10, *power_method(A_test_10x10, tol=1e-9, max_iter=100_000))

test_power_method passed successfully
test_power_method passed successfully
test_power_method passed successfully


---
### **1.3. (0.5 pt)** Reasons to fail
#### Formulate necessary conditions for power method to work and the reasons why it can fail (the more, the better, but at least two). Then, for each reason, provide an example of your own $3 \times 3$ matrix $M$ when the method fails and test it by your code.
>_Hint_: Recall that for real matrices, the eigenvalues come in complex conjugate pairs. Recall also that not all matrices are diagonalizable; do you see what obstacle that can create?

---

 **1.Non-diagonalizability** 

 
If a matrix is not diagonalizable (i.e, does not have a full set of linearly independent eigenvectors), this means that we can't put it into the form $A = PDP^{-1}$. That is, such matrix is defective, meaning that it does not have a complete basis of eigenvectors. However, it is crucial that we have exactly $n$ distinct eigenvectors for an $n \times n$ matrix when using power method. This is because when we multiply $A$ by $\mathbf x$ many times, we are "searching" an eigenvector corresponding to a dominant eigenvalue. In this context having two linearly dependent vectors will distort the algorithm. Let's try to use our implementation on non diagonalizable matrix `fail_1`:
$$ 
\begin{bmatrix}
0 & -6 & -4 \\
5 & -11 & -6 \\
-6 & 9 & 4 \\
\end{bmatrix}
$$


In [66]:
fail_1 = np.array([[0, -6, -4], [5, -11, -6], [-6, 9, 4]])
power_method(fail_1)

(-3.001910020411936, array([ 0.53557854, -0.26643625,  0.80135345]))

The original matrix has two eigenvalues $\lambda_1 = -3, \lambda_2 = -2$ and two corresponding eigenvectors $v_1 = (\frac{2}{3} -\frac{1}{3}, 1)^\top $, (corresponds to a dominant $\lambda_1$), $v_2 = (0, -\frac{2}{3}, 1)^\top$. While it correctly identified the eigenvalue, eigenvectors are calculated inaccurately.

**2. Abscense of a single distinct dominant eigenvalue.**

Even when a matrix is diagonalizable, it may still posess two eigenvalues $\lambda_i, \lambda_j$ such that $|\lambda_i| = |\lambda_j|$. It is diagonalizable, so the two eigenvalues will have corresponding linearly independent eigenvectors. But when we apply power method to such matrices, it may result in oscilating between two eigenvectors corresponding by several dominant eigevalues , $v_i, v_j$. Even though the eigenvalue will most likely be determined correctly, we may get an eigenvector that is mixed up with another eigenvector corresponding to another $\lambda$. Let's see it in action. For a diagonalizable matrix `fail_2`

$$
\begin{bmatrix}
    3 & 1 & 0 \\
    0 & 4 & 0 \\
    0 & 0 & 4
\end{bmatrix}
$$

we can run the following code:

In [67]:
fail_2 = np.array([[3, 1, 0], [0, 4, 0], [0, 0, 4]])
power_method(fail_2)

(4.002965837020879, array([0.70281273, 0.70703268, 0.07847961]))

`fail_2` has eigenvalues $\lambda_1 = 4$ with $v_1 = (1, 1, 0)^\top$, $\lambda_2 = 3$ with $v_2 = (1, 0, 0)^\top$, and $\lambda_3 = 4$ with $v_3 = (0, 0, 1)^\top$. As predicted, we can see that the resulting eigenvector kind of lies between $v_1$ and $v_3$ thus yielding a highly inaccurate result. 

**3. Complex eigenvalues**

If we apply the basic implementation of power method to a matrix having complex eigenvalues, the algorithm nevertheless will converge and output real eigenvalue. This is because our starting guess vector made up of real entries and the algorithm involves no operations that could potentially return complex numbers. Let's consider the matrix `fail_3`: 

$$
    \begin{bmatrix}
        2 & -3 \\
        1 & 4 \\
    \end{bmatrix}   
$$

That has these eigenvalues: $\lambda_1 = -\mathbb{i} \sqrt{2}+3, \lambda_2 = \mathbb{i}\sqrt{2}+3 $ with corresponding eigenvectors $v_1 = (-\mathbb{i}\sqrt{2}-1, 1)^\top, v_2 = (i\sqrt{2}-1, 1)^\top$

In [68]:
fail_3 = np.array([[2, -3], [1, 4]])
power_method(fail_3)

(1.8079028350715676, array([0.77962163, 0.62625084]))

As we see, the eigenvalue returned by our algo has nothing to do with the actual dominant eigenvalue and eigenvector of the matrix, and that's why in order to take into account matrices with complex eigenvalues, we need to include an appropriate complex numbers handling in the algorithm. 

---

## 2. Symmetric matrices (1 pt) ##

**2.1. Recap on symmetric matrices**
#### Consider a special case of finding eigenvalues and eigenvectors for a **symmetric** matrix $A$, i.e. a matrix satisfying $A^\top = A$. Recall that such a matrix  
- is **orthogonally diagonalizable**, i.e., there is an **orthonormal basis** $\mathbf{v}_1, \dots,\mathbf{v}_k$ of $\mathbb{R}^k$ consisting of **eigenvectors** of $A$;
- has only real eigenvalues $\lambda_1, \lambda_2, \dots, \lambda_k$;
- can be written as $$A = \lambda_1 \mathbf{v}_1\mathbf{v}_1^\top + \lambda_2 \mathbf{v}_2\mathbf{v}_2^\top + \dots + \lambda_k \mathbf{v}_k\mathbf{v}_k^\top$$
by the **spectral theorem**  

Assume that $|\lambda_1|\ge |\lambda_2| \ge \dots  |\lambda_k|$ and that $|\lambda_j| = |\lambda_{j+1}|$ implies that $\lambda_j = \lambda_{j+1}$. Then the power method applies and finds the eigenvalue $\lambda_1$ and the corresponding eigenvector $\mathbf{v}_1$. Think now what are the eigenvalues and eigenvectors of the matrix $$A - \lambda_1 \mathbf{v}_1\mathbf{v}_1^\top;$$ do you see how to find the second eigenvalue $\lambda_2$ and the corresponding eigenvector?

### **2.2 (0.3 pt)** Find all eigenvalues and eigenvectors of a symmetric matrix with power method
Explain how to find the second, third etc eigenvalues and the corresponding eigenvectors for a **symmetric** matrix $M$ if the first eigenvalue and the corresponding eigenvector has already been found. Write down the formulas for each step; justify your answer by referring to the corresponding properties of symmetric matrices


---

To find all eigenvalues of a symmetric matrix $M$ we can perform the following steps: 
1. Calculate the first eigenpair ($\lambda_1, \mathbf v_1$) using `power_method`. 
2. Subtract the outer product $\mathbf v_1 \mathbf v_1^\top$ multiplied by $\lambda_1$ from the original matrix: $M - \lambda_1 \mathbf{v}_1\mathbf{v}_1^\top$. After this update, the matrix $M$ is equal to: $$M: = \lambda_2 \mathbf{v}_2\mathbf{v}_2^\top + \dots + \lambda_n \mathbf{v}_n\mathbf{v}_n^\top$$
3. Calculate the next eigenpair($\lambda_2, \mathbf v_2$) using `power_method`
4. Repeat the process $n-2$ more times.
 
We then repeat the procedure of updating $M$ and calculating eigenpairs until we reach ($\lambda_n, \mathbf v_n$).

Now, let's break down the process and see why it works. 

The most important requirement here is that matrix $M$ must be symmetric. If it is, then it can be expressed as the outer product $\mathbf v\mathbf v^\top$. If so, then our matrix can be decomposed as $M = \lambda_1 \mathbf{v}_1\mathbf{v}_1^\top + \lambda_2 \mathbf{v}_2\mathbf{v}_2^\top + \dots + \lambda_k \mathbf{v}_k\mathbf{v}_k^\top$

In this light it becomes clear why power method is very useful here: we can calculate the dominant eigenpair, then subtract its product $\lambda_i\mathbf{v_i}\mathbf{v_i}^\top$ from the initial matrix, and get the remainder that is free from the influence of the dominant eigenpair we have just calculated and subtracted. 

Therefore, we get the matrix with a new dominant eigenpair $(\lambda_{i+1}\mathbf{v_{i+1}})$, which means we can again call `power_method()` on it and calculate the dominant eigenvalue and eigenvector. So, this process is repeated untill we calculate all the eigenpairs of $M$.

---

**2.3. (0.3 pts)** Implementation

Implement the ``symmetric_matrix_find_eig`` function that accepts a **symmetric matrix** $A$ and calculates all eigenpairs of $A$. To test your function, come up with your own $2 \times 2$ symmetric matrix $M_1$ and $3 \times 3$ symmetric matrix $M_2$ for which you can calculate the eigenpairs by hand, and compare the results

In [69]:
def symmetric_matrix_find_eig(A):
    """
    Return a list of eigenpairs (eigenvalues and eigenvectors)
    of a symmetric matrix A

    Args:
        A - symmetric n x n matrix for which to compute the eigenpairs

    Returns:
        list((eigval, eigvec)) - a list of length n of all eigenpairs stored as
                                 tuples (eigval, eigvec).
    """

    n = A.shape[0]
    (lmbd_1, v1) = power_method(A, tol=1e-9, max_iter=100_000)
    eigenpairs = [(lmbd_1, v1)]
    A_copy = np.copy(A)
    
    for i in range(n-1):
        eigval, eigvec = eigenpairs[i]
        A_copy = A_copy - (eigval * np.outer(eigvec, eigvec.T))

        new_ev, new_evc = power_method(A_copy, tol=1e-9, max_iter=100_000)

        eigenpairs.append((new_ev, new_evc))

    return eigenpairs

#### Now, let's calculate eigenpairs for two arbitrary symmetric matrices, $M_1$ and $M_2$.

---

$$
    M_1 = \begin{bmatrix}
        3 & 0 \\
        0 & 2 \\
    \end{bmatrix}
$$

$M_1$ is a diagonal matrix, so its eigenvalues are equal to its diagonal elements. We can calculate the eigenvectors by solving the equation $(A-\lambda I)v = 0$:

For $\lambda_1=3$:
$$A-\lambda_1I= \begin{bmatrix} 0 & 0 \\ 0 & -1 \end{bmatrix}$$    

For the equation $Av=\lambda v$ it holds that $(A-\lambda I)v = 0$. So, 
$$
\left[\begin{array}{cc|c}
0 & 0 & 0 \\
0 & -1 & 0 \\
\end{array}\right] \Longrightarrow
\left[\begin{array}{cc|c}
0 & 1 & 0 \\
0 & 0 & 0 \\
\end{array}\right] \Longleftrightarrow
\begin{cases}
    x_1 = x_1 \\
    x_2 = 0
\end{cases} \Longleftrightarrow
v_1 = \begin{bmatrix}
        1 \\
        0 \\
    \end{bmatrix}
$$

For $\lambda_2=2$:

$$A-\lambda_2I= \begin{bmatrix} 1 & 0 \\ 0 & 0 \end{bmatrix}$$   

$$
\left[\begin{array}{cc|c}
1 & 0 & 0 \\
0 & 0 & 0 \\
\end{array}\right] \Longleftrightarrow
\begin{cases}
    x_1 = 0 \\
    x_2 = x_2
\end{cases} \Longleftrightarrow
v_2 = \begin{bmatrix} 0 \\ 1 \end{bmatrix}
$$

Thus, for $M_1$ we have $\lambda_1 = 3, \lambda_2 = 2$ and $v_1 = (1, 0)^\top, v_2 = (0, 1)^\top$.

Now let's consider a $3 \times 3$ matrix $M_2$:

$$
    M_2 = \begin{bmatrix}
        2 & 0 & 0 \\
        0 & 3 & 0 \\
        0 & 0 & 5 \\
    \end{bmatrix}
$$

Eigenvalues are $\lambda_1 = 2, \lambda_2 = 3, \lambda_3 = 5$. 

For $\lambda_1=2$:
$$A-\lambda_1I= \begin{bmatrix} 0 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 3 \end{bmatrix}$$  

The augmented matrix representation is:
$$
\left[\begin{array}{ccc|c}
0 & 0 & 0 & 0 \\
0 & 1 & 0 & 0 \\
0 & 0 & 3 & 0 \\
\end{array}\right]
$$
After swapping row 2 with row 1, and row 3 with row 2, and then dividing row 2 by 3, we get:
$$
\left[\begin{array}{ccc|c}
0 & 1 & 0 & 0 \\
0 & 0 & 1 & 0 \\
0 & 0 & 0 & 0 \\
\end{array}\right] \Longleftrightarrow
\begin{cases}
    x_1 = x_1 \\
    x_2 = 0 \\
    x_3 = 0 \\
\end{cases} \Longleftrightarrow
v_1 = \begin{bmatrix} 1 \\ 0 \\ 0 \end{bmatrix}
$$

In the same way, we find that for $\lambda_2 = 3$ we have $v_2 = (0, 1, 0)^\top$ and for $\lambda_3 = 5$ we have $v_3 = (0, 0, 1)^\top$.

---

In [70]:
#call the power_method and symmetric_matrix_find_eig functions of symmetric matrices M_1 and M_2 here
M_1 = np.array([[3., 0.], [0., 2.]])
M_1_eigpairs = symmetric_matrix_find_eig(M_1)

for e in M_1_eigpairs:
    print(f"eigenvalue: {e[0]}, eigenvector: {e[1]}")

print("Numpy built-in function: ", np.linalg.eigh(M_1))

print('\n')

M_2 = np.array([[2., 0., 0.], [0., 3., 0.], [0., 0., 5.]])
M_2_eigpairs = symmetric_matrix_find_eig(M_2)

for e in M_2_eigpairs:
    print(f"eigenvalue: {e[0]}, eigenvector: {e[1]}")

print("Numpy built-in function: ", np.linalg.eigh(M_2))

eigenvalue: 2.999999999320508, eigenvector: [1.00000000e+00 2.60670673e-05]
eigenvalue: 2.000000001019238, eigenvector: [-3.91006009e-05  9.99999999e-01]
Numpy built-in function:  EighResult(eigenvalues=array([2., 3.]), eigenvectors=array([[0., 1.],
       [1., 0.]]))


eigenvalue: 4.999999999765834, eigenvector: [2.97467301e-10 1.08204777e-05 1.00000000e+00]
eigenvalue: 2.9999999998981006, eigenvector: [ 2.21850215e-05  1.00000000e+00 -1.80341296e-05]
eigenvalue: 2.000000000738263, eigenvector: [ 9.99999999e-01 -3.32775322e-05 -1.43536923e-10]
Numpy built-in function:  EighResult(eigenvalues=array([2., 3., 5.]), eigenvectors=array([[1., 0., 0.],
       [0., 1., 0.],
       [0., 0., 1.]]))


#### Test your implementation:

In [71]:
def test_symmetric_matrix_find_eig(A, eigenpairs):

    eigenvals_found = np.sort(np.array([e[0] for e in eigenpairs]))
    eigenvals_ref, _ = np.linalg.eigh(A)
    assert np.allclose(np.sort(eigenvals_found), np.sort(eigenvals_ref)),\
    f"Incorrect eigenvalue found: {eigenvals_found} differs from {eigenvals_ref}"
    print("test_power_method passed successfully")

In [72]:
A_test_3x3 = (10*np.random.rand(3,3) - 5)
A_test_5x5 = (10*np.random.rand(5,5) - 5)
A_test_10x10 = (10*np.random.rand(10,10) - 5)
A_sym_test_3x3 = A_test_3x3 + A_test_3x3.T
A_sym_test_5x5 = A_test_5x5 + A_test_5x5.T
A_sym_test_10x10 = A_test_10x10 + A_test_10x10.T

test_symmetric_matrix_find_eig(A_sym_test_3x3, symmetric_matrix_find_eig(A_sym_test_3x3))
test_symmetric_matrix_find_eig(A_sym_test_5x5, symmetric_matrix_find_eig(A_sym_test_5x5))
test_symmetric_matrix_find_eig(A_sym_test_10x10, symmetric_matrix_find_eig(A_sym_test_10x10))

test_power_method passed successfully
test_power_method passed successfully
test_power_method passed successfully


### **2.4 (0.4 pt)** Why is symmetry important?
#### Explain why this method will not work for **non-symmetric** matrices $A$. Find a $3\times3$ example of diagonalizable matrix $A$ for which ``symmetric_matrix_find_eig`` function fails to find its correct eigenvalues and eigenvectors
---
By spectral theorem, we remember that any symmetric matrix can be written as $A = \lambda_1 \mathbf{v}_1\mathbf{v}_1^\top + \lambda_2 \mathbf{v}_2\mathbf{v}_2^\top + \dots + \lambda_k \mathbf{v}_k\mathbf{v}_k^\top$. And, as we've seen, this is actually the foundation on which the function `symmetric_matrix_find_eig` is built because this allows us to decompose any symmetric matrix using this form and thus perform the deflation operation $A - \lambda_i \mathbf{v}_i\mathbf{v}_i^\top$ to find the next dominant eigenpair. 

However, this does not apply to nonsymmetric matrices, because they cannot always be written as $A = \lambda_1 \mathbf{v}_1\mathbf{v}_1^\top + \lambda_2 \mathbf{v}_2\mathbf{v}_2^\top + \dots + \lambda_k \mathbf{v}_k\mathbf{v}_k^\top$. Let's validate it by trying to calculate eigenpairs for non-symmetric diagonalizable matrix `fail_4`:
$$
    \begin{bmatrix}
        3 & 1 & 0 \\
        0 & 2 & 0 \\
        1 & 6 & 4 \\
    \end{bmatrix}
$$


In [73]:
fail_4 = np.array([[3, 1, 0], [0, 2, 0], [1, 6, 4]])
fail_4_pairs = symmetric_matrix_find_eig(fail_4)

for e in fail_4_pairs:
    print(f"eigenvalue: {e[0]}, eigenvector: {e[1]}")

print("\nNumpy built-in function: ", np.linalg.eig(fail_4))

eigenvalue: 4.0000000022636835, eigenvector: [2.26368376e-09 3.27310700e-22 1.00000000e+00]
eigenvalue: 2.99999999843468, eigenvector: [9.48683291e-01 9.18911495e-09 3.16227787e-01]
eigenvalue: 1.9999999978206113, eigenvector: [-0.27012754  0.34855162  0.89752041]

Numpy built-in function:  EigResult(eigenvalues=array([4., 3., 2.]), eigenvectors=array([[ 0.        ,  0.70710678, -0.34815531],
       [ 0.        ,  0.        ,  0.34815531],
       [ 1.        , -0.70710678, -0.87038828]]))


As we see, even though the eigenvalues are calculated correctly, eigenvectors calculated by `symmetric_matrix_find` are far from those that returned by `np.linalg.eig` method.

## 3. Inverse power method (0.6 pt)
### **3.1 The main idea**
##### Now you will try to find non-dominant eigenpairs for a generic matrix $A$ using the **inverse power method / inverse iteration method**. This method finds an eigenvalue of $A$ that is the closest one to a given guess value $\mu$, along with the corresponding eigenvector. By trying different $\mu$, we will find all simple eigenvalue/eigenvector pairs of $A$, not just the dominant one.

##### The idea is that if $\lambda_*$ is the eigenvalue of $A$ that is the closest one to $\mu$, then $(\lambda - \mu)^{-1}$ is the dominant eigenvalue of the matrix $B:=(A - \mu I)^{-1}$, while the corresponding eigenvector $\mathbf{v}_*$ of $B$ is also an eigenvector of $A$ corresponding to $\lambda_*$.

##### The natural approach is to find first the dominant eigenvalue $\lambda_1$ of $A$; then all the remaining eigenvalues satisfy  $$\forall j\ne1: |\lambda_j| \lt |\lambda_1|, $$ and we can apply a random search for $|\mu| < |\lambda_1|$ and call the **inverse power method** for each such $\mu$. This way, we will identify all the **simple** eigenvalues and the corresponding eigenvectors of $A$

>**Note:** Here, it may be necessary to work with complex numbers. In that case, the corresponding changes to the power method must be made (recall how the scalar product in $\mathbb{C}^k$ differs from that in $\mathbb{R}^k$)

### **3.2 (0.4 pt)** Implement the inverse power method
#### Function ``inverse_power_method``:

In [74]:
def inverse_power_method(A, approx_eigenvalue, start_eigvector=None, max_iter=100_000):
    """
    Return the largest eigenvalue and it's corresponding eigenvector
    using power method.

    Args:
        A - matrix for which to compute the eigenpair
        approx_eigenvalue - the \mu parameter, a value closest to some eigenvalue l,
                            which will be returned together with it's eigenvector
        start_eigvector(optional) - eigenvector used for initialization
                                    on the first step of power iteration algorithm.
                                    Can be an approximation of the real eigenvector,
                                    but doesn't have to be. Defaults to None.
                                    In such case, the initial vector will be randomized.
        max_iter(optional) - maximum number of iterations for the power method.
    Returns:
        (eigval, eigvec) - a pair of an eigenvalue closest to approx_eigenvalue
                           and it's eigenvector.
    """
    n = A.shape[0]
    mu = approx_eigenvalue

    B = np.linalg.inv(A - mu * np.eye(n))
    lambda_mu_inv, evc =  power_method(B, start_vector=start_eigvector, max_iter=max_iter, tol=1e-9)
    ev = 1 / lambda_mu_inv + mu

    return ev, evc

### **3.3 (0.2pt):** Testing the ``inverse_power_method``
#### Apply the method to find a few (at least 3) eigenvalues and eigenvectors of the matrix $A$ defined as $$ A = P D P^{-1},$$ where $$ D = \text{diag}(0,1,2,3,\cdots,9)$$ is diagonal and $P$ is a random $10 \times 10$ matrix. Explain why the diagonal entries of $D$ are eigenvalues of $A$ and why the columns of $P$ are the corresponding eigenvectors. Use that observation to test the found eigenvalues and eigenvectors of $A$

In [75]:
D = np.diag([0,1,2,3,4,5,6,7,8,9])
P = (20*np.random.rand(10,10) - 5)
#define A through known eigenvalues and random eigenvectors
A = P @ D @ np.linalg.inv(P)

# Find eigenpairs using different random numbers as an approximation
for _ in range(5):
    mu = np.random.rand() * 10 # Random eigenvalue approximation from 0 to 10
    ev, evc = inverse_power_method(A, approx_eigenvalue=mu)
    print(f"mu: {mu} \tlambda: {ev}\nv: {evc}\n")

mu: 0.09408151560741262 	lambda: -4.08006961549745e-14
v: [ 0.38689421  0.19904146 -0.08734174  0.10389727  0.27037823  0.67111359
 -0.00073498  0.19338626  0.48073031  0.01654992]

mu: 4.401206452744811 	lambda: 4.000000000049301
v: [ 0.37412606 -0.09255653  0.30204414  0.27713743 -0.06152342  0.13351113
  0.60362324  0.32396048  0.0322359   0.43756856]

mu: 9.740785372532216 	lambda: 8.999999999713346
v: [-0.24667016 -0.37799491 -0.06842952 -0.37254164 -0.37177583 -0.28765926
  0.16770833 -0.09532712 -0.27169945 -0.56639628]

mu: 7.780968674158464 	lambda: 8.000000000006686
v: [ 0.35133689  0.26069106  0.54424681 -0.30401089 -0.07687331  0.05184127
  0.45433933 -0.29167962  0.27612154  0.20888752]

mu: 3.454508000085416 	lambda: 2.9999999999103815
v: [-0.22805999 -0.54671549 -0.30160744 -0.28187239 -0.38778306 -0.26412619
 -0.32178612 -0.10411298 -0.36352188  0.10954178]



In the above case we basically reservsed the other way around the process of decomposing matrix $A$ with eigenvalues and eigenvectors. We say the process is reversed because we first defined eigenvectors (by constructing a matrix $P$ from those) and made up eigenvalues (and then multiplied them by the identity thus getting $D$). 

This set up is sufficient to construct a transformation in $A \in \mathbb R^{10}$ by defining $A := PDP^{-1}$. $D$ here simply scales a plain basis transformation represented by $P$, because it has $0$ everywhere except the diagonal. Thus, we can say that each nonzero element of $D$ simply "assigns a weight" to each direction vector that $P$ contains and the resulting matrix $A$ will be a trasformation where each column is an eigenvector scaled by its corresponding eigenvalue. This is why the eigenvalues of $A$ are simply the exact same nonzero entries that we predetermined when defining matrix $D$. 

---

## **4. Conclusions (0.4 pts)**

#### Summarize in a few sentences what you learned by completing this task. Mention the difficulties you might have faced with, any properties/facts that you now understand better (if any)

---

#### Upon the completion of this task, we learned a lot about different methods of iteratively finding the eigenvalues and eigenvectors. The basic concept of all the methods we implemented is **power iteration**, which allows us to find a dominant eigenpair by repeatedly multiplying a matrix by the approximation of the dominant eigenvector, which, in turn, gradually gets closer and closer to the actual eigenvector (provided that there's no obstacles that don't let the algorithm converge).

#### Apart that, writing these algorithms allowed us to gain a deeper understanding of important theoretical concepts, such as spectral decomposition and diagonalization. 

#### In particular, we now know that if a matrix $A$ can be written as $\lambda_1 \mathbf{v}_1\mathbf{v}_1^\top + \lambda_2 \mathbf{v}_2\mathbf{v}_2^\top + \dots + \lambda_k \mathbf{v}_k\mathbf{v}_k^\top$, then we are able to call `symmetric_matrix_find_eig()` on it and expect the method to correctly identify eigenpairs using the mentioned property. 

#### Otherwise we can simply use `inverse_power_method`. 

---