<a href="https://colab.research.google.com/github/johanhoffman/DD2363_VT23/blob/main/template-report-lab-X.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **Lab 1: Matrix Factorization**
**Nolwenn Deschand**

# **Abstract**


The first lab is about Matrix Factorization. 
Matrix Factorization is a useful tool for solving systems of linear equations such as $Ax = b$. The solution, $x = A^{-1}b$  requires to be able to compute $A^{-1}$. For some matrices, it can be a complex and costly operation. The goal of Matrix Factorization is to factorize the matrix A into a product of matrices that are easily invertible, such as orthogonal or upper triangular matrices.

---

Most of the algorithms are implemented from the pseudo-code present in the Chapter 5 of the book *Methods in Computational Science*, from Johan Hoffman.


#**About the code**

In [None]:
"""This program is a lab report using the provided template"""
"""DD2363 Methods in Scientific Computing, """
"""KTH Royal Institute of Technology, Stockholm, Sweden."""

# written by Nolwenn Deschand (ddeschand@kth.se)
# Template by Johan Hoffman


'KTH Royal Institute of Technology, Stockholm, Sweden.'

# **Set up environment**

In [None]:
# Load neccessary modules.
from google.colab import files

import time
import numpy as np


# **Introduction**

In this lab, we will work on implementation of Matrix Factorization methods in order to solve systems of linear equations as $Ax = b$, where A is a matrix, x the unknown solution and b a vector. This type of systems are common for the study of differential and integral equations. We can write the solution x as $x = A^{-1}b$. However, inverting the matrix A cas be very costly and difficult, for instance in the cases of large matrices. 

In this lab, we will implement and test several functions related to matrix factorization : a sparse vector-matrix product, a QR_factorization (Gram-Schmidt QR Factorization), a direct solver and a QR eigenvalue algorithm.

For each function implemented, we will explain the method used, implement it, test it and then discuss the results obtained. 


# **Method**


**Implementation of the sparse matrix-vector product**

There is a type of matrix called sparse matrices. These matrices are mostly composed of zero components, that is the number of nonzero components is O(n).

For these matrices, storing all components is costly. Instead, we can represent it with the CRS data structure (compressed row storage), composed of 3 arrays:


*   val, that contains the nonzero components of the matrix in row order
*   col_idx, that contains the index of the column of each nonzero component of the matrix
* row_ptr, that contains the indices in the other two arrays that correspond to the start of each row and which ends with the number of nonzero components plus 1

Here, we will try to implement the product $b = Ax$ with A a sparse matrix and x a vector.

Therefore we  will take as input: a vector x, a sparse (real, quadratic) matrix A represented in CRS arrays: val, col_idx, row_ptr

The expected output is the matrix-vector product b=Ax

Pseudo code of the algorithm (from book Chapter 5): 


```
ALGORITHM 5.9. b = sparse_matrix_vector_product(A, x).
Input: a sparse m x n matrix A and an n vector x.
Output: the matrix-vector product b = Ax
1: for i=0:n-1 do
2:     b[i]=0
3:     for j=A.row_ptr[i]:A.row_ptr[i+1]-1 do
4:         b[i]= b[i] + A.val[j]*x[A.col_idx[j]]
5:     end for
6: endfor
7: return b

```


In [None]:
def sparse_mv_product(x, val, col_idx, row_ptr):
    n = x.shape[0]

    #Algorithm 
    b = np.zeros(n)
    for i in range(n):
        b[i] = 0
        for j in range (row_ptr[i], row_ptr[i+1]):
            b[i] = b[i] + val[j-1] * x[col_idx[j-1]-1]
    return b

**Implementation of the Gram-Schmidt QR factorization**

In order to solve the linear eqution $Ax = b$, we need to compute the inverse of A. As computing the inverse is complicated, we can try to factorize A into easily invertible matrices. With the QR factorization, we will express $A = QR$ with Q an orthogonal matrix and R an upper triangular matrix.

The algorithm will take as input a real, quadratic, invertible matrix A.
The expected output is an orthogonal matrix Q and an upper triangular matrix R, such that $A=QR$.

Pseudo code of the algorithm (from Chapter 5):
```
ALGORITHM 5.3. (Q, R) = modified_gram_schmidt_iteration(A). 
Input: a full rank n x n matrix A.
Output: an orthogonal n x n matrix Q and an upper triangular n x n matrix R.
1: for j=0:n-1do
2:     v[:] = A[:,j]
3:     for i=0:j-1 do
4:         R[i,j] = scalar_product(Q[:,i], v[:])
5:         v[:] = v[:] - R[i,j]*Q[:,i]
6:     end for
7:     R[j,j] = norm(v)
8:     Q[:,j] = v[:]/R[j,j]
9: endfor
10:return Q, R

```

In [None]:
def QR_factorization(A):

    n = A.shape[0]

    R = np.zeros((n,n))
    Q = np.zeros((n,n))

    v = np.empty(n, dtype=object)

    # Algorithm
    for j in range (n):
        v = A[:,j]
        for i in range (j): 
            R[i, j] = np.dot(Q[:,i], v)
            sum = R[i,j]*Q[:,i]
            v = v - sum
        R[j,j] = np.linalg.norm(v)
        Q[:,j] = v[:]/R[j,j]

    return Q,R

**Implementation of a direct solver Ax=b**

Using the previously implemented QR Factorization, we can know implement a direct solver for $Ax = b$. As, we have $A = QR$, we can rewrite the equation as follow: 

$Ax = b ⇔ QRx = b ⇔ Rx = Q^{-1}b ⇔ x = R^{-1}Q^{-1}b$

The solver will take as input a real quadratic matrix A, and a vector b.
The expected output is a vector $x = A^{-1}b$

In [None]:
def solver(A,b):
    Q,R = QR_factorization(A)
    x = np.dot(np.dot(np.linalg.inv(R), np.linalg.inv(Q)),b)
    return x

**Bonus assignement : Implementation of a the QR eigenvalue algorithm**

There are algorithms for computing the eigenvalue decomposition of matrices, as the QR eigenvalue algorithm. 

For a square matrix A, the Schur factorization is constructed from successive QR factorizations
$Q^{(k)}R^{(k)} = A^{(k−1)}$

After k iterations we have the approximate Schur factorization :
$A = U^{(k)}A^{(k)}U^{(k)∗}$
where under suitable conditions $A^{(k)}$ will converge to an upper triangular matrix which has the eigenvalues of A on the diagonal.

To simplify the analysis of the convergence properties we can restrict our attention to matrices that are real and symmetric, for which all eigenvalues are real and the corresponding eigenvectors form an orthonormal basis for $R^{n}$

As input, we wil take a real symmetric matrix A.

The output are real eigenvalues $\lambda_{i}$ and real eigenvectors $v_{i}$ of A.

Pseudo code of thz algorithm (from Chapter 6):
```
ALGORITHM 6.1. (A, U) = qr_algorithm(A).
Input: a general n x n matrix A.
Output: approximate Schur factorization n x n matrices A and U.
1: U=I 
2: while stopping_criterion == false do
3:     (Q, R) = qr_factorization(A)
4:     A = matrix_matrix_product(R, Q)
5:     U = matrix_matrix_product(U, Q)
6: end while
7: return A, U
```

There are several possible choices for the stopping criterion. Here, I chose to limit the non diagonal residuals: by computing the sum of the elements in A minus the sum of the elements of the diagonal, we get the sum of the non diagonal elements. This value should be as close to zero as possible. We can adjust the threshold to get more accurate results. 

In [None]:
def qr_algorithm(A):

  n = A.shape[0]
  U = np.identity(n)

  non_diag_residuals = A.sum() - A.trace()

  while non_diag_residuals > 0.000001 :
    (Q, R) = QR_factorization(A)
    A = np.dot(R, Q)
    U = np.dot(U, Q)
    non_diag_residuals = A.sum() - A.trace()

  return A, U

# **Results**

In this section, all the tests of the previously implemented functions will be realized. They will be explained, the expected results will be presented and compared to the obtained results.

---

#**Sparse matrix-vector product**

The sparse matrix-vector product $Ax = b$ has been implemented, we will know test it. In order to test the implementation, we will compare the result of the implemented function with a dense matrix-vector product, using numpy library. If the function is correctly implemented, the results should be equal.

A is a sparse matrix, and val, col_idx and row_ptr are its CRS representation. 



In [None]:
# Matrix A 
A = [[3, 2, 0, 2, 0, 0], [0, 2, 1, 0, 0, 0], [0, 0, 1, 0, 0, 0], [0, 0, 3, 2, 0, 0], [0, 0, 0, 0, 1, 0], [0, 0, 0, 0, 2, 3]]

# CRS Representation 
val = [3, 2, 2, 2, 1, 1, 3, 2, 1, 2, 3]
col_idx = [1, 2, 4, 2, 3, 3, 3, 4, 5, 5, 6]
row_ptr = [1, 4, 6, 7, 9, 10, 12]

# Vector x
x = np.array([2, 0, 1, 4, 5, 2])

b = sparse_mv_product(x, val, col_idx, row_ptr)
print("sparse matrix vector product : ", b)

# Test : Dense product 
print ("dense matrix vector product : ",np.dot(A,x))

sparse matrix vector product :  [14.  1.  1. 11.  5. 16.]
dense matrix vector product :  [14  1  1 11  5 16]



We obtain the same results for the two implementations of the matrix vector product, the sparse matrix-vector product implementation seems to be correct on this example. 

#**Gram-Schmidt QR factorization**

To test the QR factorization, we will verify that R is an upper triangular matrix and that Q is orthogonal. To measure the size of a matrix $A ∈ R^{mxn}$, we can use the Frobenius norm of the matrices, defined by:
$||A||_{F} = trace(A^{T}A) = \sqrt{\sum_{i=1}^{m}\sum_{j=1}^{n}|a_{ij}|^{2}}$. We can use the numpy implementation of the Frobenius norm.

We will compute the norms $||Q^{T}Q-I||_{F}$ and $||QR-A||_{F}$.
As Q should be orthogonal, we should have $Q^{T} = Q^{-1}$ therefore $Q^{T}Q - I$ should result in the null matrix, with a frobenius norm equal to 0. 

If the factorization is correct, we should have $A = QR$, therefore the norm $||QR-A||_{F}$ should also be 0.

In [None]:
# Test
A = np.array([[3, 2, 1, 2], [6, 2, 1, 2], [3, 5, 1, 4], [5, 5, 3, 2]])

Q,R = QR_factorization(A)
print("R = ",R)

# Frobenius norm 
n = A.shape[0]

F1 = np.linalg.norm(np.dot(np.transpose(Q), Q)-np.identity(n),'fro')

F2 = np.linalg.norm(np.dot(Q,R)-A,'fro')
print("F1 = ", F1)
print("F2 = ", F2)


R =  [[ 8.88819442  6.52550983  3.03773733  4.5003516 ]
 [ 0.          3.92654066  1.06384106  2.19860487]
 [ 0.          0.          1.2807787  -1.56924238]
 [ 0.          0.          0.          0.67115606]]
F1 =  4.938266190042341e-16
F2 =  4.440892098500626e-16


The obtained R is as expected an upper triangular matrix. 

The two computed frobenius norms $||Q^{T}Q-I||_{F}$ and $||QR-A||_{F}$ are very close to 0, we can consider that the implementation is working as expected for this test case. 

#**Direct solver Ax=b**

To test the direct solver, we can first compute the norm of the residual $||Ax-b||$.
We can also try to manufacture a solution (choose a matrix A and a vector y and compute $b = Ay$ and compare y with the result of the solver. 



In [None]:
# TEST 1: residual of ||𝐴𝑥−𝑏||

A = np.array([[3, 2, 1, 2], [6, 2, 1, 2], [3, 5, 1, 4], [5, 5, 3, 2]])
b = np.array([2,3,1,2])
x = solver(A,b)

residual1 = np.linalg.norm(np.dot(A,x)-b)
print("residual of ||𝐴𝑥−𝑏|| : ", residual1)

# TEST 2: residual of ||x-y|| where y is a manufactured solution with b=Ay

A = np.array([[5, 9, 1, 2], [1, 3, 3, 2], [4, 4, 2, 1], [6, 2, 1, 1]])
y = np.array([3,1,1,2])

b = np.dot(A,y)

x = solver(A,b)
residual2 = np.linalg.norm(np.dot(A,x)-b)
print("residual of ||x-y|| : ", residual2)


residual of ||𝐴𝑥−𝑏|| :  1.0877919644084146e-15
residual of ||x-y|| :  8.702335715267317e-15


For both test cases, the resulted norm is closed to 0, as expected.

#**Bonus assignement : Implementation of a the QR eigenvalue algorithm**

To test the implementation of the QR eigen value algorithm, we will compute for each pair of eigenvalue and eigenvector:

1. $det(A - \lambda_{i}I)$,
2. $||Av_{i} - \lambda_{i}v_{i}||$

With $\lambda_{i}$ the eigenvalue and $v_{i}$ the associated eigenvector.

The implemented function should return a diagonal matrix A with the eigenvalues on the diagonal and a matrix U with the eigenvevtors on the columns.

If the function is correctly implemented the determinant $det(A - \lambda_{i}I)$ and the norm $||Av_{i} - \lambda_{i}v_{i}||$ should be 0, as the definition of eigenvalues and eigenvectors is $Av_{i} = \lambda_{i}v_{i}$.


In [None]:
# A, a real symmetric matrix
A = np.array([[2, 4, 7, 3], [4, 3, 2, 5], [7, 2, 1, 8], [3, 5, 8, 6]])

(Adiag,U) = qr_algorithm(A)

n = A.shape[0]

for i in range(n):
  print ("Eigenvalue ",i+1,": ")

  print("𝑑𝑒𝑡(𝐴−𝜆𝑖𝐼) : ",np.linalg.det(A-np.dot(Adiag[i,i],np.identity(n))))
  print("||𝐴𝑣𝑖−𝜆𝑖𝑣𝑖|| : ", np.linalg.norm(np.dot(A,U[:,i])-Adiag[i,i]*U[:,i]))
  print()




Eigenvalue  1 : 
𝑑𝑒𝑡(𝐴−𝜆𝑖𝐼) :  5.160851235385878e-12
||𝐴𝑣𝑖−𝜆𝑖𝑣𝑖|| :  6.4047456679787536e-15

Eigenvalue  2 : 
𝑑𝑒𝑡(𝐴−𝜆𝑖𝐼) :  0.0
||𝐴𝑣𝑖−𝜆𝑖𝑣𝑖|| :  2.7012892057857038e-15

Eigenvalue  3 : 
𝑑𝑒𝑡(𝐴−𝜆𝑖𝐼) :  3.4833662193366036e-11
||𝐴𝑣𝑖−𝜆𝑖𝑣𝑖|| :  4.892066786674748e-07

Eigenvalue  4 : 
𝑑𝑒𝑡(𝐴−𝜆𝑖𝐼) :  3.465503324834715e-11
||𝐴𝑣𝑖−𝜆𝑖𝑣𝑖|| :  4.892066785901544e-07



All the values here are very close to zero, which is the expected behaviour for eigenvalues and eigenvectors. We could probably improve the precision by lowering the threshold in the QR eigenvalue algorithm, at the cost of increasing computations.

# **Discussion**

Finally, we have implemented four methods for matrix products and factorization. For the simple test cases realized, the implementations seem to be working as all expected results were obtained.