# **Lab 2: Matrix Factorization Methods**
**Sanskar Gupta**

#**Abstract**

The following matrix methods are implemented in this notebook as part of programming assignment 2

1. Function: Sparse matrix-vector product $b = Ax$,
2. Function: QR Factorization ,orthogonal matrix Q, upper triangular matrix R, such that $A=QR$
3. Function:  A Direct Solver using Backsubstitution and QR Factorization $Ax = b \iff QRx = b$,
4. Function: Least Squares problem
5. Function: $QR$ eigenvalue algorithm
6. Function: Blocked matrix-matrix product


The following implementations closely follow the results that are derived using the numpy library.

#**About the code**

In [None]:
"""This program is a template for lab reports in the course"""
"""DD2363 Methods in Scientific Computing, """
"""KTH Royal Institute of Technology, Stockholm, Sweden."""

# Copyright (C) 2021 Sanskar Gupta (sanskar@kth.se)

# This file is part of the course DD2363 Methods in Scientific Computing
# KTH Royal Institute of Technology, Stockholm, Sweden
#
# This is free software: you can redistribute it and/or modify
# it under the terms of the GNU Lesser General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.

# This template is maintained by Johan Hoffman
# Please report problems to jhoffman@kth.se

#**Introduction**

Methods listed in abstract are implemented one by one manually with unit tests for each method.
The tests are evaluated together in the results secton and insights generated from tests are discussed briefly in the discussion section.

#**Environment Setup**

In [None]:
#Importing necessary stuffs
from google.colab import files

import time
import numpy as np
import unittest
import random

from matplotlib import pyplot as plt
from matplotlib import tri
from matplotlib import axes
from mpl_toolkits.mplot3d import Axes3D


# **Methods**

## **Sparse Matrix-Vector Product**

Here sparse matrix product algorithm is implemented where the input is a  matrix $A \in R^{m\times n}$ which is represented in compressed row storage (CRS) format and a vector $x \in R^n$, and output is the vector $b = Ax, \ b \in R^m$.

In numerical analysis and scientific computing, a sparse matrix or sparse array is a matrix in which most of the elements are zero. There is no strict definition how many elements need to be zero for a matrix to be considered sparse but a common criterion is that the number of non-zero elements is roughly the number of rows or columns


A matrix in the CRS format can be represented using the arrays: *val*, *col_idx*, and *row_ptr*. val stores the nonzero matrix values, col_idx the nonzero values column indices, and row_ptr the indices of the of the first nonzero values of each row in val and count of non 0 values.

It is to be noted that all column and row indixes start from 0.

In [None]:
#Function
def sparseMatrixVectorProduct(x, val, col_idx, row_ptr):
  assert x.size == (len(row_ptr)-1) #asserting if length of vector is same as matrix dimension

  result = np.zeros(len(row_ptr)-1)

  for row in range(len(row_ptr)-1):
    temp = 0
    for i in range(row_ptr[row], row_ptr[row+1]):
      temp = temp + val[i] * x[col_idx[i]]
    result[row] = temp
  
  return result

In [None]:
#Test
def testAgainstDenseMatrix():

    #Matrix to test
    A = np.array([[3,2,0,2,0,0],
                  [0,2,1,0,0,0],
                  [0,0,1,0,0,0],
                  [0,0,3,2,0,0],
                  [0,0,0,0,1,0],
                  [0,0,0,0,2,3]])
    x = np.array([2,1,2,4,1,3])
    val = [3,2,2,2,1,1,3,2,1,2,3]
    col_idx = [0,1,3,1,2,2,2,3,4,4,5]
    row_ptr = [0,3,5,6,8,9,11]
    expectedSolution = np.matmul(A,x)
    assert np.array_equal(np.matmul(A,x), sparseMatrixVectorProduct(x,val,col_idx,row_ptr))==True , "Value mismatch , test failed"

In [None]:
start_time = time.time()
testAgainstDenseMatrix()
print("--- test ran in %s seconds ---" % (time.time() - start_time))

--- 0.0003952980041503906 seconds ---


## **QR Factorization using Gram Schmidt Process**
Given a non-singular matrix $A\in R^{n\times n}$, it can be factorized as  $A=QR$, where $Q$ is an orthogonal matrix and $R$ is a upper triangular matrix.

The following function implements classical Gram Schmidt Process to get the factor matrices $Q$ and $R$. 

\begin{align*}
\left( \begin{array}{c|c|c|c} & & &  \\ a_{:1} & a_{:2} & \dots & a_{:n} \\ & & & \end{array} \right) = \left( \begin{array}{c|c|c|c} & & &  \\ q_{1} & q_{2} & \dots & q_{n} \\ & & & \end{array} \right)\left( \begin{array}{cccc} r_{11} & r_{12} & \dots & r_{1n} \\  & r_{22} &  &  \\ \vdots & & \ddots & \vdots \\ 0 & \dots & & r_{nn}\end{array} \right)
\end{align*}

In [None]:
#Function
def qrFactorization(A):
  v = np.copy(A) #Copying to prevent writing to same reference
  numRows = A.shape[0]
  R = np.zeros((numRows,numRows))
  Q = np.zeros((numRows,numRows))
  for i in range(numRows):
    R[i,i] = np.sqrt(np.dot(v[:,i],v[:,i])) #length of v
    Q[:,i] = v[:,i]/R[i,i] #dividing by the length to normalize
    for j in range(i+1, numRows):
      R[i,j] = np.dot(Q[:,i], v[:,j]) 
      v[:,j] = np.subtract(v[:,j], R[i,j]*Q[:,i])
  return Q,R

When p = q = 2 for $L_{p,q}$ it is called the Frobenius norm or the Hilbert–Schmidt norm, though the latter term is used more frequently in the context of operators on (possibly infinite-dimensional) Hilbert space

In [None]:
#Frobenius norm
def getFrobeniusNorm(A):
  result = 0
  for i in A:
    for j in i:
      result = result+ j*j
  return np.sqrt(result)

In [None]:
#Tests
def testRUpperTriangular():
    test_case = unittest.TestCase()
    for n in range(20):
      size = random.randint(2, 500) 
      A = np.random.rand(size, size)
      Q, R = qrFactorization(A)

      for i in range(size):
        for j in range(0, i):
          test_case.assertEqual(R[i,j], 0)


def testEqualityOfQRwithA():
    test_case = unittest.TestCase()
    for n in range(20):
      size = random.randint(2,100) 
      A = np.random.rand(size, size)
      Q, R = qrFactorization(A)
      expectedSolution = np.matmul(Q,R)
      np.testing.assert_array_almost_equal(expectedSolution, A, 10)  #Raises an AssertionError if two objects are not equal up to desired precision.

  
def testFrobeniusNorm():
  test_case = unittest.TestCase()
  for n in range(20):
    size = random.randint(2,100) 
    A = np.random.rand(size, size)
    Q, R = qrFactorization(A)
    test_case.assertAlmostEqual(getFrobeniusNorm(np.matmul(Q,R)) , getFrobeniusNorm(A), 10)

In [None]:
start_time = time.time()
testRUpperTriangular()
print("--- test ran in %s seconds ---" % (time.time() - start_time))

--- test ran in 8.430670261383057 seconds ---


In [None]:
start_time = time.time()
testEqualityOfQRwithA()
print("--- test ran in %s seconds ---" % (time.time() - start_time))

--- test ran in 0.40532445907592773 seconds ---


In [None]:
start_time = time.time()
testFrobeniusNorm()
print("--- test ran in %s seconds ---" % (time.time() - start_time))

--- test ran in 0.43219757080078125 seconds ---


## **Direct Solver**

Here we solve the equation for x $Ax = b, \ A\in R^{n\times n}, \ b \in R^n, \ x \in R^n$, where $A$ is non-singular

Matrix $A$ will be factorized using  Gram-Schmidt QR factorization $A = QR$ we then have  $QRx = b$. This equation can be solved easily using backward substitution.

### Backwards substitution

$x_j = b_i - \sum_{j=i+1}^{n} (u_{ij}x_j) / u_{ii}$





In [None]:
#Function
def backWardsSubstitution(U, b):
  test_case = unittest.TestCase()
  n = b.size
  x = np.zeros(n)
  x[n-1] = b[n-1]/U[n-1, n-1]
  for i in range(n-2, -1, -1):
    sum = 0
    for j in range(i+1, n):
      sum += U[i,j]*x[j]
    x[i] = (b[i] - sum) / U[i,i]
  return x


def directSolver(A,b):
  Q,R = qrFactorization(A)
  m,n = A.shape
  Q_T = np.zeros((n,m))
  for i in range(n):
    Q_T[i,:] = Q[:,i]
  
  Q_Tb = Q_T.dot(b)
  x = backWardsSubstitution(R, Q_Tb)
  return x

In [None]:
#Tests
def testBackWardSubstitution():
    for n in range(20):
      size = random.randint(2,100) 
      A = np.random.rand(size, size)
      b = np.random.rand(size)
      Q, R = qrFactorization(A)
      x = backWardsSubstitution(R, b)
      np.testing.assert_array_almost_equal(np.matmul(R,x), b, 11)

def testResiduals():
    test_case = unittest.TestCase()
    for n in range(20):
      size = random.randint(2,100) 
      A = np.random.rand(size, size)
      b = np.random.rand(size)

      derivedSolutionY = directSolver(A, b)

      result_vector = np.matmul(A,derivedSolutionY)-b
     
      
      #||Ax-b||
      test_case.assertAlmostEqual(np.sqrt(np.dot(result_vector,result_vector)), 0, 10)

      expectedSolutionX = np.linalg.solve(A, b)
      vector_difference = expectedSolutionX-derivedSolutionY


      #||x-y||
      test_case.assertAlmostEqual(np.sqrt(np.dot(vector_difference,vector_difference)), 0, 3)

In [None]:
start_time = time.time()
testBackWardSubstitution()
print("--- test ran in %s seconds ---" % (time.time() - start_time))

--- test ran in 0.48583197593688965 seconds ---


In [None]:
start_time = time.time()
testResiduals()
print("--- test ran in %s seconds ---" % (time.time() - start_time))

--- test ran in 0.4924960136413574 seconds ---


##**Least Squares**
For a system of equations,
\begin{align}
Ax = b,
\end{align}
when $ A\in R^{m\times n}, \ x \in R^n, \ b \in R^m, m > n$,
the method of least squares is a standard approach to approximate the solution of overdetermined systems (sets of equations in which there are more equations than unknowns) by minimizing the sum of the squares of the residuals made in the results of every single equation.

$$
\bar{x} = (A^T A)^{-1}A^Tb.
$$

In [None]:
#Functions
def transpose(A):
  m,n = A.shape
  A_T = np.zeros((n,m))
  for i in range(n):
    A_T[i,:] = A[:,i]
  return A_T

def leastSquares(A, b):
  if not(A.shape[0] == b.size):
    raise Exception("The dimensions of the vector and the matrix do not match")
  else:
    A_T=transpose(A)
    A_TA = A_T.dot(A)
    A_Tb = A_T.dot(b)
    Q, R = qrFactorization(A_TA)
    Q_T = transpose(Q)
    Q_TA_Tb = Q_T.dot(A_Tb)
    x = backWardsSubstitution(R, Q_TA_Tb)
    return x

In [None]:
#Tests
def testResidualLeastSquare():
  test_case = unittest.TestCase()

  m = np.random.randint(2,10);
  n = m + np.random.randint(2,10);

  A = np.random.rand(n,m);
  b = np.random.rand(n);

  x1 = leastSquares(A,b);
  x2 = np.linalg.lstsq(A,b, rcond=None)[0];
  vector_difference=x1-x2
  test_case.assertAlmostEqual(np.sqrt(np.dot(vector_difference,vector_difference)), 0, 3)
  

In [None]:
start_time = time.time()
testResidualLeastSquare()
print("--- test ran in %s seconds ---" % (time.time() - start_time))

--- test ran in 0.027906179428100586 seconds ---


##**QR eigenvalue algorithm**
$A=UTU^*$
where U has the eigenvectors as column vectors and T is a upper triangular matrix with the eigenvalues on the diagonal.

The iteration algorithm is slow to run, it is thus only tested by small matrices of size 5, with 2000 iterations which gives a precision of over 5 decimals. 

In [None]:
#Functions
def QREigenvalue(A):

  n = A.shape[0]
  U = np.identity(n)

  for i in range(0, 2000):
    (Q,R) = qrFactorization(A)
    A = R.dot(Q)
    U = U.dot(Q)

  return A, U;

In [None]:
#Tests
def testDeterminantAndNorm():
    for i in range(0,3):

      n = 8
      I = np.identity(n)

      # A is a real symmetric matrix
      A_sym = np.random.rand(n,n)
      for i in range(0, n):
        for j in range(0, n):
          A_sym[i,j] = A_sym[j,i]

      (A,U) = QREigenvalue(A_sym)

      for i in range(0, n):

        #norm (A-lambda*i)       
        np.testing.assert_almost_equal(np.linalg.norm(A_sym.dot(U[:,i]) - A[i,i]*U[:,i]), 0, decimal=2)
        
        #determinant
        np.testing.assert_almost_equal(np.linalg.det(A_sym - A[i,i]*I), 0, decimal=2)




In [None]:
start_time = time.time()
testDeterminantAndNorm()
print("--- test ran in %s seconds ---" % (time.time() - start_time))

--- test ran in 1.6452171802520752 seconds ---


##**Block matrix product**
The complexity of an algorithm is only one part of the computational cost. Often the cost to
access slow memory through read or write operations is the dominating cost. We define the
computational intensity q of an algorithm as the ratio between the average number of floating
point operations f and memory references m,
$$q = f/m$$
The blocked matrix multiplication is done by matching and dividing matrices A and B along rows and columns

In [None]:
def blockedMatrixMultiplication(A, B, n, m, p):
    if not (A.shape[1] == B.shape[0]):
        raise Exception("Matrix dimensions are not compatible for multiplication")

    i_initial, delta_i = 0, 0
    C = np.zeros((A.shape[0], B.shape[1]))

    for i in range(n):
        i_initial = i_initial + delta_i
        delta_i = int(np.ceil((A.shape[0] - delta_i) / (n - i)))
        j_initial, delta_j = 0, 0
        for j in range(m):
            j_initial = j_initial + delta_j
            delta_j = int(np.ceil((B.shape[1] - delta_j) / (m - j)))
            k_initial, delta_k = 0, 0
            for k in range(p):
                k_initial = k_initial + delta_k
                delta_k = int(np.ceil(A.shape[1] - delta_k) / (p - k))
                C[i_initial:i_initial + delta_i, j_initial:j_initial + delta_j] += A[i_initial:i_initial + delta_i, k_initial:k_initial + delta_k].dot(B[k_initial:k_initial + delta_k, j_initial:j_initial + delta_j])

    return C

In [None]:
def testAccuracyOfblockedMatrixMultiplication():
    test_case = unittest.TestCase()
    for i in range(10):
        p = np.random.randint(1, 10)
        n = np.random.randint(1, 10)
        m = np.random.randint(1, 10)
        A = np.random.rand(n, p)
        B = np.random.rand(p, m)
        n = np.random.randint(1, n + 1)
        m = np.random.randint(1, m + 1)
        p = np.random.randint(1, p + 1)
        x = blockedMatrixMultiplication(A, B, n, m, p)
        y = A.dot(B)
        np.testing.assert_almost_equal(x, y, 0)

In [None]:
start_time = time.time()
testAccuracyOfblockedMatrixMultiplication()
print("--- test ran in %s seconds ---" % (time.time() - start_time))

--- test ran in 0.008306741714477539 seconds ---


#**Results**
All the methods demonstrated above were verified using unit test cases referencing the assignment test cases with various degrees of precision that immediately followed the functions