<a href="https://colab.research.google.com/github/johanhoffman/DD2363_VT22/blob/PeterTKovacs_lab2/Lab2/PeterTKovacs_lab2.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **Lab 2: matrix factorization**
**Péter Kovács**

# **Abstract**

Implemented problems 1-3 and 5 from the homework assignment which are pretty straightforward after reading the lecture slides, so I will keep textual explanation to minimum

#**About the code**

A short statement on who is the author of the file, and if the code is distributed under a certain license. 

In [38]:
"""This program is a template for lab reports in the course"""
"""DD2363 Methods in Scientific Computing, """
"""KTH Royal Institute of Technology, Stockholm, Sweden."""

# Copyright (C) 2022 Péter Kovács (ptkovacs@kth.se)

# This file is part of the course DD2363 Methods in Scientific Computing
# KTH Royal Institute of Technology, Stockholm, Sweden
#
# This is free software: you can redistribute it and/or modify
# it under the terms of the GNU Lesser General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.

# This template is maintained by Johan Hoffman
# Please report problems to jhoffman@kth.se

'KTH Royal Institute of Technology, Stockholm, Sweden.'

# **Set up environment**

In [39]:
# Load neccessary modules.
from google.colab import files
import numpy as np


# **Introduction**

In this submission, I solve some standard problems in linear algebra. Since it is just implementing well-known concepts, I will keep explanatory text to the minimal necessary amount.

In most cases, I will rely on the numpy package for python, which is just designed to work with tensors of any dimension (vectors, matrices, etc.)




# **Method**

My solution will be organized according to the problem statements. I felt that the testing should be placed right after the code for different problems so I organized my submission accordingly.

### 1) sparse matix-vector product with CRS notation

in this part, I implement the CRS routine covered in lecture.

The method is pretty easy on the conceptual level and we discussed it in detail in lecture so I feel justified not to describe it in plain text. Variable names speak for themselves

In [40]:

def sparse_product(A_entries,A_column_indices,A_row_pointers,b):
  """
  here, A is supposed to be given in a CRS way (with three arrays), b as an oridinary vector, all given as numpy arrays

  to follow mathematical convention, indexing starts from 1!
  """

  r=A_row_pointers.shape[0]-1 # number of rows

  x=np.zeros(r,float)

  for i in range(r):
    for j in range(A_row_pointers[i]-1,A_row_pointers[i+1]-1):
      x[i]+=A_entries[j]*b[A_column_indices[j]-1]

  return x


In [41]:
def to_dense(A_entries,A_column_indices,A_row_pointers,col):
  '''
  auxiliary function to transform CRS to dense representation
  '''
  r=A_row_pointers.shape[0]-1 # number of rows
  c=col # CRS does not define the number of columns uniquely (zero padding possible)

  A_dense=np.zeros((r,c),float)

  for i in range(r):
    for j in range(A_row_pointers[i]-1,A_row_pointers[i+1]-1):
      A_dense[i,A_column_indices[j]-1]=A_entries[j]

  return A_dense




In [42]:
A_entries=np.array([1.,2.,3.,4.,5.,1.,9.])
A_column_indices=np.array([1,4,1,6,2,5,3])
A_row_pointers=np.array([1,3,4,4,5,8]) # note that we have an empty column
b=np.array([1.,-3.,0.,0.,0.,9.])

In [43]:
sparse_product(A_entries,A_column_indices,A_row_pointers,b)

array([  1.,   3.,   0.,  36., -15.])

In [44]:
A=to_dense(A_entries,A_column_indices,A_row_pointers,6)
print(A)

[[1. 0. 0. 2. 0. 0.]
 [3. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 4.]
 [0. 5. 9. 0. 1. 0.]]


In [45]:
A.dot(b)

array([  1.,   3.,   0.,  36., -15.])

So verification against builtin functions is OK
### 2) QR factorization with the Gram-Schmidt method

in this problem, I decided to work with the Gram-Schmidt method, which is one of the numerous approaches we covered in lecture

the main idea is to construct an orthonormal basis from the columns of the input matrix A such that $span(a_{*,1},a_{*,2},...,a_{*,i})=span(q_{*,1},q_{*,2},...,q_{*,i})$ in our usual notation. It follows then that the expansion coefficients of the column vectors above the new basis will follow an upper triangular pattern

Picking up the notes' notaion, we will construct the $P_j$ projector as a product of projectors to the orthogonal complement of the basis vectors' associated 1D spaces.

In [46]:
def next_basis_vector(a_j,P_j_1):
  '''
  computes the next basis vector from the latest projector and the next column of a 

  returns q_j and the new projector
  '''

  v_j=P_j_1.dot(a_j)

  try:
    q_j=v_j/np.linalg.norm(v_j)

  except:
    print('zero division')

  return q_j,P_j_1.dot(np.eye(P_j_1.shape[0],P_j_1.shape[0],0,float)-np.outer(q_j,q_j)) # construct the new projector and return it

In [47]:
def qr_gram_schmidt(A):
  '''
  input: A n by n matrix

  OBS! we don't explicitly check for being non-singular since this would not really be a realistic event with floating point numbers
  otherwise, we should update our protocol such that the leftover q_j shall be freely chosen to give an orthonormal system

  '''

  Q=np.zeros(A.shape,float)
  R=np.zeros(A.shape,float)
  P=np.eye(A.shape[0],A.shape[0],0,float)

  for i in range(A.shape[0]):
    Q[:,i],P=next_basis_vector(A[:,i],P)

  R=Q.T.dot(A)

  for i in range(A.shape[0]):
    for j in range(i):
      R[i,j]=0. # this is a nasty trick, I know, but makes life much easier and makes checking R being upper triangular redundant

  return Q,R

In [48]:
A=np.random.standard_normal((10,10))

In [49]:
Q,R=qr_gram_schmidt(A)

In [50]:
print('Frobenius norm of A: %f \n Frobenius norm of A-QR: %f \n Frobenius norm of Q: %f \n Frobenius norm of QQ$^T$-I: %f'
      % (np.linalg.norm(A,ord='fro'),
         np.linalg.norm(A-Q.dot(R),ord='fro'),
         np.linalg.norm(Q,ord='fro'),
         np.linalg.norm(Q.dot(Q.T)-np.eye(A.shape[0],A.shape[0],0,float),ord='fro')))

Frobenius norm of A: 10.139571 
 Frobenius norm of A-QR: 0.000000 
 Frobenius norm of Q: 3.162278 
 Frobenius norm of QQ$^T$-I: 0.000000


We are just fine again

### 3) direct solver for $Ax=b$

we invert A by the QR machinery we just set up

For convenience, we check with builtin functions if A is full rank or not

In [51]:
def direct_solver(A,b):
  '''
  direct solver for Ax=b, assuming that A,b are numpy arrays with matching size

  if A ain't full rank, return error message. The rank is checked with numpy builtin, see docs for exact working
  '''

  n=b.shape[0]

  if np.linalg.matrix_rank(A)<n:
    print('singular matrix')
    return -1

  Q,R=qr_gram_schmidt(A)

  # inverting Q is easy, for R, we use algorithm 5.2 (backward substitution) to solve Rx=Q^T b

  return backward_substitution(R,Q.T.dot(b))



In [52]:
def backward_substitution(U,b):
  '''
  implements backward substitution pseudocode (algorithm 5.2) for upper triangular matrices
  '''

  n=U.shape[0]
  x=np.zeros(n,float)

  x[n-1]=b[n-1]/U[n-1,n-1]

  for i in range(n-1):
    sum=0
    for j in range(n-1-i,n):
      sum+=U[n-2-i,j]*x[j]
    x[n-2-i]=(b[n-2-i]-sum)/U[n-2-i,n-2-i]

  return x

In [53]:
A=np.random.standard_normal((10,10))
b=np.random.standard_normal(10)

In [54]:
x=direct_solver(A,b)
print(
    'Ax-b norm: %f \n x-y norm: %f' % (np.linalg.norm(A.dot(x)-b),np.linalg.norm(x-np.linalg.inv(A).dot(b)))
)

Ax-b norm: 0.000000 
 x-y norm: 0.000000


We are fine again

### 5) QR algorithm for eigenvalues

Here we use the standard QR algorithm dicussed in lecture (algorithm 6.1) to approximate the eigenvalues and eigenvectors of a real, symmetric matrix A 

In [24]:

def qr_eival(A,n=100):
  '''
  algorithm that performs the QR routine on symmetric A and returns the matrix U with eigenvetors and array l with corresponding eigenvalues (both approximate)

  the inner iteration is performed n times
  '''

  A_=np.array(A,copy=True)
  U=np.eye(A.shape[0],A.shape[0],0,float)


  for i in range(n):
    Q,R=qr_gram_schmidt(A_)
    A_=R.dot(Q)
    U=U.dot(Q)

  return U,np.array([A_[j,j] for j in range(A.shape[0])])


In [27]:
I=np.eye(10,10,0,float)

In [32]:
A=np.random.standard_normal((10,10))
S=A.dot(A.T)

we can see that our results are OK, with oncreasing iteration number, the precision also increases

In [33]:
U,l=qr_eival(S,100)

In [34]:
for i in range(10):
  print(np.linalg.det(S-l[i]*I))

-0.0012939949350018237
0.09630334427441885
-1.2565623492483307
-1.0335075910868983
-1.744405378776008e-05
-1.566193712882744e-05
9.45932619969221e-08
1.2346683125765276e-08
2.3404555457802962e-08
1.0151069863874527e-06


In [35]:
U,l=qr_eival(S,1000)

In [36]:
for i in range(10):
  print(np.linalg.det(S-l[i]*I))

-0.0012939949350018237
2.743245202355759e-05
-1.1165817888056388e-05
-1.018319997819677e-06
-6.002351158191196e-07
-1.439826004998157e-07
9.45932619969221e-08
1.2346683125765276e-08
2.3404555457802962e-08
1.0151069863874527e-06


testing eigenvectors: since the 'residual' norm is tiny, we can assure ourselves that the method indeed works

In [37]:
for i in range(10):
  print(np.linalg.norm(S.dot(U[:,i])-l[i]*U[:,i]))

5.109535366184423e-14
6.514285039799014e-14
1.3691242250970083e-13
2.070379903241111e-14
2.483011995083922e-14
1.825663469143644e-14
1.8811012470629483e-14
3.311206726819265e-14
6.496444208540889e-14
2.5658787141431296e-13



# **Summary, discussion**

To sum up, our methods just work as expected, which is kind of non-surprising for such basic tasks.