<a href="https://colab.research.google.com/github/johanhoffman/DD2363-VT19/blob/maxbergmark/Lab1/maxbergmark_lab1.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **Lab 1: Matrix algorithms**
**Max Bergmark**

# **Abstract**

The objective in this lab is to implement algorithms for inner product, matrix-vector product, and matrix-matrix product. The bonus assignment involves designing a class for CRS representation of sparse matrices, and implementing matrix-vector multiplication using that class.

#**About the code**

I (Max Bergmark) is the author of the code in its entirety. Some help was taken from [StackOverflow](https://stackoverflow.com/) and from the [numpy documentation](https://docs.scipy.org/doc/).

In [0]:
"""This program is a template for lab reports in the course"""
"""DD2363 Methods in Scientific Computing, """
"""KTH Royal Institute of Technology, Stockholm, Sweden."""

# Copyright (C) 2019 Johan Hoffman (jhoffman@kth.se)

# This file is part of the course DD2363 Methods in Scientific Computing
# KTH Royal Institute of Technology, Stockholm, Sweden
#
# This is free software: you can redistribute it and/or modify
# it under the terms of the GNU Lesser General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.

# This template is maintained by Johan Hoffman
# Please report problems to jhoffman@kth.se

'KTH Royal Institute of Technology, Stockholm, Sweden.'

# **Set up environment**

To have access to the neccessary modules you have to run this cell. If you need additional modules, this is where you add them. 

In [0]:
# Load neccessary modules.
from google.colab import files

import time
import numpy as np

# **Introduction**

To complete the assignments in this class, we will represent our data using numpy arrays. However, since the main task of this report is to imlpement algorithms for matrix multiplication, we will not use `np.dot`, which is numpy's own method for matrix multiplication.Aside from that, I haven't restricted usage of numpy methods for the assignments included in this report.

One important distinction is that numpy does not store orientation for its 1D arrays. For this report, this implies that we will not make a distinction between $1\times n$ vectors and $n\times 1$ vectors. If such a discinction is neccesary, I would advice you to create the vectors as 2D matrices, where one of the dimensions has size 1. This will yield identical results. 

# **Methods**

## 1: Inner product

The first task is to implement the inner product of two vectors. When the two vectors have equal length, the inner product is defined as $(x, y) = \sum_i x_iy_i$, which is the sum of the elementwise product of both arrays. Fortunately, numpy provides us with simple syntax for implementing this efficiently.

For all parts of this assignment, I will use `np.dot` as the reference, and assert that my own implementation yields identical results. I have designed the code as a test suite, where any error will throw an exception.

In [0]:
def inner_product(x, y):
	if not (isinstance(x, np.ndarray) and isinstance(y, np.ndarray)):
		raise TypeError("Both arguments must be numpy arrays")
	if x.size != y.size:
		raise ValueError("Vectors must be same length")
	return (x*y).sum()

## 2: Matrix-vector product

The matrix-vector product is a bit more complicated, but we can extend our inner product to calculate it. In its essence, the matrix-vector product $Ax$is similar to calculating the inner product $(a_i, x)$ for each row $a_i$ in $A$. With the addition of a for loop, this is easily done, and works as expected.

In [0]:
def inner_product_matrix_vector(A, x):
  if not (isinstance(A, np.ndarray) and isinstance(B, np.ndarray)):
    raise TypeError("Both arguments must be numpy arrays")
  if A.shape[1] != x.size:
    raise ValueError("Matrix dimensions are not compatible")
  
  b = np.zeros((A.shape[0],))
  for r in range(A.shape[0]):
    b[r] = (A[r,:]*x).sum()
  return b

## 3: Matrix-matrix product

To calculate matrix-matrix products, we can use the same logic as above, but with one more loop to iterate over the column vectors in the right multiplicand. I have extended the function above to handle both matrix-vector multiplication and matrix-matrix multiplication.

In [0]:
def inner_product_matrix(A, B):
	if not (isinstance(A, np.ndarray) and isinstance(B, np.ndarray)):
		raise TypeError("Both arguments must be lists")

	if B.ndim == 1:
		C = np.zeros((A.shape[0],))
		for r in range(A.shape[0]):
			C[r] = (A[r,:]*B).sum()
	else:
		C = np.zeros((A.shape[0], B.shape[1]))
		for c in range(B.shape[1]):
			for r in range(A.shape[0]):
				C[r, c] = (A[r,:]*B[:,c]).sum()
	return C

## Bonus 1: CRS class

The CRS class is a representation of a matrix using three vectors containing the non-zero elements, the column indices of said elements, and the indices of the previous two arrays where a new row starts. 

Due to some ambiguity in the specification of the format, I have concluded that in addition to this, we also need to store the dimensions of the original matrix.

In [15]:
class CRS:

	def __init__(self, A):
		self.one_indexed = False
		self.index_dtype = np.dtype("int64")
		self.make_CRS(A)

	def make_CRS(self, A):
		self.shape = A.shape
		self.dtype = A.dtype
		self.calc_val(A)
		self.calc_col_idx(A)
		self.calc_row_ptr(A)
		# self.make_one_indexed()

	@property
	def val(self):
		return self._val
	
	@property
	def col_idx(self):
		return self._col_idx + self.one_indexed

	@property
	def row_ptr(self):
		return self._row_ptr + self.one_indexed

	def calc_val(self, A):
		self._val = A[A > 0].flatten()

	def calc_col_idx(self, A):
		# generate matrix the same size as A, where a_ij = j
		col_idx = np.tile(np.arange(A.shape[1]), (A.shape[0], 1))
		self._col_idx = col_idx[A > 0].flatten()

	def calc_row_ptr(self, A):
		"""Calculates the values of the row_ptr array in the CRS"""
		# generate matrix the same size as A, where a_ij = i
		row_idx = np.tile(
			np.arange(A.shape[0], dtype = self.index_dtype), 
			(A.shape[1], 1)
		).T
		# extract the row indices where A is non-zero
		row_indices = row_idx[A > 0].flatten()
		# the differences of row_indices indicate where a new row begins
		diffs = np.diff(row_indices)
		# to correctly handle empty rows, we must use this
		reverse_bincount = np.repeat(np.arange(diffs.size), diffs)
		row_sums = A.sum(axis = 1)
		row_cumsum = row_sums.cumsum()
		empty_top_rows = (row_cumsum == 0).sum()

		# populate the row_ptr array
		self._row_ptr = np.zeros(self.shape[0], dtype = self.index_dtype)
		start_index = empty_top_rows + 1
		end_index = reverse_bincount.size + 1 + empty_top_rows
		self._row_ptr[start_index:end_index] = reverse_bincount + 1
		# make sure that empty rows at the end are correctly reconstructed
		self._row_ptr[end_index:] = -1
		# self._row_ptr = np.array([0, 0, 3, -1])

	def make_one_indexed(self):
		"""Transform col_idx and row_ptr to use 1-indexing in output, 
		but not in the internal state"""
		self.one_indexed = True

	def print_stats(self):
		size = (self._val.size * self.dtype.itemsize 
			+ self._col_idx.size * self.index_dtype.itemsize
			+ self._row_ptr.size * self.index_dtype.itemsize)
		original_size = self.dtype.itemsize * self.shape[0] * self.shape[1]
		print("Space needed: %d bytes" % size)
		print("Original matrix size: %d bytes" % original_size)
		print("Compression ratio: %.1f%%" % (100*(1 - size / original_size),))

	def reconstruct(self):
		A = np.zeros(self.shape, dtype = self.dtype)
		row_starts = np.zeros(self._val.size, dtype = self.index_dtype)
		bbins = np.bincount(self._row_ptr[self._row_ptr >= 0])
		row_starts[:bbins.size] += bbins
		row_idx = row_starts.cumsum() - 1
		A[row_idx, self._col_idx] = self.val
		return A

	def __str__(self):
		return str(self.reconstruct())

	def __repr__(self):
		return self.reconstruct()

	def __mul__(self, x):
		return CRS.multiply(self, x)

	@staticmethod
	def multiply_slow(A, B):
		if isinstance(A, CRS):
			A = A.reconstruct()
		if isinstance(B, CRS):
			B = B.reconstruct()
		return np.dot(A, B)

	@staticmethod
	def multiply(A, B):
		if isinstance(A, CRS):
			row_starts = np.zeros(A._val.size, dtype = np.int64)
			bbins = np.bincount(A._row_ptr[A._row_ptr >= 0])
			row_starts[:bbins.size] += bbins
			row_idx = row_starts.cumsum() - 1
			res = np.zeros(A.shape[0])
			scalar_product_pairs = B[A._col_idx] * A._val
			np.add.at(res, row_idx, scalar_product_pairs)
			return res
		elif isinstance(B, CRS):
			row_starts = np.zeros(B._val.size, dtype = np.int64)
			bbins = np.bincount(B._row_ptr[B._row_ptr >= 0])
			row_starts[:bbins.size] += bbins
			row_idx = row_starts.cumsum() - 1
			res = np.zeros(B.shape[1])
			scalar_product_pairs = A[row_idx] * B._val
			np.add.at(res, B._col_idx, scalar_product_pairs)
			return res

		return np.dot(A, B)

Test passed!


## Bonus 2: CRS matrix-vector product

To implement the matrix-vector product using the CRS format, we could simply ask the CRS class to reconstruct the matrix, and then use the reconstruction with our previous methods for multiplying matrices and vectors. However, we should use the fact that our matrix is sparse during the multiplication to improve performance.

To see the implementation used, see the `multiply` method in the `CRS` class. It is not designed to handle matrix-matrix multiplication using CRS, but should work when one of the multiplicands is in CRS format, and the other one is a 1-dimensional numpy array of correct size.

# **Results**

## 1: Inner product

In [23]:
def test_inner_product():
	x = np.random.rand(5)
	y = np.random.rand(5)
	true_value = np.dot(x, y)
	np_test_value = inner_product(x, y)
	assert true_value == np_test_value
  
test_inner_product()
print("Test passed!")

Test passed!


## 2: Matrix-vector product

In [24]:
def test_matrix_vector_product():
	A = np.random.rand(5, 3)
	x = np.random.rand(3)
	true_value = np.dot(A, x)
	b = inner_product_matrix(A, x)
	assert np.allclose(true_value, b)

test_matrix_vector_product()
print("Test passed!")

Test passed!


## 3: Matrix-matrix product

In [25]:
def test_matrix_matrix_product():
	A = np.random.rand(5, 3)
	B = np.random.rand(3, 4)
	true_value = np.dot(A, B)
	C = inner_product_matrix(A, B)
	assert np.allclose(true_value, C)

test_matrix_matrix_product()
print("Test passed!")

Test passed!


## Bonus 1: CRS class

In [28]:
def test_known_CRS():
	sparse_matrix = np.array([
	 [3, 2, 0, 2, 0, 0],
	 [0, 2, 1, 0, 0, 0],
	 [0, 0, 1, 0, 0, 0],
	 [0, 0, 3, 2, 0, 0],
	 [0, 0, 0, 0, 1, 0],
	 [0, 0, 0, 0, 2, 3]])

	A_CRS = CRS(sparse_matrix)
	A_CRS.make_one_indexed()
	# A_CRS.print_stats()
	assert np.array_equal(A_CRS.val, [3, 2, 2, 2, 1, 1, 3, 2, 1, 2, 3])
	assert np.array_equal(A_CRS.col_idx, [1, 2, 4, 2, 3, 3, 3, 4, 5, 5, 6])
	assert np.array_equal(A_CRS.row_ptr, [1, 4, 6, 7, 9, 10])
	assert np.array_equal(A_CRS.reconstruct(), sparse_matrix)


def test_large_CRS(m, n):

	sparse_matrix = np.zeros((m, n))
	# create a tridiagonal matrix with random integers
	np.fill_diagonal(sparse_matrix, np.random.randint(0, 3, m))
	np.fill_diagonal(sparse_matrix[:,1:], np.random.randint(0, 3, m-1))
	np.fill_diagonal(sparse_matrix[1:,:], np.random.randint(0, 3, m-1))

	A_CRS = CRS(sparse_matrix)

	# assert that it can be properly reconstructed from its representation
	assert np.array_equal(A_CRS.reconstruct(), sparse_matrix)


def test_CRS_matrix():
	test_known_CRS()
	for m in range(2, 50):
		for n in range(2, 50):
			test_large_CRS(m, n)
      
test_CRS_matrix()
print("Test passed!")

Test passed!


## Bonus 2: CRS matrix-vector product

In [29]:
def test_CRS_product_known():
	sparse_matrix = np.array([
	 [3, 2, 0, 2, 0, 0],
	 [0, 2, 1, 0, 0, 0],
	 [0, 0, 1, 0, 0, 0],
	 [0, 0, 3, 2, 0, 0],
	 [0, 0, 0, 0, 1, 0],
	 [0, 0, 0, 0, 2, 3]])

	A_CRS = CRS(sparse_matrix)
	t0 = time.clock()
	true_right_val = np.dot(sparse_matrix, [1, 2, 3, 4, 5, 6])
	true_left_val = np.dot([1, 2, 3, 4, 5, 6], sparse_matrix)
	t1 = time.clock()
	right_val = A_CRS * np.array([1, 2, 3, 4, 5, 6])
	left_val = CRS.multiply(np.array([1, 2, 3, 4, 5, 6]), A_CRS)
	t2 = time.clock()

	assert np.array_equal(true_right_val, right_val)
	assert np.array_equal(true_left_val, left_val)
	# print((t1-t0)/(t2-t1))

def test_CRS_product_large(m, n):

	sparse_matrix = np.zeros((m, n))
	# create a tridiagonal matrix with random integers
	np.fill_diagonal(sparse_matrix, np.random.randint(0, 3, m))
	np.fill_diagonal(sparse_matrix[:,1:], np.random.randint(0, 3, m-1))
	np.fill_diagonal(sparse_matrix[1:,:], np.random.randint(0, 3, m-1))

	A_CRS = CRS(sparse_matrix)
	t0 = time.clock()
	left_mult = np.random.rand(m)
	right_mult = np.random.rand(n)
	true_right_val = np.dot(sparse_matrix, right_mult)
	true_left_val = np.dot(left_mult, sparse_matrix)
	t1 = time.clock()
	right_val = A_CRS * right_mult
	left_val = CRS.multiply(left_mult, A_CRS)
	t2 = time.clock()

	# print(true_right_val, right_val)
	assert np.allclose(true_right_val, right_val)
	assert np.allclose(true_left_val, left_val)
	# speedup = (t1-t0)/(t2-t1)
	# if speedup > 1:
		# print("%5dx%5d: %.2f" % (m, n, speedup))

def test_CRS_matrix_vector_product():
  test_CRS_product_known()
  array_dims = [2, 5, 10, 23, 50, 100, 200, 500, 1000]
  for m in array_dims:
    for n in array_dims:
      test_CRS_product_large(m, n)

test_CRS_matrix_vector_product()
print("Test passed!")

Test passed!


## Running all tests

If you want to verify the entire test suite, you can run the cell below.

In [32]:
def run_tests():
	test_inner_product()
	test_matrix_vector_product()
	test_matrix_matrix_product()
	test_CRS_matrix()
	test_CRS_matrix_vector_product()

run_tests()
print("All tests passed!")


All tests passed!


# **Discussion**

The results are as expected. There is further room for improvement in the matrix-matrix product, as using a for loop is less efficient compared to using numpy methods. The CRS class also needed to store the dimensions of the matrix in addition to the three arrays described in the litterature. From some benchmarking, the CRS class was actually able to perform matrix-vector multiplication faster than numpy for very large matrices, which is very impressive. 