# **SVD Comparisons (new C++20 library vs. numpy.linalg.svd)**

## Team H
* Evan Ram
* Prateek Makhija
* James Douthit
* Garrett Hempy

## Introduction / methods

// todo

## Dependencies

We use a new library called Matrix written in C++20 (the C++20 language version should be finalized later this month). It is written by Feng Wang and can be found [here](https://github.com/fengwang/matrix). The library is distributed as a huge single header file (see `./fengwang-matrix/matrix.hpp`).

We also depend on `numpy` for matrix stuff in Python and `pillow` (Python image processing library) for loading in the C++ library's exported matrices (which can be serialized to bitmap image files).

In [None]:
!{sys.executable} -m pip install numpy
!{sys.executable} -m pip install pillow

Feng Wang's Matrix library is difficult to compile on non-linux machines because of the new C++20 supporting compilers, so we require [Docker](https://www.docker.com/) to be installed in order to compile and run our driver code.

The below `FengWangSVD` class will run the Docker commands for us so we can test its SVD implementation directly from Python.

In [7]:
import sys
import numpy as np
import os
import subprocess
from PIL import Image

class FengWangSVD:

    def __init__(self, A):
        """
        `A` should be a 2d numpy array
        """
        
        self._A = A
    
    @staticmethod
    def build():
        """
        Builds Feng Wang's Matrix library SVD tester using Docker.

        New C++20 features are not available on many machines,
        so please have Docker installed.
        """

        # Use bang command in Jupyter notebook since we don't care about this command's output
        !docker build -t fengwang-matrix-svd .
        
        # Technically we don't run any Python in this method
        pass

    def run(self):
        """
        Runs the C++ test program for the fengwang/matrix library.
        Returns its command line output as a list of lines from stdout.

        Not using bang command b/c we want to process the output to get timings.
        Timing from the start of this method to the end of it is pointless since
        it includes overhead of process creation.
        """

        self._write_matrix()
        cmd = f'docker run -v {self._program_io_path()}:/program_io fengwang-matrix-svd'.split()
        proc_out = subprocess.check_output(cmd)
        lines = proc_out.decode('utf-8').split('\n')
        self._read_stats(lines)
        self._read_matrices()
    
    @property
    def stats(self):
        """
        A dict containing the stat results from running the program.
        """

        if not hasattr(self, '_stats'):
            raise Exception('Please call run() first')

        return self._stats
    
    def _program_io_path(self):
        path = os.path.join(os.getcwd(), 'fengwang-matrix/program_io')
        if not os.path.exists(path):
            os.makedirs(path)
        return path
    
    def _write_matrix(self):
        """
        Write matrix A as input data to the program.
        """
        
        path = os.path.join(self._program_io_path(), 'input.npy')
        with open(path, 'wb') as f:
            np.save(f, self._A)
    
    def _read_stats(self, stdout_lines):
        """
        Parse stdout for key/value pairs and assign it to self._stats
        """
        
        stats = {}
        start_processing = False
        
        for i, line in enumerate(stdout_lines):
            # Most lines end with carriage return '\r' for some reason
            line = line.strip()
            
            if line == '!!BEGIN-STATS!!':
                # Title of program (sanity check that it starts up)
                start_processing = True
                continue
            elif not start_processing:
                # Program spits out diagnostic data on `load_npy`, cant seem to disable it...
                continue
    
            if len(line) == 0:
                # Blank line
                continue
        
            [k, v] = line.split(':=', 1)
            k = k.lower().strip()
            v = v.strip()
            
            # Output numbers should all be integers, to avoid differences in
            # floating point arithmetic between Python and C++ tests
            if v.isdigit():
                v = int(v)
                
            stats[k] = v
            
        self._stats = stats
        
    def _read_matrices(self):
        """
        Read in the matrices from the generated .bmp files the program created.
        Will populate self.{U, S, V, A_prime}
        """
        
        matrices = ['U', 'S', 'V', 'A_prime']
        
        for m in matrices:
            path = os.path.join(self._program_io_path(), m + '.bmp')
            img = Image.open(path).convert('L') # 'L' for grayscale ... weird const but ok
            
            # Rescale b/c the .bmp format maps 0.0-1.0 onto integers 0-255
            setattr(self, m, np.array(img) / 255)

Run the next cell to build the C++ driver code that builds our driver executable for Feng Wang's Matrix SVD function.

In [22]:
FengWangSVD.build()

Sending build context to Docker daemon    852kB
Step 1/5 : FROM gcc:latest
 ---> 2f9778ee181e
Step 2/5 : COPY ./fengwang-matrix /app
 ---> 2977eae962e0
Step 3/5 : WORKDIR /app
 ---> Running in 4d2fefe930a6
Removing intermediate container 4d2fefe930a6
 ---> e43542da5601
Step 4/5 : RUN make
 ---> Running in c4e3df7248ba
g++ -std=c++2a -Wall -Wextra -O2 -pthread -o svdimage main.cpp -lstdc++fs
Removing intermediate container c4e3df7248ba
 ---> a72cb2b982a3
Step 5/5 : CMD ["./svdimage"]
 ---> Running in f0e3e98d5474
Removing intermediate container f0e3e98d5474
 ---> d10624dffb5a
Successfully built d10624dffb5a
Successfully tagged fengwang-matrix-svd:latest


Let's just see it working real quick.

In [26]:
A = np.random.rand(100, 100)
fw_svd = FengWangSVD(A)
fw_svd.run()

print(f'Time to run SVD algo: {fw_svd.stats["performance-svd-us"]}μs')
print(f'\nTime to construct A\' (U x S x V^T): {fw_svd.stats["performance-aprime-us"]}μs')

# Fairly large residuals (numerical error in .bmp format limitations)
print('\nResiduals between A and the product U x S x V^T')
print(abs(A - fw_svd.U @ fw_svd.S @ fw_svd.V.T))

# Smaller residuals
print('\nResiduals between A and A\' (reconstructed matrix from Feng Wang\'s SVD)')
print(abs(A - fw_svd.A_prime))

Time to run SVD algo: 28051μs

Time to construct A' (U x S x V^T): 1703μs

Residuals between A and the product U x S x V^T
[[0.36477509 0.89807084 0.7337559  ... 0.71769889 0.5381907  0.17403427]
 [0.1653281  0.1707974  0.39098358 ... 0.68588986 0.66650473 0.21615966]
 [0.13831298 0.58688824 0.83848003 ... 0.86291946 0.57550277 0.50883232]
 ...
 [0.86176018 0.65443783 0.13561682 ... 0.5389588  0.27595597 0.24165883]
 [0.76069382 1.05043579 0.63956324 ... 0.92611838 1.1059393  0.59858114]
 [0.54872033 0.43613208 0.4618965  ... 0.51274831 0.94788842 0.03744726]]

Residuals between A and A' (reconstructed matrix from Feng Wang's SVD)
[[0.00343328 0.00277019 0.00066829 ... 0.00200318 0.00137216 0.00120632]
 [0.0017607  0.0014698  0.00334747 ... 0.00016627 0.00090162 0.00068852]
 [0.00232783 0.0030515  0.00026181 ... 0.00285753 0.00059867 0.00024627]
 ...
 [0.0004381  0.00081573 0.00128166 ... 0.00365857 0.00194601 0.00158131]
 [0.00012939 0.00090477 0.00384661 ... 0.00086427 0.00319604 0.0

## Breaking it

//todo
The product of the {U,S,V}.bmp exported matrices loses a lot of precision unless we get the UxSxV^T product before dumping to bmp (A_prime.bmp). Also note the bug where an integer matrix crashes the library's loader code

## Comparison with np.linalg.svd

// todo: run test of same matrices and timings for numpy's svd

## Results and interpretation

## Conclusions and open questions