# **SVD Comparisons (new C++20 library vs. numpy.linalg.svd)**

## Team H
* Evan Ram
* Prateek Makhija
* James Douthit
* Garrett Hempy

## Introduction / methods

// todo

## Dependencies

We use a new library called Matrix written in C++20 (the C++20 language version should be finalized later this month). It is written by Feng Wang and can be found [here](https://github.com/fengwang/matrix). The library is distributed as a huge single header file (see `./fengwang-matrix/matrix.hpp`).

We also depend on `numpy` for matrix stuff in Python and `pillow` (Python image processing library) for loading in the C++ library's exported matrices (which can be serialized to bitmap image files).

In [1]:
!{sys.executable} -m pip install numpy
!{sys.executable} -m pip install pillow

/bin/sh: {sys.executable}: command not found
/bin/sh: {sys.executable}: command not found


Feng Wang's Matrix library is difficult to compile on non-linux machines because of the new C++20 supporting compilers, so we require [Docker](https://www.docker.com/) to be installed in order to compile and run our driver code.

The below `FengWangSVD` class will run the Docker commands for us so we can test its SVD implementation directly from Python.

In [64]:
import sys
import numpy as np
import os
import subprocess
from PIL import Image

class FengWangSVD:

    def __init__(self, A):
        """
        `A` should be a 2d numpy array
        """
        
        self._A = A
    
    @staticmethod
    def build():
        """
        Builds Feng Wang's Matrix library SVD tester using Docker.

        New C++20 features are not available on many machines,
        so please have Docker installed.
        """

        # Use bang command in Jupyter notebook since we don't care about this command's output
        !docker build -t fengwang-matrix-svd .
        
        # Technically we don't run any Python in this method
        pass

    def run(self):
        """
        Runs the C++ test program for the fengwang/matrix library.
        Returns its command line output as a list of lines from stdout.

        Not using bang command b/c we want to process the output to get timings.
        Timing from the start of this method to the end of it is pointless since
        it includes overhead of process creation.
        """

        self._write_matrix()
        cmd = f'docker run -v {self._program_io_path()}:/program_io fengwang-matrix-svd'.split()
        proc_out = subprocess.check_output(cmd)
        lines = proc_out.decode('utf-8').split('\n')
        self._read_stats(lines)
        self._read_matrices()
    
    @property
    def stats(self):
        """
        A dict containing the stat results from running the program.
        """

        if not hasattr(self, '_stats'):
            raise Exception('Please call run() first')

        return self._stats
    
    def _program_io_path(self):
        path = os.path.join(os.getcwd(), 'fengwang-matrix/program_io')
        if not os.path.exists(path):
            os.makedirs(path)
        return path
    
    def _write_matrix(self):
        """
        Write matrix A as input data to the program.
        """
        
        path = os.path.join(self._program_io_path(), 'input.npy')
        with open(path, 'wb') as f:
            np.save(f, self._A)
    
    def _read_stats(self, stdout_lines):
        """
        Parse stdout for key/value pairs and assign it to self._stats
        """
        
        stats = {}
        start_processing = False
        
        for i, line in enumerate(stdout_lines):
            # Most lines end with carriage return '\r' for some reason
            line = line.strip()
            
            if line == '!!BEGIN-STATS!!':
                # All stdout lines after this magic string will be parsed as K/V pairs
                start_processing = True
                continue
            elif not start_processing:
                # Program spits out diagnostic data on `load_npy`, cant seem to disable it...
                continue
    
            if len(line) == 0:
                # Blank line
                continue
        
            [k, v] = line.split(':=', 1)
            k = k.lower().strip()
            v = v.strip()
            
            # Output numbers should all be integers, to avoid differences in
            # floating point arithmetic between Python and C++ tests
            if v.isdigit():
                v = int(v)
                
            stats[k] = v
            
        self._stats = stats
        
    def _read_matrices(self):
        """
        Read in the matrices from the generated .bmp files the program created.
        Will populate self.{U, S, V, A_prime}
        """
        
        matrices = ['U', 'S', 'V', 'A_prime']
        
        for m in matrices:
            path = os.path.join(self._program_io_path(), m + '.bmp')
            img = Image.open(path).convert('L') # 'L' for grayscale ... weird const but ok
            
            # Rescale b/c the .bmp format maps 0.0-1.0 onto integers 0-255
            setattr(self, m + '_bmp', np.array(img) / 255)
        
        for m in matrices:
            path = os.path.join(self._program_io_path(), m + '.txt')
            with open(path, 'r') as myfile:
                data = myfile.read()
                setattr(self, m, self._txt_to_mat(data))
    
    def _txt_to_mat(self, data):
        
        lines = data.split('\n')
        A = []
        
        for line in lines:
            A.append([float(x) for x in line.split('\t')[:-1]])
        
        return np.array(A[:-1])

Run the next cell to build the C++ driver code that builds our driver executable for Feng Wang's Matrix SVD function.

In [38]:
FengWangSVD.build()

Sending build context to Docker daemon  1.099MB
Step 1/5 : FROM gcc:latest
 ---> 2f9778ee181e
Step 2/5 : COPY ./fengwang-matrix /app
 ---> 4e7e5df3b316
Step 3/5 : WORKDIR /app
 ---> Running in 8b8558775fd0
Removing intermediate container 8b8558775fd0
 ---> 1d2d1ad72572
Step 4/5 : RUN make
 ---> Running in 79ca1e04530d
g++ -std=c++2a -Wall -Wextra -O2 -pthread -o svdimage main.cpp -lstdc++fs
Removing intermediate container 79ca1e04530d
 ---> c5cd924ce038
Step 5/5 : CMD ["./svdimage"]
 ---> Running in 4fe04e6b7e5e
Removing intermediate container 4fe04e6b7e5e
 ---> 36f46328cf50
Successfully built 36f46328cf50
Successfully tagged fengwang-matrix-svd:latest


Let's just see it working real quick.

In [65]:
A = np.random.rand(100, 100) * 1000
fw_svd = FengWangSVD(A)
fw_svd.run()

print(f'Time to run SVD algo: {fw_svd.stats["performance-svd-us"]}μs')
print(f'\nTime to construct A\' (U x S x V^T): {fw_svd.stats["performance-aprime-us"]}μs')

# Fairly large residuals (numerical error in .bmp format limitations)
print('\nResiduals between A and the product U x S x V^T')
print(np.subtract(A, fw_svd.U @ fw_svd.S @ fw_svd.V.T))

# Smaller residuals
print('\nResiduals between A and A\' (reconstructed matrix from Feng Wang\'s SVD)')
print(np.subtract(A, fw_svd.A_prime))
      
print('\n\nThe .bmp file export however results in crazy high residuals and might be broken:')

print('\n(bmp) Residuals between A and the product U x S x V^T')
print(np.subtract(A, fw_svd.U_bmp @ fw_svd.S_bmp @ fw_svd.V_bmp.T))
      
print('\n(bmp) Residuals between A and A\' (reconstructed matrix from Feng Wang\'s SVD)')
print(np.subtract(A, fw_svd.A_prime_bmp))

Time to run SVD algo: 35640μs

Time to construct A' (U x S x V^T): 7538μs

Residuals between A and the product U x S x V^T
[[ 3.97903932e-13 -8.81072992e-13  8.52651283e-13 ... -1.53477231e-12
  -6.82121026e-13 -1.64845915e-12]
 [-5.68434189e-14 -1.47792889e-12 -5.68434189e-13 ...  0.00000000e+00
  -4.83169060e-13 -1.30739863e-12]
 [-1.13686838e-13 -1.93267624e-12  9.09494702e-13 ...  2.55795385e-13
   7.10542736e-15  1.13686838e-13]
 ...
 [ 2.04636308e-12  1.13686838e-13  9.94759830e-14 ... -2.27373675e-13
   3.41060513e-13  3.41060513e-13]
 [ 1.87583282e-12 -6.82121026e-13  2.27373675e-13 ... -2.84217094e-13
   4.54747351e-13 -1.70530257e-13]
 [-2.27373675e-13 -1.13686838e-13  5.68434189e-13 ... -1.10844667e-12
  -3.41060513e-13 -1.25055521e-12]]

Residuals between A and A' (reconstructed matrix from Feng Wang's SVD)
[[ 3.97903932e-13 -9.37916411e-13  8.52651283e-13 ... -1.64845915e-12
  -4.54747351e-13 -1.42108547e-12]
 [-5.68434189e-14 -1.36424205e-12 -6.82121026e-13 ...  0.0000000

## Breaking it

//todo
The product of the {U,S,V}.bmp exported matrices loses a lot of precision unless we get the UxSxV^T product before dumping to bmp (A_prime.bmp). Also note the bug where an integer matrix crashes the library's loader code

## Comparison with np.linalg.svd

We will just ignore the broken .bmp format for Feng Wang in our comparison because either we are using it wrong or the exported data is highly different for some reason.

In [97]:
np_U, np_S, np_Vt = np.linalg.svd(A)
np_S = np.diag(np_S) # Why is np_S not already a diagonal matrix before this?
np_A_prime = np_U @ np_S @ np_Vt

print('\nResiduals between A and A\' (reconstructed matrix from numpy\'s SVD)')
print(np.subtract(A, np_A_prime))

print('\nResiduals between Feng Wang\'s U, S, and V factors:')
print('\nU diff: ', np.subtract(np_U, fw_svd.U))
print('\nS diff: ', np.subtract(np_S, fw_svd.S))
print('\nV diff: ', np.subtract(np_Vt.T, fw_svd.V))

print('\nResiduals between Feng Wang\'s A\' and numpy\'s A\' (reconstruction of A):')
print('A\' diff: ')
print(abs(np_A_prime - fw_svd.A_prime))


Residuals between A and A' (reconstructed matrix from numpy's SVD)
[[-1.13686838e-13  2.58637556e-12  1.70530257e-12 ... -8.52651283e-13
  -1.47792889e-12 -4.54747351e-13]
 [-1.70530257e-13  1.81898940e-12  9.09494702e-13 ... -1.13686838e-12
  -1.30739863e-12 -3.41060513e-13]
 [-9.09494702e-13  2.38742359e-12  3.41060513e-13 ... -5.40012479e-13
  -1.63424829e-13 -1.93267624e-12]
 ...
 [-2.27373675e-13  1.02318154e-12  7.95807864e-13 ... -4.54747351e-13
   6.82121026e-13  4.54747351e-13]
 [-7.38964445e-13 -1.59161573e-12  3.41060513e-13 ... -7.95807864e-13
   1.02318154e-12  2.27373675e-13]
 [-1.13686838e-12 -8.52651283e-13  4.54747351e-13 ...  5.40012479e-13
   6.25277607e-13  7.95807864e-13]]

Residuals between Feng Wang's U, S, and V factors:

U diff:  [[ 1.94289029e-16  1.24900090e-15  1.91513472e-15 ...  7.00828284e-16
  -2.08166817e-16  9.41873075e-02]
 [-2.77555756e-17  2.52575738e-15  3.49026363e-15 ...  3.74700271e-16
   7.77156117e-16  1.86145569e-01]
 [-2.77555756e-17 -2.609

## Results and interpretation

## Conclusions and open questions