# Vectron Benchmark
This notebook will go over all of vectron's tests presented in the article. Note that some runtimes might differ from the ones presented in the article due to using a virtual container and the difference between the hardware configuratuons of the systems the simulation is run on. However, if the speedups are compared, the same results will be achieved.

Also note that the input sequences for codon, vectron, C++ and cuda (seqx.txt and seqy.txt) all have the same identical sequences, but due to the fact that C++ and cuda handle pairs of sequences better, we prepared the same sequences fed to codon and vectron ready to be paired and fed them to C++ and cuda. This preparation takes place in the build process of the vectron Dockerfile by running seq_modifier.python.

In [1]:
import subprocess
import os
from tabulate import tabulate
os.system("codon build /vectron/docker/experiments_docker/source/vectron.codon")

0

In [19]:
def compile(mode, src, vectron_path = ''):
    if mode == 'vectron':
        result = subprocess.run([vectron_path, '/codon-seq', '/vectron', f'{src}'], capture_output=True, text=True)
    elif mode == 'codon':
        result = subprocess.run(['codon', 'build', '-plugin', '/codon-seq', f'{src}', '-release'], capture_output=True, text=True)
    elif mode == 'cpp':             
        result = subprocess.run([
            'clang++', '-O3', '-msse4.2', '-funroll-loops', '-mfpmath=sse', '-march=native',
            f'{src}', '-o', f'{os.path.splitext(src)[0]}'
        ], capture_output=True, text=True)
        #print(result)     
    elif mode == 'cuda':
        result = subprocess.run([
            'nvcc', '-o', f'{os.path.splitext(src)[0]}', f'{src}'
        ], capture_output=True, text=True)        

def exec(mode, src, ds_type):
    seq_x = ''
    seq_y = ''
    if 'float' in ds_type:
        if mode == 'cpp' or mode == 'cuda':
            ds_type = 'cuda' + "_" + ds_type.split("_", 1)[1]
    else:
        if mode == 'cpp' or mode == 'cuda':
            ds_type = mode + "_" + ds_type.split("_", 1)[1]        
    seq_x = f'/vectron/docker/experiments_docker/data/{ds_type}/seqx.txt'
    seq_y = f'/vectron/docker/experiments_docker/data/{ds_type}/seqy.txt'       
    result = subprocess.run(f'{src} {seq_y} {seq_x} >{mode}_out.txt', capture_output=True, text=True, shell=True)
    if mode == 'vectron' or mode == 'codon':
        return result.stderr
    else:
        with open(f'{mode}_out.txt', 'r') as file:
            lines = file.readlines()
            return(lines[-1].strip())
        
def batch_exec(mode, ds_type):
    if 'float' in ds_type:        
        if mode == 'vectron':
            source_p = mode + '_' + ds_type.split("_", 1)[0]
        else:
            source_p = 'cuda'        
        algorithms = [
            ("Smith Waterman", "smith_waterman"),
        ]            
    else:
        if mode == 'vectron':
            source_p = mode + '_' + ds_type.split("_", 1)[0]
        else:
            source_p = mode        
        algorithms = [
            ("Levenshtein Distance", "levenshtein_distance"),
            ("Longest Common Subsequence", "lcs"),
            ("Hamming Distance", "hamming_distance"),
            ("Manhattan Tourist", "manhattan_tourist"),
            ("Minimum Cost Path", "min_cost_path"),
            ("Needleman Wunsch", "needleman_wunsch"),
            ("Smith Waterman", "smith_waterman"),
        ]    
    results = []


    headers = [f"{mode}", "Execution Time"]
    
    for name, exec_name in algorithms:
        if mode == "cuda":
            exec_path = f'/vectron/docker/experiments_docker/source/{source_p}/{exec_name}_cuda'
        else:
            exec_path = f'/vectron/docker/experiments_docker/source/{source_p}/{exec_name}'
                
        result = exec(mode, exec_path, ds_type)        
        results.append((name, result))
    print(tabulate(results, headers=headers, tablefmt="pretty"))

The following module will compile vectron, codon and C++ benchmarks on CPU in integer mode.
The path to source codes for each script can be found in its compile command.

In [3]:
## COMPILING VECTRON EXPERIMENTS:
compile('vectron', '/vectron/docker/experiments_docker/source/vectron_int/smith_waterman.codon', '/vectron/docker/experiments_docker/source/vectron')
compile('vectron', '/vectron/docker/experiments_docker/source/vectron_int/needleman_wunsch.codon', '/vectron/docker/experiments_docker/source/vectron')
compile('vectron', '/vectron/docker/experiments_docker/source/vectron_int/levenshtein_distance.codon', '/vectron/docker/experiments_docker/source/vectron')
compile('vectron', '/vectron/docker/experiments_docker/source/vectron_int/lcs.codon', '/vectron/docker/experiments_docker/source/vectron')
compile('vectron', '/vectron/docker/experiments_docker/source/vectron_int/hamming_distance.codon', '/vectron/docker/experiments_docker/source/vectron')
compile('vectron', '/vectron/docker/experiments_docker/source/vectron_int/manhattan_tourist.codon', '/vectron/docker/experiments_docker/source/vectron')
compile('vectron', '/vectron/docker/experiments_docker/source/vectron_int/min_cost_path.codon', '/vectron/docker/experiments_docker/source/vectron')

## COMPILING codon EXPERIMENTS:
compile('codon', '/vectron/docker/experiments_docker/source/codon/smith_waterman.codon')
compile('codon', '/vectron/docker/experiments_docker/source/codon/needleman_wunsch.codon')
compile('codon', '/vectron/docker/experiments_docker/source/codon/levenshtein_distance.codon')
compile('codon', '/vectron/docker/experiments_docker/source/codon/lcs.codon')
compile('codon', '/vectron/docker/experiments_docker/source/codon/hamming_distance.codon')
compile('codon', '/vectron/docker/experiments_docker/source/codon/manhattan_tourist.codon')
compile('codon', '/vectron/docker/experiments_docker/source/codon/min_cost_path.codon')

## COMPILING C++ EXPERIMENTS:
compile('cpp', '/vectron/docker/experiments_docker/source/cpp/smith_waterman.cpp')
compile('cpp', '/vectron/docker/experiments_docker/source/cpp/needleman_wunsch.cpp')
compile('cpp', '/vectron/docker/experiments_docker/source/cpp/levenshtein_distance.cpp')
compile('cpp', '/vectron/docker/experiments_docker/source/cpp/lcs.cpp')
compile('cpp', '/vectron/docker/experiments_docker/source/cpp/hamming_distance.cpp')
compile('cpp', '/vectron/docker/experiments_docker/source/cpp/manhattan_tourist.cpp')
compile('cpp', '/vectron/docker/experiments_docker/source/cpp/min_cost_path.cpp')

The following module will execute vectron, codon and C++ respectively, and benchmark their runtimes for the small dataset (4096 sequence pairs)

In [4]:
batch_exec('vectron', 'int_small')

batch_exec('codon', 'int_small')

batch_exec('cpp', 'int_small')

+----------------------------+-------------------------+
|          vectron           |     Execution Time      |
+----------------------------+-------------------------+
|    Levenshtein Distance    | Total:  took 0.398445s  |
| Longest Common Subsequence |  Total:  took 0.37403s  |
|      Hamming Distance      | Total:  took 0.0854971s |
|     Manhattan Tourist      | Total:  took 0.510745s  |
|     Minimum Cost Path      | Total:  took 0.758103s  |
|      Needleman Wunsch      | Total:  took 0.500773s  |
|       Smith Waterman       | Total:  took 0.269599s  |
+----------------------------+-------------------------+
+----------------------------+-----------------------+
|           codon            |    Execution Time     |
+----------------------------+-----------------------+
|    Levenshtein Distance    | Total:  took 8.0801s  |
| Longest Common Subsequence | Total:  took 9.78918s |
|      Hamming Distance      | Total:  took 8.2862s  |
|     Manhattan Tourist      | Total:  took

The following module will execute vectron, codon and C++ respectively, and benchmark their runtimes for the medium dataset (262,144 sequence pairs)

In [5]:
batch_exec('vectron', 'int_medium')

batch_exec('codon', 'int_medium')

batch_exec('cpp', 'int_medium')

+----------------------------+-----------------------+
|          vectron           |    Execution Time     |
+----------------------------+-----------------------+
|    Levenshtein Distance    | Total:  took 24.1062s |
| Longest Common Subsequence | Total:  took 22.8324s |
|      Hamming Distance      | Total:  took 1.92024s |
|     Manhattan Tourist      | Total:  took 32.4822s |
|     Minimum Cost Path      | Total:  took 47.4777s |
|      Needleman Wunsch      | Total:  took 30.6931s |
|       Smith Waterman       | Total:  took 18.3113s |
+----------------------------+-----------------------+
+----------------------------+-----------------------+
|           codon            |    Execution Time     |
+----------------------------+-----------------------+
|    Levenshtein Distance    | Total:  took 521.671s |
| Longest Common Subsequence | Total:  took 604.978s |
|      Hamming Distance      | Total:  took 518.549s |
|     Manhattan Tourist      |  Total:  took 637.3s  |
|     Mini

The following modules will execute vectron, codon and C++ respectively, and benchmark their runtimes for the large dataset (4,194,304 sequence pairs)

In [6]:
batch_exec('vectron', 'int_large')

+----------------------------+-----------------------+
|          vectron           |    Execution Time     |
+----------------------------+-----------------------+
|    Levenshtein Distance    | Total:  took 351.592s |
| Longest Common Subsequence | Total:  took 350.358s |
|      Hamming Distance      | Total:  took 28.1443s |
|     Manhattan Tourist      | Total:  took 502.26s  |
|     Minimum Cost Path      | Total:  took 734.145s |
|      Needleman Wunsch      | Total:  took 457.519s |
|       Smith Waterman       | Total:  took 312.49s  |
+----------------------------+-----------------------+


In [None]:
batch_exec('codon', 'int_large')

In [None]:
batch_exec('cpp', 'int_large')

The following module will compile vectron, cuda and C++ benchmarks on GPU in floating-point mode

In [7]:
## COMPILING VECTRON EXPERIMENT:
compile('vectron', '/vectron/docker/experiments_docker/source/vectron_float/smith_waterman.codon', '/vectron/docker/experiments_docker/source/vectron')

## COMPILING cuda EXPERIMENT:
compile('cuda', '/vectron/docker/experiments_docker/source/cuda/smith_waterman_cuda.cu')

## COMPILING C++ EXPERIMENT:
compile('cpp', '/vectron/docker/experiments_docker/source/cuda/smith_waterman.cpp')

The following module will execute vectron, codon and C++ respectively, and benchmark their runtimes for the small GPU dataset (256 sequence pairs)

In [21]:
batch_exec('vectron', 'float_small')

batch_exec('cuda', 'float_small')

batch_exec('cpp', 'float_small')

+----------------+------------------------+
|    vectron     |     Execution Time     |
+----------------+------------------------+
| Smith Waterman | Total:  took 0.414509s |
+----------------+------------------------+
+----------------+----------------+
|      cuda      | Execution Time |
+----------------+----------------+
| Smith Waterman |      2.06      |
+----------------+----------------+
+----------------+----------------+
|      cpp       | Execution Time |
+----------------+----------------+
| Smith Waterman |      0.25      |
+----------------+----------------+


The following module will execute vectron, codon and C++ respectively, and benchmark their runtimes for the medium GPU dataset (1024 sequence pairs)

In [22]:
batch_exec('vectron', 'float_medium')

batch_exec('cuda', 'float_medium')

batch_exec('cpp', 'float_medium')

+----------------+-----------------------+
|    vectron     |    Execution Time     |
+----------------+-----------------------+
| Smith Waterman | Total:  took 1.42555s |
+----------------+-----------------------+
+----------------+----------------+
|      cuda      | Execution Time |
+----------------+----------------+
| Smith Waterman |      1.69      |
+----------------+----------------+
+----------------+----------------+
|      cpp       | Execution Time |
+----------------+----------------+
| Smith Waterman |      1.30      |
+----------------+----------------+


The following module will execute vectron, codon and C++ respectively, and benchmark their runtimes for the large GPU dataset (4096 sequence pairs)

In [23]:
batch_exec('vectron', 'float_large')

batch_exec('cuda', 'float_large')

batch_exec('cpp', 'float_large')

+----------------+----------------------+
|    vectron     |    Execution Time    |
+----------------+----------------------+
| Smith Waterman | Total:  took 5.3616s |
+----------------+----------------------+
+----------------+----------------+
|      cuda      | Execution Time |
+----------------+----------------+
| Smith Waterman |      2.20      |
+----------------+----------------+
+----------------+----------------+
|      cpp       | Execution Time |
+----------------+----------------+
| Smith Waterman |      8.67      |
+----------------+----------------+
