# Other State-of-the-Art Implementations:

The following modules will test the other state of the art implementations in the article in the order of their appearance for 4096 pairs of nucleotide sequences:

### SSW:

In [1]:
! time /ssw/src/ssw_test /others/data/seqx.fasta /others/data/seqy.fasta >ssw_out.txt

CPU time: 0.278962 seconds
0.28user 0.00system 0:00.28elapsed 99%CPU (0avgtext+0avgdata 1044maxresident)k
0inputs+1112outputs (0major+289minor)pagefaults 0swaps


### Parasail:

In [2]:
! time /root/parasail/build/parasail_aligner -f /others/data/seqx.fasta -q /others/data/seqy.fasta >parasail_out.txt

10.07user 0.10system 0:10.30elapsed 98%CPU (0avgtext+0avgdata 202472maxresident)k
0inputs+312outputs (0major+64384minor)pagefaults 0swaps


### SeqAn: 

Note that SeqAn was very difficult to work with external files. However the below bash command will test the package in equal settings for 48 pairs of sequences. **The run time for 4096 pairs of sequences can be extrapolated by multiplying the resulting runtime by 85.3**

In [3]:
! time /root/seqan/build/bin/test_align_simd_global_equal_length_avx2 >seqan_out.txt

0.07user 0.00system 0:00.07elapsed 100%CPU (0avgtext+0avgdata 5876maxresident)k
0inputs+24outputs (0major+1502minor)pagefaults 0swaps


### Codon (seq) inter_align:

In [4]:
! codon build -plugin /codon-seq/ seq_interalign.codon -release


In [5]:
! time ./seq_interalign >seq_interalign.txt

total took 0.419478s
0.47user 1.72system 0:02.27elapsed 96%CPU (0avgtext+0avgdata 148212maxresident)k
0inputs+48outputs (0major+9569minor)pagefaults 0swaps


### ADEPT: 

ADEPT's test are unable to perform for arbitrary number of pairs with arbitrary lengths, and their DNA test suite of 3000 sequence pairs of length 150 was used to perform benchmarks, **the resulting runtime must be multiplied by 4.66** to get the equivalent of other runtime benchmarks.

In [6]:
! time /root/ADEPT/build/examples/simple_sw/simple_sw /root/ADEPT/test-data/dna-query.fasta /root/ADEPT/test-data/dna-reference.fasta out.txt res.txt


-----------------------
       SIMPLE DNA      
-----------------------

STATUS: Launching driver


STATUS: Writing results...
correctness test passed
0.05user 1.60system 0:01.78elapsed 92%CPU (0avgtext+0avgdata 217172maxresident)k
0inputs+272outputs (0major+28355minor)pagefaults 0swaps


### SWIPE:

In [7]:
! time /root/swipe/swipe -d /swipe/swipe_data/seqx.fasta -i /swipe/swipe_data/seqy.fasta >swipe_out.txt

8.25user 0.08system 0:08.36elapsed 99%CPU (0avgtext+0avgdata 1576maxresident)k
0inputs+19288outputs (0major+14435minor)pagefaults 0swaps


### Compiling RV scripts (equivalent to out C++ scripts with the loop annotation specific to RV)

In [9]:
! clang++-16 -fplugin=/root/libRV.so -mllvm -rv-loopvec -fno-unroll-loops /RV/cpp_sw_rv.cpp -o /RV/cpp_sw_rv
! clang++-16 -fplugin=/root/libRV.so -mllvm -rv-loopvec -fno-unroll-loops /RV/cpp_nmw_rv.cpp -o /RV/cpp_nmw_rv
! clang++-16 -fplugin=/root/libRV.so -mllvm -rv-loopvec -fno-unroll-loops /RV/cpp_mt_rv.cpp -o /RV/cpp_mt_rv
! clang++-16 -fplugin=/root/libRV.so -mllvm -rv-loopvec -fno-unroll-loops /RV/cpp_mcp_rv.cpp -o /RV/cpp_mcp_rv
! clang++-16 -fplugin=/root/libRV.so -mllvm -rv-loopvec -fno-unroll-loops /RV/cpp_lcs_rv.cpp -o /RV/cpp_lcs_rv
! clang++-16 -fplugin=/root/libRV.so -mllvm -rv-loopvec -fno-unroll-loops /RV/cpp_LD_rv.cpp -o /RV/cpp_LD_rv
! clang++-16 -fplugin=/root/libRV.so -mllvm -rv-loopvec -fno-unroll-loops /RV/cpp_hd_rv.cpp -o /RV/cpp_hd_rv

Below prints the runtime of the C++ files using the RV vectorizer:

In [10]:
! time /RV/cpp_sw_rv /vectron/docker/experiments_docker/data/cpp_small/seqy.txt /vectron/docker/experiments_docker/data/cpp_small/seqx.txt >rv_out.txt

50.32user 12.34system 1:02.81elapsed 99%CPU (0avgtext+0avgdata 12866008maxresident)k
0inputs+56outputs (0major+11024930minor)pagefaults 0swaps


In [11]:
! time /RV/cpp_nmw_rv /vectron/docker/experiments_docker/data/cpp_small/seqy.txt /vectron/docker/experiments_docker/data/cpp_small/seqx.txt >rv_out.txt

54.96user 4.40system 0:59.51elapsed 99%CPU (0avgtext+0avgdata 6561732maxresident)k
0inputs+48outputs (0major+4521900minor)pagefaults 0swaps


In [12]:
! time /RV/cpp_mt_rv /vectron/docker/experiments_docker/data/cpp_small/seqy.txt /vectron/docker/experiments_docker/data/cpp_small/seqx.txt >rv_out.txt

41.70user 2.52system 0:44.33elapsed 99%CPU (0avgtext+0avgdata 2194824maxresident)k
0inputs+48outputs (0major+1614247minor)pagefaults 0swaps


In [13]:
! time /RV/cpp_mcp_rv /vectron/docker/experiments_docker/data/cpp_small/seqy.txt /vectron/docker/experiments_docker/data/cpp_small/seqx.txt >rv_out.txt

39.47user 1.42system 0:40.99elapsed 99%CPU (0avgtext+0avgdata 2194824maxresident)k
0inputs+48outputs (0major+1478037minor)pagefaults 0swaps


In [14]:
! time /RV/cpp_lcs_rv /vectron/docker/experiments_docker/data/cpp_small/seqy.txt /vectron/docker/experiments_docker/data/cpp_small/seqx.txt >rv_out.txt

42.75user 1.64system 0:44.50elapsed 99%CPU (0avgtext+0avgdata 2190600maxresident)k
0inputs+40outputs (0major+1477089minor)pagefaults 0swaps


In [15]:
! time /RV/cpp_LD_rv /vectron/docker/experiments_docker/data/cpp_small/seqy.txt /vectron/docker/experiments_docker/data/cpp_small/seqx.txt >rv_out.txt

42.55user 3.84system 0:46.51elapsed 99%CPU (0avgtext+0avgdata 2190600maxresident)k
0inputs+48outputs (0major+1635327minor)pagefaults 0swaps


In [16]:
! time /RV/cpp_hd_rv /vectron/docker/experiments_docker/data/cpp_small/seqy.txt /vectron/docker/experiments_docker/data/cpp_small/seqx.txt >rv_out.txt

25.80user 1.65system 0:27.52elapsed 99%CPU (0avgtext+0avgdata 2194824maxresident)k
0inputs+40outputs (0major+1604872minor)pagefaults 0swaps


### Sample Polyhedral Optimization:

In [None]:
! clang++ -mllvm --polly /vectron/docker/experiments_docker/source/cpp/smith_waterman.cpp

In [None]:
! time /vectron/docker/experiments_docker/source/cpp/smith_waterman /vectron/docker/experiments_docker/data/cpp_small/seqy.txt /vectron/docker/experiments_docker/data/cpp_small/seqx.txt >poly_out.txt

## Others

Unfortunately, we were unable to replicate 2 of the software packages we used in our benchmarks on the Docker image: SW#, and Aalign.

#### 1. SW#
SW# utilizes CUDA 11.0 and is not compatible with CUDA 12.0 and later. However, on a Linux or Mac-based system with CUDA 11.0, SW# can be built as follows:

1. Clone the repo:
    ```bash
    git clone https://github.com/mkorpar/swsharp
    ```

2. Go to the project directory and remove the existing Makefile:
    ```bash
    cd swsharp/swsharp
    rm -rf Makefile
    ```

3. Create a new Makefile using either nano or vim and copy and paste the following script:

    ```makefile
    CC = gcc
    CP = g++
    CU = nvcc
    LD = nvcc
    DX = doxygen

    NAME = swsharp

    OBJ_DIR = obj
    SRC_DIR = src
    DOC_DIR = doc
    INC_DIR = ../include/$(NAME)
    LIB_DIR = ../lib
    EXC_DIR = ../bin
    WIN_DIR = ../swsharpwin/$(NAME)

    I_CMD = $(addprefix -I, $(SRC_DIR) )
    L_CMD = $(addprefix -L, )

    DEP_LIBS =

    CC_FLAGS = $(I_CMD) -O3 -Wall -march=native
    CP_FLAGS = $(CC_FLAGS)
    LD_FLAGS = $(I_CMD) $(L_CMD) -lpthread -lstdc++

    CU_FLAGS = $(I_CMD) -O3

    API = $(addprefix $(SRC_DIR)/, align.h alignment.h chain.h constants.h \
            cpu_module.h cuda_utils.h database.h db_alignment.h evalue.h gpu_module.h \
            post_proc.h pre_proc.h reconstruct.h scorer.h swsharp.h thread.h threadpool.h)

    SRC = $(shell find $(SRC_DIR) -type f \( -iname \*.cpp -o -iname \*.c -o -iname \*.cu \))
    HDR = $(shell find $(SRC_DIR) -type f \( -iname \*.h \))
    OBJ = $(subst $(SRC_DIR), $(OBJ_DIR), $(addsuffix .o, $(basename $(SRC))))
    DEP = $(OBJ:.o=.d)
    INC = $(subst $(SRC_DIR), $(INC_DIR), $(API))
    LIB = $(LIB_DIR)/lib$(NAME).a
    EXC = $(NAME)
    BIN = $(EXC_DIR)/$(EXC)
    DOC = $(DOC_DIR)/Doxyfile
    WIN = $(subst $(SRC_DIR), $(WIN_DIR), $(HDR) $(SRC))

    debug: CC_FLAGS := $(CC_FLAGS) -DDEBUG -DTIMERS
    debug: CP_FLAGS := $(CP_FLAGS) -DDEBUG -DTIMERS
    debug: CU_FLAGS := $(CU_FLAGS) -DDEBUG -DTIMERS --ptxas-options=-v

    cpu: LD = $(CC)

    all: $(OBJ) $(DEP_LIBS)
    debug: all
    cpu: all

    install: lib include win

    bin: $(BIN)

    include: $(INC)

    lib: $(LIB)

    win: $(WIN)

    $(EXC): $(OBJ) $(DEP_LIBS)
            @echo [LD] $@
            @mkdir -p $(dir $@)
            @$(LD) $(OBJ) -o $@ $(LD_FLAGS)
    $(OBJ_DIR)/%.o: $(SRC_DIR)/%.c
            @echo [CC] $<
            @mkdir -p $(dir $@)
            @$(CC) $< -c -o $@ -MMD $(CC_FLAGS)
    $(OBJ_DIR)/%.o: $(SRC_DIR)/%.cpp
            @echo [CP] $<
            @mkdir -p $(dir $@)
            @$(CP) $< -c -o $@ -MMD $(CP_FLAGS)

    $(OBJ_DIR)/%.o: $(SRC_DIR)/%.cu
            @mkdir -p $(dir $@)
    ifeq (,$(findstring cpu,$(MAKECMDGOALS)))
            @echo [CU] $<
            @$(CU) $< -M -o $(@:.o=.d) $(CU_FLAGS) --output-directory $(dir $@)
            @$(CU) $< -c -o $@ $(CU_FLAGS)
    else
            @echo [CP] $<
            @$(CP) -x c++ $< -c -o $@ -MMD $(CP_FLAGS)
    endif

    $(INC_DIR)/%.h: $(SRC_DIR)/%.h
            @echo [CP] $@
            @mkdir -p $(dir $@)
            @cp $< $@
            
    $(LIB): $(OBJ)
            @echo [AR] $@
            @mkdir -p $(dir $@)
            @ar rcs $(LIB) $(OBJ) 2> /dev/null

    $(BIN): $(EXC)
            @echo [CP] $@
            @mkdir -p $(dir $@)
            @cp $< $@

    $(WIN_DIR)/%: $(SRC_DIR)/%
            @echo [CP] $@
            @mkdir -p $(dir $@)
            @cp $< $@

    docs:
            @echo [DX] generating documentation
            @$(DX) $(DOC)
            
    clean:
            @echo [RM] cleaning
            @rm -rf $(OBJ_DIR) $(EXC)

    remove:
            @echo [RM] removing
            @rm -rf $(INC_DIR) $(LIB) $(BIN) $(EXC) $(WIN)

    -include $(DEP)
    ```

4. Navigate back to the main swsharp folder and run:
    ```bash
    make
    ```

5. The executables will be in `./swsharp/bin`.

6. A simple alignment test can be run by using:
    ```bash
    ./swsharp/bin/swsharpdb -i query.fasta -j target.fasta
    ```

#### 2. Aalign
Aalign utilizes `icpc` (Intel's C++ compiler), which is not available on Ubuntu 22.04's Docker image. However, on a Linux or Mac-based system with CUDA 11.0, Aalign can be built as follows:

1. Install the Intel `icpc` C++ compiler.
2. Clone the repo:
    ```bash
    git clone https://github.com/vtsynergy/aalign
    ```

3. In `./aalign/ModularDesign`, run:
    ```bash
    make
    ```

4. You will have access to Smith-Waterman (different modes) and Needleman-Wunsch (different modes).
