Skip to content
/ LCS Public

LCS implementations with vanilla C, C+Pthreads, C+OMP and C+MPI

Notifications You must be signed in to change notification settings

drberg1000/LCS

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

18 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Longest Common Substring

This repository currently several implementations of LCS.

  • LCS_serial_row: runs on one core and fills the matrix row wise.
  • LCS_serial: runs on one core and fills the matrix in diagonal order.
  • LCS_omp: accepts a anumber of cores as an argument and uses OpenMP for threading. Otherwise, it's identical to LCS_serial.
  • LCS_mpi: uses the Open Message Passing Interface to create an LCS routine that will run on multiple computers in a cluster as well on a multi core machine. NOTE The 12/16/2014 version will not print an LCS. It only finds it's length. As a result, LCS_mpi fails the test_LCS.sh check. TODO Review MPI function calls, clean up code, and replace LCS_Print with a distributed serial format (ie. each process returns/prints just it's portion of the LCS.
  • LCS_pthreads: Threading with pthreads correctly returns the LCS length and the same LCS as LCS_serial* and LCS_omp. This version could also use some clean up.

test_LCS

compiles and verifies functionality against micro_test*.txt and tiny_test*.txt. If the argument correspond to versions as follows:

  • serial == diagonal wise serial
  • row == row wise serial
  • diagonal == diagonal wise omp

The serial & OpenMP implementations use:

len_X+len_Y + len_X*8 + (len_X*len_Y)*2

bytes of memory on the heap. On machine with 4G of RAM two strings of ~46,341 bytes will cause paging. The code checks to make sure the above formula is under 4*2^30. If the strings are longer than this, the program will abort. The mpi and pthreads likely use more memory than this but don't yet have the same check. With MPI having the ability to run on multiple machines it's limit check should consider this and be conditionally raised.

Main routines from serial and OpenMP versions.

void LCS_length(char *x, char *y, int **c);

LCS_Length calculates the length of the longest common substring shared between the character strings pointed to by x and y. The longest common subsequence between x_i and y_j is stored in the (m+1)x(n+1) matrix at address c.

Here, x_i is first i characters of x, and y_j is the first j characters of y. x is m characters long and y is j characters long. Thus, the LCS of x and y will be found at c[m][n].

void LCS_print(int **c, char *x, int i, int j);

LCS_print takes the matrix modified by LCS_length() and uses it, along with recursive calls to itself to print a LCS of x_i and y_j to standard out. Thus calling LCS_print() with i=strlen(x) and j=strlen(y) will result in printing the LCS of x and y.

NOTE: This algorithm will always print the same LCS, but it is not necessarily unique.

About

LCS implementations with vanilla C, C+Pthreads, C+OMP and C+MPI

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published