## Longest Common Substring Problem

https://www.techiedelight.com/longest-common-substring-problem/

Find the longest string (or strings) that is a substring (or are substrings) of two strings.

This problem differs from the Longest Common Subsequence because substrings are required to occupy consecutive positions within the original string.

For example:
    
    string 1: ABABC
    string 2: BABCA
    common substring: BABC
    ABC is NOT a substring since they do not come consecutively
    

So when we set up our 2D matrix, what we want to keep track of is whether we have a match of chacters and whether the character before (one row up, one column to the left) also matched. 

If it's not a match, fill it in with a zero - meaning that if we want to use this character, it cannot match with the one in the other substring.

If it is a match, look to the upper left value. We can add one to this value, meaning that if we're using this character, there is a prefix before it made of the characters that are consecutive to it that we can use.
    
              A   B   A   B   C
           0  0   0   0   0   0 

        B  0  0   1   0   1   0

        A  0  1   0   2   0   0

        B  0  0   2   0   3   0

        C  0  0   0   0   0   4

        A  0  1   0   1   0   0

In [16]:
def commonsubstring(x, y):
    lookup = [[0]*(len(x) + 1) for _ in range(len(y)+1)]
    max_length = 1
    ending_index = 0
    for i in range(len(y)):
        for j in range(len(x)):
            row = i+1
            col = j+1
            if x[j] == y[i]:
                length = lookup[i][j] + 1
                lookup[row][col] = length
                if length > max_length:
                    max_length = length
                    ending_index = i
    return y[ending_index - max_length + 1 : ending_index + 1]

In [22]:
""" test strings """
A = 'ABABC'
B = 'BABCA'
C = 'ABC'
D = 'BABA'
E = 'ABABCGHIJ'
F = 'BABCAGHIJ'

In [17]:
commonsubstring('ABABC', 'BABCA')

'BABC'

In [19]:
commonsubstring(C, D)

'AB'

In [20]:
def allcommonsubstrings(x, y):
    lookup = [[0]*(len(x) + 1) for _ in range(len(y)+1)]
    max_length = 1
    ending_indices = []
    for i in range(len(y)):
        for j in range(len(x)):
            row = i+1
            col = j+1
            if x[j] == y[i]:
                length = lookup[i][j] + 1
                lookup[row][col] = length
                if length > max_length:
                    max_length = length
                    ending_indices = [i]
                elif length == max_length:
                    ending_indices.append(i)
    return [y[e - max_length + 1 : e + 1] for e in ending_indices]

In [23]:
allcommonsubstrings(E, F)

['BABC', 'GHIJ']