## Shortest common supersequence (problem)

Given two strings s1 and s2, find the length of their shortest common supersequence, the shortest string that has both s1 and s2 subsequences.

A string A is a supersequence of a string B if B is a subsequence of A.


### Example:

input:
s1 = "abdacbab"
s2 = "acebfca"

output: 11

explanation: The shortest common supersequence of s1 and s2 is "abdacebfcab", its length is 11

### 

<img src="scs.png" alt="Alt Text" width="300"/>

## The relation


## The bottom-up approach:

In [1]:
s1 = "abdacbab"
s2 = "acebfca"

In [4]:
def scs(s1, s2):
    
    dp = [[0]* (len(s1)+1) for _ in range(len(s2)+1)]
    
    for j in range(1, len(s1)+1): 
        dp[0][j] = j
        
    for i in range(1, len(s2)+1):
        dp[i][0] = i
        
    for j in range(1, len(s2)+1):
        for i in range(1, len(s1)+1):
            if s1[i-1] == s2[j-1]:
                dp[j][i] = max(dp[j-1][i-1] + 1, dp[j-1][i], dp[j][i-1]) 
            else:
                dp[j][i] = 1 + min(dp[j-1][i-1] + 1, dp[j-1][i], dp[j][i-1])
                
    return dp[-1][-1]

In [5]:
scs(s1, s2)

11

## The original solution

## Using the LCS

In [2]:
def lcs(s1, s2):
    
    n = len(s1)
    m = len(s2)
    dp = [[0]*(m+1) for i in range(n+1)]
    
    for i in range(1, n+1):
        for j in range(1, m+1):
            if s1[i-1] == s2[j-1]:
                dp[i][j] = 1 + dp[i-1][j-1]
            else:
                dp[i][j] = max(dp[i-1][j], dp[i][j-1])
    
    return dp[n][m]


def scs(s1, s2):
    return len(s1) + len(s2) - lcs(s1, s2)

In [3]:
scs(s1, s2)

11

## Recursive

Time complexity: $O(2^{n+m})$\
Space complexity: $O(n+m)$

In [5]:
def scs(s1, s2, i=0, j=0):
    
    if i == len(s1):
        return len(s2)-j
    
    elif j == len(s2):
        return len(s1)-i
    
    elif s1[i] == s2[j]:
        return 1 + scs(s1, s2, i+1, j+1)
    
    else:
        return 1 + min(scs(s1, s2, i+1, j), scs(s1, s2, i, j+1))

In [6]:
scs(s1, s2)

11

## Memoization (top-down)

Time complexity: $O(nm)$\
Space complexity: $O(nm)$

In [None]:
def scs(s1, s2, i=0, j=0, lookup=None):
    
    lookup = {} if lookup is None else lookup
    
    if (i, j) in lookup:
        return lookup[(i, j)]
    
    if i == len(s1):
        return len(s2)-j
    
    elif j == len(s2):
        return len(s1)-i
    
    elif s1[i] == s2[j]:
        lookup[(i, j)] = 1 + scs(s1, s2, i+1, j+1, lookup)
        return lookup[(i, j)]
    
    else:
        lookup[(i, j)] = 1 + min(scs(s1, s2, i+1, j, lookup), scs(s1, s2, i, j+1, lookup))
        return lookup[(i, j)]

## Tabulation (bottom-up)

Time complexity: $O(nm)$\
Space complexity: $O(nm)$

In [7]:
def scs(s1, s2):
    
    n, m = len(s1), len(s2)
    dp = [[0]*(m+1) for i in range(n+1)]
    
    for j in range(1, m+1):
        dp[0][j] = j
    
    for i in range(1, n+1):
        dp[i][0] = i
    
    for i in range(1, n+1):
        for j in range(1, m+1):
            if s1[i-1] == s2[j-1]:
                dp[i][j] = 1 + dp[i-1][j-1]
            else:
                dp[i][j] = 1 + min(dp[i-1][j], dp[i][j-1])
    
    return dp[n][m]

In [8]:
scs(s1, s2)

11

But we can do it in:

Time complexity: $O(nm)$\
Space complexity: $O(m)$

In [9]:
def scs(s1, s2):
    
    n, m = len(s1), len(s2)
    prev_dp = [0]*(m+1)
    dp = [0]*(m+1)
    
    for j in range(1, m+1):
        prev_dp[j] = j
    
    for i in range(1, n+1):
        dp[0] = i
        for j in range(1, m+1):
            if s1[i-1] == s2[j-1]:
                dp[j] = 1 + prev_dp[j-1]
            else:
                dp[j] = 1 + min(prev_dp[j], dp[j-1])
        prev_dp = dp
        dp = [0]*(m+1)
    
    return prev_dp[m]

In [10]:
scs(s1, s2)

11