# Dynamic Programming 
# Longest Common Subsequence
<br/>

> **Problem Statement**: Write a function to find the length of the **longest common subsequence** between two sequences. E.g. Given the strings "serendipitous" and "precipitation", the longest common subsequence is "reipito" and its length is 7.
>
> A "sequence" is a group of items with a deterministic ordering. Lists, tuples and ranges are some common sequence types in Python.
>
> A "subsequence" is a sequence obtained by deleting zero or more elements from another sequence. For example, "edpt" is a subsequence of "serendipitous".

<br/>

<img src="https://i.imgur.com/ry4Y0wS.png" width="420">


<br/>

## The Method

Here's the systematic strategy we'll apply for solving problems:

1. State the problem clearly. Identify the input & output formats.
2. Come up with some example inputs & outputs. Try to cover all edge cases.
3. Come up with a correct solution for the problem. State it in plain English.
4. Implement the solution and test it using example inputs. Fix bugs, if any.
5. Analyze the algorithm's complexity and identify inefficiencies, if any.
6. Apply the right technique to overcome the inefficiency. Repeat steps 3 to 6.
<br/><br/>

## Solution


### 1. State the problem clearly. Identify the input & output formats.

**Problem**

> **We are given 2 sequences, and we need to find the length of the longest subsequences between them.**


**Input**

1. **seq1**
2. **seq2**



**Output**

1. **len_lcs** (Length of the Longest Common Subsequence)


In [95]:
def lcs(seq1, seq2):
    pass

### 2. Come up with some example inputs & outputs. Try to cover all edge cases.

1. General case (string)
2. General case (list)
3. No common subsequence
4. One is a subsequence of the other
5. One sequence is empty
6. Both sequences are empty
7. Multiple subsequences with same length like “abcdef” and “badcfe”

In [96]:
T0 = {
    'input': {
        'seq1': 'serendipitous',
        'seq2': 'precipitation'
    },
    'output': 7
}

T1 = {
    'input': {
        'seq1': [1, 3, 5, 6, 7, 2, 5, 2, 3],
        'seq2': [6, 2, 4, 7, 1, 5, 6, 2, 3]
    },
    'output': 5
}

T2 = {
    'input': {
        'seq1': 'longest',
        'seq2': 'stone'
    },
    'output': 3
}

T3 = {
    'input': {
        'seq1': 'asdfwevad',
        'seq2': 'opkpoiklklj'
    },
    'output': 0
}

T4 = {
    'input': {
        'seq1': 'dense',
        'seq2': 'condensed'
    },
    'output': 5
}

T5 = {
    'input': {
        'seq1': '',
        'seq2': 'opkpoiklklj'
    },
    'output': 0
}

T6 = {
    'input': {
        'seq1': '',
        'seq2': ''
    },
    'output': 0
}

T7 = {
    'input': {
        'seq1': 'abcdef',
        'seq2': 'badcfe'
    },
    'output': 3
}

tests = [T0, T1, T2, T3, T4, T5, T6, T7]

### 3. Come up with a correct solution for the problem. State it in plain English.

#### Recursive Solution


1. Create two counters `idx1` and `idx2` starting at 0. Our recursive function will compute the LCS of `seq1[idx1:]` and `seq2[idx2:]`


2. If `seq1[idx1]` and `seq2[idx2]` are equal, then this character belongs to the LCS of `seq1[idx1:]` and `seq2[idx2:]` (why?). Further the length this is LCS is one more than LCS of `seq1[idx1+1:]` and  `seq2[idx2+1:]`

<img src="https://i.imgur.com/um7LDiX.png" width="400">

3. If not, then the LCS of `seq1[idx1:]` and `seq2[idx2:]` is the longer one among the LCS of `seq1[idx1+1:], seq2[idx2:]` and the LCS of `seq1[idx1:]`, `seq2[idx2+1:]`

<img src="https://i.imgur.com/DRanmOy.png" width="360">

5. If either `seq1[idx1:]` or `seq2[idx2:]` is empty, then their LCS is empty.



Here's what the tree of recursive calls looks like:


![](https://i.imgur.com/JJrq3KH.png)

###  4. Implement the solution and test it using example inputs. Fix bugs, if any.

In [97]:
def lcs_recursive(seq1, seq2, idx1=0, idx2= 0):
    # base case:
    if idx1 == len(seq1) or idx2 == len(seq2):
        return 0
    
    # recursive call:
    elif seq1[idx1] == seq2[idx2]:
        # do something before the recursive call (add 1)
        return 1 + lcs_recursive(seq1, seq2, idx1+1, idx2+1)
    else:
        return max(lcs_recursive(seq1, seq2, idx1+1, idx2), lcs_recursive(seq1, seq2, idx1, idx2+1))

In [98]:
from jovian.pythondsa import evaluate_test_cases
evaluate_test_cases(lcs_recursive, tests)


[1mTEST CASE #0[0m

Input:
{'seq1': 'serendipitous', 'seq2': 'precipitation'}

Expected Output:
7


Actual Output:
7

Execution Time:
163.283 ms

Test Result:
[92mPASSED[0m


[1mTEST CASE #1[0m

Input:
{'seq1': [1, 3, 5, 6, 7, 2, 5, 2, 3], 'seq2': [6, 2, 4, 7, 1, 5, 6, 2, 3]}

Expected Output:
5


Actual Output:
5

Execution Time:
2.37 ms

Test Result:
[92mPASSED[0m


[1mTEST CASE #2[0m

Input:
{'seq1': 'longest', 'seq2': 'stone'}

Expected Output:
3


Actual Output:
3

Execution Time:
0.115 ms

Test Result:
[92mPASSED[0m


[1mTEST CASE #3[0m

Input:
{'seq1': 'asdfwevad', 'seq2': 'opkpoiklklj'}

Expected Output:
0


Actual Output:
0

Execution Time:
48.905 ms

Test Result:
[92mPASSED[0m


[1mTEST CASE #4[0m

Input:
{'seq1': 'dense', 'seq2': 'condensed'}

Expected Output:
5


Actual Output:
5

Execution Time:
0.062 ms

Test Result:
[92mPASSED[0m


[1mTEST CASE #5[0m

Input:
{'seq1': '', 'seq2': 'opkpoiklklj'}

Expected Output:
0


Actual Output:
0

Execution Time:

[(7, True, 163.283),
 (5, True, 2.37),
 (3, True, 0.115),
 (0, True, 48.905),
 (5, True, 0.062),
 (0, True, 0.001),
 (0, True, 0.001),
 (3, True, 0.021)]

### 5. Analyze the algorithm's complexity and identify inefficiencies, if any.


Worst case occurs when each time we have to try 2 subproblems i.e. when the sequences have no common elements.

<img src="https://i.imgur.com/z5m36m8.png" width="360">


<img src="https://i.imgur.com/n8ZgBYj.png" width="500">

All the leaf nodes are `(0, 0)`. Can you count the number of leaf nodes?

*HINT*: Count the number of unique paths from root to leaf. The length of each path is `m+n` and at each level there are 2 choices. 

Based on the above can you infer that the time complexity is $O( 2^{(m+n)} )$.


#### Time Complexity = $O( 2^{(m+n)} )$

### 6. Apply the right technique to overcome the inefficiency. Repeat steps 3 to 6.

#### Dynamic programming


<img src="https://i.imgur.com/SAsEol6.png">




## https://youtu.be/sSno9rV8Rhg

### 7. Come up with a correct solution for the problem. State it in plain English.

Come with the optimized correct solution and explain it in simple words below:

1. **???**
2. **???**
3. **???**
4. **???**
5. **???**


### 8. Implement the solution and test it using example inputs. Fix bugs, if any.

In [99]:
s1 = 'bd'
s2 = 'abcd'
tab = [ ([None]*(len(s2)+1))  for i in range(len(s1)+1)]
print(tab,'\n')

print(s1+' x '+s2)
s=''
for i in range(len(tab)):
    for j in range(len(tab[0])):
        s += str(tab[i][j]) + '\t|\t'
    print(s, "\n")
    s=''

[[None, None, None, None, None], [None, None, None, None, None], [None, None, None, None, None]] 

bd x abcd
None	|	None	|	None	|	None	|	None	|	 

None	|	None	|	None	|	None	|	None	|	 

None	|	None	|	None	|	None	|	None	|	 



In [100]:
def lcs_dynamic(seq1, seq2):
    table = [([None]*(len(seq2)+1)) for i in range(len(seq1)+1)]
    for i in range(len(table)):
        table[i][0] = 0
    for i in range(1, len(table[0])):
        table[0][i] = 0
    
    def fill_table(i, j):
        if seq1[i-1] == seq2[j-1]:
            table[i][j] = 1 + table[i-1][j-1]
        else:
            table[i][j] = max(table[i-1][j], table[i][j-1])

    for i in range(1, len(table)):
        for j in range(1, len(table[0])):
            fill_table(i, j)

    return table[-1][-1]

In [101]:
evaluate_test_cases(lcs_dynamic, tests)


[1mTEST CASE #0[0m

Input:
{'seq1': 'serendipitous', 'seq2': 'precipitation'}

Expected Output:
7


Actual Output:
7

Execution Time:
0.065 ms

Test Result:
[92mPASSED[0m


[1mTEST CASE #1[0m

Input:
{'seq1': [1, 3, 5, 6, 7, 2, 5, 2, 3], 'seq2': [6, 2, 4, 7, 1, 5, 6, 2, 3]}

Expected Output:
5


Actual Output:
5

Execution Time:
0.031 ms

Test Result:
[92mPASSED[0m


[1mTEST CASE #2[0m

Input:
{'seq1': 'longest', 'seq2': 'stone'}

Expected Output:
3


Actual Output:
3

Execution Time:
0.016 ms

Test Result:
[92mPASSED[0m


[1mTEST CASE #3[0m

Input:
{'seq1': 'asdfwevad', 'seq2': 'opkpoiklklj'}

Expected Output:
0


Actual Output:
0

Execution Time:
0.032 ms

Test Result:
[92mPASSED[0m


[1mTEST CASE #4[0m

Input:
{'seq1': 'dense', 'seq2': 'condensed'}

Expected Output:
5


Actual Output:
5

Execution Time:
0.016 ms

Test Result:
[92mPASSED[0m


[1mTEST CASE #5[0m

Input:
{'seq1': '', 'seq2': 'opkpoiklklj'}

Expected Output:
0


Actual Output:
0

Execution Time:
0

[(7, True, 0.065),
 (5, True, 0.031),
 (3, True, 0.016),
 (0, True, 0.032),
 (5, True, 0.016),
 (0, True, 0.003),
 (0, True, 0.002),
 (3, True, 0.015)]

### 9. Analyze the algorithm's complexity and identify inefficiencies, if any.

#### Time Complexity: O(m * n)
m, n = len(seq1), len(seq2)