This notebook was prepared by [Donne Martin](https://github.com/donnemartin). Source and license info is on [GitHub](https://github.com/donnemartin/interactive-coding-challenges).

# Challenge Notebook

## Problem: Given two strings, find the longest common subsequence.

* [Constraints](#Constraints)
* [Test Cases](#Test-Cases)
* [Algorithm](#Algorithm)
* [Code](#Code)
* [Unit Test](#Unit-Test)
* [Solution Notebook](#Solution-Notebook)

## Constraints

* Can we assume the inputs are valid?
    * No
* Can we assume the strings are ASCII?
    * Yes
* Is this case sensitive?
    * Yes
* Is a subsequence a non-contiguous block of chars?
    * Yes
* Do we expect a string as a result?
    * Yes
* Can we assume this fits memory?
    * Yes

## Test Cases

* str0 or str1 is None -> Exception
* str0 or str1 equals 0 -> ''
* General case

str0 = 'ABCDEFGHIJ'
str1 = 'FOOBCDBCDE'

result: 'BCDE'

## Algorithm

Refer to the [Solution Notebook]().  If you are stuck and need a hint, the solution notebook's algorithm discussion might be a good place to start.

## Code

In [1]:
import numpy as np
class StringCompare(object):

    def longest_common_subseq(self, str0, str1):
        if str0 is None or str1 is None:
            raise TypeError
        if len(str0) == 0 or len(str1) == 0:
            return ''
        memo = np.zeros(shape=(len(str0) + 1, len(str1) + 1), dtype=np.int)
        memo[0,:] = 0
        memo[:,0] = 0
        
        memo[1, 1] = str0[0] == str1[0]
        for i in range(1, len(str0) + 1):
            for j in range(1, len(str1) + 1):
                if i == 1 and i == j:
                    continue
                if str0[i - 1] == str1[j - 1]:  # match case
                    memo[i, j] = memo[i - 1, j - 1] + 1
                else:  # no match case
                    memo[i, j] = max(memo[i - 1, j], memo[i, j - 1])
        
        # backtrace to uncover the common subsequence
        return self.backtrace(str0, str1, memo)

    def backtrace(self, str0, str1, memo):
        len_longest_subsequence = np.max(memo)
        assert len_longest_subsequence == memo[len(str0), len(str1)]
        
        # trace up rows and cols until encountering a change.
        rows, cols = np.where(memo == len_longest_subsequence)
        first_occurance = rows[0], cols[0]
        # first_occurance is the end of the common substring
        
        # now walk diagonally up until reaching 0.
        row_ix = first_occurance[0]
        col_ix = first_occurance[1]
        while row_ix >= 0 and col_ix >= 0 and memo[row_ix - 1, col_ix - 1] == memo[row_ix, col_ix] - 1:
            row_ix -= 1
            col_ix -= 1
        
        return str0[row_ix : first_occurance[0]]

## Unit Test

**The following unit test is expected to fail until you solve the challenge.**

In [2]:
# %load test_longest_common_subseq.py
from nose.tools import assert_equal, assert_raises


class TestLongestCommonSubseq(object):

    def test_longest_common_subseq(self):
        str_comp = StringCompare()
        assert_raises(TypeError, str_comp.longest_common_subseq, None, None)
        assert_equal(str_comp.longest_common_subseq('', ''), '')
        str0 = 'ABCDEFGHIJ'
        str1 = 'FOOBCDBCDE'
        expected = 'BCDE'
        assert_equal(str_comp.longest_common_subseq(str0, str1), expected)
        print('Success: test_longest_common_subseq')


def main():
    test = TestLongestCommonSubseq()
    test.test_longest_common_subseq()


if __name__ == '__main__':
    main()

Success: test_longest_common_subseq


## Solution Notebook

Review the [Solution Notebook]() for a discussion on algorithms and code solutions.