Before you turn this problem in, make sure everything runs as expected. First, **restart the kernel** (in the menubar, select Kernel$\rightarrow$Restart) and then **run all cells** (in the menubar, select Cell$\rightarrow$Run All).

Note that this Pre-class Work is estimated to take **46 minutes**.

Make sure you fill in any place that says `YOUR CODE HERE` or "YOUR ANSWER HERE", as well as your name and collaborators below:

In [55]:
NAME = "Andriy Kashyrskyy"
COLLABORATORS = ""

# CS110 Pre-class Work - Computational applications of dynamic programming and greedy algorithms

## Question 1 [time estimate: 18 minutes]
Complete the following functions, following the algorithms in Cormen et al.

In [56]:
def lcs_length(x, y):
    """
    Computes the length of an LCS of strings x and y.
    
    Inputs:
    - x, y: strings
    
    Outputs:
    - c: a list of lists of ints OR a numpy array. c[i,j] contains the 
    length of a LCS of x[:i] and y[:j]
    - b: a list of lists of strings OR a numpy array, containing the information
    used for LCS reconstruction (See Cormen et al.) Use "N" (North), "NW" 
    (North West), and "W" (West) that correspond to the directions of the arrows 
    used in Cormen et al.
    """
    ## getting lengths of strings
    m = len(x)
    n = len(y)
    
    ## creating new tables
    c =  [[0 for x in range(n+1)] for u in range(m+1)]
    b =  [[0 for x in range(n+1)] for u in range(m+1)] 
    
    
    ## filling in the tables
    for i in range(1, m+1):
        
        for j in range(1, n+1):
            
            if x[i-1] == y[j-1]:
                c[i][j] = c[i-1][j-1] + 1
                b[i][j] = "NW"   
            elif c[i-1][j] >= c[i][j-1]:
                c[i][j] = c[i-1][j]
                b[i][j] = "N"  
            else:
                c[i][j] = c[i][j-1]
                b[i][j] = "W" 
    return c, b

In [57]:
def print_lcs(b,x,i,j, lcs = None):
    """
    Finds a LCS.
    
    Inputs:
    - b: a list of lists of strings OR a numpy array, returned by lcs_length
    - x: string, an input to lcs_length
    - i, j: ints. print_lcs(b,x,i,j) returns a lcs of x[:i] and y[:j], where y
    is an input to lcs_length.
    - lcs: initialized as None
    
    Outputs:
    - lcs: list of strings, representing a LCS of x and y
    - length: int, the length of the LCS
    
    You can choose to actually PRINT OUT the LCS or not using the print function.
    
    """
    ## if there's no list for lcs - create an empty one, increase i, j by one
    if lcs is None:
        i += 1
        j += 1
        lcs = []
    
    if i == 0 or j == 0:
        return 
    
    if b[i][j] == "NW":
        print_lcs(b, x, i-1, j-1, lcs)
        lcs.append(x[i-1])
        
    elif b[i][j] == "N":
        print_lcs(b, x, i-1, j, lcs)
    else:
        print_lcs(b, x, i, j-1, lcs)
    
    return lcs, len(lcs)

In [58]:
import numpy as np
x, y = 'ambgdec', 'aubyci'
c, b = lcs_length(x, y)
assert(print_lcs(b,x,len(x)-1,len(y)-1)[0] == ['a', 'b', 'c'])
assert(print_lcs(b,x,len(x)-1,len(y)-1)[1] == 3)

x, y = 'xyqwsssazdesaqqf', 'xoppoypllzookjdef'
c, b = lcs_length(x, y)
assert(print_lcs(b,x,len(x)-1,len(y)-1)[0]  == ['x', 'y', 'z', 'd', 'e', 'f'])
assert(print_lcs(b,x,len(x)-1,len(y)-1)[1]  == 6)

## Question 2. (Adapted from Exercise 15-4.1 Cormen et al.) [time estimate: 3 minutes]
Use the functions built in Question 1 to find the LCS of ```'10010101'``` and ```'010110110'```. You should store the list that represents the LCS you found in a variable named ```lcs_q2```

In [59]:
## creating two lists with given inputs
entry_1 = list('10010101')
entry_2 = list('010110110')

## lcs length value
lcs_length_value = lcs_length(entry_1,entry_2)

## finding LCS, printing it
lcs = print_lcs(lcs_length_value[1], entry_1, len(entry_1)-1, len(entry_2)-1)[0]
lcs

['1', '0', '0', '1', '1', '0']

## Question 3. (Adapted from Exercise 15-4.5 Cormen et al.) [time estimate: 15 minutes]
Complete the following function, making use of ```lcs_length``` and ```print_lcs```.

In [60]:
def lmis(lst):
    """
    Finds the Longest Monotonically Increasing Subsequence (LMIS) of a list 
    (lst) of n numbers in O(n^2) time. Note that a monotonically increasing 
    sequence is a sequence of numbers such that: a_1 <= a_2 <= ... <= a_n .
    
    Inputs:
    - lst: a list of ints
    
    Outputs:
    - out_lst: a list of ints, a longest monotonically increasing subsequence
    of lst
    """
    ## making a copy of the list, sorting it
    list_copy = lst.copy()
    list_copy.sort()
    
    ## finding the largest common substring
    c, b = lcs_length(lst, list_copy)
    
    ## printing the longest common sequence
    lcsequence = print_lcs(b, lst, len(lst) - 1, len(list_copy) - 1)[0]
    return lcsequence

assert(lmis([1,2,3,4,3,2,1]) == [1,2,3,4])


## Question 4 [time estimate: 5 minutes]
How would you devise a greedy algorithm to compute the longest common subsequence in a string? Explain your strategy step by step, and comment on any advantages/limitations over the dynamic programming approach. Provide a few test cases to check the validity of the greedy approach.

The advantage of a dynamic programming approach is that it computes all possible sequences, resulting in the limitation of its efficiency; the greedy approach, contrary to the dynamic programming, computes only local optimums, mocing through which it gets to the final result rather efficiently, while it might not result in the best global optimum solution. A strategy then would be to compute the common subsequences starting with n of characters 1, 2,...n, checking whether the number of longest common subsequence becomes bigger if we increase the number of n (if it does, we update the counter value by difference in highest current and highest former lengths). 