In [None]:
# setup
from IPython.core.display import display,HTML
display(HTML('<style>.prompt{width: 0px; min-width: 0px; visibility: collapse}</style>'))
display(HTML(open('rise.css').read()))

# imports
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline
sns.set(style="whitegrid", font_scale=1.5, rc={'figure.figsize':(12, 6)})


# CMPS 2200
# Introduction to Algorithms

## Longest Increasing Subsequence


## Longest Increasing Subsequence

We previously looked at approaches to identify trends in a sequence: longest run, maximum contiguous subsequence, longest gap. 

Let's look another trend we might want to identify from a sequence. Given a sequence $S = \langle s_0, s_1, \ldots, s_{n-1} \rangle$, what is the longest increasing subsequence? Note that subsequences don't need to be contiguous.

Example: $S=\langle 5, 2, 8, 6, 3, 6, 9, 7\rangle$. Every subsequence of length 1 is trivially increasing. Also $\langle 2, 6, 9 \rangle$, $\langle 2, 8, 9 \rangle$ are increasing, as is $\langle 5, 6, 7\rangle$. What is the longest?



Let's reduce this problem to something slightly simpler with the observation that the longest increasing subsequence must start somewhere in $S$.

Let $\mathit{LIS}(S, i)$ be the longest increasing subsequence for $S$ that starts with $S[i]$ as the first element. 

How can we use the function $\mathit{LIS}(S, i)$ to solve the original problem?



If we can compute $LIS(S, i)$ then we can compute $ \mathit{LIS}(S) = \max_{0\leq i < n} \mathit{LIS}(S, i).$


- If $S[i]$ is the first element, then the next element $j$ must have $j>i$ and $S[j] > S[i]$. 
- Whichever element is next, we must have $\mathit{LIS}(S, i) = 1 + \max_{j: S[j] > S[i]} \mathit{LIS}(S[j:]).$




**Optimal Substructure for Longest Increasing Subsequence**: Given a sequence $S$, we have that the longest increasing subesquence of $S$ is $ \mathit{LIS}(S) = \max_{0\leq i < n} \mathit{LIS}(S, i)$ where
$$\mathit{LIS}(S, i) = 1 + \max_{j: S[j] > S[i]} \mathit{LIS}(S, j).$$

To compute this optimal substructure property, how many distinct subproblems must be computed from scratch? 


This optimal substructure property is little different than what we've seen so far. There are only a linear number of starting points for an optimal solution. But for each subproblem the work to compute an optimal solution, even if we have computed all subproblems already, is actually linear in the size of the sequence we consider (instead of $O(1)$). 

However there are only a linear number of starting points for an optimal solution. 



In [None]:
# longest increasing subsequence starting at position 0
def LIS_helper(S):
    if (S == []):
        return(0)
    else:
        # find elements in the sequence that are larger than S[0]
        rest = [j for j in range(1,len(S)) if S[j]>S[0]]
        if (rest == []):
            return(1)
        else:
            results = [LIS_helper(S[i:]) for i in rest]
            if (results == []):
                return(1)
            else:
                return(1 + max(results))
    
def LIS(S):
    return(max([LIS_helper(L[i:]) for i in range(len(L))]))

L = [5,2,8,6,3,6,9,7]
print(LIS(L))


So, for a list $S$ of length $n$, we incur $O(n^2)$ work if we reuse the results from already visited subproblems. Since we decrease the length of the list by at least one element in each recursive call, the longest path in the DAG is $n$. At each node, we require $O(\lg n)$ span to compute the max (e.g., using `reduce`), and $O(1)$ span to compute `rest` (using filter), so the span is $O(n \lg n).$
