# Python Algorithms
## Chapter 8 Tangled Dependencies and Memoization
Say you have a sequence of numbers, and you want to find its longest increasing (or, rather nondecreasing) subsequence—or one of them, if there are more. A subsequence consists of a subset of the elements in their original order. So, for example, in the sequence [3, 1, 0, 2, 4], one solution would be [1, 2, 4]
### Don’t Repeat Yourself

In [3]:
from functools import wraps
def memo(func):
    cache = {}
    @wraps(func)
    def wrap(*args):
        if args not in cache:
            cache[args] = func(*args)
        return cache[args]
    return wrap

we often think of the elements in order so that a single evaluation of $C(n,k)$ would only worry about whether element number $n$ should be included. If it is included, we have to count the $k-1$-sized subsets of the 
remaining $n-1$ elements, which is simply $C(n-1,k-1)$. If it is not included, we have to look for subsets of size $k$, or $C(n-1,k)$.

In [7]:
@memo
def C(n,k):
    if k == 0: return 1
    if n == 0: return 0
    return C(n-1,k-1) + C(n-1,k)
C(6,4)

15

In [8]:
from collections import defaultdict
n,k = 10,7
C = defaultdict(int)
for row in range(n+1):
    C[row,0] = 1
    for col in range(1,k+1):
        C[row,col] = C[row-1,col-1]+C[row-1,col]
C[6,4]

15


### Shortest Paths in Directed Acyclic Graphs
At the core of dynamic programming lies the idea of sequential decision problems. Each choice you make leads to a 
new situation, and you need to find the best sequence of choices that gets you to the situation you want.  
Let’s assume that we already know the answer for all the nodes we can move to. Let’s say the distance from a node $v$ to our end node is $d(v)$. Let the edge weight of edge $(u,v)$ be $w(u,v)$. Then, if we’re in node $u$, we already (by inductive hypothesis) know $d(v)$ for each neighbor $v$, so we just have to follow the edge to the neighbor $v$ that minimizes the expression $w(u,v) + d(v)$. In other words, we minimize the sum of the first step and the shortest path from there.

In [23]:
G = {
    'a':{'b':2,'f':9},
    'b':{'c':1,'d':2,'f':6},
    'c':{'d':7},
    'd':{'e':2,'f':3},
    'e':{'f':4},
    'f':dict()
}

In [13]:
def rec_short(G,s,t):
    @memo
    def d(u):
        if u == t:
            return 0
        return min(G[u][v] + d(v) for v in G[u]) # min the sum of next step (from u to v) and from v to end point
    return d(s)
rec_short(G,'a','f')

7

In [25]:
def topsort(G):  # removing the correct node, linear time
    count = dict((n,0) for n in G)
    for n in G:
        for v in G[n]:
            count[v] += 1
    Q = [n for n in G if count[n] ==0]
    S = []
    while Q:
        a = Q.pop() # remove 0 in-degree
        S.append(a)
        for n in G[a]:
            count[n] -= 1 # update count
            if count[n] == 0:
                Q.append(n)
    return S
topsort(G)

['a', 'b', 'c', 'd', 'e', 'f']

In [27]:
def itr_short(G,s,t):
    d = {u:float('inf') for u in G} #initial estimate
    d[s] = 0
    for u in topsort(G):
        if u == t:
            break  #arrived at the destination
        for v in G[u]:
            d[v] = min(d[v],d[u] + G[u][v]) #update the last estimate with edge 
    return d[t]
itr_short(G,'a','f')

7

### Longest Increasing Subsequence
Let’s try, to find the longest increasing subsequence that ends at each given position.If we’ve already know how to find this for the first $k$ positions, how can we find it for position $k + 1$? Once we’ve gotten this far, the answer is pretty straightforward: We just look at the previous positions and look at those whose elements are smaller than the current one. Among those, we choose the one that is at the end of the longest subsequence

In [31]:
from random import randint
n = [randint(0,20) for n in range(10)]
n

[0, 16, 7, 20, 19, 1, 14, 16, 6, 7]

In [33]:
def rec_ls(seq):
    @memo 
    def ls(cur):
        res = 1
        for pre in range(cur):
            if seq[pre] <= seq[cur]:
                res = max(res,1+ls(pre)) 
        return res
    return max(ls(i) for i in range(len(seq)))
rec_ls(n)

4

In [34]:
def itr_ls(seq):
    L = [1]*len(seq)
    for curPos,val in enumerate(seq):
        for prePos in range(curPos):
            if seq[prePos] <= val:
                L[curPos] = max(L[curPos],1+L[prePos])
    return max(L)
itr_ls(n)

4

### Sequence Comparison
Let’s say our sequences are $a$ and $b$. As with inductive thinking in general, we start with two arbitrary prefixes, identified by their lengths $i$ and $j$. What we need to do is relate the solution to this problem to some other problems, where at least one of the prefixes is smaller. Intuitively, we’d like to temporarily chop off some elements from the end of either sequence, solve the resulting problem by our inductive hypothesis, and stick the elements back on. If we stick with weak induction (reduction by one) along either sequence, we get three cases: Chop the last element from $a$, from $b$, or from both. If we remove an element from just one sequence, it’s excluded from the LCS. If we drop the last from both, however, what happens depends on whether the two elements are equal or not. If they are, we can use them to extend the LCS by one! (If not, they’re of no use to us.)
We can express the length of the LCS of $a$ and $b$ as a function of prefix lengths $i$ and $j$ as follows:  
$$
L(i,j) = \begin{cases}
    0 &i = 0\quad\mathsf{or}\quad j = 0  \\ 
    1 + L(i-1,j-1) & a_i = b_j \\
    \max(L(i-1,j),L(i,j-1))&\mathsf{otherwise}
\end{cases}
$$  
![](../images/python%20algorithm/9.jpg)

In [4]:
def rec_lcs(a,b):
    @memo
    def L(i,j):
        if min(i,j) < 0:
            return 0
        if a[i] == b[j]:
            return 1 + L(i-1,j-1)
        return max(L(i-1,j),L(i,j-1))
    return L(len(a)-1,len(b)-1)

In [5]:
a = 'abcdefg'
b = 'adcgifg'
rec_lcs(a,b)

4

In [7]:
def itr_lcs(a,b):
    n,m = len(a),len(b)
    pre,cur = [0]*(n+1),[0]*(n+1)
    for j in range(1,m+1):
        pre,cur = cur,pre
        for i in range(1,n+1):
            if a[i-1] == b[j-1]:
                cur[i] = pre[i-1] + 1
            else:
                cur[i] = max(pre[i],cur[i-1])
    return cur[n]
itr_lcs(a,b)

4

### The Knapsack Strikes Back
If we say that $m(r)$ is the maximum value we can get with a (remaining) capacity $r$, each value of $r$ gives us a 
subproblem. The recursive decomposition is based on either using or not using the last unit of the capacity. If we don’t use it, we have $m(r) = m(r-1)$. If we do use it, we have to choose the right object to use. If we choose object $i$ (provided it will fit in the remaining capacity), we would have $m(r) = v[i] + m(r-w[i])$, because we’d add the value of $i$, but we’d also have used up a portion of the remaining capacity equal to its weight.  

In [8]:
def rec_unbound_knapsack(w,v,c): #weight, value, capacity
    @memo
    def m(r): #value of capacity r
        if r ==0 :
            return 0
        val = m(r-1)
        for i,weight in enumerate(w): #in all objects
            if weight > r :
                continue #too heavy
            val = max(val,v[i] + m(r-weight))
        return val
    return m(c)

In [None]:
def itr_unbound_knapsack(w,v,c):
    m = [0]
    for r in range(1,c+1):
        val = m[r-1]
        for i,weight in enumerate(w): #in all objects
            if weight > r :
                continue #too heavy
            val = max(val,v[i] + m(r-weight))
        m.append(val)
    return m[c]

Let $m(k,r)$ be the maximum value we can have with the first k objects and a remaining capacity $r$. Then, clearly, if $k = 0$ or $r = 0$, we will have $m(k,r) = 0$. For other cases, we once again have to look at what our decision is. For this problem, the decision is simpler than in the unbounded one; we need consider only whether we want to include the last object, $i = k-1$. If we don’t, we will have $m(k,r) = m(k-1,r)$. In effect, we’re just “inheriting” the optimum from the case where we hadn’t considered $i$ yet. Note that if $w[i] > r$, we have no choice but to drop the object. If the object is small enough, though, we can include it, meaning that $m(k,r) = v[i] + m(k-1,r-w[i])$

In [None]:
def rec_knapsack(w,v,c):
    @memo
    def m(k,r):
        if k ==0 or r == 0:
            return 0
        i = k-1
        drop = m(k-1,r) # the optimal from the last one
        if w[i] > r:
            return drop
        return max(drop,v[i] + m(k-1,r-w[i]))
    return m(len(w),c)

In [9]:
def irt_knapsack(w,v,c):
    n = len(w)
    m = [[0]*(c+1) for i in range(n+1)] #max/value
    P = [[False]*(c+1) for i in range(n+1)] # drop/include
    for k in range(1,n+1):
        i = k-1
        for r in range(1,k+1):
            m[k][r] = drop = m[k-1][r]
            if w[i] > r:
                continue
            keep = v[i]+m[k-1][r-w[i]]
            m[k][r] = max(drop,keep)
            P[k][r] = keep > drop
    return m,P


### Binary Sequence Partitioning
### Exercise
1. Rewrite `@memo` so that you reduce the number of dict lookups by one.


In [None]:
from functools import wraps
def memo2(func):
    cache = {}
    @wraps(func)
    def wrap(*args):
        try:
            cache[args] = func(*args)
            return cache[args]
        except:
            pass
    return wrap

2. How can `two_pow` be seen as using the “in or out” idea? What would the “in or out” correspond to?   
    Counting subsets
3. Write iterative versions of `fib` and `two_pow`. This should allow you to use a constant amount of memory, while retaining the pseudolinear time (that is, time linear in the parameter `n`).  

In [2]:
def itr_fib(i):
    num = [1,1]
    if i == 0 or i == 1:
        return num[i]
    for j in range(2,i+1):
        num.append(num[j-1] + num[j-2])
    return num[-1]
itr_fib(5)

8

4. The code for computing Pascal’s triangle in this chapter actually fills out an rectangle, where the irrelevant parts are simply zeros. Rewrite the code to avoid this redundancy.
5. Extend either the recursive or iterative code for finding the length of the shortest path in a DAG so that it returns an actual optimal path.    
   store the choice made in each node. 
6. Why won’t the pruning discussed in the sidebar “Varieties of DAG Shortest Path” have any effect 
on the asymptotic running time, even in the best case?     
   toplogical sorting will have to visit every node. 
7. In the object-oriented `observer` pattern, several observers may register with an observable object. These observers are then notified when the observable changes. How could this idea be used to implement the DP solution to the DAG shortest path problem? How would it be similar to or different from the approaches discussed in this chapter?    
   let each node observe its predecessors and then explicitly trigger an update in the estimate in the start node. The observers would be notified of changes and could update their own estimates accordingly, triggering new updates in their observers
8. In the `lis` function, how do we know that `end` is nondecreasing?   
   each objected is added by bisect, making the list sorted. 
9. How would you reduce the number of calls to `bisect` in `lis`?   
   check if the new element is larger than the last element or if end is empty
10. Extend either the recursive or one of the iterative solutions to the longest increasing subsequence problem so that it returns the actual subsequence.    
    remember the predecessors 
11. Implement a function that computes the edit distance between two sequences, either using 
memoization or using iterative DP.
12. How would you find the underlying structure for LCS (that is, the actual shared subsequence) or 
edit distance (the sequence of edit operations)?   
    keep track of the choices  
13. If the two sequences compared in `lcs` have different lengths, how could you exploit that to 
reduce the function’s memory use?   
    swap the sequence and length  
14. How could you modify $w$ and $c$ to (potentially) reduce the running time of the unbounded 
knapsack problem?    
    divide by greatest common diviser  
15. The knapsack solution in Listing 8-13 lets you find the actual elements included in the optimal 
solution. Extend one of the other knapsack solutions in a similar way.  
16. How can it be that we have developed efficient solutions to the integer knapsack problems, when 
they are regarded as hard, unsolved problems (see Chapter 11)?     
     The running time is pseudopolynomial
17. The subset sum problem is one you’ll also see in Chapter 11. Briefly, it asks you to pick a subset of a set of integers so that the sum of the subset is equal to a given constant, $k$. Implement a solution to this problem based on dynamic programming.   
18. A problem closely related to finding optimal binary search trees is the matrix chain 
multiplication problem, briefly mentioned in the text. If matrices A and B have dimensions n×m and 
m×p, respectively, their product AB will have dimensions n×p, and we approximate the cost of this 
multiplication by the product nmp (the number of element multiplications). Design and implement 
an algorithm that finds a parenthetization of a sequence of matrices so that performing all the matrix multiplications has as low total cost as possible.   
19. The optimal search trees we construct are based only on the frequencies of the elements. We 
might also want to take into account the frequencies of various queries that are not in the search tree. For example, we could have the frequencies for all words in a language available but store only some of the words in the tree. How could you take this information into consideration?
