In [1]:
import math
import logging
FORMAT = '[%(name)s:%(levelname)s]  %(message)s'
logging.basicConfig(level=logging.DEBUG, format=FORMAT)
logger = logging.getLogger('dbg')

def dprint(s):
    logger.debug(s)

def iprint(s):
    logger.info(s)

logger.setLevel(logging.INFO)

## Algorithm Design Strategies

| Approach | Description |
| ----------- | ----------- |
| Divide & Conquer | Breaks problem into subproblems like original but smaller, solve recursively and combine |
| Dynamic Programming | Applies like _divide & conquer_ but when subproblems overlap making combination more efficient |
| Greedy Algorithms | Problem is divided in such as way that combination is performed in a locally optimal manner |
| Others | Brute-Force, Backtracking, Branch & Bound, Transform & Conquer, etc. |

### Divide & Conquer

- **Divide** the problem into one or more subproblems that are smaller instances of the
same problem.
- **Conquer** the subproblems by solving them recursively.
- **Combine** the subproblem solutions to form a solution to the original problem.

Examples: 
- _Sorting_ - MergeSort, QuickSort, Heapsort
- _Search_ - Binary Search
- _Computation_ - Strassen's Matrix Multiplication
- _Signal Processing_ - FFT
- _Geometric Algorithms_ - Closest Pair

#### Binary Search

_Given a sorted array, find the location of a specific element, or show that the element does not exist_

Split the array down the middle and search recursively:

In [2]:
## search S for element k
def binary_search(S, k, p = None, r = None):
    ## set default sub-array to full-span
    # p is bottom, r is top : [p:r]
    if p == None:
        p = 0
    if r == None:
        r = len(S) - 1
    # midpoint 
    q = (p+r)//2 # floor division
    if p > r:
        return -1 # Not found
    elif S[q] == k:
        # match
        return q
    elif k < S[q]:
        return binary_search(S, k, p, q-1) # Left sub array
    else:
        return binary_search(S, k, q+1, r) # Right sub array

A = [1,2,4,8,16,32,64,128]
print(A)
print(f"{8} is at {binary_search(A, 8)}")
print(f"{9} is at {binary_search(A, 9)}")



[1, 2, 4, 8, 16, 32, 64, 128]
8 is at 3
9 is at -1


### 1D Closest Pair (Using Sort O(nlogn))

The 2D case is more complicated and non-examinable

In [3]:
def mergesort_merge(S, p, q, r):
    ## bit of slicing trickery
    ## extract the two sorted sublists we're trying to merge
    L = [S[p]] if p == q else S[p:q+1] 
    R = [S[q+1]] if q+1 == r else S[q+1:r+1]
    # trick to simplify the merge
    # if we append both with infinity, if one list ends 
    # the other will be appended without any logic change 
    L.append(float("inf"))
    R.append(float("inf"))
    i = 0 # L index
    j = 0 # R index
    # k is the index in the top level list we're replacing
    for k in range(p, r+1):
        if L[i] <= R[j]:
            S[k] = L[i]
            i += 1
        else:
            S[k] = R[j]
            j += 1
    

def mergesort(S, p = None, r = None):
    if p == None:
        iprint(S)
        p = 0
    if r == None:
        r = len(S)-1
    if p < r:
        # S is the list, p-r is the sub-array index to sort
        q = math.floor((p+r)/2) # midpoint
        # sort these two sub-arrays
        mergesort(S, p, q)
        mergesort(S, q + 1, r)
        mergesort_merge(S, p, q, r)


def closest_point_lin(S):
    mergesort(S)
    min_distance = float("inf")
    index = -1
    for i in range(1, len(S)):
        delta = S[i] - S[i-1]
        if delta < min_distance:
            min_distance = delta
            index = i
    return([S[index-1], S[index]], min_distance)

A = [1,12,34,26,7,29,55,102,234,222,65]
print(closest_point_lin(A))


[dbg:INFO]  [1, 12, 34, 26, 7, 29, 55, 102, 234, 222, 65]


([26, 29], 3)


### Dynamic Programming

Problems exhibit **optimal sub-structure** and **overlapping subproblems**.

Optimal sub-structure is when optimal solutions to a problem incorporate optimal solutions to related subproblems, which you may solve independently. Dynamic programming builds an optimal solution to the problem from optimal  solutions to subproblems.

**Key Steps:**
1. Characterize the structure of an optimal solution
2. Recursively define the value of an optimal solution
3. Compute the value of an optimal solution (bottom up)
4. Construct an optimal solution from computed information

Examples: Bellman equation, Bellman-Ford, 0-1 Knapsack Problem, Longest common subsequence

#### 0-1 Knapsack Problem

A thief robbing a store wants to take the most valuable load that can be carried in a knapsack capable of carrying
at most $W$ pounds of loot. The thief can choose to take any subset of $n$ items in the store. The $i$ th item is worth $v_i$ dollars and weighs $w_i$ pounds, where $v_i$ and $w_i$ are integers. Which items should the thief take? 

This is called the  0-1 knapsack problem because for each item, the thief must either take it or leave it behind.

In [4]:
# optimal is 0,3,4 to 14 & w = 10

## Recursive Solution O(2^n)

def knapsack_recursive_track(v,w,W,r = None):
    if r == None:
        r = 0
    
    #print(f"Check {r} w:{w[r]} v:{v[r]}")

    if r < len(w) and W > 0:
        # Case 1 - take item at position r if possible
        # find best case with remaining options
        x, s = knapsack_recursive_track(v, w, W-w[r], r + 1)
        x += v[r]
        # Case 2 - DONT take item at position r
        # find best case with remaining options
        y, t = knapsack_recursive_track(v, w, W, r + 1)

        ## pick x but only if we still have a mass budget
        if x > y and W-w[r] >= 0:
            s.append(r)
            return (x, s)
        else:
            return (y, t)
    return 0, []

# values v, weights w, max weight W and from position r
def knapsack_recursive(v,w,W,r = None):
    ## set r to the last element
    if r == None:
        r = len(w)-1

    # if we are allowed to select a new item
    # i.e. we have weight left and our current element is > -1
    if r >= 0 and W > 0:
        # Case 1 - take item at position r
        # find best case with remaining options
        # request recursion on W-w[r] for r-1
        x = knapsack_recursive(v, w, W-w[r], r - 1) + v[r]
        # Case 2 - DONT take item at position r
        # find best case with remaining options
        # keep the weight, loose the value
        y = knapsack_recursive(v, w, W, r - 1)

        # choose the best subproblem without overdoing the weight
        if x > y and W-w[r] >= 0:
            #print(x)
            return x
        else:
            #print(y)
            return y
        
    # we've run out of items
    return 0

W = 10
w = [1,3,4,8,1]
v = [2,4,1,6,6]

print(f"Optimal Value: {knapsack_recursive(v,w,W)}")
print(f"Optimal Value: {knapsack_recursive_track(v,w,W)}")

Optimal Value: 14
Optimal Value: (14, [4, 3, 0])


#### Dynamic Programming Soln

![image](media/dynam01.png)

In [9]:
W = 10
w = [1,3,4,8,1]
v = [2,4,1,6,6]

# solution dynamically constructs an optimal table for the best value achieved for items 0->i in weight allowance 0->W
# loop through items, first, and then weights

def dynamic_knapsack(v, w, W):
    # S = [[weights] items] => S[item][weight]
    S = [[0 for x in range(W+1)] for y in range(len(w))]
    # first index is i which is between 0 and n-1 which is allowed items to use
    # second index is allowed weight which is between 1 and W

    # |item| times loop i from 0
    for i in range(0, len(w)):
        # W+1 times (i.e. starting with all 0s)
        for weight in range(0, W+1):
            # if an items weight is less than the current loops allowed 
            # i.e. it could be used at the current step at all - otherwise we will try and check a -ve weight category
            if w[i] <= weight:
                # set S[allowed items 0->i for w] = max{}
                        # S[allowed items 0->i-1 for w]
                        # S[allowed items 0->i-1 for w - weight[i]] + value[i]
                # i.e. choose whatever is better 
                # 1. the value achieved by all items up to i for the given weight
                # 2. the value achieved when using item i (in the same allowed weight) by checking a weight category below that allows it 
                S[i][weight] = max( S[i-1][weight], S[i-1][weight-w[i]] + v[i])
            else:
            # else, i.e. an items weight is more than the allowed
                # set S[allowed items 0-i for w] = S[allowed items 0-i-1 for w] 
                # i.e. this item doesn't add anything, so effectively ignore it
                 S[i][weight] = S[i-1][weight]

    return S[len(w)-1][W], S

a, Q = dynamic_knapsack(v,w,W)
print(a)
W = Q[0]
print(f"For allowed weight of:    {[x for x in range(len(W))]}")
for i in range(len(Q)):
    print(f"Value using items: {0}->{i} - {Q[i]}")

14
For allowed weight of:    [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
Value using items: 0->0 - [0, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2]
Value using items: 0->1 - [0, 2, 2, 4, 6, 6, 6, 6, 6, 6, 6]
Value using items: 0->2 - [0, 2, 2, 4, 6, 6, 6, 6, 7, 7, 7]
Value using items: 0->3 - [0, 2, 2, 4, 6, 6, 6, 6, 7, 8, 8]
Value using items: 0->4 - [0, 6, 8, 8, 10, 12, 12, 12, 12, 13, 14]


#### True Longest Common Subsequence (LCS)
A subsequence is a string that can be derived from another string by deleting some or no elements.
We want to find the _length_ of this string. Using a dynamic soln. to create a memoized table of solutions where we only have to do comparisons.

In [20]:
## LCS count dynamic programming solution
def d_lcs(A, B):
    m, n = len(A), len(B)

    # create a LOL length m+1, where the first n+1 sublists contain n+1 zeros
    # X is the dynamic output capabilities list where
    # X[i][k] - LCS A[:i] B[:k]
    X = [ [] for i in range(m+1)]

    for i in range(m+1):
        X[i] = ["" for j in range (n+1)]


    for i in range(m): # len(A)
        for j in range(n): # len(B)

            if A[i] == B[j]: # if the characters are the same
                # we can make one more LCS 
                X[i+1][j+1] = X[i][j] + A[i]
            else:
                # choose the best so far
                if len(X[i][j+1]) > len(X[i+1][j]):
                    X[i+1][j+1] = X[i][j+1]
                else:
                    X[i+1][j+1] = X[i+1][j]

    return len(X[m][n]), X

x, y = "abcdefghi", "ndnenf"
print(f"m:{x}:{len(x)} n:{y}:{len(y)}")
v, X = d_lcs(x, y)

print(f"     n:   {[x for x in range(len(X[0]))]} of {y}:{len(y)}")
for i in range(len(X)):
    print(f"m: {0}->{i} - {X[i]}")

m:abcdefghi:9 n:ndnenf:6
     n:   [0, 1, 2, 3, 4, 5, 6] of ndnenf:6
m: 0->0 - ['', '', '', '', '', '', '']
m: 0->1 - ['', '', '', '', '', '', '']
m: 0->2 - ['', '', '', '', '', '', '']
m: 0->3 - ['', '', '', '', '', '', '']
m: 0->4 - ['', '', 'd', 'd', 'd', 'd', 'd']
m: 0->5 - ['', '', 'd', 'd', 'de', 'de', 'de']
m: 0->6 - ['', '', 'd', 'd', 'de', 'de', 'def']
m: 0->7 - ['', '', 'd', 'd', 'de', 'de', 'def']
m: 0->8 - ['', '', 'd', 'd', 'de', 'de', 'def']
m: 0->9 - ['', '', 'd', 'd', 'de', 'de', 'def']


In [23]:
## LCS count dynamic programming solution
def threeway_lcs(A, B, C):
    m, n, o = len(A), len(B), len(C)

    # create a LOL length m+1, where the first n+1 sublists contain n+1 zeros
    # C is the output capabilities list where
    # C[i][k] - LCS A[:i] B[:k]

    X = [ [] for i in range(m+1)]

    for i in range(m+1):
        X[i] = [ [] for j in range (n+1)]

    for i in range(m+1):
        for j in range(n+1):
            X[i][j] = [ "" for j in range (o+1)]


    for i in range(m): # len(A)
        for j in range(n): # len(B)
            for k in range(o): # len(B)

                if A[i] == B[j] and B[j] == C[k]: # if the characters are the same
                    # we can make one more LCS 
                    X[i+1][j+1][k+1] = X[i][j][k] + A[i]
                else:
                    # choose the best so far
                    # set default 111=011
                    X[i+1][j+1][k+1] = X[i][j+1][k+1]
                    # if 101 > 011 by proxy
                    # 111 = 101
                    if X[i+1][j][k+1] > X[i+1][j+1][k+1]:                        
                        X[i+1][j+1][k+1] = X[i+1][j][k+1]
                    # if 110 > 011 or 101 by proxy
                    # 111 = 110
                    if X[i+1][j+1][k] > X[i+1][j+1][k+1]:                        
                        X[i+1][j+1][k+1] = X[i+1][j+1][k]

    return len(X[m][n][o]), X

x, y, z = ["le2ap","3le2apto","l1eaptoleap"]
print(f"{x} {y} {z}")
v, X = threeway_lcs(x, y, z)

for i in range(len(X)):
    print(f"m: {0}->{i} - {X[i]}")

le2ap 3le2apto l1eaptoleap
m: 0->0 - [['', '', '', '', '', '', '', '', '', '', '', ''], ['', '', '', '', '', '', '', '', '', '', '', ''], ['', '', '', '', '', '', '', '', '', '', '', ''], ['', '', '', '', '', '', '', '', '', '', '', ''], ['', '', '', '', '', '', '', '', '', '', '', ''], ['', '', '', '', '', '', '', '', '', '', '', ''], ['', '', '', '', '', '', '', '', '', '', '', ''], ['', '', '', '', '', '', '', '', '', '', '', ''], ['', '', '', '', '', '', '', '', '', '', '', '']]
m: 0->1 - [['', '', '', '', '', '', '', '', '', '', '', ''], ['', '', '', '', '', '', '', '', '', '', '', ''], ['', 'l', 'l', 'l', 'l', 'l', 'l', 'l', 'l', 'l', 'l', 'l'], ['', 'l', 'l', 'l', 'l', 'l', 'l', 'l', 'l', 'l', 'l', 'l'], ['', 'l', 'l', 'l', 'l', 'l', 'l', 'l', 'l', 'l', 'l', 'l'], ['', 'l', 'l', 'l', 'l', 'l', 'l', 'l', 'l', 'l', 'l', 'l'], ['', 'l', 'l', 'l', 'l', 'l', 'l', 'l', 'l', 'l', 'l', 'l'], ['', 'l', 'l', 'l', 'l', 'l', 'l', 'l', 'l', 'l', 'l', 'l'], ['', 'l', 'l', 'l', 'l', 'l', 'l', 

### Greedy Algorithms

Greedy algorithms are a subclass of dynamic programming where the locally optimal solution of a subproblem gives a globally optimal solution, i.e. we can make very efficient and optimal algorithms by only considering local state.

Examples
- _Graph Algorithms_ - Kruskal, Primm, Dijkstra 
- _Compression_ - Huffman codes
- _Optimisation_ - Fractional Knapsack
- _Scheduling_ - Activity Selection Problem

#### Example - The Activity Selection Problem

Let $S=\{a_1,a_2,\cdots,a_n\}$ be a set of $n$ proposed activities that we wish to use a resource (e.g. a lecture hall), which can serve only one activity at a time.

- Each activity has a start time, $s_i$, and a finish time, $f_i$, where $0\leq s_i <f_i<\infty$. 
- If selected, activity, $a_i$, takes place during the half-open time interval $[s_i,f_i)$. 
- Activities, $a_i$ and $a_j$, are compatible if the intervals, $[s_i,f_i)$ and $[s_i,f_i)$, do not overlap. 

In this activity selection problem, we wish to select a maximum-size subset of mutually compatible activities.

The solution is to incrementally select the next compatible activity with the _earliest finish time_ - See page 421 in `Introduction to Algorithms` for a fully featured proof.

In [29]:
def greedy_selector(a):
    b = sorted(a, key = lambda x: int(x[1]))
    selected = []
    ft = -float("inf")
    for i in range(len(a)):
        if b[i][0] >= ft: # if starts after current finish time
            ft = b[i][1]
            selected.append(a.index(b[i]))
    return len(selected), selected


activities = [[0,11],[2,6],[4,7],[5,10],[7,11],[10,13],[12,14]]
# choose 1,4,6 [2,6] [7,11] [12,14]
print(greedy_selector(activities))

(3, [1, 4, 6])
