# Dynamic Programming

## fibonacci brute force

In [2]:
def fib(n):
    if n <=2: return 1
    return fib(n-1) + fib(n-2)

print(fib(1))
print(fib(5))
print(fib(7))

1
5
13


Time O(2^n)
Space O(n)

## fibonacci memoization

In [4]:
def fib(n, memo={}):
    if n <=2: 
        memo[n] = 1
        return 1
    memo[n] = fib(n-1, memo) + fib(n-2, memo)
    return memo[n]

print(fib(1))
print(fib(5))
print(fib(7))

1
5
13


Time O(n)
Space O(n)

## fibonacci tabulation

In [None]:
def fibonacci_tabulation(n):
    if n <= 1:
        return n

    # Create a table to store Fibonacci numbers
    dp = [0] * (n + 1)
    dp[1] = 1

    # Fill the table iteratively
    for i in range(2, n + 1):
        dp[i] = dp[i - 1] + dp[i - 2]

    return dp[n]

# The Coin Change Problem

Given as input a list of possible denominations of coins, D,  and a total amount , a, the problem is to compute r which is the minimum number of coins (where the denomination of each coin is an element of D) needed to sum exactly to the amount a. If this cannot be done with the given denominations, return -1.

Example:

D = [1, 5, 10, 20]

a = 115

r = 7

Explanation:

The minimum amount of coins with denominations in D needed to sum to a is 7.

These coins are: [20,20,20,20,20,10,5]

The problem appears trivial at first glance. One may be tempted to use a greedy method as follows:

## Greedy approach to the coin change problem

Input: D, a 

S := Sort D ascending

r := 0

total := 0

While total < a:

    if |S| == 0:
        return -1

    if total + S[-1] > a:

        S.pop()

    else:

        total += S[-1]
        r += 1

Return r

In this greedy algorithm, we always choose the coin with the largest value which will not make the total exceed a.

### Optimality of the Greedy approach

This algorithm is not optimal however, and we can prove this by counter-example.

A counter example is D = [5,4,3,2,1], a = 7

Given these inputs, the greedy result is: r1 = 3  ([5,1,1])

The optimal solution for these inputs is: r2 = 2 ([4,3])

We see that r1 > r2, meaning the greedy approach does not find the miminized solution.

### Correctness of the Greedy approach

The greedy algorithm is also not correct, and we can prove this by another counter-example.

A counter example is D = [4,3], a = 6

Given these inputs, the greedy result is: r1 = -1  ([4])

The optimal solution for these inputs is: r2 = 2 ([3,3])

We see that the greedy approach fails when a solution is indeed possible, as shown by r2.

Since we have shown that the greedy approach is neither correct nor optimal, we move on to the brute force solution.

## Brute force solution

To try all possible coin combinations, we can subtract each c (coin in D) from a, as long as a - c >= 0.

We can repeat this step for each result obtained from this calculation (replacing a with the result), until all possible coin combinations are explored.

We can then select the shortest path through the resulting tree which has a leaf value of 0.

The length of this path is r.

Below is an implementation of this algorithm.

In [1]:
def coin_change_bf(D, a):
       def dfs(a):
            if a == 0:
                return 0
            if a<0:
                return float('inf')
            return min([1+dfs(a-c) for c in D])
       minimum = dfs(a)
       return minimum if minimum < float("inf") else -1

print(coin_change_bf([5,4,3,2,1], 7))

2


### Coin Change Brute Force Complexity Analysis

Time Complexity:

For the worst case scenario, let's assume each coin in D < a such that each node which is not a leaf node has |D| children. This means we have |D| recursive calls at the first level, |D|^2 at the second level, |D|^3 at the third level and so on...

The total number of recursive calls in this scenario is |D| + |D|^2 + ... + |D|^a = O(|D|^a)

Therefore the time complexity is O(|D|^a). This is because at each step, there are |D| choices (coin denominations) to consider, and the recursion depth is at most 'a' (target amount).

Space Complexity:

The space complexity is determined by the maximum depth of the recursion stack. In the worst case, the recursion depth is equal to the target amount 'a'. Therefore, the space complexity is O(a).

Overall:

Time Complexity: O(|D|^a)

Space Complexity: O(a)

## Top Down Memoization for Coin Change

In the brute force algorithm, we have a chance to arrive at a value multiple times. For every path in the search tree, we can store intermediate results in a table, so that the next time we arrive at a value, eg. 3, we don't have to repeat the work in finding the minimum amount of extra coins needed to sum to a. Instead we can simply look in the table with a constant time lookup.

This optimization reduces search time greatly, but it does not effect the worst case time or space complexity.

In [None]:
def coin_change_memo(D, a):
       memo = {}
       def dfs(a):
            if a == 0:
                return 0
            if a < 0:
                return float('inf')
            if a in memo:
                return memo[a]
            
            memo[a] = min([1+dfs(a-c) for c in D])
            return memo[a]
            
       res = dfs(a)
       return res if res < float("inf") else -1

print(coin_change_memo([5,4,3,2,1], 7))

2


## Tabulation for Coin Change

Instead of doing a dfs to fill in the memo table, which requires a traversal of the exponential search tree, we can calculate the values in the memo table directly, and extract the answer from there.

We will call the memo table dp, as we are no longer doing memoization, but tabulation.

dp[i] represents the minimum amount of coins needed to get the amount i.

For the example D=[5,4,3,1] a=7

We initialize each dp[i] to contain infinity.

We know that dp[0] = 0 as it takes 0 coins to add up to an amount of 0. We can initialize this in our table.

Now we can deduce dp[1], dp[2] ... until we reach dp[a].

To get dp[i], we will look at each c element of D in sequence.

For each c in D, we take i - c to get t, and look for dp[t] if it exists.

Our result is 1+dp[t]

If this result is less than the current dp[i] and is not negative, we update dp[i] = 1+dp[t]

The logic is demonstrated with the examples:

Example 1: Calculating dp[1]

dp[0]=0
dp[1]=inf
dp[2]=inf
dp[3]=inf
dp[4]=inf
dp[5]=inf
dp[6]=inf

to calculate dp[1]:

for c in D=[5,4,3,1]

t = i-c = 1-5 = -4 -> ignore because negative

t = i-c = 1-4 = -3 -> ignore because negative

t = i-c = 1-3 = -2 -> ignore because negative

t = i-c = 1-1 = 0

dp[0] = 0 (lookup of dp[t])

1+dp[0] = 1 (add the current coin to the previous solution 1+dp[t])

This means a possible solution to dp[1] is 1

Since all other coin values would make dp[1] negative, and 1 < inf, we can update dp[1] to 1.

Example 2: Calculating dp[7]

dp[0]=0
dp[1]=1
dp[2]=2
dp[3]=1
dp[4]=1
dp[5]=1
dp[6]=2

To calculate dp[7]:

For c in D = [5,4,3,1]:

t = i-c = 7-5 = 2 -> dp[2] = 2, 1+dp[2] = 3, 3 < inf, update dp[7] to 3

t = i-c = 7-4 = 3 -> dp[3] = 1, 1+dp[3] = 2, 2 < 3, update dp[7] to 2

t = i-c = 7-3 = 4 -> dp[4] = 1, 1+dp[4] = 2, 2 = 2, ignore

t = i-c = 7-1 = 6 -> dp[6] = 2, 1+dp[6] = 3, 3 > 2, ignore

We conclude that the minimum solution to dp[7] is 2, achieved by adding a 4 coin to dp[3], which is achieved by adding a 3 coin to dp[0]

4 + 3 = 7


In [6]:
def coin_change_dp(D,a, printTable=False):
    dp=[float('inf')] * (a + 1) # initialize all fields in dp to infinity 
    dp[0] = 0

    for i in range(1, a+1):
        for c in D:
            t = i - c
            if t >= 0:
                dp[i] = min(dp[i], 1+dp[t])

    if printTable:
        print(dp)

    return dp[a] if dp[a] != float('inf') else -1

print(coin_change_dp([5,4,3,1], 7, printTable=True))

[0, 1, 2, 1, 1, 1, 2, 2]
2


## Coin Change Tabulation Complexity Analysis

Time Complexity:

For the worst case scenario, we need to iterate for all (i = 0; i <= a; i++)

And for each i, we need to iterate over each coin in D

All other operations within the loops are constant time lookups and subtractions, so the time complexity is O(a * |D|)

Space Complexity:

The space complexity is determined by the size of the dp array. This array is always of size a+1.

Therefore the space complexity is O(a)

Overall:

Time Complexity: O(a * |D|)

Space Complexity: O(a)

# Longest Increasing Subsequence

Given an array nums, return the length of the longest strictly increasing subsequence. A subsequence does not have to be contiguous.

Example: nums = [2,5,3,7,101,18]

Output: 4

Explanation: The subsequence [2,5,7,101] is the longest increasing subsequence, with length 4.

Much like coin change, this problem appears trivial at first glance. One may attempt to be greedy as follows:

## Longest Increasing Subsequence Greedy Approach

Input: nums

r := 0

index := 0

cur := nums[0]

While index < |nums|:

    if nums[index] > cur:

        r += 1

        cur := nums[index]

    index +=1

Return r

In this greedy algorithm we iterate through nums keeping track of the current max value enountered, incrementing our result each time a new larger value is encountered.

### Optimality of the Greedy approach

This algorithm is not optimal however, and we can prove this by counter-example.

A counter example is nums = [10,9,2,5,3,7,101,18]

Given these inputs, the greedy result is: r1 = 2  ([10,101])

An optimal solution for these inputs is: r2 = 4 ([2,3,7,101,18])

We see that r1 > r2, meaning the greedy approach does not find the maximised solution.

We therefore need a more sophisticated approach.

## Longest Increasing Subsequence Brute Force

We can try a brute force approach, where we start at index 0, and for each index choose wether we should exclude it from the subsequence or include it in the subsequence. We keep track of the prev_index, which represents the last index we included in the result, and the current_index, which is the index for which we are making the choice.

This will generate all possible increasing subsequenes.

We keep track of the longest increasing subsequence length, and return it.

Below is an implementation of this algorithm.

In [29]:
def length_of_lis_bf(nums):
    def dfs(prev_index, current_index):
        # Base case: reached the end of the sequence
        if current_index == len(nums):
            return 0

        # Case 1: Exclude the current element
        exclude_current = dfs(prev_index, current_index + 1)

        # Case 2: Include the current element if it is greater than the previous one
        include_current = 0
        if prev_index < 0 or nums[current_index] > nums[prev_index]:
            include_current = 1 + dfs(current_index, current_index + 1)

        # Return the maximum length of the two cases
        return max(exclude_current, include_current)

    # Start the recursion with initial indices (-1 represents no previous index)
    return dfs(-1, 0)

print(length_of_lis_bf([10,9,2,5,3,7,101,18]))

4


## Longest Increasing Subsequence Brute Force Complexity Analysis

Time Complexity:

Let n be the length of nums.

For the worst case scenario, There are n indices to consider. There are two subtrees at each decision, one where we include the current index, and one where we do not.

This brings the time complexity to O(2^n)

Space Complexity:

The space complexity is determined by the recursion depth.

Therefore the space complexity is O(n)

Overall:

Time Complexity: O(2^n)

Space Complexity: O(n)

## Longest Increasing Subsequence Memoization

We can use memoization to avoid repeating subproblems, such as when we are deciding wether the next element should be added or not for multiple subsequences ending in the same element.

Before proceeding with the recursive calls, the function checks if the result for the current combination of prev_index and current_index is already computed and stored in the memo dictionary. If it is, the stored result is returned immediately. This optimization allows the algorithm to avoid repeating work, speeding up the runtime.

In [35]:
def length_of_lis_memo(nums):
    if not nums:
        return 0

    memo = {}  # Memoization dictionary to store computed results

    def dfs(prev_index, current_index):
        if current_index == len(nums):
            return 0

        if (prev_index, current_index) in memo:
            return memo[(prev_index, current_index)]
        
        exclude_current = dfs(prev_index, current_index + 1)

        include_current = 0
        if prev_index < 0 or nums[current_index] > nums[prev_index]:
            include_current = 1 + dfs(current_index, current_index + 1)

        
        # Save the result in the memoization dictionary
        memo[(prev_index, current_index)] = max(include_current, exclude_current)

        return memo[(prev_index, current_index)]

    return dfs(-1, 0)

print(length_of_lis_memo([10,9,2,5,3,7,101,18]))

4


## Longest Increasing Subsequence Memoization Complexity Analysis

Time Complexity:

For each unique combination of (prev_index, current_index), the algorithm either calculates the result or looks it up in the memoization table. Since there are at most n choices for each index, the time complexity for a single subproblem is O(n).

The algorithm explores all combinations of prev_index and current_index. There are at most n choices for current_index and, in the worst case, n choices for prev_index for each current_index. Therefore, the total number of unique subproblems is O(n^2).

Space Complexity: 

The space complexity is increased to O(n^2), as the memo table needs to store all n^2 combinations in the worst case.

Overall:

Time Complexity: O(n^2)

Space Complexity: O(n^2)

## Longest Increasing Subsequence Tabulation

We can use tabulation to build a table from which we can deduce the result, similar to the coin change problem.

We know that starting at the last index will result in an increasing subsequence of length 1. We can work backwards, for the second last, third last ect.. deciding if including that element will result in a longer increasing subsequence or not, and storing the longest possible increasing subsequence starting at each index until we reach index zero.

We create a table called dp of size len(nums), where dp[i] represents the longest increasing subsequence starting at index i in nums.

Lets take the example nums = [1,2,4,3]

We initialize dp[3] to 1, as the longest increasing subsequence starting at index 3 is 1.

Consider nums[2] = 4

We can either take nums[2] by itself, or include nums[2] in any subsequence at any index that comes after it (if it maintains the property of an increasing subsequence). Including it would make dp[2] = 1+dp[3], Excluding it would make dp[2] = 1.

Since including it would not result in an increasing subsequence, we must exclude it, so dp[2] = 1

Now Conisder nums[1] = 2

We can either take it by itself or include it in any subsequence at any index that comes after it. Including it would make dp[1] = max(1+dp[2], 1+dp[3]), taking it by itself would make dp[1] = 1

We choose the option which maximizes the value of dp[1], which is 1+dp[2] (or equally 1+dp[3]) = 2

So for dp[i], by the same logic, we simply put max(1,1+dp[j1],1+dp[j2],1+dp[j3]...) (only include 1+dp[jx] in the max function if nums[i] < nums[jx], to maintain increasing subsequence property)


In [3]:
def length_of_lis_dp(nums, printTable = False):
    dp = [1] * len(nums)

    for i in range(len(nums)-1,-1,-1):
        for j in range(i+1,len(nums)):
            if nums[i] < nums[j]:
                dp[i] = max(dp[i], 1+dp[j])

    if printTable:
        print(dp)

    return max(dp)

print(length_of_lis_dp([10,9,2,5,3,7,101,18], printTable=True))

[2, 2, 4, 3, 3, 2, 1, 1]
4


## Longest Increasing Subsequence Tabulation Complexity Analysis

Let n be the length of nums

Time Complexity:

For the worst case scenario, we need to perform a double nested iteration over nums.

All other operations within the loops are constant time lookups and max(a,b), so the time complexity is O(n^2)

Space Complexity:

The space complexity is determined by the size of the dp array. This array is always of size n.

Therefore the space complexity is O(n)

Overall:

Time Complexity: O(n^2)

Space Complexity: O(n)

# continuous integer sequence that sums to the largest value

input: an array of integers
output: largest value of a sum of a subarray

Example: [-2,1,-3,4,-1,2,1,-5,4] -> 6
Explanation: [4,-1,2,1] has the largest sum 6

Brute force: find all subarrays and sum them, return the largest sum

Kadanes Algorithm: Iterate through the array, updating current_sum at each step. If the current_sum becomes negative, it is reset to the current element. The maximum subarray sum is updated whenever a new maximum is found.

if the array consists entirely of negative numbers, the algorithm will return 0 for the maximum subarray sum

If you want to modify the algorithm to handle the case where a non-empty subarray with all negative elements is allowed, you can initialize max_sum to the first element of the array instead of 0. This way, the algorithm will return the largest single negative element if the array consists entirely of negative numbers.

Time O(n)
Space O(1)


Kadane's algorithm is often considered a dynamic programming algorithm. Specifically, it falls under the category of algorithms that use dynamic programming principles for optimization. While it may not fit the traditional definition of dynamic programming that involves solving a problem by breaking it down into smaller overlapping subproblems and storing the solutions to these subproblems, Kadane's algorithm exhibits an important characteristic of dynamic programming: it maintains a global and local state to optimize the overall solution.

## Kadanes Algorithm

In [47]:
def max_subarray_sum(nums):
    if not nums:
        return 0

    max_sum = current_sum = nums[0]

    for num in nums[1:]:
        if current_sum < 0:
            current_sum = 0
        current_sum += num
        max_sum = max(max_sum, current_sum)

    return max_sum

print(max_subarray_sum([-2,1,-3,4,-1,2,1,-5,4]))

6


# longest alternating subsequence

a sequence {x1,x2,x3,x4..xn} is alternating if its elements satisfy one of the following:

`x1>x2<x3>x4<..`

`x1<x2>x3<x4>..`

the task is to find the longest alternating subsequence in a list

note that a subsequence does not have to be contiguous, meaning given a list:

[1,17,5,10,13,15,10,5,16,8], the answer is [1,17,5,15,5,16,8] -> 7

Two arrays, inc and dec, are initialized with all elements set to 1. These arrays will store the lengths of the longest increasing and decreasing subsequences ending at each index.

The algorithm iterates through the input sequence, considering each element arr[i] and updating the inc and dec arrays based on the following conditions:

If arr[i] is greater than arr[j] for some previous index j, it means the sequence can be extended in an increasing manner. In this case, inc[i] is updated to be the maximum of its current value and the length of the longest decreasing subsequence ending at index j plus 1.

If arr[i] is smaller than arr[j] for some previous index j, it means the sequence can be extended in a decreasing manner. In this case, dec[i] is updated to be the maximum of its current value and the length of the longest increasing subsequence ending at index j plus 1.

The length of the longest alternating subsequence is the maximum value in the inc and dec arrays.

In [None]:
def longest_alternating_subsequence(arr):
    n = len(arr)
    
    # Initialize arrays to store lengths of longest increasing and decreasing subsequences
    inc = [1] * n
    dec = [1] * n

    # Dynamic Programming
    for i in range(1, n):
        for j in range(i):
            if arr[i] > arr[j]:
                inc[i] = max(inc[i], dec[j] + 1)
            elif arr[i] < arr[j]:
                dec[i] = max(dec[i], inc[j] + 1)

    # Find the maximum value in inc and dec arrays
    result = max(max(inc), max(dec))
    return result

print(longest_alternating_subsequence([1,17,5,10,13,15,10,5,16,8]))

7


O(n^2) time complexity

O(n)

Observe that at each step, we only need to observe the most recent index, so we do not need to store the entire inc and dec arrays. Instead, we can use a single variable.

This reduces time complexity to O(n)

and space complexity to O(1)

## longest alternating subsequence optimized

In [None]:
def longest_alternating_subsequence(arr):
    n = len(arr)

    # Initialize lengths of longest increasing and decreasing subsequences
    inc = 1
    dec = 1

    # Dynamic Programming
    for i in range(1, n):
        if arr[i] > arr[i - 1]:
            inc = dec + 1
        elif arr[i] < arr[i - 1]:
            dec = inc + 1

    # The maximum of inc and dec is the length of the longest alternating subsequence
    result = max(inc, dec)
    return result

print(longest_alternating_subsequence([1,17,5,10,13,15,10,5,16,8]))

7


# binomial coefficients

given as input positive integers n and k, calculate how many unique ways you can select k items from a list of size n. in other words, calculate C(n,k) where

C(n,k) = n! / k! ((n-k)!)

we could simply code a factorial function and calculate the formula, but for the purpose if this demonstration, lets solve this using dynamic programming

also, by definition,

C(n,k) = C(n-1,k-1) + C(n-1,k)

C(n,0) = 1

C(n,n) = 1

we can use this as a recursive case and base cases in a sub-optimal solution.

## recursive binomial coefficients

In [None]:
def C(n,k):
    if k == 0 or k == n: return 1
    return C(n-1,k-1) + C(n-1,k)

print(C(3,1))
print(C(4,2))

3
6


if we look at the trace of calculating C(4,2), we can see that it is the sum of:

C(3,1) and C(3,2)

to calculate C(3,1) and C(3,2) we must know C(2,1). meaning we have repeated calculations of C(2,1)

we can use memoization to store each calculation to avoid repeating subproblems

## memoization binomial coefficients

In [None]:
def C(n,k,memo={}):
    if k == 0 or k == n: return 1

    if (n,k) in memo:
        return memo[(n,k)]
    
    result = C(n-1,k-1,memo) + C(n-1,k,memo)
    memo[(n,k)] = result
    return result

print(C(3,1))
print(C(4,2))


3
6


we can also create the table directly.

consider a table dp with dimensions n+1 x k+1,

where dp[i][j] in the table contains the result of C(i,j)

we could build this table up starting from our base cases:

C(n,0) = 1

C(n,n) = 1

, until we have a solution to C(n,k)

So for C(4,2), we would initialize the following table

|   | '0' | '1' | '2' |
|---|---|---|---|
| '0' | 1 | 0 | 0 |
| '1' | 1 | 1 | 0 |
| '2' | 1 |   | 1 |
| '3' | 1 |   |   |
| '4' | 1 |   |   |

Now, we can use C(n,k) = C(n-1,k-1) + C(n-1,k) to fill the table

For example, to get dp[2][1] (C(2,1)), we take dp[1][0] + dp[1][1] = 2

Do this for the entire table

|   | '0' | '1' | '2' |
|---|---|---|---|
| '0' | 1 | 0 | 0 |
| '1' | 1 | 1 | 0 |
| '2' | 1 | 2 | 1 |
| '3' | 1 | 3 | 3 |
| '4' | 1 | 4 | 6 |

until we arrive at dp[n][k] which will give the answer of C(n,k)


we use the fact that  C(n,k) = C(n,n-k) (C(9,8) = C(9,1)) to avoid doing unneccessary work

this is implemented by replacing any instance of 'k' with min(i,k) in the code (only filling the)

## tabulation binomial coefficients

In [109]:
def C(n,k, printTable=False):
    dp = [[0] * (k+1) for _ in range(n+1)]

    #fill in the base cases
    for i in range(n+1):
        dp[i][0] = 1
        dp[i][min(i,k)]=1

    for i in range(1,n+1):
        # optimization: C(n,k) = C(n,n-k)
        for j in range(1,min(i,k)+1):
            dp[i][j] = dp[i-1][j-1] + dp[i-1][j]

    if printTable:
        for row in dp:
            print(row)
            
    return dp[n][k]

print(C(3,1))
print(C(4,2, printTable=True))

3
[1, 0, 0]
[1, 1, 0]
[1, 2, 1]
[1, 3, 3]
[1, 4, 6]
6


Time: O(n*k)
Space: O(n*k)

## optimization

Notice that we can calculate the values in any row r using the values in row r-1

Hence, we do not need to store the entire dp table in memory, only two rows at a time.

The current row r, and the previous row r-1.

This reduces space complexity to O(2k) -> O(k)

In [108]:
def C(n,k,printTable=False):
    oldRow = [0] * (k+1)
    oldRow[0] = 1

    for i in range(1,n+1):
        newRow = [0] * (k+1)
        newRow[0] = 1
        for j in range(1,min(i,k)+1):
            newRow[j] = oldRow[j-1] + oldRow[j]

        if printTable:
            print(oldRow)
            print(newRow)
            print("-------------")

        oldRow = newRow
    
            
    return newRow[k]

print(C(4,2,printTable=True))

[1, 0, 0]
[1, 1, 0]
-------------
[1, 1, 0]
[1, 2, 1]
-------------
[1, 2, 1]
[1, 3, 3]
-------------
[1, 3, 3]
[1, 4, 6]
-------------
6


# longest common subsequence

given two strings, text1 and text2, return the length of their longest common subsequence

example: abcde, ace -> 3

explanation: the longest common subsequence is ace of length 3

Brute force: generate all subsequences of both strings, iterate through them comparing character by character, and finding the longest common subsequence -> terribly inefficient

Notice that:

OBSERVATION 1: if we have a common character at the current position, such as in:

abcde, ace

the solution will be 1 + lcs(bcde,ce) [remove the common character from both texts, add 1 and recurse. the reason we add 1 is because each common character contributes 1 length to our final output]

OBSERVATION 2: if there is no common character at the current position such as in:

bcde, ce

the solution will be max(lcs(cde,ce), lcs(bcde,e)) [remove the first character from either the first or second text, recurse on both cases]

we can use these two cases to do a recursive solution, but let's try to implement this idea directly into tabulation

We create a table dp with j rows and i columns, where i and j are the lengths of the inputs.

Each row and column of dp represents a character in text1 and text2

|   | a | c | e | '' |
|---|---|---|---|---|
| a |   |   |   | 0 |
| b |   |   |   | 0 |
| c |   |   |   | 0 |
| d |   |   |   | 0 |
| e |   |   |   | 0 |
| '' | 0 | 0 | 0 | 0 |

in this table, for example,

dp[0][0] will represent lcs(abcde,ace) -> the solution

dp[3][1] will represent lcs(de,ce) ect..

i.e. the position in the table represents the substring starting at that index

we fill the table starting from the bottom right as follows:

if the row and col have the same label, we have found a common character

as per OBSERVATION 1: we fill the space with 1 + dp[i+1][j+1]

if the row and col do not have the same label, we must use OBSERVATION 2:

fill the space with max(dp[i][j+1], dp[i+1][j])

if we go out of bounds, we simply take zero. (for ease of code, the grid is i+1 x j+1 and initialized to all zeros, so we don't have to check for going out of bounds)

our solution will be at dp[0][0]

|   | a | c | e | '' |
|---|---|---|---|---|
| a | 3 | 2 | 1 | 0 |
| b | 2 | 2 | 1 | 0 |
| c | 2 | 2 | 1 | 0 |
| d | 1 | 1 | 1 | 0 |
| e | 1 | 1 | 1 | 0 |
| '' | 0 | 0 | 0 | 0 |



the time complexity of this is O(i*j)

the space complexity of this is O(i*j)

THE SPACE COMPLEXITY CAN BE IMPROVED TO O(2j) -> O(j) USING THE SAME TWO ROW OPTIMIZATION AS ABOVE



## longest common subsequence tabulation

In [3]:
def lcs(text1, text2, printTable=True):
    dp=[[0 for j in range(len(text2)+1)] for i in range(len(text1)+1)]

    for i in range(len(text1)-1,-1,-1):
        for j in range(len(text2)-1,-1,-1):
            if text1[i] == text2[j]:
                dp[i][j] = 1 + dp[i+1][j+1]
            else:
                dp[i][j] = max(dp[i][j+1], dp[i+1][j])
    if printTable:
        for row in dp:
            print(row)
    return dp[0][0]

print(lcs("abcde","ace"))

[3, 2, 1, 0]
[2, 2, 1, 0]
[2, 2, 1, 0]
[1, 1, 1, 0]
[1, 1, 1, 0]
[0, 0, 0, 0]
3


# longest palindromic subsequence

example: babbb -> 4

explanation: bbbb is the longest palindromic subsequence and has length 4

This question is simply a special case of longest common subsequence.

The longest palindromic subsequence of a string s is simply the longest common subsequence of s and reverse(s)

In [4]:
def lps(s):
    return lcs(s,s[::-1])

print(lps("babbb"))

[4, 3, 3, 2, 1, 0]
[3, 3, 2, 2, 1, 0]
[3, 3, 2, 1, 1, 0]
[2, 2, 2, 1, 1, 0]
[1, 1, 1, 1, 1, 0]
[0, 0, 0, 0, 0, 0]
4


# Longest Contiguous Palindromic Substring

example: aaaabbaa -> aabbaa

brute force: find all substrings, filter out non palindromes, get max length O(n^3)

We can create a table dp where dp[i][j] is true if s[i][j+1] is a palindrome

We can initialize all fields where i==j to True, as single letters are palindromes

We can initialize all fields where j = i+1 and s[i] = s[j] to true as all pairs of the same letter are palindromes

Now, notice that if p is a palindrome, and x is a character, xpx is guaranteed to be a palindrome.

Using this, for substrings of length 3, we check if the start and end are the same letter, and the middle is a palindrome (we know from the existing entries in the table)

Do this for all lengths from 3 -> len(s) - 1

From the table we can deduce all palindromic substrings, or the longest palindromic substring.

In [None]:
def longest_palindromic_substring(s, printTable=True):
    n = len(s)
    # single character strings are palindromes, so we can return
    if n <=1: return s

    #create the nxn table, initialize all to False
    dp = [[False] * n for _ in range(n)]

    #initialize all single character substrings to True
    #single character substrings start and end at the same index
    for i in range(n):
        dp[i][i] = True

    #since we are looking for the longest palindromic substring,
    #we can track the starting position and max_length for convenience
    start, max_length = 0,1

    #initialize all pairs of identical characters to True
    for i in range(n-1):
        if s[i] == s[i+1]:
            dp[i][i+1] = True
            start, max_length = i,2

    #for all substrings with length >=3, set to True if start and end are true
    #and the middle is a palindrome
            
    for length in range(3, n+1):
        for i in range(n - length + 1):
            j = i + length - 1
            if dp[i+1][j-1] and s[i] == s[j]:
                dp[i][j] = True
                start, max_length = i, length

    if printTable:
        for row in dp:
            print(row)

    return s[start:start+max_length]

print(longest_palindromic_substring("aaaabbaa"))

[True, True, True, True, False, False, False, False]
[False, True, True, True, False, False, False, False]
[False, False, True, True, False, False, False, True]
[False, False, False, True, False, False, True, False]
[False, False, False, False, True, True, False, False]
[False, False, False, False, False, True, False, False]
[False, False, False, False, False, False, True, True]
[False, False, False, False, False, False, False, True]
aabbaa


# the Needleman-Wunsch algorithm

used for global alignment of DNA and protein sequences

eg:

seq1 = ATGCT

seq2 = AGCT

globally aligned sequences:

A T G C T
A - G C T

the trick is to find an efficient way to place gaps in the sequences to maximise the amount of matching letters.

We can do this with tabulation:


Create a matrix of  size +len(seq1) + 1 x len(seq2) + 1 

|   | '' | A | T | G | C | T |
|---|---|---|---|---|---|---|
| '' |   |   |   |   |   |   |
| A |   |   |   |   |   |   |
| G |   |   |   |   |   |   |
| C |   |   |   |   |   |   |
| T |   |   |   |   |   |   |


Now use the following scheme to fill in the table:

Match: 1

Mismatch: -1

GAP: -2

### INITIALIZATION:

starting at 0 at (0,0), fill the '' row and column with progressive GAP penalties as follows

|   | '' | A | T | G | C | T |
|---|---|---|---|---|---|---|
| '' | 0 | -2 | -4 | -6 | -8 | -10 |
| A | -2 |   |   |   |   |   |
| G | -4 |   |   |   |   |   |
| C | -6 |   |   |   |   |   |
| T | -8 |   |   |   |   |   |

### FILLING THE MATRIX:

now starting at (1,1), and going row-wise, fill each cell with the max of the following:

Value from left + GAP

Value from above + GAP

Value from diagonal + (Match or Mismatch) [depending on wether the row and column label are the same letter]

so, for the (1,1) cell, we fill it with max(-4,-4,1) = 1

and for the (1,2) cell, we fill it with max(-1,-6,-3) = -1

and so on...

until we get:

|   | '' | A | T | G | C | T |
|---|---|---|---|---|---|---|
| '' | 0 | -2 | -4 | -6 | -8 | -10 |
| A | -2 | 1 | -1 | -3 | -5 | -7 |
| G | -4 | -1 | 0 | 0 | -2 | -4 |
| C | -6 | -3 | -2 | -1 | 1 | -1 |
| T | -8 | -5 | -2 | -3 | -1 | 2 |


### TRACEBACK:

Starting from the bottom-right (which will always be the highest value in the matrix), continue the following until (0,0) is reached

if row and col labels match, go diagonally

else, go to max(left,above,diagonal)

each time we go diagonally, we can align the row and col labels

each time we go left, we put a gap in Seq2 at that index

each time we go up, we put a gap in Seq1 at that index

note that the aligned sequences will be written from right to left



In [None]:
def needleman_wunsch(seq1, seq2, match=1, mismatch=-1, gap=-2, printTable=False):
    # Initialize the scoring matrix
    m, n = len(seq2), len(seq1)
    score = [[0] * (n + 1) for _ in range(m + 1)]

    # Initialize the first row and column
    for i in range(m + 1):
        score[i][0] = i * gap
    for j in range(n + 1):
        score[0][j] = j * gap

    # Fill in the scoring matrix
    for i in range(1, m + 1):
        for j in range(1, n + 1):
            match_mismatch = match if seq2[i - 1] == seq1[j - 1] else mismatch
            diagonal = score[i - 1][j - 1] + match_mismatch
            horizontal = score[i][j - 1] + gap
            vertical = score[i - 1][j] + gap
            score[i][j] = max(diagonal, horizontal, vertical)

    if printTable:
        for row in score:
            print(row)

    # Traceback to find the alignment
    align2, align1 = "", ""
    i, j = m, n
    while i > 0 or j > 0:
        if i > 0 and j > 0 and score[i][j] == score[i - 1][j - 1] + (match if seq2[i - 1] == seq1[j - 1] else mismatch):
            align2 = seq2[i - 1] + align2
            align1 = seq1[j - 1] + align1
            i -= 1
            j -= 1
        elif i > 0 and score[i][j] == score[i - 1][j] + gap:
            align2 = seq2[i - 1] + align2
            align1 = "-" + align1
            i -= 1
        else:
            align2 = "-" + align2
            align1 = seq1[j - 1] + align1
            j -= 1

    return align1, align2

# Example usage:
sequence1 = "ATGCT"
sequence2 = "AGCT"
alignment1, alignment2 = needleman_wunsch(sequence1, sequence2, printTable=True)
print("Alignment 1:", alignment1)
print("Alignment 2:", alignment2)

[0, -2, -4, -6, -8, -10]
[-2, 1, -1, -3, -5, -7]
[-4, -1, 0, 0, -2, -4]
[-6, -3, -2, -1, 1, -1]
[-8, -5, -2, -3, -1, 2]
Alignment 1: ATGCT
Alignment 2: A-GCT


# Smith-Waterman algorithm

Used for local alignment of DNA and protein sequences.

eg:

seq1 = ATGCT

seq2 = AGCT

locally aligned sequences:

G C T
G C T

the trick is to find an efficient way displace one of the sequences, such that the number of matching letters is maximised. In this case it is 3.

This is very similar to Needleman-Wunsch. The algorithm is the same as above, however all negative values become zero as follows:

We can do this with tabulation:

Create a matrix of  size +len(seq1) + 1 x len(seq2) + 1 

|   | '' | A | T | G | C | T |
|---|---|---|---|---|---|---|
| '' |   |   |   |   |   |   |
| A |   |   |   |   |   |   |
| G |   |   |   |   |   |   |
| C |   |   |   |   |   |   |
| T |   |   |   |   |   |   |


Now use the following scheme to fill in the table:

Match: 1

Mismatch: -1

GAP: -2

### INITIALIZATION:

starting at 0 at (0,0), fill the '' row and column with progressive GAP penalties as follows [all negative values become 0]

|   | '' | A | T | G | C | T |
|---|---|---|---|---|---|---|
| '' | 0 | 0 | 0 | 0 | 0 | 0 |
| A | 0 |   |   |   |   |   |
| G | 0 |   |   |   |   |   |
| C | 0 |   |   |   |   |   |
| T | 0 |   |   |   |   |   |

### FILLING THE MATRIX:

now starting at (1,1), and going row-wise, fill each cell with the max of the following:

Value from left + GAP

Value from above + GAP

Value from diagonal + (Match or Mismatch) [depending on wether the row and column label are the same letter]

REMEMBERING THAT ALL NEGATIVE VALUES BECOME 0

so, for the (1,1) cell, we fill it with max(-4,-4,1) = 1

and for the (1,2) cell, we fill it with max(-1,-6,-3) = -1 -> 0

and so on...

until we get:

|   | '' | A | T | G | C | T |
|---|---|---|---|---|---|---|
| '' | 0 | 0 | 0 | 0 | 0 | 0 |
| A | 0 | 1 | 0 | 0 | 0 | 0 |
| G | 0 | 0 | 0 | 1 | 0 | 0 |
| C | 0 | 0 | 0 | 0 | 2 | 0 |
| T | 0 | 0 | 1 | 0 | 0 | 3 |


### TRACEBACK:

Starting from the highest value in the matrix, move diagonally backwards until 0 is reached.

For each diagonal movement, align the coresponding letters.


In [None]:
def smith_waterman(seq1, seq2, match=1, mismatch=-1, gap=-2, printTable=False):
    # Initialize the scoring matrix
    m, n = len(seq2), len(seq1)
    score = [[0] * (n + 1) for _ in range(m + 1)]

    # Initialize the first row and column
    for i in range(m + 1):
        score[i][0] = 0
    for j in range(n + 1):
        score[0][j] = 0

    # Fill in the scoring matrix
    max_score = 0
    max_position = (0, 0)

    for i in range(1, m + 1):
        for j in range(1, n + 1):
            match_mismatch = match if seq2[i - 1] == seq1[j - 1] else mismatch
            diagonal = score[i - 1][j - 1] + match_mismatch
            horizontal = score[i][j - 1] + gap
            vertical = score[i - 1][j] + gap

            score[i][j] = max(0, diagonal, horizontal, vertical)

            if score[i][j] > max_score:
                max_score = score[i][j]
                max_position = (i, j)

    if printTable:
        for row in score:
            print(row)

    # Traceback to find the alignment
    align1, align2 = "", ""
    i, j = max_position

    while i > 0 and j > 0 and score[i][j] > 0:
        if score[i][j] == score[i - 1][j - 1] + (match if seq2[i - 1] == seq1[j - 1] else mismatch):
            align2 = seq2[i - 1] + align2
            align1 = seq1[j - 1] + align1
            i -= 1
            j -= 1
        elif score[i][j] == score[i][j - 1] + gap:
            align2 = seq2[i - 1] + align2
            align1 = "-" + align1
            j -= 1
        else:
            align2 = "-" + align2
            align1 = seq1[j - 1] + align1
            i -= 1

    return align1, align2

# Example usage:
sequence1 = "ATGCT"
sequence2 = "AGCT"
alignment1, alignment2 = smith_waterman(sequence1, sequence2, printTable=True)
print("Alignment 1:", alignment1)
print("Alignment 2:", alignment2)

[0, 0, 0, 0, 0, 0]
[0, 1, 0, 0, 0, 0]
[0, 0, 0, 1, 0, 0]
[0, 0, 0, 0, 2, 0]
[0, 0, 1, 0, 0, 3]
Alignment 1: GCT
Alignment 2: GCT


# Dynamic programming on multi dimensional arrays



# Unique Paths

A robot R is located at the top-left corner of a m x n grid.

The robot can only move either down or right at any point in time. The robot is trying to reach the bottom right corner of the grid F.

How many possible unique paths are there?

Input: m = 3, n = 7
Output: 28

Example path:

| R | > | v |   |   |   |   |
|---|---|---|---|---|---|---|
|   |   | > | > | > | > | v |
|   |   |   |   |   |   | F |

Notice that at every move, since we can only move right or down, the grid shrinks. (Which is a potential subproblem)

Also notice that wherever the robot lands on the grid, it is possible to reach the finish square, so we do not have to worry about cases where the robot cannot reach the finish square (backtracking)

Notice also that the number of paths from any square is the sum of the number paths to the right, and the number of paths if you go down.

We can use tabulation.

We will make a m x n grid where the value at dp[i][j] = the number of paths from that square to F.

We can then initialize F to 1, as there is one path from F to itself.

the square directly above and to the left of F can also be initialized to 1, as there is one path from them to F, going down or right respectfully. This logic follows for all of the bottom row of the grid, and the last column of the grid.

|   |   |   |   |   |   | 1 |
|---|---|---|---|---|---|---|
|   |   |   |   |   |   | 1 |
| 1 | 1 | 1 | 1 | 1 | 1 | 1 |

Now we simply fill the grid from the bottom right to the top left, where the value of each square is the sum of the value below and to the right.

| 28 | 21 | 15 | 10 | 6 | 3 | 1 |
|---|---|---|---|---|---|---|
| 7 | 6 | 5 | 4 | 3 | 2 | 1 |
| 1 | 1 | 1 | 1 | 1 | 1 | 1 |

The top left square is where the robot was, so that is the number of paths from the robot to the finish square.



In [2]:
def unique_paths(m,n, printTable=False):
    
    # Initialize a m x n table of 1s
    dp = [[1] * n for _ in range(m)]

    # Fill in the table in a bottom-up manner
    for i in range(m-1,-1,-1):
        for j in range(n-1,-1,-1):
            # Leave last row and column as 1s
            if i == m-1 or j == n-1:
                continue
            # Fill all other squares with the sum of the square below and to the right
            else:
                dp[i][j] = dp[i+1][j] + dp[i][j+1]

    if printTable==True:
        for row in dp:
            print(row)
    return dp[0][0]

print(unique_paths(3,7, printTable=True))

[28, 21, 15, 10, 6, 3, 1]
[7, 6, 5, 4, 3, 2, 1]
[1, 1, 1, 1, 1, 1, 1]
28


### Optimization

As you see in the coding solution, we cleverly reduce space complexity to O(2n) as we only actually need two rows in memory at the same time. 

In [13]:
def unique_paths_optimized(m,n):
    # Initialize the bottom row of 1s
    oldRow = [1] * n

    # For each subsequent row in the grid
    for _ in range(m-1):
        # Create a new row which is initialized to all 1s
        newRow = [1] * n
        # Traverse the new row in reverse, without the last element
        for j in range(n - 2, -1, -1):
            # Each value in the new row is the sum of the value to the right and below
            newRow[j] = newRow[j+1] + oldRow[j]
        oldRow = newRow

    return oldRow[0]

print(unique_paths_optimized(3,7, printTable=True))

[28, 21, 15, 10, 6, 3, 1]
28


Using the constraint of being able to move only down and right in the matrix, (every move makes progress towards a goal), we can construct other more complex problems such as:

# Minimum path sum

Given a grid (matrix) filled with non-negative numbers, find a path from the top-left corner to the bottom-right corner that minimizes the sum of numbers along the path. You can only move down or to the right at any point.

Input: matrix = [

    [1, 3, 1],

    [1, 5, 1],

    [4, 2, 1]

]

Output: 7

Explanation: 1 + 3 + 1 + 1 + 1

| R | > | v |
|---|---|---|
|   |   | v |
|   |   | F |

We use tabulation.

Make a table the same dimensions as the matrix.


Starting at the bottom right, initializing the cost of that path as its own value.

| 0 | 0 | 0 |
|---|---|---|
| 0 | 0 | 0 |
| 0 | 0 | 1 |

Now we can iterate through the matrix backwards using the same logic as in unique paths, but this time:

If we are on the last column: dp[i][j] = dp[i][j+1] + matrix[i][j]

[Because there is no more squares to go right, we must go down and incur that cost]

If we are on the last row: dp[i][j] = dp[i+1][j] + matrix[i][j]

[Because there is no more squares to go down, we must go right and incur that cost]

Otherwise, dp[i][j] = matrix[i][j] + the minimum cost of going down, or going right.

The top left of the grid will give us the minimum cost of reaching the bottom right position.

| 7 | 6 | 3 |
|---|---|---|
| 8 | 7 | 2 |
| 7 | 3 | 1 |



In [3]:
def min_path_sum(matrix, printTable=False):
    rows, cols = len(matrix), len(matrix[0])
    
    # Initialize a table to store minimum path sums
    dp = [[0] * cols for _ in range(rows)]

    # Fill in the table in a bottom-up manner
    for i in range(rows-1,-1,-1):
        for j in range(cols-1,-1,-1):
            # Initialize the bottom right square to its own cost
            if i == rows-1 and j == cols-1:
                dp[i][j] = matrix[i][j]

            # When we are on the last row or col
            elif i == rows-1:
                dp[i][j] = dp[i][j+1] + matrix[i][j]
            elif j == cols-1:
                dp[i][j] = dp[i+1][j] + matrix[i][j]

            # Otherwise, choose the minimum cost path and add its cost
            # to the cost of this square.
            else:
                dp[i][j] = matrix[i][j] + min(dp[i+1][j], dp[i][j+1])

    if printTable==True:
        for row in dp:
            print(row)
    return dp[0][0]

print(min_path_sum([
    [1, 3, 1],
    [1, 5, 1],
    [4, 2, 1]], printTable=True))

[7, 6, 3]
[8, 7, 2]
[7, 3, 1]
7


### Optimization

We can use the exact same optimization as we did with unique_paths.

Since we only need the current row and the row below to determine any value in the grid, we should only store two rows in memory at a time.

In [16]:
def min_path_sum_optimized(matrix):
    rows, cols = len(matrix), len(matrix[0])
    
    # Initialize the bottom row of 0s
    oldRow = [0] * cols

    # For each subsequent row in the grid
    for i in range(rows-1, -1, -1):
        # Create a new row which is initialized to all 0s
        newRow = [0] * cols
        # Traverse the new row in reverse, same logic as before.
        for j in range(cols - 1, -1, -1):
            if i == rows-1 and j == cols-1:
                newRow[j] = matrix[i][j]
            elif i == rows-1:
                newRow[j] = newRow[j+1] + matrix[i][j]
            elif j == cols-1:
                newRow[j] = oldRow[j] + matrix[i][j]
            else:
                newRow[j] = matrix[i][j] + min(oldRow[j], newRow[j+1])
        oldRow = newRow

    return oldRow[0]

print(min_path_sum_optimized([
    [1, 3, 1],
    [1, 5, 1],
    [4, 2, 1]]))

7


: 

This two-row framework can be used to solve any pathing problem which is constrained such that negative progress is not possible!

## Extending this to 3 dimensions

What if the grid we can move through is not a 2d array but a 3d array (but the constraint what negative progress is not possible still holds?)

Consider an m x n x k array, where we start at the top left at depth 0 (0,0,0) (call this space R), and our target is in the bottom right at depth k-1 (m-1, n-1, k-1)(call this space F).

We can only move down at the current depth, right at the current depth, or "in" (meaning we increase the depth we are at).

The problem, like before, is to find the number of distinct paths from R to F.

We can use the same approach as with the 2d problem, but this time we will need a 3d array to store the intermediate values.



In [75]:
def unique_paths_3d(M, N, K):
    dp = [[[0 for _ in range(K)] for _ in range(N)] for _ in range(M)]

    # Initializing dp
    dp[M - 1][N - 1][K - 1] = 1

    # Filling the DP table in a bottom-up manner
    for i in range(M - 1, -1, -1):
        for j in range(N - 1, -1, -1):
            for k in range(K - 1, -1, -1):
                if i + 1 < M:
                    dp[i][j][k] += dp[i + 1][j][k]
                if j + 1 < N:
                    dp[i][j][k] += dp[i][j + 1][k]
                if k + 1 < K:
                    dp[i][j][k] += dp[i][j][k + 1]

    return dp[0][0][0]

print(unique_paths_3d(2,7,5))

2310


## Extending this to N dimensions:

The np.ndindex(shape) generates an iterator over all possible indices in the multi-dimensional array.

For each index, represented by the current_cell:

Iterate over each dimension using for i in range(num_dimensions).

Check if the current cell's index in the current dimension (current_cell[i]) is greater than 0. If true, it means there is a valid cell to move from in that dimension.

Create a copy of the current cell (prev_cell) and decrement the index in the current dimension (prev_cell[i] -= 1). This represents the cell from which we are coming.

Add the number of paths from the previous cell to the current cell in the dynamic programming array (dp[index] += dp[tuple(prev_cell)]).

#### WRITE WHY WE ARE USING NUMPY
-> can access elements in array using tuples as index
-> can easily generate large multi dimensional arrays of zeros instead of:

def initialize_array(dimensions):
    if len(dimensions) == 1:
        return [0] * dimensions[0]
    else:
        return [initialize_array(dimensions[1:]) for _ in range(dimensions[0])]

we can use:

np.zeros

-> np.ndindex is a NumPy function that provides an iterator yielding tuples of indices for a given shape. It is particularly useful for iterating over the indices of multi-dimensional arrays. This function returns an object that can be iterated over, generating all possible index tuples for a specified shape.

In [78]:
import numpy as np

def numberOfWays(dimensions):
    # Determine the number of dimensions
    num_dimensions = len(dimensions)

    # Initialize a multi-dimensional array to store the number of paths for each cell
    shape = tuple(dimensions)
    dp = np.zeros(shape, dtype=int)

    # Set the number of paths for the starting cell to 1
    dp[(0,) * num_dimensions] = 1

    # Calculate the number of paths for each cell in the matrix
    for index in np.ndindex(shape):
        # Get the current index as an array
        current_cell = np.array(index)
        # Iterate over the array
        for i in range(num_dimensions):
            # If the current cell is not on the edge
            if current_cell[i] > 0:
                # Add the previous cell's paths to the current cell [this happens from each valid direction in each dimension]
                prev_cell = current_cell.copy()
                prev_cell[i] -= 1
                dp[index] += dp[tuple(prev_cell)]

    # The result is stored in the last cell of the matrix
    return dp[tuple(np.array(dimensions) - 1)]

# Example usage:
dimensions = (2, 7, 5)
result = numberOfWays(dimensions)
print("Number of ways:", result)

Number of ways: 2310
