# Final exam programming problem

In this problem, you will work on the following:

- For a graph G, you will compute its MST
- We will add an edge into the MST and you should be able to quickly update the MST

To solve this problem, you may consult three lab sessions:

- Merge sort (We will sort the edges using merge sort)
- MST (Kruskal's algorithm)
- DFS (To find a cycle on a graph)

In [1]:
import random

# Programming Problem 1

## Part 1

You must work with the provided data structures.
You may use your own data structures if they are not defined here.

In [2]:
# Define graph edge
class Edge:
    def __init__(self, node1, node2, weight=1.0):
        self.node1 = node1
        self.node2 = node2
        self.weight = weight
        
    def __lt__(self, other):
        selfPriority = self.weight
        otherPriority = other.weight
        return selfPriority < otherPriority

In [3]:
# Define an undirected graph
class UndirectedGraph:
    def __init__(self, n):
        self.num_nodes = n
        self.nodes = [set() for i in range(n)]
        self.edges = []
    
    # edge node1 <--> node2 (undirected)
    def insert(self, node1, node2, weight=1.0):
        self.nodes[node1].add(node2)
        self.nodes[node2].add(node1)
        self.edges.append(Edge(node1, node2, weight))
        self.edges.append(Edge(node2, node1, weight))

### Graph representation

- Linked list: self.nodes in class UndirectedGraph (for DFS)
- Edge list: self.edges in class UndirectedGraph (for MST)

In [4]:
# A function to generate a random graph
def generate_graph_and_output_edges(num_nodes, random_seed, num_edge_per_node=3):
    random.seed(random_seed)
    graph = UndirectedGraph(num_nodes)
    edge_list = []
    for i in range(num_nodes):
        while len(graph.nodes[i]) < num_edge_per_node:
            node = random.randint(0, num_nodes-1)
            if node != i and (node not in graph.nodes[i]):
                weight = random.random()
                graph.insert(i, node, weight)
    return graph  

### A small sample graph

This is a sample graph so you can learn how graph is represented in this problem

In [5]:
# Generate the graph
graph_sample = generate_graph_and_output_edges(5, 100, 2)

# print nodes represented in linked list
print(graph_sample.nodes)

# print edges represented in edge list
for e in graph_sample.edges:
    print("{:2d}, {:2d}, {:.5f}".format(e.node1, e.node2, e.weight))

[{1, 2, 3}, {0, 3, 4}, {0, 3, 4}, {0, 1, 2}, {1, 2}]
 0,  1, 0.45953
 1,  0, 0.45953
 0,  3, 0.73196
 3,  0, 0.73196
 1,  3, 0.50687
 3,  1, 0.50687
 2,  0, 0.53290
 0,  2, 0.53290
 2,  3, 0.26342
 3,  2, 0.26342
 4,  1, 0.33535
 1,  4, 0.33535
 4,  2, 0.83890
 2,  4, 0.83890


In [6]:
# Union Find data structure is provided to support Kruskal
class UnionFind:
    def __init__(self, num_nodes):
        # Initially position[i] = i
        self.position = [i for i in range(num_nodes)]
        
    # Return the cluster index
    def find(self, node):
        if self.position[node] == node:
            return node
        else:
            self.position[node] = self.find(self.position[node])
            return self.position[node]
    
    def union(self, node1, node2):
        a = self.find(node1)
        b = self.find(node2)
        # no need to union
        if a == b:
            return
        # union is needed
        else:
            if a < b:
                self.position[b] = a
            else:
                self.position[a] = b

## Part 2

Find a MST for a given graph.

The returned MST should be in edge list format. (Just like what we did in the lab)

When sorting the graph.edges, you must use Merge Sort. (We used PriorityQueue in the lab)

In [9]:
# Implement MergeSort

# Your merge sort should returend a sorted list of A
def merge(X, Y):
    m = len(X)
    n = len(Y)  

    # output list to store the merged (and sorted) X and Y.
    output = []

    
    i = 0 #index for X
    j = 0 #index for Y

    #length of the output list should be the sum of the length of the two lists to be merged
    while len(output) < (m+n):
        #get the smaller element of X and Y at the current i and j and add to the output.
        if X[i] < Y[j]:
            output.append(X[i])
            i += 1
            #if we have run through the entire X, append the remaining Y (the remaining will already be sorted)
            if i>=len(X):
                output.extend(Y[j:])
        else:
            output.append(Y[j])
            j += 1
            #if we have run through the entire Y, append the remaining X (the remaining will already be sorted)
            if j>=len(Y):
                output.extend(X[i:])

    return output


def MergeSort(A):
    n = len(A)

    #base case
    if n<=1:
        return A

    else: 
        mid = n//2
        left_half = A[:mid]
        right_half = A[mid:]
        left_half = MergeSort(left_half) #sorted left half
        right_half = MergeSort(right_half) #sorted right half
        return merge(left_half,right_half) #combine(merge) the two sorted halves

In [22]:
graph_sample_edge_weights  = [] 
for e in graph_sample.edges:
    graph_sample_edge_weights.append(e.weight)

sorted_edge_weights = MergeSort(graph_sample_edge_weights)

In [62]:
# Implement Kruskal

# returned MST is a list of class Edge
def kruskal(graph):
    MST = []
    uf = UnionFind(graph.num_nodes)

    graph_edge_weights  = [] 
    for e in graph.edges:
        graph_edge_weights.append(e.weight)

    sorted_edge_list = MergeSort(graph_edge_weights)

    counter = 0 
    while counter < len(sorted_edge_list):
        edge_weight = sorted_edge_list[counter]
        src = [e.node1 for e in graph.edges if e.weight == edge_weight][0]
        dst = [e.node2 for e in graph.edges if e.weight == edge_weight][0]
        counter += 1
        if uf.find(src) != uf.find(dst):
            MST.append((Edge(src, dst, edge_weight)))
            uf.union(src,dst)

    return MST

## Part 3

A new edge (2, 19) with weight 0.2

We need to quickly update the old MST to get a new MST

In [73]:
# Construct a graph from edge list. This is given to you.

# We assume num_nodes is given for simplicity.
def generate_graph_from_edges(edge_list, num_nodes):
    graph = UndirectedGraph(num_nodes)
    for e in edge_list:
        graph.insert(e.node1, e.node2, e.weight)
    return graph

### Algorithm

- We add the new edge to the old MST which will generate a cycle

- The cycle consists at least three edges

- The edge with the largest weight should be removed

- You ONLY need to print out the edges on the cycle to get full marks (the two end nodes and the weight)

### Sample example

- We have a MST for a graph of 3 nodes

- MST has edge (0, 1, 0.12) and (0, 2, 0.23)

- We add a new edge (1, 2, 0.2)

- A cycle of three edges is generated

- We need to delete edge (0, 2, 0.23)

### How to use DFS to find the cycle

- Suppose the new edge is (2, 19, 0.2)

- If we DFS from node 2, we will eventurally hit node 19

- You need to properly append and pop on "cycle_edges" during DFS 

- The "cycle_edge" should contain the final path from 2 to 19

In [80]:
# Use DFS to detect the cycle

# You may add more arguments in dfs_recursive() for your convenience

def dfs(graph, starting_node, stopping_node):
    visited_nodes = [False for i in range(graph.num_nodes)]
    cycle_edges = []
    dfs_recursive(graph, starting_node, visited_nodes, cycle_edges, stopping_node)
    
    # The following is for demonstration purpose
    print("We found {:d} edges from node {:d} to node {:d}"
          .format(len(cycle_edges), starting_node, stopping_node))
    for e in cycle_edges:
        '''
        print the two end nodes of the edge
            the weight information is not known so you do not need to output 
        
        You may change the following print statement
            if you did not use Edge class
        '''
        print("{:2d}, {:2d}".format(e.node1, e.node2))


def dfs_recursive(graph, starting_node, visited_nodes, cycle_edges, stopping_node):

    if starting_node == stopping_node:
        return
    else:
        visited_nodes[starting_node] = True
        neighbors = graph.nodes[starting_node]
        if all([visited_nodes[neigh] for neigh in neighbors]):
            return

        for neigh in neighbors:    
            if visited_nodes[neigh]:
                continue
            visited_nodes[starting_node] = True
            cycle_edges.append((Edge(starting_node, neigh)))
            dfs_recursive(graph, neigh, visited_nodes,cycle_edges,stopping_node)

        return

    

## Part 4 Test code

Do NOT modify the code in this part.


In [63]:
graph_mst = generate_graph_and_output_edges(20, 10, 5)
MST = kruskal(graph_mst)

for e in MST:
    print("{:2d}, {:2d}, {:.5f}".format(e.node1, e.node2, e.weight))

 3, 18, 0.00406
 5,  1, 0.03175
 0, 18, 0.03259
 0,  5, 0.03440
 1, 11, 0.04456
18,  2, 0.05057
13, 15, 0.06277
 4, 15, 0.06499
11,  7, 0.07993
19,  4, 0.10876
 9,  8, 0.14351
 5, 16, 0.15642
10,  9, 0.16494
 7, 14, 0.16636
 9, 16, 0.17846
 3,  4, 0.19495
 6,  4, 0.38442
12,  5, 0.39059
17,  0, 0.43858


In [74]:
def find_the_cycle(MST, new_edge, num_nodes):
    # construct graph from edge list
    graph_cycle = generate_graph_from_edges(MST, num_nodes)

    # call dfs to find a path from node1 to node2
    dfs(graph_cycle, new_edge.node1, new_edge.node2)

In [84]:
find_the_cycle(MST, Edge(2, 19, 0.2), 20)

We found 19 edges from node 2 to node 19
 2, 18
18,  0
 0, 17
 0,  5
 5,  1
 1, 11
11,  7
 7, 14
 5, 12
 5, 16
16,  9
 9,  8
 9, 10
18,  3
 3,  4
 4, 19
 4,  6
 4, 15
15, 13


In [85]:
# Instead we can add edge (3, 14, 0.1)
find_the_cycle(MST, Edge(3, 14, 0.1), 20)

We found 19 edges from node 3 to node 14
 3, 18
18,  0
 0, 17
 0,  5
 5,  1
 1, 11
11,  7
 7, 14
 5, 12
 5, 16
16,  9
 9,  8
 9, 10
18,  2
 3,  4
 4, 19
 4,  6
 4, 15
15, 13


### Hint

If you run the following codes,

In [81]:
graph_hint = generate_graph_and_output_edges(10, 1, 2)

mst = kruskal(graph_hint)

print("The minimum spanning tree:")
for e in mst:
    print("{:2d}, {:2d}, {:.5f}".format(e.node1, e.node2, e.weight))

find_the_cycle(mst, Edge(1, 8, 0.2), 10)

The minimum spanning tree:
 6,  8, 0.00920
 5,  0, 0.02232
 0,  1, 0.25507
 3,  6, 0.43277
 3,  1, 0.48786
 9,  0, 0.52763
 0,  2, 0.56920
 4,  3, 0.59115
 2,  7, 0.65159
We found 9 edges from node 1 to node 8
 1,  0
 0,  2
 2,  7
 0,  5
 0,  9
 1,  3
 3,  4
 3,  6
 6,  8


In [83]:
# this is the results you should get:
'''
The minimum spanning tree:
 8,  6, 0.00920
 0,  5, 0.02232
 1,  0, 0.25507
 6,  3, 0.43277
 1,  3, 0.48786
 0,  9, 0.52763
 2,  0, 0.56920
 3,  4, 0.59115
 7,  2, 0.65159
We found 3 edges from node 1 to node 8
 1,  3
 3,  6
 6,  8
'''
print()




# Answers to the other questions of the Final Exam

### Q 1.1 (A)
The minimum spanning tree does not change when each edge weight is incremented by 1
### Q 1.2 (B)
Removing the root node is the only way to disconnect the tree

### Q 1.3 (C)
?

### Q 1.4 (D)
6 minutes 

### Q 1.5 (B)
{him, ham, cat, bat}

### Q 1.6 (D)
bacca






# Q2

Let the c deleted edges be $e_1, e_2, e_3,....e_c$, where   
$e_1$ is the edge between node $u_1$ and $v_1$ with cost, $cost_1$   
$e_2$ is the edge between node $u_2$ and $v_2$ with cost, $cost_2$  
.  
.  
.  
$e_c$ is the edge between node $u_c$ and $v_c$ with cost, $cost_c$  

Starting from node $s$, run Dijkstra’s algorithm on the subgraph $H$ to get shortest path between nodes $s$ and $u_1,u_2,u_3,...u_c$ and store these values as $su_1,su_2,su_3....su_c$, respectively.   
In general, represent these as $su_i$ where $i$ ranges from $1$ to $c$.  
Based on the hint, this is an $O(mlogn)$ operation where $m$ is the number of edges in $H$ and $n$ is the number of vertices in $H$.

Starting from node $t$, run Dijkstra’s algorithm on the subgraph $H$ to get shortest path between nodes $s$ and $v_1,v_2,v_3,...v_c$ and store these values as $sv_1,sv_2,sv_3....sv_c$, respectively.  
In general, represent these as $sv_i$ where $i$ ranges from $1$ to $c$.  
Based on the hint, this is an $O(mlogn)$ operation where m is the number of edges in $H$ and $n$ is the number of vertices in $H$.

The goal now is to add back an edge $e_1$ or $e_2$....or $e_c$ such that we get the shortest path possible between $s$ and $t$. This is equivalent to minimizing the total cost $su_i + cost_i + sv_i$ for all $i$ between $1$ and $c$.  
This is an $O(c)$ operation where $c$ is the number of elements of the list (c deleted edges), the maximum of which is to be found. 

In summary, Dijkstra’s algorithm needs to be run twice $O(2mlogn) \approx O(mlogn)$ and the $O(n)$  

So the total running time of the algorithm is $O(mlogn) + O(c)$. Assuming $c \approx O(m)$, the total rumnning time of this algorithm is $O(mlogn)$

# Q3 

It is known that if a depth first search (DFS) starting from any arbitrary node of a graph, visits every other node of the (undirected)graph, then the graph is connected. Depth first search is  $O(m+n)$ where $m$ is the number of edges and $n$ is the number of nodes in the graph.

Removing a node $v$ from a graph $G$, also removes certain number of edges $e \leq m$. A connected graph $G$ becomes disconnected upon the removal of a node $v$ if any one of the edges $e$ is a `bridge`. The goal then is to identify if any of $e$ is a bridge. 

In other words, given a graph $G$, the aim is to identify edges which are bridges in O(m+n) time according to the question. Then we can check if any of the removed edges are bridges to decide if the graph remains connected or not upon the removal of a node.

The DFS rooted on a node, gives rise to two kinds of edges in the resulting DFS tree since DFS does not visit all edges of a graph.
1. Forward edges - These are the edges of $G$ which are visited while generating the DFS tree
2. Back edges - These are the edges of $G$ which are not visited while generating the DFS tree

Back edges, put into a DFS tree, creates cycles in a tree. Additionally, back edges connect a node to an ancestor of the node, who is not its direct parent. Back edges can never be bridges since if it were a bridge, it would be visited by DFS and be included in the DFS tree.
Therefore, only forward edges can be candidates for bridges.

Pseudocode to identify bridges in $O(m+n)$ (stemming off the Tarjan's algorithm)...same as DFS time complexity.

1. Start traversing a connected graph $G$ from node $v$
2. During DFS on $G$, keep track and update the following information
    - Discovery time (`disc`): For each node, this value indicates the time (starting from 0) at which the node is visited by the DFS
    - Low value (`low`): This value for each node, $u$, indicates whether there is an earlier node (whose discovery time value will be lower) which can be visited by the subtree rooted in the node $u$.
3. In DFS tree a forward edge (u, v) (u is parent of v in DFS tree) is a bridge if there does not exist any other alternative to reach u or an ancestor of u from subtree rooted with v. low[v] indicates earliest visited vertex reachable from subtree rooted with v. The condition for an edge (u, v) to be a bridge is, “low[v] > disc[u]”
4. Store all such (u,v). These are the bridges.


Given the list of all bridge edges, check if any of these bridge edges is present in the list of removed edges when node v is removed from graph $G$. This check can be an $O(1)$ or $O(m)$ operation. If it is present, then this means that the graph becomes disconnected upon removal of the node. Otherwise, it remains connected. 

The overall time complexity of this algorithm is $O(m+n)$

```
    //the value of 'time' will not be initialized for next function calls mark start as visited

    time := 0 
    set disc[start] := time+1 and low[start] := time + 1
    time := time + 1

    for all vertex v in the graph G, do
        if there is an edge between (start, v), then
            if v is visited, then
                parent[v] := start
                bridgeDetails(v, visited, disc, low, parent)
                low[start] := minimum of low[start] and low[v]

                if low[v] > disc[start], then
                display bridges from start to v
            else if v is not the parent of start, then
                low[start] := minimum of low[start] and disc[v]
    done
    
    End
```

# Q4

Total bits used by Huffman encoding with the given frequency = 17500 (Average 1.75 bits)  
Total bits used by encoding, where each character is encoded using 2 bits = 20000 (Average 2 bits)  

Photograph of the huffman tree generation process and calculations shown in the photograph. 

# Q6


Pseudocode - 

Given N x N board of characters, from a randomly chosen grid point, there are 8 possible directions to traverse.  
(1)Up, (2)Down, (3)Left, (4)Right, (5)Top right, (6)Top left, (7)Bottom right, (8)Bottom left.  

```
For row in range(N) # O(N)
    For column in range(N)  # O(N)
        grid point index = (row,col) 
        For each word in the given set of N words, # O(N)
            - Traverse in all 8 directions on the board of characters from the grid point, 
                for a total distance (in each direction) equalling the length of the word to be found.   
            - Ensure the program(crawler) remains within the bounds of the board. 
            - Use brute force technique to make character comparisons 
            - Worst case number of character comparisons to be made, per word, per grid point, 
                accoutning for all 8 directions is 8 x len(word)
``` 

Therefore, worst case time complexity is $O(N \times N \times N \times 8 \times len(longest\ word)) \approx O(N^3)$, assuming $len(longest\ word) << N,\ and\ N >> 8$
    

# Q7

In [311]:
def calc(n): 
    k=0 #O(1)
    i = n/2 #O(1)
    while(i<=n): #since i is incrementing by 1 each time, this has to run a maximum (worst case) of n/2 times as it starts from n/2
        i=i+1 #O(1)
        j=2 #O(1)
        #since j is incrementing by a factor of 2 each time, the remaining interval decreases by a factor of 2. 
        # Thus this has worse case time complexity of O(logn)
        while(j<=n):
            j=j*2 #O(1)
            k = k + n/2 #O(1)
    return k

Therefore, 
* The time complexity of the above function is $O(1) + O(1) + O(n/2 \times logn \times 1 \times 1) \approx O(nlogn)$
* The value of k :  
    A value of $n/2$ gets added to k (starting from 0), $nlogn$ times. Therefore, $k \approx O(n^2logn)$  

# Q8

1. Merge sort always splits the array into two equal halves while quick sort doesn't necessarily split into two equal halves. Instead it splits in a ratio based on the pivot location
2. Time complexity of quick sort is $O(N^2)$ whereas time complexity of merge sort is $O(NlogN)$
3. Merge sort is a stable sorting algorithm while quick sort is unstable.
4. Since merge sort is a recursive algorithm, it can be memory ineffiecient and perform have a large recursive stack
5. Quick sort is an inplace sorting (memory efficient) whereas merge sort requires additional memory to store the sorted elements.


Why merge sort is better for linked lists.

In linked lists, the elements to be sorted are not stored in adjacent memory locations (unlike an array).  
To access an element in the ith position in a linked list, we have to start from the head element and sequentially travel each and every node until we reach the ith position as we don’t have a continuous block of memory (having which, would have made quick random access possible).  

Quick Sort requires a lot of access to different memory locations (randomly, not sequentially).  Therefore, quick sort used on linked list is slow and heavy.   
On the other hand, merge sort accesses data sequentially and the need for random access is low, which makes it very suitable for linked lists.
 

# Q9

In [322]:
#Boyer Moore (from the assignment)
def find_boyer_moore(T, P):
    """
    Boyer-moore string matching algorithm returns
    the first index of the matching substring P in string T.
    """
    n, m = len(T), len(P)
    if m == 0:
        return 0
    #key is letter, value is index. Reading from the end. So last letter will have index 0
    last = {}
    for k in range(m):
        last[P[k]] = k
    #i is for text, reading from the end of pattern.
    i = m-1
    #k is for the pattern, reading from the end of pattern.
    k = m-1
    
    count = 0
    
    while i < n:
        # If match, decrease i,k
        count += 1
        if T[i] == P[k]:
            if k == 0:
                return i,count
            else:
                i -= 1
                k -= 1
        # Not match , reset the positions
        else:
            #check if the element in the text which didn't match the pattern is elsewhere in the pattern. Get that position
            j = last.get(T[i], -1)
            #shift the entire pattern if j = -1 or shift until the matching character in the text.
            i += m - min(k, j+1)
            #k reset to end of pattern.
            k = m-1
    return -1,count

In [323]:
s = "GCATGACTGCGTGACC"   
p = "CTGC"

index,compare = find_boyer_moore(s,p)
print(f"{p} matches {s} starting at index {index}. The total number of comparisons made are {compare}")

CTGC matches GCATGACTGCGTGACC starting at index 6. The total number of comparisons made are 6


s = “GCATGACTGCGTGACC"   
p = “CTGC"

Booyer Moore Step by Step

- Read from the end of the pattern (pattern_index=3, letter C) and corresponding position of the string (string_index=3, letter T). 
- Check if the letters match
- The two letters don't match. Number of comparisons made so far = 1
- Mismatch occurs between p's C and s's T. 
- Check if T is present in p. 
- It is. Get the position of T in p using the lookup table created at the beginning of the program (Named `last`). It is at position 1 of the pattern
- Now the pattern is shifted so that CTGC comes under ...ATGA... of the string.  

This completes the first iteration of the algorithm 

- Start from the end of the pattern again (pattern_index=3, letter C) and check if the letter C of p matches with letter A of s (string_index=5, letter A).  
- The two letters don't match. Number of comparisons made so far = 2
- Mismatch occurs between p's C and s's A. 
- Check if G is present in p. 
- It is not. So shift the entire pattern by the length of the pattern.
- Now the pattern is shifted so that CTGC comes under ...CTGC... of the string. 

This completes the second iteration of the algorithm

- Start from the end of the pattern again (pattern_index=3, letter C) and check if the letter C of p matches with letter C of s (string_index=9, letter A). 
- The two letters match. Number of comparisons made so far = 3. Continue such comparisons.
- G of p (pattern_index=2) matches with G (string_index=8) of s. Number of comparisons made so far = 4
- T of p (pattern_index=2) matches with T (string_index=7) of s. Number of comparisons made so far = 5
- C of p (pattern_index=2) matches with C (string_index=6) of s. Number of comparisons made so far = 6

This completes the algorithm. Output the string index and the number of comparisons made.






# Q10

In [234]:
def does_subset_with_sum_exist(arr, target_sum):

    n = len(arr)
     
    # The value of subset[i][j] will be true if there is a subset of arr[0..j-1] with sum equal to i

    #make a table with all false values. Rows represent array indices and columns represent sums from 0 to target sum
    table = [[False for _ in range(target_sum + 1)] for __ in range(n + 1)] # O(target_sum*len(arr))
     
    # If selected sum is 0, then answer is true since all empty sets have sum 0.
    for i in range(n + 1):
        table[i][0] = True
         
    # If selected sum is not 0 and set is empty, then answer is false cause this is unattainable
    for i in range(1, target_sum + 1):
         table[0][i]= False
             
    # for each row in the table
    for i in range(1, n + 1): # O(len(array))
        #for each column of the table
        for j in range(1, target_sum + 1):  # O(target_sum)
            # if the selected sum is lesser than the array element, copy boolean from the previous array element (row above) 
            if j < arr[i-1]:
                table[i][j] = table[i-1][j]

             # if the selected sum is greater than the array element, copy either boolean from the previous array element (row above) or from a shifted column
             # This is equivalent to moving one step up and x steps back where x is arr[i-1]
            if j >= arr[i-1]:
                table[i][j] = (table[i-1][j] or table[i - 1][j-arr[i-1]])
     
    return table[n][target_sum]

The time complexity of the dynamic programming implementation to check if the subset with sum x exists, given a set, is $O(nm)$  
where n in the size of the input array or set and m is the target sum value. Its a pseudo polynomial time algorithm.

In [242]:
#Proof that `does_subset_with_sum_exist` works 
A = [3,7,11,15,21]
target_sum = 25
print(f"Does subset with sum {target_sum} exist in {A}?",does_subset_with_sum_exist(A, target_sum))

A = [3,7,11,15,21]
target_sum = 0
print(f"Does subset with sum {target_sum} exist in {A}?",does_subset_with_sum_exist(A, target_sum))

A = [3,7,11,15,21]
target_sum = 100
print(f"Does subset with sum {target_sum} exist in {A}?",does_subset_with_sum_exist(A, target_sum))

A = [15,15,15]
target_sum = 100
print(f"Does subset with sum {target_sum} exist in {A}?",does_subset_with_sum_exist(A, target_sum))

A = [15,15,15]
target_sum = 45
print(f"Does subset with sum {target_sum} exist in {A}?",does_subset_with_sum_exist(A, target_sum))


Does subset with sum 25 exist in [3, 7, 11, 15, 21]? True
Does subset with sum 0 exist in [3, 7, 11, 15, 21]? True
Does subset with sum 100 exist in [3, 7, 11, 15, 21]? False
Does subset with sum 100 exist in [15, 15, 15]? False
Does subset with sum 45 exist in [15, 15, 15]? True


# Q 11
Given an L table, the length of the LCS is given by the value at the bottom right of the table. In this case, the value at the bottom right is 6. Therefore the length of the LCS is 6.


# Programming Problem 2

In [298]:
def dynamic_knapsack(arr,target_sum):

    '''https://www.baeldung.com/cs/subset-of-numbers-closest-to-target'''

    n = len(arr)
    dp = [[None for _ in range (target_sum+1)] for __ in range(n+1)]
    for sum in range(0,target_sum+1):
        dp[n][sum] = sum
    for i in range(n-1, -1,-1):
        for sum in range(target_sum,-1,-1):
            pick = 0
            if sum + arr[i] <= target_sum:
                pick = dp[i+1][sum+arr[i]]
            leave = dp[i+1][sum]
            dp[i][sum] = max(pick,leave)
    # print("table",dp)
    return dp
 

In [309]:
table

[[8, 9, 9, 9, 9, 8, 9, 9, 9, 9],
 [7, 8, 9, 8, 9, 7, 8, 9, 8, 9],
 [5, 6, 7, 8, 9, 5, 6, 7, 8, 9],
 [5, 6, 7, 8, 9, 5, 6, 7, 8, 9],
 [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]]

In [308]:
A = [1,2,5,5]
target_sum = 9
table = dynamic_knapsack(A,target_sum)

add_to_sack = []
row = 0
while row<len(table)-1:
    if table[row+1][0] != table[row][0]:
        add_to_sack.append(A[row])
    row += 1

print(f"subset of elements summing closest to {target_sum} is {add_to_sack} with a sum of {table[0][0]}")

subset of elements summing closest to 9 is [1, 2, 5] with a sum of 8


In [310]:
A = list({5, 23, 27, 37, 48, 51, 63, 67, 71, 75, 70, 83, 889, 91, 101, 112, 121, 132, 137, 141, 143, 147, 153, 159, 171, 181, 190, 191})
target_sum = 726
table = dynamic_knapsack(A,target_sum)

add_to_sack = []
row = 0
while row<len(table)-1:
    if table[row+1][0] != table[row][0]:
        add_to_sack.append(A[row])
    row += 1

print(f"subset of elements summing closest to {target_sum} is {add_to_sack} with a sum of {table[0][0]}")

subset of elements summing closest to 726 is [63, 191, 70, 71, 75, 83, 91, 101, 112, 121] with a sum of 726
