## 1. Greedy

There are n houses in a village and one large water tank near by. You are required to supply water to all houses by laying pipes. For each house you can either (i) build a pipe to the water tank or (ii) build a pipe to a house that already has water. You are given array T, where T[i] is the cost of building a pipe between the water tank and house i. You are also given list H, where H[i] is a tuple (j, c) indicating the cost c of building a pipe between houses i and j. The cost c is non-negative.
Your task is to supply water to all the houses at minimum cost. Describe a greedy algorithm to solve the task and prove the correctness of your algorithm. Your proof of correctness can use an exchange argument.
(Hint: Model the problem as a graph and devise an algorithm for connecting the vertices.) 


<div style="color:blue">

Method

* We create a graph $G$ in which the water tank $t$ and each house $i$ are represented as vertices.
* For each house $i$, we add an edge $(t, i)$ between the water tank $t$ and the house $i$ with weight $T[i]$.
* For each tuple $(j, c)$ in the list $H$, add an edge between house $i$ and $j$ with weight $c$.

Correctness Proof

* **Optimal Substructure**: The MST of a graph is an optimal substructure for the problem. It guarantees that the total weight (cost) of the edges (pipes) in the MST is the minimum possible to connect all vertices (houses and the water tank)
* **Greedy Choice Property**: In each step, both Kruskal’s and Prim’s algorithms make a greedy choice by picking the smallest weight edge that doesn’t form a cycle. This ensures that at each step, the cost is minimized.
* For **Exchange Arguments**, refer to:
  * [JHU 600.363 Lecture 14](https://www.cs.jhu.edu/~mdinitz/IntroAlgorithms/Lectures/lecture14.pdf) Page 25
  * [Stanford CS 161 Slides](https://web.stanford.edu/class/archive/cs/cs161/cs161.1138/lectures/14/Small14.pdf)

</div>


## 2. Dynamic Programming

You are given an integer array cost, where cost[i] is the cost of the i-th step on a staircase. Once you pay the cost, you can either climb one step or two steps. Design a dynamic programming algorithm which finds (i) the minimum cost to reach the top floor and (ii) the sequence of steps to achieve that minimum.

Example 1:

Input: cost = [10, **15**, 20]

Output: minimum cost = 15, sequence of steps = [1]


Example 2:
Input: cost = [**1**, 100, **1**, 1, **1**, 100, **1**, **1**, 100, **1**,]

Output: minimum cost = 6, sequence of steps = [0, 2, 4, 6, 7, 9]

You are required to provide the recurrence relation and write a pseudocode. 


<div style="color:blue">

Let `dp[i]` represent the minimum total cost to reach the i-th step, and let `cost[i]` represent the cost of the `i`-th step. The recurrence relation is defined as:
* For the first step, `dp[0] = cost[0]`
* For the second step, `dp[1] = cost[1]`
* For the follow-up steps, `dp[i] = cost[i] + min(dp[i-2], dp[i-1])`

</div>


In [24]:
def findMinCostAndPath(cost):
    n = len(cost)
    if n == 0:
        return (0, [])
    if n == 1:
        return (cost[0], [0])
    if n == 2:
        min_cost = min(cost[0], cost[1])
        min_step = 0 if cost[0] < cost[1] else 1
        return (min_cost, [min_step])

    dp = [float('inf') for _ in range(n)]
    dp[0] = cost[0]
    dp[1] = cost[1]

    for i in range (2, n-1):
        dp[i] = cost[i] + min(dp[i-1], dp[i-2])

    min_cost = min(dp[n-1], dp[n-2])

    # Backtracking to find the path
    path = []
    i = n-1

    # Start from the smaller element.
    if cost[n-1] > cost[n-2]:
        i = n-2

    while i >= 0:
        path += [i]
        if i == 1 or i == 0:
            break
        if dp[i-1] < dp[i-2]:
            i = i - 1
        else:
            i = i - 2
    
    path.reverse()
    return (min_cost, path)

            

In [25]:
for s in [[10, 15, 20], # Output: minimum cost = 15, sequence of steps = [1]
          [1, 100, 1, 1, 1, 100, 1, 1, 100, 1]]: # cost = 6, sequence = [0, 2, 4, 6, 7, 9]
    print(findMinCostAndPath(s))   

20 15 larger
(15, [1])
(104, [0, 2, 4, 6, 7, 9])


## 3. Dynamic Programming: bookshelf

You are given n books, b1, b2, ..., bn that need to be arranged into a bookshelf. The books are already sorted by their indices. Each book bi has thickness ti and height hi. The books must be arranged in the given order of their indices, from the lowest level to the highest level of the bookshelf. The bookshelf has a total width of L, and the height of each level on the bookshelf can be adjusted.
The aim is to minimize the total space usage of the n books, defined as the sum of the heights of the highest book on each level, multiplied by the bookshelf width L. An illustration is shown below (the figure may not show an optimal solution):

Example: we have three books b1,b2,b3. The thickness values are: t1 = 1, t2 = 1 and t3 = 1. The heights of the books are: h1 = 1, h2 = 2, h3 = 3. The width of bookshelf L = 2. The optimal solution is to put b1 on level 1 and put b2 and b3 on level 2, which results in a total space usage of 8.
Please design a dynamic programming algorithm to find the minimum total space usage of the n books. Please define the subproblem(s) and give the recurrence relations. Analyze the time and space complexity of your algorithm. Backtracing step and pseudocode are not required.

<div style="color:blue">
    
See [LeetCode 1105](https://leetcode.com/problems/filling-bookcase-shelves/description/)
    
</div>

## 4. NP-complete

The 2-PARTITION problem is: given a set $S$ of numbers, determine whether $S$ can be partitioned into two sets, $A$ and $S − A$, such that: $\sum_{x \in A} x = \sum_{x \in (S−A)} x$. Please prove that the 2-PARTITION problem is NP-complete using that the SUBSET-SUM problem is NP-complete. In SUBSET-SUM$(X, k)$, we are given a set $X = \{x_1, .., x_n\}$ of integers and a target number $k$, and we want to find a subset $Y$ \subseteq S$ such that the members of $Y$ sum up to $k$.Hint: to construct 2-PARTITION instance S given a SUBSET-SUM instance $(X,k)$, you can consider adding one number to X and this number can be calculated from the following two variables: (1) the sum of all numbers in $X$; (2) $k$.


<div style="color:blue">

Reference
* [Set partition is NP complete (Geeksforgeeks)](https://www.geeksforgeeks.org/set-partition-is-np-complete/)

### 2-PARTITION is in NP

Given a partition of $S$ into two subsets, we can verify in polynomial time whether the sums of the two subsets are equal. This involves summing up the elements in each set, which takes $O(n)$ in the worst case.

### 2-PARTITION is NP-Hard

We construct an instance of the 2-PARTITION problem from an instance of subset sum. Consider a set of numbers $X$ with sum $\mathrm{Sum}(X)$. We construct a new set $S = X \cup \{\mathrm{Sum}(X)- 2k\}$ by adding an additional number $\mathrm{Sum}(X)- 2k$.

The total sum of elements in $S$ is $\text{Sum}(X) + (\text{Sum}(X) - 2k) = 2 \times \text{Sum}(X) - 2k$. If there exists a subset $Y \subseteq X$ such that $\sum_{y \in Y} y = k$ (the SUBSET-SUM problem), then the subset $Y \cup \{\text{Sum}(X) - 2k\}$ in $S$ (formed by adding the additional number to $Y$) will have a sum of $k + (\text{Sum}(X) - 2k) = \text{Sum}(X) - k$. The rest of the elements in $S$ (which are the elements in $X$ not in $Y$) will also sum up to $\text{Sum}(X) - k$, thus providing a valid partition for the 2-PARTITION problem.

If we can find such a partition in $S$, it implies a solution to the original SUBSET-SUM problem. This reduction is done in polynomial time: Calculating the sum of an array takes $O(n)$, where $n$ is the number of elements in the array. Appending the number takes $O(1)$.

Since 2-PARTITION is in NP and also NP-Hard, we have that 2-PARTITION is NP-Complete.

</div>