Before you turn this problem in, make sure everything runs as expected. First, **restart the kernel** (in the menubar, select Kernel$\rightarrow$Restart) and then **run all cells** (in the menubar, select Cell$\rightarrow$Run All).

Make sure you fill in any place that says `YOUR CODE HERE` or "YOUR ANSWER HERE", as well as your name and collaborators below:

In [1]:
NAME = "Lars Janssen"

---

For those not familiar with Python, a quick overview is given [here](https://github.com/palcu/python-for-competitive-programming/blob/master/python-for-competitive-programming.ipynb).

# Notebook BAPC week 13: Dynamic Programming II

In [2]:
from io import StringIO
from sys import stdin
# Overwrite the jupyter input function.
def input():
    return stdin.readline()

## The 0-1 Knapsack problem

Given the weights and values of $N$ items, the goal is to put items into a knapsack of capacity $W$ and maximize the total value in the knapsack. This problem is called 0-1 knapsack when you can pick every item either 0 or 1 times; there are no copies.
![](https://miro.medium.com/max/1368/0*3dS6Jw8NzzSD-mn8.jpg)

### Exercise 1: A brute force solution
Let us first build a naive brute force solution by just simply enumerating all possible subsets of the items. We can encode a subset as a list of ones and zeros (`subset[n-1] == 1` when item `n` is in the subset, and `0` else). Computing the total value and weight for a given subset is simple, so this leads to a (very slow) solution of the 0-1 Knapsack problem.

Encoding these subsets can be done efficiently using a bitmask: on the computer, an integer is basically a list of ones and zeros, and bit `n-1` (counted from the right) is set exactly when item `n` is in our subset. First, finish the function `is_bit_set` below.

In [102]:
def is_bit_set(subset, n):
    """ Returns whether bit `n` is set in the integer `subset`. """
    assert n >= 0
    binair = bin(subset)[2:]
    binair = binair[::-1]
    if(n < len(binair) and binair[n] == "1"):
        return True
    else:
        return False

In [103]:
for n in range(10):
    print("%s in binary is %s. Its bits are enumerated as %s" % (n, bin(n), ', '.join(reversed(bin(n)[2:]))))
assert is_bit_set(1, 0) and not is_bit_set(1, 1)
assert is_bit_set(2, 1) and not is_bit_set(2, 0)
assert is_bit_set(3, 1) and is_bit_set(3, 1)

0 in binary is 0b0. Its bits are enumerated as 0
1 in binary is 0b1. Its bits are enumerated as 1
2 in binary is 0b10. Its bits are enumerated as 0, 1
3 in binary is 0b11. Its bits are enumerated as 1, 1
4 in binary is 0b100. Its bits are enumerated as 0, 0, 1
5 in binary is 0b101. Its bits are enumerated as 1, 0, 1
6 in binary is 0b110. Its bits are enumerated as 0, 1, 1
7 in binary is 0b111. Its bits are enumerated as 1, 1, 1
8 in binary is 0b1000. Its bits are enumerated as 0, 0, 0, 1
9 in binary is 0b1001. Its bits are enumerated as 1, 0, 0, 1


Study the `knapsack_bruteforce` code and answer the question below it. Note that the `values` and `weights` lists are zero-indexed while our `items` list is 1-indexed.

In [104]:
def knapsack_bruteforce(W, values, weights):
    N = len(values)
    max_value = 0
    for subset in range(2**N):
        items = [n for n in range(1, N+1) if is_bit_set(subset, n-1)]
        items_value = sum(values[n-1] for n in items)
        items_weight = sum(weights[n-1] for n in items)
        if items_value > max_value and items_weight <= W:
            max_value = items_value
    return max_value

In [105]:
assert knapsack_bruteforce(5, [1, 10, 100], [5, 5, 5]) == 100
assert knapsack_bruteforce(6, [5, 4, 3, 2], [4, 3, 2, 1]) == 9
assert knapsack_bruteforce(50, [60, 100, 120], [10, 20, 30]) == 220
assert knapsack_bruteforce(10, [10, 40, 30, 50], [5, 4, 6, 3]) == 90

What is the runtime of this brute force solution?

2^N

### Exercise 2: Deriving a Dynamic Programming solution
When the values and weights are positive integers, and the maximum weight is not too big, we can use DP to solve this problem efficiently.

Every item `n` (using 1-indexed notation, so $1 \leq n \leq N$) is either contained in the optimal subset, or it is not. Therefore, the maximum value `max_value[n][w]` that can be obtained using the first `n` items with weight capacity `w` is the maximum of
* `max_value[n-1][w]` (we *didnt't* choose item `n`)
* `max_value[n-1][w - weights[n-1]] + values[n-1]` (we *do* choose item `n`).

The following observations can be made:
* Obviously, if item `n` weighs more than what the knapsack can hold, we can't include it;
* `max_value[0][w] = 0` for all `w` (we take zero items);
* `max_value[n][0] = 0` for all `k` (we can't fit any items in a bag with no space);
* After building the `max_value` table, `max_value[N][W]` will contain our answer.

Finish the bottom-up DP code below.

In [29]:
def knapsack_dp_table(W, values, weights):
    N = len(values)
    max_value = [[-1 for _ in range(W+1)] for _ in range(N+1)]
    for w in range(W+1):  # We take zero items
        max_value[0][w] = 0
    for n in range(N+1):  # We can't fit any items in a bag with no space
        max_value[n][0] = 0
        
    # Build the bottom-up DP table
    for n in range(1, N+1):  # Consider subsets of the items 1, ..., n only.
        for w in range(1, W+1):  # The current weight capacity of the knapsack
            if(weights[n-1] > w):
                max_value[n][w] = max_value[n-1][w]
            else:
                max_value[n][w] = max(max_value[n-1][w], max_value[n-1][w- weights[n-1]] + values[n-1])
    return max_value

def knapsack_dp(W, values, weights):
    return knapsack_dp_table(W, values, weights)[-1][W]

In [30]:
assert knapsack_dp(5, [1, 10, 100], [5, 5, 5]) == 100
assert knapsack_dp(6, [5, 4, 3, 2], [4, 3, 2, 1]) == 9, "Are you sure you never go over the weight capacity of the knapsack?"
assert knapsack_dp(50, [60, 100, 120], [10, 20, 30]) == 220
assert knapsack_dp(10, [10, 40, 30, 50], [5, 4, 6, 3]) == 90

### Note on complexity

As you may know, the Knapsack problem is NP-complete. So how did we just find an efficient solution?

Note that in our algorithm, we fill a $n \times W$ array. This is polynomial in $n$ and $W$, but it is not polynomial in the input size of the algorithm. Because it takes only $\lceil\log_2(W)\rceil$ bits to specify $W$, writing the size of the array as $n \times 2^{\log_2(W)}$ reveals that our array is really exponential in size (in terms of the size of the input).

### Exercise 3: Reconstructing the optimal subset

The above code will produce the right solution, and if we did everything right, it will do it in runtime $O(n W)$. However, the algorithm does not keep a record of which subset of items gives the optimal solution. This step, i.e., tracing what choices we made to arrive at an optimal solution is called *backtracing*.

For the knapsack problem, the backtracing step is relatively easy. For each item `n` from `N` down to `1`, we check whether removing item `n` from our knapsack yields a solution that is again optimal for `n-1` items, and we add item `n` to the list of items in our optimal subset.

In [53]:
def knapsack_subset(W, values, weights):
    N = len(values)
    max_value = knapsack_dp_table(W, values, weights)
    items = set()  # The list of items we will 
    w = W  # Our running "maximum weight on `n` items".
    for n in reversed(range(1, N+1)):
        if(max_value[n-1][w - weights[n-1]] + values[n-1] > max_value[n-1][w] and w -weights[n-1] >= 0):
            items.add(n)
            w = w - weights[n-1]
    return items

In [54]:
assert knapsack_subset(6, [1, 10, 100], [5, 5, 5]) == set([3])
assert knapsack_subset(6, [5, 4, 3, 2], [4, 3, 2, 1]) == set([2,3,4])
assert knapsack_subset(52, [60, 100, 120], [10, 20, 30]) == set([2, 3])
assert knapsack_subset(10, [10, 40, 30, 50], [5, 4, 6, 3]) == set([2, 4])