Question 1

In this programming problem and the next you'll code up the knapsack algorithm from lecture. File knapsack1.txt is used.

This file describes a knapsack instance, and it has the following format:

[knapsack_size][number_of_items]

[value_1] [weight_1]

[value_2] [weight_2]

...

For example, the third line of the file is "50074 659", indicating that the second item has value 50074 and size 659, respectively.

You can assume that all numbers are positive.  You should assume that item weights and the knapsack capacity are integers.

Calculate and return the value of the optimal solution.

In [1]:
def load(filename):
    """
        To load data to a nested list: first item is capacity and total count of items;
        the rest of nested list is list of value and weight
        ex. [[8,6], [1,1], [2,3], [5,4], [5,2], [4,2], [1,5]]
        Returns the list of data
    """
    data = []
    
    with open(filename) as file:
        f = file.readlines()
        
        for item in f:
            value, weight = item.strip().rsplit(" ")
            data.append([int(value), int(weight)])
            
    return data


def knapsack(data):
    """
        Knapsack Algorithm with dynamic programming:
            Each item has value v and size w; total count of item is i, max capacity is x. 
            The goal is to maximum value within the max capacity.
            The basic logic is:
            For i in {1, 2, ... n} and capacity x, maximum value v at (i, x) position is
                V(i,x) = max of
                    1. V((i-1), x) (case 1, item i excluded)
                    2. value of i + V((i-1), x - w(i)) (case 2, item i included)
        Result is a dictionary in the form of {(count, capacity): weight}; and last element of the dictionary
        is the optimal solution of Knapsack Algorithm
    """
    capacity = data[0][0]
    count = data[0][1]
    
    result = {(0, i): 0 for i in range(capacity + 1)}
    
    for i in range(1, count + 1):
        for x in range(capacity + 1):
            try:
                result[(i, x)] = max(result[(i - 1, x)], result[(i - 1, x - data[i][1])] + data[i][0])
            except:
                result[(i, x)] = result[(i - 1, x)]
    
    return result


def max_list(data, result):
    capacity = data[0][0]
    count = data[0][1]
    lis = []
    
    max_value = result[(count, capacity)]
    
    while max_value > 0:
        if max_value == result[(count-1, capacity)]:
            count -= 1
        else:
            lis.append(count)
            capacity -= data[count][1]
            max_value -= data[count][0]
            count -= 1
                    
    return lis
            
    

if __name__ == "__main__":
    import time
    start = time.time()
    data1 = load("knapsack1.txt")
    capacity = data1[0][0]
    count = data1[0][1]
    result = knapsack(data1)
    print("The value of the optimal solution of Knapsack Algorithm is: ", result[(count, capacity)])
    
    print()
    lis = max_list(data1, result)
    print("Items included in the optimal soluction are: \n", lis)
    
    print()
    total = 0
    weight = 0
    for i in lis:
        total += data1[i][0]
        weight += data1[i][1]
        
    print(f"Testing of result: total of the items is: {total}, total weights of items is: {weight}")
    
    end = time.time()
    print()
    print(f"The run time of Knapsack Algorithm is {end-start} second(s).")
    

The value of the optimal solution of Knapsack Algorithm is:  2493893

Items included in the optimal soluction are: 
 [100, 99, 93, 92, 91, 89, 83, 80, 78, 77, 76, 75, 72, 69, 66, 62, 59, 58, 57, 54, 50, 46, 43, 42, 41, 37, 32, 29, 26, 25, 24, 19, 16, 15, 14, 11, 10, 9, 7, 6]

Testing of result: total of the items is: 2493893, total weights of items is: 9976

The run time of Knapsack Algorithm is 0.8384654521942139 second(s).


Question 2

This problem also asks you to solve a knapsack instance, but a much bigger one. File knapsack_big.txt

This file describes a knapsack instance, and it has the following format:

[knapsack_size][number_of_items]

[value_1] [weight_1]

[value_2] [weight_2]

...

For example, the third line of the file is "50074 834558", indicating that the second item has value 50074 and size 834558, respectively.  As before, you should assume that item weights and the knapsack capacity are integers.

This instance is so big that the straightforward iterative implemetation uses an infeasible amount of time and space.  So you will have to be creative to compute an optimal solution.  One idea is to go back to a recursive implementation, solving subproblems --- and, of course, caching the results to avoid redundant work --- only on an "as needed" basis.  Also, be sure to think about appropriate data structures for storing and looking up solutions to subproblems.

Calculate and return the value of the optimal solution.

In [2]:
cache = {}

def knapsack1(index, weight):
    """
        Knapsack Algorithm with dynamic programming using caching and recursion. Same logic as above knapsack algorithm,
    """
    if index <= 0:
        return 0
    elif weight <= 0:
        return 0
    else:
        # case 1: V((i-1), x) (case 1, item i excluded). If the result was previously calculated, returns the amount in cache
        # if not recurse the function
        if (index-1, weight) in cache.keys():
            value1 = cache[(index-1, weight)]
        else:
            value1 = knapsack1(index-1, weight)
        
        # case 2: value of i + V((i-1), x - w(i)) (case 2, item i included). If the result was previously calculated, 
        # returns the amount in cache; if not recurse the function. 
        if (index - 1, weight - data[index][1]) in cache.keys():
            value2 = cache[(index-1, weight - data[index][1])] + data[index][0]
        else:
            # if available capacity if less than current item's weight, return 0
            if weight - data[index][1] >= 0:
                value2 = knapsack1(index-1, weight - data[index][1]) + data[index][0]
            else:
                value2 = 0
        
        # get the max value of case 1 and case 2, and cache the result
        value = max(value1, value2)
        cache[(index, weight)] = value
        return value
 
    


if __name__ == "__main__":
    import time
    start = time.time()
    data = load("knapsack_big.txt")
    index = data[0][1]
    capacity = data[0][0]
    print("The value of the optimal solution of Knapsack Algorithm is: ", knapsack1(index, capacity))
    end = time.time()
    print(f"The run time of Knapsack Algorithm is {end-start} second(s).")

The value of the optimal solution of Knapsack Algorithm is:  4243395
The run time of Knapsack Algorithm is 32.964741468429565 second(s).
