---
format:
  html:
    code-line-numbers: true
    code-overflow: wrap
    code-block-bg: true
    code-block-border-left: true
    highlight-style: Arrow
---

# The Knapsack Problem {#sec-knapsack}

The knapsack problem is a classic optimization problem in the field of operations research. It involves selecting a subset of items from a set of items, each with a certain weight and value, to maximize the total value while staying within a weight constraint.

More formally, given a set of $n$ items, each with a weight $w_i$ and a value $v_i$, and a knapsack with a maximum weight capacity $W$, the goal is to select a subset of items such that the sum of their weights is less than or equal to $W$, and the sum of their values is maximized.

This problem is NP-hard, meaning that it is computationally difficult to find an optimal solution for large instances of the problem. Various algorithms have been developed to solve this problem, including dynamic programming, branch and bound, and heuristic methods. The knapsack problem has applications in a variety of fields, including computer science, finance, and logistics, among others.

\begin{align}
    \text{max.} &\quad \sum_{i = 1}^nv_i x_i \label{knapsack-obj} \\
    \text{s.t.} &\quad \sum_{i=1}^n w_i x_i \leq W \label{knapsack-cons1} \\
    &\quad x_i \in \{0, 1\}, \ i = 1, \cdots, n
\end{align}

In [55]:
class Item:
    """An item represents an object that can be placed within a knapsack
    """
    
    def __init__(self, index, profit):
        """constructor

        Args:
            index (int): index of the item, starting from 0
            profit (float): profit of choosing the item
        """
        self._index = index
        self._profit = profit
        self._attributes = {}
    
    @property
    def index(self): return self._index
    
    @property
    def profit(self): return self._profit
    
    def get_attribute(self, name):
        return self._attributes[name]

    def set_attribute(self, name, value):
        self._attributes[name] = value
        
    def __str__(self):
        attribute_str = ""
        for attr in self._attributes:
            attribute_str += f'{attr}: {self._attributes[attr]}'
        return f"index: {self._index}, profit: {self._profit}, " + attribute_str

In [65]:
class KnapsackDataCenter:
    
    def __init__(self):
        self._items = []
        self._capacities = {}
        
    def read_data_set_f(self, data_file: str):
        """this function aims to read and parse data presented in 
        http://artemisa.unicauca.edu.co/~johnyortega/instances_01_KP/

        Args:
            data_file (str): path to the data file
            num_items (int): number of items in the file
            capacity (int): knapsack capacity
        """
        with open(data_file) as f:
            first_line = f.readline()
            num_items, capacity = first_line.split()
            self._capacities['weight'] = float(capacity)
            
            item_idx = 0
            rest_lines = f.readlines()
            for line in rest_lines:
                profit, weight = line.split()
                item = Item(item_idx, float(profit))
                item.set_attribute('weight', float(weight))
                self._items.append(item)
                item_idx += 1
                if item_idx == int(num_items): break
                
    @property
    def items(self): return self._items
    
    @property
    def capacities(self): return self._capacities

In [59]:
data_dir = "./data/knapsack/instances_01_KP/low-dimensional/"
data_file = "./data/knapsack/instances_01_KP/low-dimensional/f1_l-d_kp_10_269"

data_center = KnapsackDataCenter()
data_center.read_data_set_f(data_file)

items = data_center.items
capacities = data_center.capacities
for item in items:
    print(item)
print(capacities)

index: 0, profit: 55.0, weight: 95.0
index: 1, profit: 10.0, weight: 4.0
index: 2, profit: 47.0, weight: 60.0
index: 3, profit: 5.0, weight: 32.0
index: 4, profit: 4.0, weight: 23.0
index: 5, profit: 50.0, weight: 72.0
index: 6, profit: 8.0, weight: 80.0
index: 7, profit: 61.0, weight: 62.0
index: 8, profit: 85.0, weight: 65.0
index: 9, profit: 87.0, weight: 46.0
{'weight': 269.0}


In [76]:
from ortools.linear_solver import pywraplp

class KnapsackSolver:
    
    def __init__(self, data_center):
        self._data_center = data_center

        self._solver = None
        self._var_x = None

        self._opt_obj = None
        self._opt_x = None
        
    def build_model(self):
        self._solver = pywraplp.Solver.CreateSolver('SCIP')

        self._create_variables()
        self._create_objective()
        self._create_constraints()
        
    def optimize(self):
        status = self._solver.Solve()
        if status == pywraplp.Solver.OPTIMAL:
            self._opt_obj = self._solver.Objective().Value()
            items = self._data_center.items
            self._opt_x = [self._var_x[item.index].solution_value()
                        for item in items]
    
    def _create_variables(self):
        items = self._data_center.items
        self._var_x = [self._solver.BoolVar(name=f'x_{i}')
                    for i, item in enumerate(items)]
    
    def _create_objective(self):
        items = self._data_center.items
        obj_expr = [self._var_x[item.index] * item.profit for item in items]
        self._solver.Maximize(self._solver.Sum(obj_expr))
        
    def _create_constraints(self):
        items = self._data_center.items
        capacities = self._data_center.capacities
        expr = [self._var_x[item.index] * item.get_attribute('weight') for item in items]
        self._solver.Add(self._solver.Sum(expr) <= capacities['weight'])
        
    @property
    def opt_obj(self): return self._opt_obj
    
    def get_num_chosen_items(self):
        return sum(self._opt_x)

In [77]:
# | echo: false
# | label: tbl-knapsack-data
# | tbl-cap: Computational results of Knapsack problems

from IPython.display import Markdown
from tabulate import tabulate
import os

data_dir = "./data/knapsack/instances_01_KP/low-dimensional/"
dir_list = os.listdir(data_dir)

table_data = []
for file in sorted(dir_list):
    data_file = os.path.join(data_dir, file)

    data_center = KnapsackDataCenter()
    data_center.read_data_set_f(data_file)

    knapsack_solver = KnapsackSolver(data_center)
    knapsack_solver.build_model()
    knapsack_solver.optimize()
    
    row_data = []
    row_data.append(file)
    row_data.append(len(data_center.items))
    row_data.append(data_center.capacities['weight'])
    row_data.append(knapsack_solver.get_num_chosen_items())
    row_data.append(knapsack_solver.opt_obj)
    table_data.append(row_data)

col_names = ["Instance", "No. of Items", "Capacity", "No. of Chosen Items", "Optimal Value"]
Markdown(tabulate(table_data, headers=col_names))

Instance              No. of Items    Capacity    No. of Chosen Items    Optimal Value
------------------  --------------  ----------  ---------------------  ---------------
f10_l-d_kp_20_879               20         879                     17         1025
f1_l-d_kp_10_269                10         269                      6          295
f2_l-d_kp_20_878                20         878                     17         1024
f3_l-d_kp_4_20                   4          20                      3           35
f4_l-d_kp_4_11                   4          11                      2           23
f5_l-d_kp_15_375                15         375                      9          481.069
f6_l-d_kp_10_60                 10          60                      7           52
f7_l-d_kp_7_50                   7          50                      2          107
f8_l-d_kp_23_10000              23       10000                     11         9767
f9_l-d_kp_5_80                   5          80                      4          130