In [None]:
To op[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/mathcoding/opt4ds/blob/master/KnapsackProblem.ipynb)

# The Knapsack Problem
In this first notebook, we show how to solve the Knapsack Problem using Integer Linear Programming.

TODO: FORMAL PROBLEM DEFINITION

## Software Installation
If you are running this notebook in a Colab, you don't need to install anything else on your computer.

Otherwise, if you have installed the recommended Anaconda Python distribution, you have to run the following two commands:

1. To install the [Pyomo](http://www.pyomo.org/) optimization modeling language:

```
conda install -c conda-forge pyomo
```

2. To install the open source [GLPK](https://www.gnu.org/software/glpk/) solver:

```
conda install -c conda-forge glpk
```

3. (Optional) You can install some extra packages of Pyomo using the following command:

```
conda install -c conda-forge pyomo.extras
```

For details about the Pyomo installation, we refer to the official [Pyomo Documentation](https://pyomo.readthedocs.io/en/stable/).

The following lines are for running this notebook in a Colab(https://colab.research.google.com/github/jckantor/ND-Pyomo-Cookbook/blob/master/notebooks/02.01-Production-Models-with-Linear-Constraints.ipynb):

In [42]:
import shutil
import sys
import os.path

if not shutil.which("pyomo"):
    !pip install -q pyomo
    assert(shutil.which("pyomo"))

if not (shutil.which("glpk") or os.path.isfile("glpk")):
    if "google.colab" in sys.modules:
        !apt-get install -y -qq glpk-utils
    else:
        try:
            !conda install -c conda-forge glpk 
        except:
            pass

Collecting package metadata (current_repodata.json): ...working... done
Solving environment: ...working... done

# All requested packages already installed.



## Mixed Integer Programming model
The Knapsack problem can be formulated as follows.

The input data are: 

* The index set $I$ referring to the items
* The profit vector $c$
* The weight vector $A$
* The budget value $B$

For each item $i\in I$, we introduce a binary decision variables $x_i \in \{0,1\}$, which is used to define the following **Integer Linear Programming (ILP)**
problem:

\begin{align}\label{eq:1}
\max \;\; & c^T x \\
\mbox{s.t.} \;\; & \sum_{i \in I} A_i x_i \leq B & \\
& x_i \in \{0,1\},& \forall i \in I.
\end{align}


Since $x_i=1$ represents the decision of selecting item the $i$-th item, it is clear that the objective function (1) consists of maximizing the dot product $c^T\,x$. The single constraint (2) limits the number of selected item in such a way that the sum of the weights $A_i$ of the selected items does not exceed the availabl capacity $B$. The constraints (3) impose the domain on the decision variables $x_i$. 

## Pyomo Knapsack Model
The ILP model (1)-(3) can be expressed using the Pyomo optimization modeling language as shown next.

As a first step, we need to define the input data using the standard Python data structure. The simplest method to define the data is as follows:

In [4]:
I = range(5)        # Items
C = [2, 3, 1, 4, 3] # Profits
A = [3, 4, 2, 1, 6] # Weights
B = 9               # Budget

This snippet of code is defining the following input data:

* The set $I$ as the range of number from 0 to 4, using the [range](https://docs.python.org/3/library/functions.html#func-range) builtin class. You can think of the range as a generator function for the list $[0,1,2,3,4]$.
* The cost vector $C$ and the weight vector $A$ are defined as two lists, using the standard [list()](https://docs.python.org/3/library/functions.html#func-list) builtin class.
* The budget parameter is defined as a given integer.

So far, we have not used any **Pyomo** construct. Before we start, we need to import the Pyomo library:

In [5]:
from pyomo.environ import *

In [7]:
# If you want to check all the elements imported from the library uncomment the following line
# who

### Model and Variables
The first step in defining the model consists in choosing the type of Model we want to use. In Pyomo there are two options, the [ConcreteModel](https://pyomo.readthedocs.io/en/stable/library_reference/aml/index.html#pyomo.environ.ConcreteModel) and the [AbstractModel](https://pyomo.readthedocs.io/en/stable/library_reference/aml/index.html#pyomo.environ.AbstractModel). In this first example, we use the simpler ConcreteModel as follows:

In [8]:
# Create concrete model
model = ConcreteModel()

The choice of the `model` name is recommended if you are planning to use the Pyomo command line tool.

Once we have defined the model, we can define the binary decision variable by using the [Var](https://pyomo.readthedocs.io/en/stable/library_reference/aml/index.html#pyomo.environ.Var) class. We define an object of type **Var** for each element of the range $I$, with the following command:

In [9]:
# Variables
model.x = Var(I, within=Binary)

The `Binary` keyword is part of the `pyomo.environ` setting, and it is used to specify the constraint $x_i \in \{0,1\}$. Other possible values for the optional parameter `within` are: `NonNegativeReals`, `PositiveReals`, `PositiveIntegers`.

Note the the choice of $x$ as a name of the variable is arbitrary. 

### Objective Function
The objective function is defined via the [Objective](https://pyomo.readthedocs.io/en/stable/library_reference/aml/index.html#pyomo.environ.Objective.construct) class, as follows:

In [15]:
# Objective Function: Maximize Profit
model.obj = Objective(expr = sum(C[i]*model.x[i] for i in I),
                      sense = maximize)

Again, the `obj` name is arbitrary, and you can select the one you prefer. The parameter `epxr` is mandatory, and is used to define the objective function expression. In the following example, we are using the `list comprehension` syntax to define our linear objective $\sum_{i \in I} c_i x_i$. Note that the python notation is very similar to the mathematical notation.

The parameter `sense` is optional, and it is used to define the type of objective function: `maximize` or `minimize`.

With Pyomo, we are not directly restricted to use only linear objective functions. It is the type solver we use that limits that type of problem that we can solve. As long as we use the GLPK solver, we can only define linear objective function.

### Constraints
Finally, we need to define the budget constraint.
The constraints are defined using the [Constraint](https://pyomo.readthedocs.io/en/stable/library_reference/aml/index.html#pyomo.environ.Constraint) class. The minial use of this class requires to define the `expr` input parameter. In the knapsack problem we have to define the budget constraint $\sum_{i \in I} A_i x_i \leq B$ as follows: 

In [16]:
# Constraint
model.capacity = Constraint(expr = sum(A[i]*model.x[i] for i in I) <= B)

Also the constraints can be named, in this case, we named it `capacity`. The name of the constraint can be used to retrieve information about the status of the constraint in a solution, that is, given a solution $\bar x$ to check whether $\sum_{i \in I} A_i \bar x_i < B$ or $\sum_{i \in I} A_i \bar x_i = B$.

### Solving the Pyomo model
The complete Pyomo model defined so far is as follows:

In [17]:
# Create concrete model
model = ConcreteModel()
# Variables
model.x = Var(I, within=Binary)
# Objective Function: Maximize Profit
model.obj = Objective(expr = sum(C[i]*model.x[i] for i in I),
                      sense = maximize)
# Constraint
model.capacity = Constraint(expr = sum(A[i]*model.x[i] for i in I) <= B)

In order to solve this model, we need to use a **solver**, that is a software that binds the data to the model and solve the corrisponding instance of the problem. In this notebook, we use the GLPK solver, using a `SolverFactory` as follows:

In [19]:
# Solve the model
sol = SolverFactory('glpk').solve(model)

In [30]:
# Basic info about the solution process
for info in sol['Solver']:
    print(info)


Status: ok
Termination condition: optimal
Statistics: 
  Branch and bound: 
    Number of bounded subproblems: 1
    Number of created subproblems: 1
Error rc: 0
Time: 0.014999628067016602



Finally, in order to check the values of the decision variables, we can query the solved model by using the variable, objective function, and constraints names:

In [31]:
# Report solution value
print("Optimal solution value: z =", model.obj())
print("Decision variables:")
for i in I:
    print("x_{} = {}".format(i, model.x[i]()))
print("Capacity left in the knapsack:", B-model.capacity())

Optimal solution value: z = 9.0
Decision variables:
x_0 = 1.0
x_1 = 1.0
x_2 = 0.0
x_3 = 1.0
x_4 = 0.0
Capacity left in the knapsack: 1.0


In this case, we have found a solution with value equal to 9, given by selecting the three items $[0, 1, 3]$. We still had a unit of capacity left in the knapsack, but since no left items as a weight equal to 1, and we cannot take fractional items, that capacity stay unused.

## Complete Python Script
The whole Python script for solving the Knapsack problem is as follows:

In [33]:
# Import the libraries
from pyomo.environ import ConcreteModel, Var, Objective, Constraint, SolverFactory
from pyomo.environ import maximize, Binary

# CONCRETE MODEL: Data First, then model
I = range(5)        # Items
C = [2, 3, 1, 4, 3] # Profits
A = [3, 4, 2, 1, 6] # Weights
B = 9               # Budget

# Create concrete model
model = ConcreteModel()

# Variables
model.x = Var(I, within=Binary)

# Objective Function: Maximize Profit
model.obj = Objective(expr = sum(C[i]*model.x[i] for i in I),
                      sense = maximize)

# Constraint
model.capacity = Constraint(expr = sum(A[i]*model.x[i] for i in I) <= B)

# Solve the model
sol = SolverFactory('glpk').solve(model)

# Basic info about the solution process
for info in sol['Solver']:
    print(info)
    
# Report solution value
print("Optimal solution value: z =", model.obj())
print("Decision variables:")
for i in I:
    print("x_{} = {}".format(i, model.x[i]()))
print("Capacity left in the knapsack:", B-model.capacity())


Status: ok
Termination condition: optimal
Statistics: 
  Branch and bound: 
    Number of bounded subproblems: 1
    Number of created subproblems: 1
Error rc: 0
Time: 0.015997886657714844

Optimal solution value: z = 9.0
Decision variables:
x_0 = 1.0
x_1 = 1.0
x_2 = 0.0
x_3 = 1.0
x_4 = 0.0
Capacity left in the knapsack: 1.0
