# Linear Integer Optimization

In this lesson we will solve a slightly more complicated optimiztion problem with `Pyomo`, this time involving integers and larger variable sets.

The problem we will solve will be from earlier this week, where a logistics company is trying to minimize the costs to supply different three regions by choosing warehouse locations. 

Recall that we had a set of cities $C$ where warehouses could be opened, a set of regions $R$ to supply, weekly fixed costs $c_i$ for opening a warehouse in city $i$, demand $b_j$ from each region $j$, and shipping costs $t_{ij}$ from city $i$ to region $j$.

We need to decide variables $y_i$ (whether or not - 1 or 0 - to open a warehouse in city $i$) and $x_{ij}$ (how much supply to send from  city $i$ to region $j$). We formulated the optimization problem as follows:   


$$
\begin{aligned}
& \underset{x,y}{\text{minimize}}
& & \sum_{i \in C}c_i y_i + \sum_{i \in C}\sum_{j \in R} t_{ij} x_{ij}\\
& \text{subject to}
& & \sum_{j \in R} x_{ij} \leq 100 y_i, \forall \:  i \in C \\
& & & \sum_{i \in C} x_{ij} = b_j, \forall \:  j \in R \\
& & & x_{ij} \in \mathbb{Z}^+  , \forall \:  i \in C, j \in R  \\
&&& y_i \in \{0,1\} , \forall \:  i \in C
\end{aligned}
$$

First, we define our fixed costs and demands as given in an earlier lesson from this week as lists:

In [1]:
#costs for each city
c = [400,500,300,150]

#demands from each region
b = [80,70,40]

We now read in our data table of shipping costs for city/region combinations using the `Pandas` package.

In [2]:
import io
import pandas as pd
shipping_data = pd.read_csv('intprog_shippingcosts.csv')
shipping_data.head()

Unnamed: 0,City,Region 1,Region 2,Region 3
0,New York,20,40,50
1,Lost Angeles,48,15,26
2,Chicago,26,35,18
3,Atlanta,24,50,35


We will extract the numerical data from this `DataFrame` by selecting the rows and columns we need using `.iloc`

In [3]:
import numpy as np
t = np.array(shipping_data.iloc[0:4,1:4])
print(t)

[[20 40 50]
 [48 15 26]
 [26 35 18]
 [24 50 35]]


We can now define our optimization problem in Pyomo. Since this problem has many variable elements and input parameters, it is useful to define sets `C` and `R` (cities and regions, respectively) as ranges, and use these sets to define our `Pyomo` variables. Note that we define `y` using the index `C` (one value for each city) and specify its domain as binary (open = 1, closed = 0), and define `x` in the `NonNegativeInteger` domain, with dimensions for both `C` and `R`.  

Our objective function is written as an expression using `sum()` and for loops over our index sets - compare the mathematical notation for the objective function $\sum_{i \in C}c_i y_i + \sum_{i \in C}\sum_{j \in R} t_{ij} x_{ij}$ to the expression within `Objective` below. Since this is a minimization problem, we set `sense` to `minimize`. 

Since we have many constraints in this problem, we create a list using `ConstraintList()`, and then create these constraints in a loop using the same notation as the objective function. As before, compare the code for these expressions to their mathematical notation above. 

We do not need to specify the integrality constraints in code, since we have already properly specified these domains when defining variables.

In [4]:
from pyomo.environ import *

# create a model
model = ConcreteModel()

#city indices
C = range(4)
#region indices
R = range(3)


# declare decision variables
model.y = Var(C,domain=Binary)
model.x = Var(C,R,domain=NonNegativeIntegers)

# declare objective
model.profit = Objective(expr =  sum(c[i]*model.y[i] for i in C) + 
                         sum(sum(t[i,j]*model.x[i,j] for i in C) for j in R ), sense=minimize)

# add model constraints
model.constraints = ConstraintList()
#forcing constraints
for i in C:
    model.constraints.add(sum(model.x[i,j] for j in R) <= 100*model.y[i])
#demand constraints
for j in R:
    model.constraints.add(sum(model.x[i,j] for i in C) == b[j])

We can now solve our model, using the same code as for linear optimization.

In [5]:
SolverFactory('glpk', executable='/usr/bin/glpsol').solve(model).write()

# = Solver Results                                         =
# ----------------------------------------------------------
#   Problem Information
# ----------------------------------------------------------
Problem: 
- Name: unknown
  Lower bound: 4570.0
  Upper bound: 4570.0
  Number of objectives: 1
  Number of constraints: 8
  Number of variables: 17
  Number of nonzeros: 29
  Sense: minimize
# ----------------------------------------------------------
#   Solver Information
# ----------------------------------------------------------
Solver: 
- Status: ok
  Termination condition: optimal
  Statistics: 
    Branch and bound: 
      Number of bounded subproblems: 5
      Number of created subproblems: 5
  Error rc: 0
  Time: 0.006490230560302734
# ----------------------------------------------------------
#   Solution Information
# ----------------------------------------------------------
Solution: 
- number of solutions: 0
  number of solutions displayed: 0


We can now display our optimal profit and variable values at optimality. Since it is convenient to view these variables in tables with their associated city and region labels, we create `Pandas` Data Frames for both `model.x` and `model.y`.

In particular, to look at the values of `model.x`, we borrow the structure of the `shipping_data` table (which has labels for both cities and regions) using the `.copy()` method, then loop through all indices of `model.x` and insert these flow values in their corresponding spot in the new Data Frame `x`.

In [6]:
#display solution
print('\nProfit = ', model.profit())

print('\nDecision Variables')

print('\ny: which warehouses open?')
#print y
print(pd.DataFrame({"City":shipping_data.City,"Open?":[model.y[i].value for i in C]}))

#print x
print('\nx: flow from warehouse in city i to region j')
x = shipping_data.copy()
for i in C:
  for j in R:
    x.iloc[i,j+1] = model.x[i,j].value
print(x)


Profit =  4570.0

Decision Variables

y: which warehouses open?
           City  Open?
0      New York    1.0
1  Lost Angeles    1.0
2       Chicago    1.0
3       Atlanta    0.0

x: flow from warehouse in city i to region j
           City  Region 1   Region 2  Region 3
0      New York       80.0       0.0       0.0
1  Lost Angeles        0.0      70.0       0.0
2       Chicago        0.0       0.0      40.0
3       Atlanta        0.0       0.0       0.0


We now see our solution: we should open a warehouse in each city except Atlanta, and supply all of Region 1 from New York, all of Region 2 from Los Angeles, and all of Region 3 from Chicago.