# Tutorial: Beyond Linear Programming, (CPLEX Part2)

This notebook describes some non-LP techniques and also under which conditions they should be used. 

Before continuing, you should ensure you followed the CPLEX Tutorial Part 1.

After completing this unit, you should be able to describe the differences between Linear Programming (LP), Integer Programming (IP), and Mixed-Integer Programming (MIP). You should also be able to construct a simple MIP model. 

>This notebook is part of **[Prescriptive Analytics for Python](http://ibmdecisionoptimization.github.io/docplex-doc/)**
>
>It requires either an [installation of CPLEX Optimizers](http://ibmdecisionoptimization.github.io/docplex-doc/getting_started.html) or it can be run on [IBM Watson Studio Cloud](https://www.ibm.com/cloud/watson-studio/) (Sign up for a [free IBM Cloud account](https://dataplatform.cloud.ibm.com/registration/stepone?context=wdp&apps=all>)
and you can start using Watson Studio Cloud right away).


Table of contents:

* [CPLEX Modeling for Python](#Use-IBM-Decision-Optimization-CPLEX-Modeling-for-Python)
* [Integer Optimization](#Integer-Optimization)

We will use DOcplex to write small samples to illustrate the topics.

## Use IBM Decision Optimization CPLEX Modeling for Python

Let's use the [DOcplex](http://ibmdecisionoptimization.github.io/docplex-doc/) Python library to write sample models in Python.

In [None]:
# ALERT: execute this cell to install DOcplex! 
!pip install docplex cplex

### Step 1: Import the library

First import *docplex*.

In [None]:
import docplex.mp.model as cplex

# Integer Optimization

In this topic, you’ll learn how to deal with integer decision variables by using Integer Programming and Mixed-Integer Programming, and how these techniques differ from LP.

## Problems requiring integers

For some optimization problems the decision variables should take integer values. 

- One example is problems involving the production of large indivisible items, such as airplanes or cars.  It usually does not make sense to use a continuous variable to represent the number of airplanes to produce, because there is no point in manufacturing a partial airplane, and each finished airplane involves a large cost.  

- Another example of where one would use integer variables is to model a particular state, such as on or off. For example, a unit commitment problem where integer variables are used to represent the state of a particular unit being either on or off.  

- Planning of investments also requires integer variables, for example a variable that takes a value of 1 to invest in a warehouse, and 0 to ignore it.  

Finally, integer variables are often used to model logic between different decision, for example that a given tax break is only applicable if a certain investment is made.

## Different types of integer decisions

Many types of decisions can be modeled by using integer variables. 

One example is yes/no decisions, with a value of 1 for yes, and 0 for no. For example, if x equals 1, new manufacturing equipment should be installed, and if x equals 0, it should not. 

Integer variables are also used to model state or mode decisions. For example, if z1 equals 1 the machine operates in mode 1, if z2 equals 1, the machine operates in mode 2, and if z3 equals 1 the machine operates in mode 3.  The same integer is often used to express both yes/no decisions and logic. For example, y1 equals 1 could in this case also be used to indicate that machine 1 is installed, and 0 otherwise. 

Finally, integer variables are used tomodel cases where a value can take only integer values: for example: how many flights should a company operate between two airports.

## Types of integer variables

In general, integer variables can take any integer value, such as 0, 1, 2, 3, and so forth.  Integers that should only take the values of 1 or 0 are known as binary (or Boolean) variables. Binary variables are also often referred to as Boolean variables because the Boolean values of true and false are analogous to 1 and 0. 

To ensure that an integer variable can only take the values 0 and 1, one can give it an upper bound of 1 or declare it to be of type binary. In a DOcplex  model, decision variables are assumed to be nonnegative unless otherwise specified and the lower bound of 0 does not need to be declared explicitly. 

### Declaring integer decision variables in DOcplex

DOcplex has specific methods to create integer and binary variables.

In [None]:
import docplex.mp.model as cplex
with cplex.Model('integer_programming') as mdl:
    b = mdl.binary_var(name='b')  # b is a binary variable
    z = mdl.integer_var(name='z') # z is a integer variable
    mdl.print_information()

## Modeling techniques with integer and binary variables

Integer and binary variables are very useful to express logical constraints. Here are a few examples of
such constraints.

### Indicator variables

Indicator variables are binary variables used to indicate whether a certain set of conditions is valid (with the variable equal to 1) or not (with the variable equal to 0). For example, consider a production problem where you want to distinguish between two states, namely production above a minimum threshold, and no production. 

To model this, define a binary variable $y$ to take a value of 1 if the production is above the minimum threshold (called minProd), and 0 if there is no production. Assume $production$ is a continuous variable containing the produced quantity. 
This leads to these two constraints.

$$
production \ge minProd \cdot y\\
production \le maxProd \cdot y
$$

Here, maxProd is an upper bound on the production quantity. Thus, if y = 1, the minimum and maximum production bounds hold, and if y = 0, the production is set to zero. 


### Logical constraints - an example

For example, consider an investment decision involving a production plant and two warehouses. 

- If the production plant is invested in, then either warehouse 1 or warehouse 2 may be invested in (not both).

- If the production plant is not invested in, then neither of the two warehouses may be invested in.

Let $yPlant$ be 1 if you decide to invest in the production plant, and 0 otherwise. 
Similar for $yWarehouse1$ and $yWarehouse2$.
Then this example can be modeled as follows:

$$
yWarehouse1 + yWarehouse2 \leq yPlant
$$

If $yPlant$ is 0 then both $yWarehouse1$ and $yWarehouse2$ are set to zero. 

On the opposite, if one warehouse variable is set to 1, then $yPlant$ is also set to 1. Finally, this constraint also states that warehouse variables cannot be both equal to 1.


## IP versus MIP

When all the decision variables in a linear model should take integer values, the model is an Integer Program (or **IP**). 

When some of the decision variables may also take continuous values, the model is a Mixed Integer Program (or **MIP**). 

MIPs are very common in, for example, some supply chain applications where investment decisions may be represented by integers and production quantities are represented by continuous variables.

IPs and MIPs are generally much more difficult to solve than LPs.

The solution complexity increases with the number of possible combinations of the integer variables, and such problems are often referred to as being “combinatorial”.

In the worst case, the solution complexity increases exponentially with the number of integer decision variables.

Many advanced algorithms can solve complex IPs and MIPs in reasonable time


### An integer programming example

In the telephone production problem where the optimal solution found in chapter 2 'Linear programming' had integer values, it is possible that the solution becomes non-integer under certain circumstances, for example:

- Change the availability of the assembly machine to 401 hours
- Change the painting machine availability to 492 hours
- Change the profit for a desk phone to 12.4
- Change the profit for a cell phone to 20.2

The fractional values for profit are quite realistic. Even though the fractional times for availability are not entirely realistic, these are used here to illustrate how fractional solutions may occur. 

Let's solve again the telephone production problem with these new data. A detailed explanation of the model is found in notebook 2: 'Linear Programming'

In [None]:
import docplex.mp.model as cplex

with cplex.Model('lp_telephone_production') as lm:
    desk = lm.continuous_var(name='desk')
    cell = lm.continuous_var(name='cell')
    # write constraints
    # constraint #1: desk production is greater than 100
    lm.add_constraint(desk >= 100, 'desk_lb')

    # constraint #2: cell production is greater than 100
    lm.add_constraint(cell >= 100, 'cell_lb')

    # constraint #3: assembly time limit
    lm.add_constraint( 0.2 * desk + 0.4 * cell <= 401, 'assembly_limit')

    # constraint #4: painting time limit
    lm.add_constraint( 0.5 * desk + 0.4 * cell <= 492, 'painting_limit')
    lm.maximize(12.4 * desk + 20.2 * cell)
    lm.print_information()
    ls = lm.solve(log_output=True)
    if ls == None:
        print('No solution.')
    else:
        print(ls)    

As we can see the optimal solution contains fractional values for number of telephones, which are not realistic.
To ensure we get integer values in the solution, we can use integer decision variables.

Let's solve a new model, identical except that its two decision variables are declared as _integer_ variables.

In [None]:
import docplex.mp.model as cplex

with cplex.Model('ip_telephone_production') as im:
    desk = im.integer_var(name='desk')
    cell = im.integer_var(name='cell')
    # write constraints
    # constraint #1: desk production is greater than 100
    im.add_constraint(desk >= 100, 'desk_lb')

    # constraint #2: cell production is greater than 100
    im.add_constraint(cell >= 100, 'cell_lb')

    # constraint #3: assembly time limit
    im.add_constraint( 0.2 * desk + 0.4 * cell <= 401, 'assembly_limit')

    # constraint #4: painting time limit
    im.add_constraint( 0.5 * desk + 0.4 * cell <= 492, 'painting_limit')
    im.maximize(12.4 * desk + 20.2 * cell)

    im.print_information()
    si = im.solve(log_output=True)
    if si == None:
        print('No solution.')
    else:
        print(si)

As expected, the IP model returns integer values as optimal solution.

<img src = "https://www.tommasoadamo.it/images/lez18/1_39.png" >

This graphic shows the new feasible region where the dots indicate the feasible solutions.  That is, solutions where the variables take only integer values.  This graphic is not according to scale, because it’s not possible to indicate all the integer points graphically for this example. What you should take away from this graphic, is that the feasible region is now a collection of points, as opposed to a solid area.  Because, in this example, the integer solution does not lie on an extreme point of the continuous feasible region, LP techniques would not find the integer optimum.  To find the integer optimum, you should use an integer programming technique. 


### Rouding a fractional solution

An idea that often comes up to deal with fractional solutions is to solve an LP and then round the fractional numbers in order to find an integer solution. However, because the optimal solution is always on the edge of the feasible region, rounding can lead to an infeasible solution, that is, a solution that lies outside the feasible region. In the case of the telephone problem, rounding would produce infeasible results for both types of phones. 

When large quantities of items are produced, for example thousands of phones, rounding may be still be a good approach to avoid integer variables.
In general, you should use an integer programming algorithm to solve IPs.  The most well-known of these is the branch-and-bound algorithm. 


## The branch and bound method

The _branch and bound_ method, implemented in CPLEX Mixed-Integer Optimizer, provides an efficient way to solve IP and MIP problems. This method begins by relaxing the integer requirement and treating the problem as an LP. If all the variables take integer values, the solution is complete. If not, the algorithm begins a tree search.  You’ll now see an example of this tree search. 

Consider this integer programming problem, involving an objective to maximize, three constraints, and three non-negative integer variables (this is the default for DOcplex variables).

$
maximize\ \ \ x + y + 2 z\\
subject\ to: 7x + 2y + 3z \leq 36\\
\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ 5x + 4y + 7z \leq 42\\
\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ 2x + 3y + 5z \leq 28\\
\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ x,y,z \geq 0 \quad \mbox{integer}
$

### Branch and Bound: the root node

The first node of the branch and bound tree is the LP relaxation of the original IP model.  
LP relaxation means that the integer variables have been relaxed to be continuous variables.  
The solution to the LP relaxation of a maximization IP, such as this, provides an upper bound to the original problem, in this case that bound is eleven and five elevenths.  
The current lower bound is minus infinity. 

<img src = "https://www.tommasoadamo.it/images/lez18/1_43.png" >

In this case, the solution is fractional and the tree search continues in order to try and find an integer solution.

### Branch and Bound: branching on a variable

The algorithm next chooses one of the variables to branch on, in this case $x$, and adds two constraints to create two subproblems.  
These two constraints are based on the relaxed value of x, namely one and three elevenths.  In the one subproblem, $x$ is required to be less than or equal to one, and in the other problem, $x$ is required to be greater than or equal to two, in order to eliminate the fractional solution found.  
IP2 gives another fractional solution, but IP3 gives an integer solution.  This integer solution of 10 is the new lower bound to the original maximization problem, because it is the best current solution to the maximization problem.  

<img src = "https://www.tommasoadamo.it/images/lez18/1_44.png" >

The algorithm will terminate when the gap between the upper and lower bounds is sufficiently small, but at this point there is still more of the tree to explore.

### Branch and Bound: iteration

<img src = "https://www.tommasoadamo.it/images/lez18/1_45.png" >

Two new subproblems are now generated from IP2, and these constraints are determined by the fractional value of z in IP2.  In IP4, z must be less than or equal to 5, and in IP3 z must be greater than or equal to 6. IP4 gives another fractional solution, while IP3 is infeasible and can be pruned.  When a node is pruned, the node is not explored further in the tree.

<img src = "https://www.tommasoadamo.it/images/lez18/1_46.png" >

Next, two more subproblems are created from IP4, namely one with y less than or equal to zero in IP6, and one with y greater than or equal to 1 in IP5.  IP6 yields an integer solution of 11, which is an improvement of the previously found lower bound of 10. IP5 gives a fractional solution and can be explored further.

<img src = "https://www.tommasoadamo.it/images/lez18/1_47.png" >

So another two subproblems are created from IP5, namely IP8 with z less than or equal to 4, and IP7 with z greater than or equal to 5.  However, the constraint added for IP4 specifies that z must be less than or equal to 5, so node IP7 immediately yields an integer solution with an objective value of 11, which is the same objective as for IP6.  IP8 yields an integer solution with objective value of 9, which is a worse solution than those previously found and IP8 can therefore be discarded.  

The optimal solution reported is the integer solution with the best objective value that was found first, namely the solution to IP6.

The progess of the Branch & Bound algorithm can be monitored by looking at the CPLEX the _log_. Adding the keyword argument `log_output=True` to the `Model.solve()` method will print the log on the standard output.
You can see the best bound going down until the gap closes and the final solution of 11 is returned.
By default the CPLEX log is not printed.

In [None]:
import docplex.mp.model as cplex

with cplex.Model('b&b') as bbm:
    x = bbm.integer_var(name='x')
    y = bbm.integer_var(name='y')
    z = bbm.integer_var(name='z')
    bbm.maximize(x + y + 2*z)
    bbm.add_constraint(7*x + 2*y + 3*z <= 36)
    bbm.add_constraint(5*x + 4*y + 7*z <= 42)
    bbm.add_constraint(2*x + 3*y + 5*z <= 28)
    bbm.print_information()
    s = bbm.solve(log_output=True)
    if s == None:
        print('No solution.')
    else:
        print(s)

## Modeling yes/no decisions with binary variables: an example

Binary variables are often used to model yes/no decisions.  
Consider again the telephone production problem. The company is considering replacing the assembly machine with a newer machine that requires less time for cell phones, namely 18 minutes per phone, but more time for desk phones, namely 15 minutes per phone. This machine is available for 430 hours, as opposed to the 400 hours of the existing assembly machine, because it requires less downtime. 

We will design and write a model that uses binary variables to help the company choose between the two machines. 

The steps to formulate the mixed-integer model are:
- Add four new variables (desk1, desk2, cell1, and cell2, to indicate the production on assembly machines 1 and 2, respectively.
- Add two constraints to define the total production of desk and cell to equal the sum of production from the two assembly machines.
- Rewrite the constraint for assembly machine 1 to use the new variables for that machine (desk1 and cell1).
- Add a similar constraint for the production on assembly machine 2.
- Define a Boolean variable, y, to take a value of 1 if assembly machine 1 is chosen, and 0 if assembly machine 2 is chosen.
- Use the y variable to set the production to zero for the machine that is not chosen.


### Implementing the yes/no decision model with DOcplex

First, create a model instance.

In [None]:
import docplex.mp.model as cplex

with cplex.Model('decision_phone') as tm2:
    ...

#### Setup decision variables

we create two sets of (desk, cell) integer variables, one per machine type, plus the total production variables.
Note that the total production variables do not need to be declared if the four typed productions are integers.
As the sum of two integers, they will always be integers; the less we have of integer variables, the easier CPLEX will solve the problem.

In addition, we define an extra binary variable $z$ to model the choice we are facing: use machine #1 or machine #2.

#### Setup constraints

- The lower bounds of 100 items on production
- The constraint for painting machine limit is identical to the basic telephone model
- Two extra constraints express the total production as the sum of productions on the two assembly machines.
- Each assembly machine type has its own constraint, in which variable $z$ expresses the exclusive choice between the two.

#### Expressing the objective

The objective is identical: maximize total profit, using total productions.

$
maximize:\\
\ \ 12\ desk + 20\ cell\\
subject\ to: \\
\ \   desk \geq 100 \\
\ \   cell \geq 100 \\
\ \   desk_1 + desk_2 = desk\\
\ \   cell_1 + cell_2 = cell\\
\ \   0.2\ desk_1 + 0.4\ cell_1 \leq 400 \ z \\
\ \   0.25\ desk_2 + 0.3\ cell_2 \leq 430 \ (1-z) \\
\ \   0.5\ desk + 0.4\ cell \leq 490 \\
\ \   desk_1, desk_2, cell_1, cell_2 \geq 0 \quad \mbox{integer}\\
\ \   z \in \{0, 1\}
$

#### Solve with the Decision Optimization solve service

In [None]:
import docplex.mp.model as cplex

with cplex.Model('decision_phone') as tm2:
    # variables for total production
    desk = tm2.continuous_var(name='desk')
    cell = tm2.continuous_var(name='cell')

    # two variables per machine type:
    desk1 = tm2.integer_var(name='desk1')
    cell1 = tm2.integer_var(name='cell1')

    desk2 = tm2.integer_var(name='desk2')
    cell2 = tm2.integer_var(name='cell2')

    # yes no variable
    z = tm2.binary_var(name='z')
    
    # total production is sum of type1 + type 2
    tm2.add_constraint(desk == desk1 + desk2, 'total_desk')
    tm2.add_constraint(cell == cell1 + cell2, 'total_cell')

    # constraint #1: desk production is greater than 100
    tm2.add_constraint(desk >= 100, 'desk_lb')
    # constraint #2: cell production is greater than 100
    tm2.add_constraint(cell >= 100, 'cell_lb')
    
    # production on assembly machine of type 1 must be less than 400 if y is 1, else 0
    tm2.add_constraint(0.2 * desk1 + 0.4 * cell1 <= 400 * z, 'assembly_limit_1')
    # production on assembly machine of type 2 must be less than 430 if y is 0, else 0
    tm2.add_constraint(0.25 * desk2 + 0.3 * cell2 <= 430 * (1-z), 'assembly_limit_2')

    # painting machine limit is identical
    # constraint #4: painting time limit
    tm2.add_constraint( 0.5 * desk + 0.4 * cell <= 490, 'painting_limit')

    tm2.maximize(12 * desk + 20 * cell)
    
    tm2.print_information()
    
    tm2s= tm2.solve(log_output=True)
    if tm2s != None:
        print(tm2s)
    else:
        print('No solution.')

#### Conclusion

This model demonstrates that the optimal solution is to use machine #2 , producing 100 desk phones and 1100 cell phones.

### Using binary variables for logical decisions

What if the company had to choose between 3 possible candidates for the assembly machine, as opposed to two?

The above model can be generalized with three binary variables $z_1$, $z_2$, $z_3$ each of which is equal to 1 only if machine type 1,2, or 3 is used. But then we need to express that _exactly_ one of those variables must be equal to 1. How can we achive this?

The answer is to add  the following constraint to the model:

$$
z_{1} + z_{2} + z_{3} = 1
$$

Thus, if one of zs variables is equal to 0, the two other are equal to zero (remember binary variables can take value 0 or 1).


# Summary

Having completed this notebook, you should be able to:

- Describe the differences between:
  - Linear Programming (LP)
  - Integer Programming (IP)
  - Mixed-Integer Programming (MIP)

- Construct a simple MIP model

## References
* [CPLEX Modeling for Python documentation](http://ibmdecisionoptimization.github.io/docplex-doc/)
* [Decision Optimization on Cloud](https://developer.ibm.com/docloud/)
* Need help with DOcplex or to report a bug? Please go [here](https://stackoverflow.com/questions/tagged/docplex).
* Contact us at dofeedback@wwpdl.vnet.ibm.com.