# Introduction to OR-Tools 

In this recitation, we will learn how to solve simple linear and integer programs using OR-Tools.

You can work in pairs or individually. Answer the questions in the blank Markdown cells, and turn this notebook in on Gradescope. If you work with a partner, you and your partner need only turn in one copy.

<font color='blue'>**Solutions in blue.**</font>

In [None]:
# Run this cell to import necessary packages!
import pandas as pd
from ortools.linear_solver import pywraplp as OR

## Part 1: Solving an LP using OR-Tools

We will model our use of OR-Tools after AMPL. AMPL describes linear programs using model files: files that have the suffix `.mod`. The model file is a general model, which might not include actual numbers or “data”. The data for the AMPL model can be included in a separate file, a data file, which are files with the suffix `.dat`. In OR-Tools, model files will be implemented as functions which take data in the form of numbers, lists, CSV files, etc..

Let's begin by examining the following model:

In [None]:
def net1(cities, links):
    """An example OR-Tools `model` file."""
    CITIES = list(cities.index)             
    LINKS = list(links.index) 
    
    supply = cities['supply'].to_dict()     # amounts available at cities
    demand = cities['demand'].to_dict()     # amounts required at cities
    
    assert sum(supply[i] for i in CITIES) == sum(demand[j] for j in CITIES)
    
    cost = links['cost'].to_dict()          # shipment costs/1000 packages
    capacity = links['capacity'].to_dict()  # max packages that can be shipped
    
    # define model
    m = OR.Solver('net1', OR.Solver.CBC_MIXED_INTEGER_PROGRAMMING)
    
    # decision variables
    x = {}  # packages to be shipped
    for i,j in LINKS:
        x[i,j] = m.NumVar(0, capacity[i,j], ('(%s, %s)' % (i,j)))
            
    # objective function : Total_Cost
    m.Minimize(sum(cost[i,j] * x[i,j] for i,j in LINKS))
    
    # subject to: Balance
    for k in CITIES:
        m.Add(supply[k] + sum(x[i,j] for i,j in LINKS if j == k) ==
              demand[k] + sum(x[i,j] for i,j in LINKS if i == k))
        
    
    return m, x

**Q1:** Try to understand what this model is doing. Write the analogous linear program in standard mathematical notation. Clearly specify what the decision variables, the objective function, and the constraints are.

**A:** <font color='blue'> Let $x_{ij}$ be the number of units shipped from $i$ to $j$, for each $(i, j) \in E$.
Then, our linear program is: 
    
$$\begin{align*}
\min \quad & \sum_{(i,j) \in E} c_{ij}x_{ij}\\
\text{s.t.} \quad & s_k + \sum_{(i,k) \in E} x_{ik} = d_k + \sum_{(k,j) \in E} x_{kj} \quad \forall k \in N \\
\quad & 0 \leq x_{ij} \leq u_{ij} \quad \forall (i,j) \in E
\end{align*}$$

</font> 

Notice that the model does not contain the actual numbers, but the data file supply this concrete information. Let's load the data and view it now.

In [None]:
cities = pd.read_csv('data/net1_cities.csv', index_col=0)
display(cities)
links = pd.read_csv('data/net1_links.csv')
links['link'] = list(zip(links['i'],links['j']))
links = links.set_index('link')
display(links)

**Q2:** Convert this input to a graph that you should draw below.

**A:** <font color='blue'> <img src="images-key/net1_graph.png" width="600" height="450" /> </font> 

**Q3:** Try to figure out the optimal solution by hand. Do not use the simplex method, but try to guess
a solution and try to reason that it is optimal. Draw the solution (flow values on each edge) on
your graph above. (This exercise is meant to build up your intuition, which is important. Do not
worry if your solution here does not turn out to be the “correct” one.)

**A:** <font color='blue'> Will vary. </font> 

If we wish to solve this input, we can load the data into the model by running the following line:

In [None]:
m, x = net1(cities, links)

We can then define this solve function to solve the input, print the objective value, and display the decision variables.

In [None]:
def solve(m):
    m.Solve()
    print('Solution:')
    print('Objective value =', m.Objective().Value())
    for var in m.variables():
        print(var.name(), ':',  var.solution_value())

In [None]:
solve(m)

**Q4:** Run the cell above to solve the model. Write down the optimal solution and the corresponding objective value in the space below. Did you get the right solution before?

**A:**  <font color='blue'> Optimal objective value is 1819.
   
| variable     | value |
|-----------|-------|
| (PITT, NE) | 250.0 |
| (PITT, SE) | 200.0 |
| (NE, BOS)  | 90.0 |
| (NE, EWR)  | 100.0 |
| (NE, BWI)  | 60.0 |
| (SE, EWR)  | 20.0 |
| (SE, BWI)  | 60.0 |
| (SE, ATL)  | 70.0 |
| (SE, MCO)  | 50.0 |

</font> 

## Part 2: Writing your own OR-Tools model

In this section, we will review the structure of the OR-Tools model and data files. We will do this by writing
an OR-Tools model for the following simple minimum-cost network flow problem. The number next to
each node is the supply value of the node. The first number next to each edge is the edge cost while the
second number is the edge capacity.

<img src="images-lab/graph.png" width="500" height="380" />

**Q5:** First, write a linear program for this problem in standard mathematical notation. Clearly specify
what the decision variables, the objective function, and the constraints are.
In particular, for each of the four nodes, write the flow-conservation constraint explicitly.

**A:** <font color='blue'> Let $x_{ij}$ be the flow from $i$ to $j$, for each $(i,j) \in E$.
Then, our linear program is: 
    
$$\begin{align*}
\min \quad & \sum_{(i,j) \in E} c_{ij}x_{ij}\\
\text{s.t.} \quad & x_{12} + x_{13} = 10 \\
\quad & x_{24} + x_{23} - (x_{12} + x_{32})= 0  \\
\quad & x_{34} + x_{32} - (x_{13} + x_{23})= 0 \\
\quad & -(x_{24} + x_{34}) = -10 \\
\quad & 0 \leq x_{ij} \leq u_{ij} \quad \forall (i,j) \in E \\
\end{align*}$$

</font> 

The remainder of this section will guide you step-by-step how to model this problem in OR-Tools. Please write your model in the blank function `rec1` before **Q10**. If
you feel comfortable with OR-Tools, feel free to work on this on your own and skip ahead to **Q10**.

**Note that although this model will be similar to `net1`, we advise that you to start from scratch, typing everything by hand.**

Next, based on the above linear program, we will write the model file, which describe a general
minimum-cost network flow problem. (So, this model file should not specifically mention the
four nodes and six edges in the above graph.)

The following are the basic components of an OR-Tools model. (It is best, for now, that you write
the components your OR-Tools model file in this order to avoid potential errors.)

    a) Set declarations
    b) Parameter declarations
    c) Input checks
    d) Define model
    e) Variable declarations
    f) Objective function
    g) Constraints
    
We will elaborate each of the components below:

**Set declarations:** Here, you tell OR-Tools what sets you will be using. In our problem, we deal with the set of nodes and the set of edges in the graph. By convention, you should name sets in UPPERCASE. The way you define a set will depend on the type of data coming in to your model. Here are the two most common cases:

Your data is a pandas table (`pd.DataFrame`) indexed by the items in the set:

    NODES = list(nodes.index)             
    EDGES = list(edges.index) 

Your data is an array/list (`List`) containing the items in the set:

    NODES = list(nodes)             
    EDGES = list(edges) 

**Parameter declarations:** Parameters are the constants that appear in your linear program. In this case, our parameters are the supply values for each node, and the edge costs and edge capacities for each edge. Here are two possible ways you may declare a parameter supply for each element of the set Nodes:

Your data is a column of a pandas table (`pd.DataFrame`) indexed by the items in the set:

    supply = nodes['supply'].to_dict()           

Your data is a dictionary (`Dict`) where the keys are items in the set:

    supply = dict(supply) 

**Q6:** In the space below (as well as your model), declare parameters cost and capacity for each element of the set `EDGES`.

**A:** <font color='blue'>
    
    cost = edges['cost'].to_dict()
    capacity = edges['capacity'].to_dict()  

</font> 

**Input checks:** Sometimes, we want our input (which we’ll specify in the data file) to satisfy certain model assumptions. For example, in our problem, we want the edge capacities to be nonnegative numbers:

    for i,j in EDGES:
        assert capacity[i,j] >= 0

We also want to make sure that total supply is equal to total demand. I.e. the sum of the supply values is zero.
In standard mathematical notation, if $s_i$ is the supply at node $i$, then we want $\sum_{i \in \text{NODES}} s_i = 0$. The OR-Tools notation is similar:

    assert sum(supply[i] for i in NODES) == 0

**Define model:** Before defining the model, we need to create a blank OR-Tools model:

    m = OR.Solver(<Name>, <Solver>)
    
By convention, you should name your model the same as the function name. For the solver, you have the following options:

- `OR.Solver.GLOP_LINEAR_PROGRAMMING`: An open-source LP solver (DO NOT USE FOR IPs)
- `OR.Solver.CBC_MIXED_INTEGER_PROGRAMMING`: An open-source MIP solver
- `OR.Solver.GUROBI_MIXED_INTEGER_PROGRAMMING`: A leading commercial MIP solver (Must have Gurobi installed)

In this example, you should define your model as:

    m = OR.Solver('rec1', OR.Solver.CBC_MIXED_INTEGER_PROGRAMMING)

**Variable declarations:** We need a decision variable for the flow on each edge. Suppose that we call our decision variables $x$. Variables are declared using one of:

    m.NumVar(<lb>, <ub>, <name>)
    m.IntVar(<lb>, <ub>, <name>)
    
It is important that the names for each variable are distinct. In our example, we want a variable for each edge that is between 0 and the capcaity on that edge:

    x = {}
    for i,j in EDGES:
        x[i,j] = m.NumVar(0, capacity[i,j], ('(%s, %s)' % (i,j)))

**Objective function:** The general form of the objective function is as follows.

     m.Minimize(<function expression>) 
     m.Maximize(<function expression>)

 
where `<function expression>` will involve some (or all of) your decision variables.

We will review the OR-Tools summation notation one more time. In your mathematical formulation of the LP, you should have the following objective:

$$\min c_{12}x_{12} + c_{13}x_{13} + c_{23}x_{23} + c_{32}x_{32} + c_{24}x_{24} + c_{34}x_{34}$$

or more succinctly

$$\min \sum_{(i,j) \in E}c_{ij}x_{ij}$$

where $c_{ij}$ denote the cost on the edge $(i, j)$. The OR-Tools notation is similar:

    sum(cost[i,j] * x[i,j] for i,j in EDGES) 

**Q7:** Based on this, in the space below, write the complete line to specify your objective function.

**A:** <font color='blue'>
    
    m.Minimize(sum(cost[i,j] * x[i,j] for i,j in EDGES)) 

</font> 

**Constraints:** While you can only have one objective function, you can have a lot of constraints.
The general form of a constraint is as follows.

    m.Add(<function expression>)

Sometimes, though, you might also want to write a few very similar constraints. For example, you would like to write the ”same” constraint for each element in a set. In this case, your constraint will take the form:

    for i in SET:
        m.Add(<function expression, in terms of i>)

**Q8:** In the space provided below, write the flow conservation constraints; recall that there is one
constraint for each element of the set Nodes.

**A:** <font color='blue'>
    
    for k in NODES:
        m.Add(sum(x[i,j] for i,j in EDGES if i == k) - 
              sum(x[i,j] for i,j in EDGES if j == k) == supply[k])

</font> 

Now, define the data corresponding to the above graph. You will specify the nodes, edges, supply values, edge costs, and edge capacities. There are two main approaches: define with a CSV file or define the objects directly. Both approaches have been implemented for the nodes and supply values.

In [None]:
# Approach 1: Define with a CSV file
nodes = pd.read_csv('data/rec1_nodes.csv', index_col=0)
display(nodes)

In [None]:
# Approach 2: Define objects directly
nodes = [1,2,3,4]
supply = {1:10, 2:0, 3:0, 4:-10}

**Q9:** Choose an approach and define the remaining data. Be sure to define your model appropriately for the approach you choose.

In [None]:
# TODO: Define the remaining data.

### BEGIN SOLUTION
nodes = pd.read_csv('data/rec1_nodes.csv', index_col=0)
display(nodes)

edges = pd.read_csv('data/rec1_edges.csv')
edges['edges'] = list(zip(edges['i'],edges['j']))
edges = edges.set_index('edges')
display(edges)
### END SOLUTION

In [None]:
# HINT: If you use Approach 2, you will have more data to pass to this model
def rec1(nodes, edges):
    # TODO: Implement your model here.
    
    ### BEGIN SOLUTION
    NODES = list(nodes.index)             
    EDGES = list(edges.index) 
    
    supply = nodes['supply'].to_dict()
    cost = edges['cost'].to_dict()
    capacity = edges['capacity'].to_dict() 
   
    for i,j in EDGES:
        assert capacity[i,j] >= 0
    assert sum(supply[i] for i in NODES) == 0
    
    # define model
    m = OR.Solver('rec1', OR.Solver.CBC_MIXED_INTEGER_PROGRAMMING)
    
    # define variables
    x = {}  # flow on edge (i,j)
    for i,j in EDGES:
        x[i,j] = m.NumVar(0, capacity[i,j], ('(%s, %s)' % (i,j)))
        
    # Minimize: total cost
    m.Minimize(sum(cost[i,j] * x[i,j] for i,j in EDGES)) 
    
    # subject to: flow conservation
    for k in NODES:
        m.Add(sum(x[i,j] for i,j in EDGES if i == k) - 
              sum(x[i,j] for i,j in EDGES if j == k) == supply[k])
    ### END SOLUTION

    return m,x

In [None]:
# HINT: If you use Approach 2, you will have more data to pass to this model
m,x = rec1(nodes, edges)
solve(m)

**Q10:** Having finished your OR-Tools model (model and data files), use the cell above to solve it. Write down your optimal objective value and the corresponding optimal solution.

**A:** <font color='blue'> Optimal objective value is 41.
   
| variable     | value |
|-----------|-------|
| (1, 2) | 2.0 |
| (1, 3) | 8.0 |
| (3, 2)  | 0.0 |
| (2, 3)  | 0.0 |
| (2, 4)  | 2.0 |
| (3, 4)  | 8.0 |

</font> 