CS524: Introduction to Optimization Lecture 18
======================================

## Michael Ferris<br> Computer Sciences Department <br> University of Wisconsin-Madison

## October 16, 2023
--------------

### Set Covering

- Suppose we are given a base set S = {1,2,...m} (customers)
- We're also given a set of subsets (routes) of S:
    - base set: S = {1,2,3,4,5,6}
    - set of subsets of S:  C = {{1,2},{1,3,5},{1,2,4,5},{4,5},{3,6}}
- A set of subsets C **"covers"** S if every element of S is in at least of of the sets in C. In this problem, it means that all customers are "covered" -- Everyone gets delivery
- The set $C_{1}$ = {{1,2},{1,4,5},{3,6}} is a cover
- **Possible goal**: find a cover with minimum cost
- If we choose subset i in cover:  $x_{1}$ = 1

Each route has a cost (distance from one customer to another customer, may visit one customer more than one time)


### Formulation:
Minimize: $x_{1} + x_{2} + x_{3}+ x_{4}+ x_{5}$  
-  Number of $x_{i}$ = Number of different routes

<br /> Subject to
<br /> $x_{1}$ + $x_{2}$ + $x_{3}$ &emsp;&emsp;&emsp;&emsp;&emsp; $\ge 1 $
<br /> $x_{1}$ + &emsp;&ensp; + $x_{3}$ &emsp;&emsp;&emsp;&emsp;&emsp; $\ge 1 $
<br /> &emsp;&emsp;&emsp;$x_{2}$ &emsp;&emsp;&emsp;&emsp;&emsp;+ $x_{5}$ $\ge 1 $
<br /> &emsp;&emsp;&emsp;&emsp;&emsp;&ensp; $x_{3}$ + $x_{4}$ &emsp;&emsp; $\ge 1 $
<br /> &emsp;&emsp;&emsp;&emsp;&emsp;&ensp;&emsp;&emsp;&emsp;&emsp;&emsp;&ensp; $x_{5}$ $\ge 1 $
<br /> &emsp;&emsp;&emsp;&emsp;&emsp;&ensp;&emsp;&emsp;&emsp;&emsp;&emsp;&ensp; $x_{i} \in \big\{ 0,1\big\} \forall i$ 
- Each column corresponds to a route in the subset
- Each (row) equation correpsonds an element (customer) on the base set. The "$\ge 1 $" means that each customer must be visited at least once (get covered)



In [1]:
%load_ext gams.magic
m = gams.exchange_container

In [2]:
s = m.addSet('s',description='base set',records=['a','b','c','d','e','f'])
j = m.addSet('j',description='index of subsets',records=[i+1 for i in range(5)])
cover = m.addSet('cover',[s,j],description="item i is covered by set j",records=[
    ('a','1'), ('b','1'),
    ('a','2'), ('c','2'), ('e','2'),
    ('a','3'), ('b','3'), ('d','3'), ('e','3'),
    ('d','4'), ('e','4'),
    ('c','5'), ('f','5')])

In [3]:
%%gams
alias (s,i);

binary variables x(j);
variable coverCost;

equation coverConstraint(i), objective;

objective..
        coverCost =e= sum(j, x(j));

coverConstraint(i)..
        sum(j$cover(i,j), x(j)) =g= 1;

model setcover /all/;

* any optimum within <1 of the true optimum must BE the true optimum!
setcover.optca = 0.999;
solve setcover using mip minimizing coverCost;

Unnamed: 0,Solver Status,Model Status,Objective,#equ,#var,Model Type,Solver,Solver Time
0,Normal (1),Optimal Global (1),2.0,7,6,MIP,CPLEX,0.001


# More Set Covering: Kilroy County
In this example, the county must determine where to build fire stations. We want to build fire stations in some of cities and ensure that at least one station is within 15 minutes driving time of each city. Assumed that there are 6 cities in Kilroy County.

We would like to formulate an integer problem whose solution gives the minimum number of fire stations and their locations.

- Driving Distances from city $i$ to city $j$

|city|1|2|3|4|5|6|
|----|-|-|-|-|-|-|
|**1**|0|10|20|30|30|20|
|**2**|10|0|25|35|20|10|
|**3**|20|25|0|15|30|20|
|**4**|30|35|15|0|15|25|
|**5**|30|20|30|15|0|14|
|**6**|20|10|20|25|14|0|

- Variables: $x_j=1$ if build a fire station in city $j$

For example, $x_1=1$ if we build a fire station in city $1$, then it will cover city $2$ as well since the driving time from city $1$ to city $2$ is 10, less than 15.


In this example, we would like to select a subset of cities to meet the requirement of all cities, which is a standard set covering problem.

In [4]:
import numpy as np
import pandas as pd

cities = [i+1 for i in range(6)]
s = m.addSet('s',description='city',records=cities)
j = m.addSet('j',description='city',records=cities)
travelTime = pd.DataFrame(data=np.array([
    [ 0,      10,     20,     30,    30,    20],
    [10,       0,     25,     35,    20,    10],
    [20,      25,      0,     15,    30,    20],
    [30,      35,     15,      0,    15,    25],
    [30,      20,     30,     15,     0,    14],
    [20,      10,     20,     25,    14,     0]]),index=cities, columns=cities)
inRange = gams.from2dim(travelTime<=15)
cover = m.addSet('cover',[s,j],description="travel time between cities in Kilroy County",records=inRange[inRange[0]]) 
gams.gams('display cover; solve setcover using mip minimizing coverCost;')

Unnamed: 0,Solver Status,Model Status,Objective,#equ,#var,Model Type,Solver,Solver Time
0,Normal (1),Optimal Global (1),2.0,7,7,MIP,CPLEX,0.001


## Vehicle Routing -- Determine how trucks meet consumers' demand

<br />**Goal: Find a route that k trucks can go over all consumers with minimum cost**
- You have a fleet of k trucks, each with a capacity Q.
- You have a set of customers N = {0,1,2....N}.
- 0 is a special "depot" node
- Each customer has a demand $b_{1}$, $i \in N$ \ $\{0\}$ 
- How do route trucks to meet customer demand at minimum cost? (Can't ship to each customer more than their demand bi)
    - A truck must visit every customer
    - The sum of the demands on the route visiting the customer must be $\le Q $ (Each truck can only ship Q quantity at most)


--------------

**Picture signifies the distribution of consumers:**
<img src="https://raw.githubusercontent.com/twcmchang/Intro-to-Optimization/master/scribe/graph1.png" width="200" height="200" />

### Additional Possible Constraints

- If each customer had some "time windows" [ $t^{-}$,$t^{+}$ ] that the routes must obey.
- If each route had some "fixed cost" or had a nonlinear function of the distance traveled. (e.g. charge to give for airports)

### A New Formulation Idea - Enumeration
- Set up a new variable x for every possible route $r \in R$
\begin{equation}
  x_{r}=\begin{cases}
    1 & \text{if travel on route r}.\\
    0 & \text{otherwise}.
  \end{cases}
\end{equation}

- **Enumerate** all possible routes 1,2,3.... and choose the optimal one from within these routes (which satify certain criteria)

<img src="https://raw.githubusercontent.com/twcmchang/Intro-to-Optimization/master/scribe/graph2.png" width="600" height="600" />

- **Constraints**: 
    - each customer get stuff delivered -- (for each customer, sum of $x_{i}\ge 1 $)
    - A refers to data matrix: (A * $x_{i}) \ge  b_{i}$

In [5]:
s = m.addSet('s',description='customers',records=[i+1 for i in range(5)])
j = m.addSet('j',description='routes',records=['r'+str(i+1) for i in range(6)])
cover = m.addSet('cover',[s,j],description="customer covered by route",records=[
    ('1','r1'), ('1','r4'),
    ('2','r2'), ('2','r5'),
    ('3','r2'), ('3','r3'),
    ('4','r2'),
    ('5','r1'), ('5','r3'), ('5','r6')])
gams.gams('display cover; solve setcover using mip minimizing coverCost;')

Unnamed: 0,Solver Status,Model Status,Objective,#equ,#var,Model Type,Solver,Solver Time
0,Normal (1),Optimal Global (1),2.0,6,7,MIP,CPLEX,0.001


# Set Covering/Packing/Partitioning 
If $A$ is a matrix consisting only of $0$'s and $1$'s, then

### Set covering: $min \hspace{0.2cm} c^Tx: Ax\ge1, x \in\{0,1\}^n$

- Assume the data $A$ and cost of each thing $c$ are specified. 
- We would like to choose a subset of columns of $A$ to make sure every row (or city in the previous example) being visited at least once, that is, $Ax\ge1$. Here we use $x_i=1$ to *activate* the column $i$ in the matrix $A$.
- We would like to find a subset of $x$ to meet the requirement with minimum cost.

### Set packing: $max \hspace{0.2cm} c^Tx: Ax\le1, x \in\{0,1\}^n$

- Assume there is a base set of things, $A$. Let's maximize the things that we can get without repetition.
- We would like to maximize the value of the things we can pack and won't choose things at the base set more than once.

### Set partitioning: $min \hspace{0.2cm} c^Tx: Ax=1, x \in\{0,1\}^n$
- Assume there is a base set of things, $A$. Let's take the same thing **exactly once** with minimum cost.

### Visualization
Here we conceptually visualize these three problems by the following graph.
<img src="https://raw.githubusercontent.com/twcmchang/Intro-to-Optimization/master/scribe/set-covering-packing-partitioning.png"></img>
- set covering: things at the base set could be visited more than once.
- set packing: things at the base set could **not** be visited more than once.
- set partitioning: things at the base set should be visited exactly once.

# Another Example of Set Covering - Flight Crew Scheduling
One of the famous examples is flight crew scheduling. For this problem, we would like to determine a minimum cost set of pilots or  to use so that all flights are covered. However, it is a complicated problem because:
- The rule that constitute a feasible pairing for a set of pilots are complicated, such as maximum flight time constraints and minimum sleep in between flights constraints.
- The cost for a pilot is also a complicated function of flying time, time away from home, and a minimum trip guarantee.

If you're trying to do that with some logical functions, it will be very complicated. But, once you know the route, you can evaluate the cost of the route, which is relatively easier. Thus, the trick they do is to list all (or most) of the feasible pairings and their cost. 

We therefore can formulate it as a set covering problem, where the rows are flight legs and the columns are pairings (subsets of flight legs, starting and ending in home base that obeying FAA flight regulation).

In [6]:
%gams_cleanup --closedown