**SA367 &#x25aa; Mathematical Models for Decision Making &#x25aa; Spring 2022 &#x25aa; Uhan**

# Lesson 6. Solving Dynamic Programs with Python

## Overview

* In this lesson, we'll revisit a few examples of dynamic programs and solve them with Python &mdash; in particular, NetworkX

## The knapsack problem, revisited

You are a thief deciding which precious metals to steal from a vault:
                                  
|    | Metal    | Weight (kg) | Value |
|:---|:---------|:-----------:|:-----:|
| 1  | Gold     | 3           | 11    |
| 2  | Silver   | 2           | 7     |
| 3  | Platinum | 4           | 12    |
                                  
You have a knapsack that can hold at most 8 kg. If you decide to take a particular metal, you must take all of it. Which items should you take to maximize the value of your theft?

* Recall that we formulated this problem as a dynamic program with the following longest path representation:
    - Stage $t$ represents the decision to take item $t$ ($t = 1, 2, 3$), or the end of the decision-making process ($t = 4$)
    - Node $t_n$ represents having $n$ kgs left in knapsack at stage $t$ ($n = 0, 1, \dots, 8$)

![DP for knapsack example](img/knapsack.png)

* We know how to solve shortest/longest path problems using NetworkX, so we can apply the same ideas here

* There is a Python data structure that makes this a little easier...

### Tuples

* A __tuple__ is like a list, except once it's been defined, it cannot be changed

* A tuple is written as a sequence of comma-separated items between _round_ brackets

* For example, we can define a tuple corresponding to taking silver with 5 kg left in the knapsack, like this:

* Tuples are ideal for things like names of nodes &mdash; things that you want to make permanent and  not accidentally change

### Back to the knapsack problem...

* We can use a tuple to represent the name of each node in our dynamic program, since each node's name has two distinct parts: the stage and the state

* Before we do anything, we need to import `networkx` and `bellmanford`:

In [None]:
import networkx as nx
import bellmanford as bf

* Let's begin by creating an empty graph:

* Next, let's add the stage-state nodes to the graph, using `for` loops
    - Remember that `range(a, b)` iterates over the integers `a, a + 1, ..., b - 1`

* We also need to add the special "end" node:

* Now we need to add the edges

* There are a lot of them, so we'll want to use some for loops

* The best way to use for loops depends on the shortest/longest path representation of the DP

* For example, looking above, we can add all the red edges of length 0 &mdash; corresponding to not taking the item &mdash; in one fell swoop, like this:

* Next, we can add the blue edges of length 11, corresponding to taking item 1 (gold)
    - Don't forget our DP is a _longest_ path problem!

* We can do something similar for the orange edges of length 7, corresponding to taking item 2 (silver):

* In addition, we can do something similar for the green edges of length 12, corresponding to taking item 3 (platinum):

* Finally, we can add the edges from the last stage nodes to the special "end" node:

* Now, we can solve the dynamic program using the Bellman-Ford algorithm, just as before:

### Interpreting the output

* What is the maximum value we can carry in the knapsack?

_Write your notes here. Double-click to edit._

* Which items should we take to obtain this maximum value?

_Write your notes here. Double-click to edit._

## Practice makes perfect &mdash; on your own

* Here are three more examples of dynamic programs we modeled in a previous lesson. Solve them using NetworkX and interpret the output.

### Assigning patrol cars to precincts

<!-- Winston and Venkataramanan Problem 13.4.4 -->
The Simplexville Police Department wants to determine how to assign patrol cars to each precinct in Simplexville. Each precinct can be assigned 0, 1, or 2 patrol cars. The number of crimes in each precinct depends on the number of patrol cars assigned to each precinct:
                                      
| Precinct | 0 patrol cars | 1 patrol cars | 2 patrol cars | 
| :------: | :-----------: | :-----------: | :-----------: | 
| 1 | 14 | 10 | 7 |
| 2 | 25 | 19 | 17 |
| 3 | 20 | 14 | 11 |
                                      
The department has 5 patrol cars. The department's goal is to minimize the total number of crimes across all 3 precincts. 

* We formulated this problem as a dynamic program with the following shortest path representation:
    - Stage $t$ represents the decision to assign patrol cars to precinct $t$ $(t = 1, 2, 3)$ or the end of the decision-making process ($t = 4$).
    - Node $t_n$ represents having $n$ patrol cars left at stage $t$ ($n = 0, 1, \dots, 5$).

![DP for patrol car example](img/patrol.png)

Solve this dynamic program using NetworkX.

_Interpret the output of the DP here. Double-click to edit._

### Inventory management

<!-- Rardin Exercise 9-26 -->
The Dijkstra Brewing Company is planning production of its new limited run beer, Primal Pilsner. The company must supply 1 batch next month, then 2 and 4 in successive months. Each month in which the company produces the beer requires a factory setup cost of \\$5,000. Each batch of beer costs \\$2,000 to produce. Batches can be held in inventory at a cost of \\$1,000 per batch per month. Capacity limitations allow a maximum of 3 batches to be produced during each month. In addition, the size of the company's warehouse restricts the ending inventory for each month to at most 3 batches. The company has no initial inventory.
  
The company wants to find a production plan that will meet all demands on time and minimizes its total production and holding costs over the next 3 months. 

* We formulated this problem as a dynamic program with the following shortest path representation:
    - Stage $t$ represents deciding to produce in month $t$ ($t = 1, 2, 3$), or the end of the decision-making process ($t = 4$).
    - Node $t_n$ represents having $n$ batches in inventory at the end of stage $t$ ($n = 0, 1, 2, 3$).

![DP for inventory management example](img/inventory.png)

Solve this dynamic program using NetworkX.

_Interpret the output of the DP here. Double-click to edit._

### Study time

To graduate from Simplexville University, Angie needs to pass at least one of
the three courses she is taking this semester: literature, finance, and
statistics. Angie's busy schedule of extracurricular activities allows her to
spend only 4 hours per week on studying. Angie's probability of passing each
course depends on the number of hours she spends studying for the course:

| Hours of studying per week | Literature | Finance | Statistics |
|:--------------------------:|:----------:|:-------:|:----------:|
| 0                          | 0.20       | 0.25    | 0.10       |
| 1                          | 0.30       | 0.30    | 0.30       |
| 2                          | 0.35       | 0.33    | 0.40       |
| 3                          | 0.38       | 0.35    | 0.44       |
| 4                          | 0.40       | 0.38    | 0.50       |

Angie wants to maximize the probability that she passes at least one of these
three courses. Formulate this problem as a dynamic program by giving its
shortest/longest path representation.

- We formulated this problem as a dynamic program with the following shortest path representation:
    - Stage $t$ represents assigned time to course $t$ ($t = 1, 2, 3$) or the end of the decision-making process ($t = 4$).
    - Node $t_n$ represents having $n$ hours left to assign at stage $t$ ($n = 0, 1, 2, 3, 4$).




_Hint._ You can import the natural exponent and logarithm functions from the `math` library:

```python
from math import exp, log
```

![DP for study time example](img/study.png)

Solve this dynamic program using NetworkX.

_Interpret the output of the DP here. Double-click to edit._

---

## Problems

### Problem 1 (Dynamic Distillery, revisited)

You have been put in charge of launching Dynamic Distillery's new bourbon whiskey. There are 4 nonoverlapping phases: research, development, manufacturing system design, and initial production and distribution. Each phase can conducted the two speeds: normal or priority. The times required (in months) to complete each phases at the two speeds are:

| Level    | Research | Development | Manufacturing System Design | Initial Production and Distribution |
|:---------|:--------:|:-----------:|:---------------------------:|:-----------------------------------:|
| Normal   | 4        | 3           | 5                           | 2                                   |
| Priority | 2        | 2           | 3                           | 1                                   |

The costs (in millions of \$) of complete each phase at the two speeds are:

| Level    | Research | Development | Manufacturing System Design | Initial Production and Distribution |
|:---------|:--------:|:-----------:|:---------------------------:|:-----------------------------------:|
| Normal   | 2        | 2           | 3                           | 1                                   |
| Priority | 3        | 3           | 4                           | 2                                   |

You have been given \$10 million dollars to execute the launch as quickly as possible. 

Once upon a time, for homework, you formulated this problem as a dynamic program by giving its shortest/longest path representation.

1. Solve your dynamic program using NetworkX.
2. Interpret the output of your dynamic program.

_Interpret the output of the DP here. Double-click to edit._

### Problem 2 (Pear Computers, revisited)

Pear Computers has a contract to deliver the following number of laptop computers during the next three months:

|                           | Month 1 | Month 2 | Month 3 |
|:--------------------------|:-------:|:-------:|:-------:|
| Laptop computers required | 200     | 300     | 200     |

For each laptop produced during months 1 and 2, a \\$100 cost is incurred; for each laptop produced during month 3, a \\$120 cost is incurred. Each month in which the company produces laptops requires a factory setup cost of \\$2,500. Laptops can be held in a warehouse at a cost of \\$15 for each laptop in inventory at the end of a month. The warehouse can hold at most 400 laptops. 

Laptops made during a month may be used to meet demand for that month or any future month. Manufacturing constraints require that laptops be produced in multiples of 100, and at most 300 laptops can be produced in any month.  The company's goal is to find a production plan that will meet all demands on time and minimizes its total production and holding costs over the next 3 months.

Formulate this problem as a dynamic program by giving its shortest/longest path representation.

Once upon a time, for homework, you formulated this problem as a dynamic program by giving its shortest/longest path representation.

1. Solve your dynamic program using NetworkX.
2. Interpret the output of your dynamic program.

_Interpret the output of the DP here. Double-click to edit._