# Baytheon and Farkas Investments, revisited

Place any setup code that you need (e.g. `import` statements) in the cell below.

In [None]:
from stochasticdp import StochasticDP

## Baytheon

Baytheon has received an order to supply 2 guided missiles. In order to meet stringent quality requirements, the company may have to manufacture more than one missile to obtain an missile that is acceptable. The company has time to make no more than 3 production runs, and at most 2 missiles can be produced in each run. The probability distribution of acceptable missiles in a given run depends on how many missiles are produced:

| Number of missiles produced | Prob. of 0 acceptable missiles | Prob. of 1 acceptable missile | Prob. of 2 acceptable missiles |
|-----------------------------|--------------------------------|-------------------------------|--------------------------------| 
| 0                           | 1                              | 0                             | 0                              |
| 1                           | 1/3                            | 2/3                           | 0                              |
| 2                           | 1/4                            | 1/2                           | 1/4                            |
    
Each missile costs \$100,000 to produce, and excess missiles are worthless. In addition, a setup cost of \$50,000 must be incurred whenever the production process is setup for this item. If 2 acceptable missiles have not been obtained by the end of the third production run, Baytheon is in breach of contract and must pay a penalty of \$1,000,000. The objective is to determine how many missiles to produce in each production run in order to minimize the total expected cost.

Once upon a time, for homework, you formulated this problem as a stochastic dynamic program. (Note that the penalty value has changed.)

1. Solve your dynamic program using `stochasticdp`.
2. Interpret the output of your stochastic dynamic program.

In [None]:
# Number of stages
number_of_stages = 4

# List of states
states = [0, 1, 2]

# List of decisions
decisions = [0, 1, 2]

# Initialize stochastic dynamic program
dp = StochasticDP(number_of_stages, states, decisions, minimize=True)

# Transition probabilities and contributions from state n = 2
for t in range(number_of_stages - 1):
    # Produce 2
    dp.add_transition(stage=t, from_state=2, decision=2, to_state=2, probability=1/4, contribution=50 + 100*2)
    dp.add_transition(stage=t, from_state=2, decision=2, to_state=1, probability=1/2, contribution=50 + 100*2)
    dp.add_transition(stage=t, from_state=2, decision=2, to_state=0, probability=1/4, contribution=50 + 100*2)
    
    # Produce 1
    dp.add_transition(stage=t, from_state=2, decision=1, to_state=2, probability=1/3, contribution=50 + 100*1)
    dp.add_transition(stage=t, from_state=2, decision=1, to_state=1, probability=2/3, contribution=50 + 100*1)
    
    # Produce 0
    dp.add_transition(stage=t, from_state=2, decision=0, to_state=2, probability=1, contribution=0)
    
# Transition probabilities and contributions from state n = 1
for t in range(number_of_stages - 1):
    # Produce 2
    dp.add_transition(stage=t, from_state=1, decision=2, to_state=1, probability=1/4, contribution=50 + 100*2)
    dp.add_transition(stage=t, from_state=1, decision=2, to_state=0, probability=1/2 + 1/4, contribution=50 + 100*2)
    
    # Produce 1
    dp.add_transition(stage=t, from_state=1, decision=1, to_state=1, probability=1/3, contribution=50 + 100*1)
    dp.add_transition(stage=t, from_state=1, decision=1, to_state=0, probability=2/3, contribution=50 + 100*1)
    
    # Produce 0
    dp.add_transition(stage=t, from_state=1, decision=0, to_state=1, probability=1, contribution=0)

# Transition probabilities and contributions from state n = 0
for t in range(number_of_stages - 1):
    # Produce 2
    dp.add_transition(stage=t, from_state=0, decision=2, to_state=0, probability=1, contribution=50 + 100*2)
    
    # Produce 1
    dp.add_transition(stage=t, from_state=0, decision=1, to_state=0, probability=1, contribution=50 + 100*1)
    
    # Produce 0
    dp.add_transition(stage=t, from_state=0, decision=0, to_state=0, probability=1, contribution=0)
        
# Boundary conditions
dp.add_boundary(state=0, value=0)
dp.add_boundary(state=1, value=1000)
dp.add_boundary(state=2, value=1000)

# Solve the stochastic dynamic program
value, policy = dp.solve()

In [None]:
# Examine value-to-go
value

In [None]:
# Examine policy
policy

<!-- _Write your notes here. Double-click to edit._ -->
* The minimum expected cost is $f_0(2) = 590,972.22$.

* The optimal policy is as follows:
    - Run 1: Produce 2. We will end up with 2, 1, or 0 missiles needed.
    - Run 2: If 2 missiles are still needed, produce 2. If 1 missile is still needed, produce 1. Otherwise produce 0. We will end up with 2, 1, or 0 missiles needed.
    - Run 3: If 2 missiles are needed, we're screwed; producing 2 or 0 missiles result in the same expected cost. If 1 missile is still needed, produce 1. Otherwise produce 0.

## Farkas Investments

You have recently been hired as a junior analyst at Farkas Investments. You have been given \$4 million to invest over the next 3 years. At the beginning of each of the next 3 years, you can invest in one of two investments: A or B.

| Investment | Cost (\$ millions) | Profit (\$ millions) | Probability |
|:-----------|:-------------------|:---------------------|:------------|
| A          | 3                  | 2                    | 0.5         |
|            |                    | -2                   | 0.5         |
| B          | 5                  | 3                    | 0.1         |
|            |                    | -1                   | 0.9         |

You are allowed to make at most one investment each year. Any additional money accumulated is left idle. You may not borrow money to invest; that is, you cannot buy into an investment if it costs more than you currently have.

Formulate a stochastic dynamic program to find an investment policy that maximizes the probability you will have at least \$10 million at the end of 3 years.

Once upon a time, for homework, you formulated this problem as a stochastic dynamic program.

1. Solve your dynamic program using `stochasticdp`.
2. Interpret the output of your stochastic dynamic program.

In [None]:
# Number of stages
number_of_stages = 4

# List of states
states = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

# List of decisions
decisions = ['A', 'B', 'no investment']

# Initialize stochastic DP
dp = StochasticDP(number_of_stages, states, decisions, minimize=False)

# Transition probabilities and contributions
for t in range(number_of_stages - 1):
    for n in states:
        # Investment A
        if (n >= 3):
            dp.add_transition(stage=t, from_state=n, decision='A', to_state=min(n + 2, 10), probability=0.5, contribution=0)
            dp.add_transition(stage=t, from_state=n, decision='A', to_state=n - 2, probability=0.5, contribution=0)
        
        # Investment B
        if (n >= 5):
            dp.add_transition(stage=t, from_state=n, decision='B', to_state=min(n + 3, 10), probability=0.1, contribution=0)
            dp.add_transition(stage=t, from_state=n, decision='B', to_state=n - 1, probability=0.9, contribution=0)            
        
        # No investment
        dp.add_transition(stage=t, from_state=n, decision='no investment', to_state=n, probability=1, contribution=0)            
        
# Boundary conditions
for n in states:
    if n == 10:
        dp.add_boundary(state=n, value=1)
    else:
        dp.add_boundary(state=n, value=0)

# Solve stochastic DP
value, policy = dp.solve()

In [None]:
# Print value-to-go
value

In [None]:
# Print policy
policy

<!-- _Write your notes here. Double-click to edit._ -->
* The maximum probability of reaching \$10 million is $f_0(4) = 0.125$.

* The optimal policy is as follows:
    - Year 1: Invest in A. We will end up with either \$2 or \$6 million.
    - Year 2: If we end up with \$2 million, don't invest (in fact, we can't, because we don't have enough money). If we end up with \$6 million, invest in A. We can end up with \$4 million or \$8 million.
    - Year 3: If we end up with \$4 million, we can't reach \$10 million by the end of the year. Since our objective in this problem is purely to maximize the probability we reach \$10 million, either investing in A or not investing result in the same value-to-go (0). If we end up with \$8 million, invest in A again.