## Project 3 Report - Scheduling and Decision Analysis with Uncertainty

For the final project, we're going to combine concepts from Lesson 7 (Constraint Programming), Lesson 8 (Simulation), and Lesson 9 (Decision Analysis). We'll do this by revisiting the scheduling problem from Lesson 7. But, we're going to make it a little more true-to-life by acknowledging some of the uncertainty in our estimates, and using simulation to help us come up with better estimates. We'll use our estimated profits to construct a payoff table and make a decision about how to proceed with the building project.

When we originally created the problem, we used the following estimates for time that each task would take:

<img src='images/reliable_table.png' width="450"/>

But based on past experience, we know that these are just the most likely estimates of the time needed for each task. Here's our estimated ranges of values (in days instead of weeks) for each task:

<img src='images/reliable-estimate-ranges.png' width="450"/>

Further, we're going to consider the following factors:

* The base amount that Reliable will earn is \$5.4 million.
* If Reliable completes the project in 280 days or less, they will get a bonus of \$150,000.
* If Reliable misses the deadline of 329 days, there will be a \$25,000 penalty for each day over 329.

### **P3.1** - Simulation

Create a simulation that uses a triangular distribution to estimate the duration for each of the activities. Use the Optimistic Estimate, Most Likely Estimate, and Pessimistic Estimate for the 3 parameters of your triangular distribution.   Use CP-SAT to find the minimal schedule length in each iteration.  Track the total days each simulation takes and the profit for the company.

Put your simulation code in the cell below.  Use at least 1000 iterations.  Check your simulation results to make sure the tasks are being executed in the correct order!

<font color = "blue"> *** 8 points -  answer in cell below *** (don't delete this cell) </font>

In [60]:
from ortools.sat.python import cp_model
import numpy as np
import math

class VarArraySolutionPrinter(cp_model.CpSolverSolutionCallback):
    """Print intermediate solutions."""

    def __init__(self, variables):
        cp_model.CpSolverSolutionCallback.__init__(self)
        self.__variables = variables
        self.__solution_count = 0

    def on_solution_callback(self):
        self.__solution_count += 1
        for v in self.__variables:
            print(f'{v} = {self.Value(v)}', end = ' ')
        print()

    def solution_count(self):
        return self.__solution_count

#using an artifact parameter to reuse code as much as possible
def worktimes(artifacts = 0):
    newworktimes = [
        math.ceil(np.random.triangular(7,14,21 )),
        math.ceil(np.random.triangular(14,21,56 )),
        math.ceil(np.random.triangular(42,63,126 )),
        math.ceil(np.random.triangular(28,35,70 )),
        math.ceil(np.random.triangular(7,28,35 )),
        math.ceil(np.random.triangular(28,35,70 )),
        math.ceil(np.random.triangular(35, 42, 77 )),
        math.ceil(np.random.triangular(35, 56, 119 )),
        math.ceil(np.random.triangular(21, 49, 63 )),
        math.ceil(np.random.triangular(21, 63, 63 )),
        math.ceil(np.random.triangular(21, 28, 28 )),
        math.ceil(np.random.triangular(7, 35, 49 )),
        math.ceil(np.random.triangular(7, 14, 21 )),
        math.ceil(np.random.triangular(35, 35, 63 ))  
    ]
    
    newworktimes[0] += artifacts
    
    return newworktimes

In [61]:
#using an artifact parameter to reuse code as much as possible
def buildahouse(artifacts = 0):
    tasklist = [chr(i+65) for i in range(14)]

    tasklength = worktimes(artifacts)

    taskdict = dict(zip(tasklist, tasklength))

    #task_names = list(task_duration_dict.keys())
    num_tasks = len(tasklist)

    # for each task we have a list of tasks that must go after
    # task:['these','tasks','after']
    precedence_dict = {
        'A': ['B'],
        'B': ['C'],
        'C': ['D', 'E', 'I'],
        'D': ['G'],
        'E': ['F', 'H'],
        'F': ['J'],
        'G': ['H'],
        'H': ['M'],
        'I': ['J'],
        'J': ['K', 'L'],
        'K': ['N'],
        'L': ['N'],
    }

    task_name_to_number_dict = dict(zip(tasklist, np.arange(0, num_tasks)))

    horizon = sum(tasklength)

    model = cp_model.CpModel()

    start_vars = [model.NewIntVar(0, horizon, name=f'start_{t}') for t in tasklist]
    end_vars = [model.NewIntVar(0, horizon, name=f'end_{t}') for t in tasklist]

    intervals = [
        model.NewIntervalVar(start_vars[i],
                             tasklength[i],
                             end_vars[i],
                             name=f'interval_{tasklist[i]}')
        for i in range(num_tasks)
    ]

    for before in list(precedence_dict.keys()):
        for after in precedence_dict[before]:
            before_index = task_name_to_number_dict[before]
            after_index = task_name_to_number_dict[after]
            model.Add(end_vars[before_index] <= start_vars[after_index])

    obj_var = model.NewIntVar(0, horizon, 'largest_end_time')
    model.AddMaxEquality(obj_var, end_vars)
    model.Minimize(obj_var)

    solver = cp_model.CpSolver()
    status = solver.Solve(model)

    return(solver.ObjectiveValue())


In [62]:
houseresults = []
houseearnings = []
for i in range(1000):
    thishousetime = buildahouse()
    houseresults.append(thishousetime)
    thishouseearnings = 5400000 - max(0, thishousetime - 329) * 25000
    if thishousetime <= 280:
        thishouseearnings += 150000
    houseearnings.append(thishouseearnings)


What is the probability that Reliable Company will finish the bid in 280 days or fewer, more than 280 and 329 days or fewer, or more than 329 days? What is their average profit?

Include code to answer these questions with output below:

<font color = "blue"> *** 2 points -  answer in cell below *** (don't delete this cell) </font>

In [50]:
quick = 0
average = 0
slow = 0

for x in houseresults:
    if x <= 280:
        quick += 1
    elif x >= 329:
        slow += 1
    else:
        average += 1
print(f"There were {quick} simulated projects finished in 280 days or fewer, {average} simulated projects finished in 281 to 328 days, and {slow} took at least 329 days." )
print(f"The average earnings is ${round(sum(houseearnings)/len(houseearnings))}")


There were 44 simulated projects finished in 280 days or fewer, 574 simulated projects finished in 281 to 328 days, and 382 took at least 329 days.
The average earnings is $5220625


### **P3.2** - Add Random Cost
From past experience, we know that special artifacts are sometimes found in the area where Reliable Construction is planning this building project.  When special artifacts are found, the excavation phase takes considerably longer and the entire project costs more - sometimes much more. They're never quite sure how much longer it will take, but it peaks around an extra 15 days, and takes at least an extra 7 days. They've seen some sites where relocating the special artifacts took as much as 365 extra days (yes - a whole year)! 

In addition, there are usually unanticipated costs that include fines and other things.  The accounting departments suggest that we model those costs with an exponential distribution with mean (scale) \$100,000.


Run a second simulation with these new parameters and using at least 1000 iterations.  Note, we are assuming that artifacts were found for this simulation.

Put your simulation code in the cell below.

<font color = "blue"> *** 8 points -  answer in cell below *** (don't delete this cell) </font>

In [65]:
#using the artifact parameter now!
houseresultsart = []
houseearningsart = []
for i in range(1000):
    artifacttime = math.ceil(np.random.triangular(7,15,365))
    thishousetime = buildahouse(artifacttime)
    houseresultsart.append(thishousetime)
    thishouseearnings = 5400000 - max(0, thishousetime - 329) * 25000 - np.random.exponential(100000)
    if thishousetime <= 280:
        thishouseearnings += 150000
    houseearningsart.append(thishouseearnings)


When artifacts are found, what is the probability that Reliable Company will finish the bid in 280 days or fewer, more than 280 and 329 days or fewer, or more than 329 days? What is their average profit?

Include code to answer these questions with output below:

<font color = "blue"> *** 2 points -  answer in cell below *** (don't delete this cell) </font>

In [64]:
quickart = 0
averageart = 0
slowart = 0

for x in houseresultsart:
    if x <= 280:
        quickart += 1
    elif x >= 329:
        slowart += 1
    else:
        averageart += 1
print(f"There were {quickart} simulated projects finished in 280 days or fewer, {averageart} simulated projects finished in 281 to 328 days, and {slowart} took at least 329 days." )
print(f"The average earnings is ${round(sum(houseearningsart)/len(houseearningsart))}")

There were 1 simulated projects finished in 280 days or fewer, 38 simulated projects finished in 281 to 328 days, and 961 took at least 329 days.
The average earnings is $2171434


### **P3.3** - Make Decision about Insurance

Clearly dealing with artifacts can be very costly for Reliable Construction.  It is known from past experience that about 30% of building sites in this area contain special artifacts.  Fortunately, they can purchase an insurance policy - a quite expensive insurance policy. The insurance policy costs \$500000, but it covers all fines and penalities for delays in the event that special artifacts are found that require remediation. Effectively, this means that Reliable could expect the same profit they would get if no artifacts were found (minus the cost of the policy).

Given the estimated profit without artifacts, the estimated profit with artifacts, the cost of insurance, the 30% likelihood of finding artifacts, create a payoff table and use Baye's Decision Rule to determine what decision Reliable should make.  You should round the simulated profits to the nearest \$100,000 and use units of millions of dollars so that, for example, \$8,675,309 is 8.7 million dollars.

Provide appropriate evidence for the best decision such as a payoff table or picture of a suitable (small) decision tree.

<font color = "blue"> *** 6 points -  answer in cell below *** (don't delete this cell) </font>

<img src='p33decisions.png'>

Describe, in words, the best decision and the reason for that decision:

<font color = "blue"> *** 2 points -  answer in cell below *** (don't delete this cell) </font>

<font color = "green">
Always buy insurance. The possible loss of $3 million occurs often enough that losing 1/6th that amount every time is the safer bet.
</font>

### **P3.4** - Posterior Probabilities
Reliable has been contacted by an archeological consulting firm. They assess sites and predict whether special artifacts are present. They have a pretty solid track record of being right when there are artifacts present - they get it right about 86% of the time. Their track record is less great when there are no artifacts - they're right about 72% of the time.

First find the posterior probabilities and provide evidence for how you got them (Silver Decisions screenshot or ?).

<font color = "blue"> *** 6 points -  answer in cell below *** (don't delete this cell) </font>

<img src='p34a.png'>

The consulting fee for the site in question is \$50,000. 

Construct a decision tree to help Reliable decide if they should hire the consulting firm or not and if they should buy insurance or not.  Again, you should round the simulated profits to the nearest $100,000 and use units of millions of dollars (e.g. 3.8 million dollars) in your decision tree.

Include a picture of the tree exported from Silver Decisions.

<font color = "blue"> *** 10 points -  answer in cell below *** (don't delete this cell) </font>

<img src='finaltree.png'>

Summarize the optimal policy in words here:

<font color = "blue"> *** 2 points -  answer in cell below *** (don't delete this cell) </font>

<font color = "green">
Always consult the archeology firm. If they detect an artifact, buy insurance. If not, don't.  
</font>

### **P3.5** - Final Steps

How confident do you feel about the results of your decision analysis? If you were being paid to complete this analysis, what further steps might you take to increase your confidence in your results?

<font color = "blue"> *** 4 points -  answer in cell below *** (don't delete this cell) </font>

<font color = "green">
I'm fairly confident in these results. In order to improve the analysis, I'd preform sensitivity analysis on the different odds of finding an artifact to see how the decision changes. I'd also want to take a closer look at the building tasks and see if there's ways to improve some of the task scheduling by seeing if one particular task is a holdup for the entire project often. 
</font>