# Week 11 case study: optimization of a 2-factor system

In [None]:
from process_improve import *    
from bokeh.plotting import output_notebook
output_notebook()

## Background and assumptions

We assume we have run other experiments already, and we are only left with these 2 factors which have influence on the system.

**P** = price: baseline is 0.75 $/part

**T** = throughput: baseline is 325 parts per hour.

Based on prior knowledge of the system and experience we know that it is throughput sensitive, and somewhat price sensitive in the region we have used in the past (around the baseline). So we cannot take such big steps initially.

1. Run another center point
2. Run a full factorial in 2 factors *in random order*, using range of P = [0.70, 0.80] and a range of throughputs = [300, 350].
3. Record the results of all 6 experiments.

In [None]:
# Experiments were run in random order, but shown here in standard order
p1 = c(0.75, 0.75, 0.70, 0.80, 0.70, 0.80, center=0.75, range=[0.7, 0.8], name = "Price", units = '$/part')    
t1 = c( 325,  325,  300,  300,  350,  350, center=325,  range=[300, 350], name = 'Throughput', units = 'parts/hour')
P1 = p1.to_coded()
T1 = t1.to_coded()
gather(Price=p1, Throughput=t1)


In [None]:
# Gather all the data together
y1 = c(____, ____, ____, ____, ____, ____,
       name = "Response: profit per hour", units="$/hour")
expt1 = gather(P=P1, T=T1, y=y1, title="First experiment")
print(expt1)

In [None]:
mod_base1 = lm("y ~ P * T", data=expt1)
summary(mod_base1)
contour_plot(mod_base1, "P", "T");

### Understand the model's performance

Does the model predict equally well everywhere?


In [None]:
prediction_1 = predict(mod_base1, P=P1, T=T1)
print(prediction_1)
print(y1 - prediction_1)

We see non-linearity, especially when viewed in the direction of T, Throughput.

Let's use the model anyway to make a prediction, to verify the model's performance. The best next step seems to be at:

* P = ____ (coded values)
* T = 2 (coded values)

Predict what will happen first, to see how well the model works. 

In [None]:
P2 = P1.extend([____])
T2 = T1.extend([2])
p2 = P2.to_realworld()
t2 = T2.to_realworld()
print(p2) 
print(t2) 
print(predict(mod_base1, P = P2, T = T2))

In [None]:
# Then run the experiment, and fill in the new response for the 7th experiment
y2 = y1.extend([____])

Predicted profit value = ____

Actual profit value = ____

Confirms our model is in a nonlinear region in the T=Throughput direction. The predictions don't match, and we see over- and under-prediction in the throughput direction. Nonlinearity in that direction.

In [None]:
# Add axial points: 
P3 = P2.extend([-1.41, +1.41,     0,     0])
T3 = T2.extend([    0,     0, -1.41, +1.41])
p3 = P3.to_realworld()
t3 = T3.to_realworld()
gather(Price=p3, Throughput=t3)


In [None]:
# Then run the 4 experiments, and add the results here:
y3 = y2.extend([____, ____, ____, ____])

In [None]:
expt3 = gather(P=P3, T=T3, y=y3, title="Added the axial points; quadratic model")
print(expt3)

In [None]:
mod_base3 = lm("y ~ P * T + I(P**2) + I(T**2)", data=expt3)
summary(mod_base3)
contour_plot(mod_base3, "P", "T");

In [None]:
# Expand the contour plot to higher values of P
contour_plot(mod_base3, "P", "T", xlim=(-2, +6));

# Taking a step with the quadratic model

The quadratic terms in factor T are strongly significant, while not significant in P. The direction of improvement is in P though. So take a step, but perhaps not as far.
In coded units:
* P = ____ (coded values)
* T = +1 (coded values)

In [None]:
P4 = P3.extend([____])
T4 = T3.extend([+1])
p4 = P4.to_realworld()
t4 = T4.to_realworld()
print(f'Price per part = {p4[-1:].values} $/part')  # show the last in the vector
print(f'Throughput     = {t4[-1:].values} parts/hour')

In [None]:
# Run the experiment and add the actual value here:
y4 = y3.extend([____])

Predicted value: ____ $/hour profit

Actual value: ____  $/hour profit

Predicted value is off by ____ $ profit/hour. Update the model with the new value.

In [None]:
expt4 = gather(P=P4, T=T4, y=y4, title="Add the first step out into the quadratic model space")
mod_base4 = lm("y ~ P * T + I(P**2) + I(T**2)", data=expt4)
summary(mod_base4)
contour_plot(mod_base4, "P", "T", xlim=(-2, 10));

The model's $R^2$ value and standard error seem great. The coefficients for the direction of factor P have also changed. We can probably take the next step with confidence:

In coded units:
* P = ____ (coded values)
* T = ____ (coded values)

In [None]:
P5 = P4.extend([+7.1])
T5 = T4.extend([+1])
p5 = P5.to_realworld()
t5 = T5.to_realworld()
print(f'Price per part = {p5[-1:].values} $/part')  # show the last in the vector
print(f'Throughput     = {t5[-1:].values} parts/hour')

predictions = predict(mod_base4, P=P5, T=T5 )
print(f'Prediction     = {predictions[-1:].values} $ per hour')

In [None]:
# Run the experiment and add the actual value here:
y5 = y4.extend([____])

Predicted value: ____ $/hour profit

Actual value: ____

This looks satisfying. Stop here; or confirm with another quadratic model update around this point.