## Full Factorial Design

The basic idea is this: for each factor that might matter, we should pick initial points that span the widest range. (We can come back later for further optimizations in smaller ranges if needed.)

If we want to capture nonlinear / quadratic behavior, it's often useful to add a few center points.

<img src="./Figures/factorial.png" width="450" />

Many times, this full factorial design is also mentioned as $2^k$ design, because at 2 levels for each factor and $k$ factors... it's an exponential number of experiments.

It's useful though.

Consider the *statistical power* of our design.

Can we detect an improvement of 0.5?

<img src="./Figures/power-small.png" width="450" />

Probably not. What if the improvement effect was larger .. maybe a difference of 2.0? (This is similar to our t-test and ANOVA questions.. how much separation do we need to detect a difference)

<img src="./Figures/power-large.png" width="450" />

If we want to detect smaller differences, maybe we need better equipment (or expert taste testers) with less variation:

<img src="./Figures/power-expert.png" width="450" />

Or if we don't want to do that, we can repeat the measurements because the signal to noise of an average (i.e., the standard error) goes down as you repeat:

<img src="./Figures/power-repeat.png" width="450" />

### Replications in Full Factorial Design

Let's say we want to test two levels of two different variables, like in our catalyst concentration and temperature example.

|Concentration (mM)|Temperature (°C)|
| --- | --- |
| 0.5|30|
| 0.5|60|
| 3.0|30|
| 3.0|60|

Our multiple-variable regression would look something like this:

$$\text{yield} = \beta_0 + \beta_1 \text{concentration} + \beta_2 \text{Temperature}$$

We do 4 experiments, and the regression has 3 degrees of freedom (one for each of the $\beta$ coefficients.

If we did a three-factor study, we'd do 8 experiments:

|Concentration (mM)|Temperature (°C)|Time (hours)|
| --- | --- | -- |
| 0.5|30|8|
| 0.5|30|24|
| 0.5|60|8|
| 0.5|60|24|
| 3.0|30|8|
| 3.0|30|8|
| 3.0|60|24|
| 3.0|60|24|

Again, if we only consider the first-order effects:

$$\text{yield} = \beta_0 + \beta_1 \text{concentration} + \beta_2 \text{Temperature} + \beta_3 \text{Time}$$

We effectively get increased repetition because we're doing a bigger study. We can more effectively deduce the effect of each factor (and minimizing the effects of random noise.)

## Better Office Coffee

This example is borrowed from [`dexpy`](https://hpanderson.github.io/dexpy-pymntos/#/5) a Python module for design of experiments.

Incdentally, Prof. Chris Hendon, a computational chemist at U. Oregon has worked hard to make better coffee and has won barista awards:
- [Dr. Coffee](https://around.uoregon.edu/drcoffee)
- [Brewing a Great Cup](https://theconversation.com/brewing-a-great-cup-of-coffee-depends-on-chemistry-and-physics-84473)
- [Systematically Improving Espresso: Insights from Mathematical Modeling and Experiment](https://www.sciencedirect.com/science/article/pii/S2590238519304102)
- [Using Chemistry To Get The Perfect Cup Of Coffee](https://www.sciencefriday.com/segments/coffee-chemistry/)

**Why?**

Current office coffee is 👎 "disgusting and unacceptable" 

* What coffee beans to use? (Light vs. Dark roast)
* How much coffee to use?
* How to grind the coffee? (Burr vs. Blade grind, Grind size)
* How long to brew?

So that's *five* factors, or $2^5$ design, potentially with some center points.

* Amount of Coffee (2.5 to 4.0 oz.) - continuous
* Grind size (8-10mm) - continuous
* Brew time (3.5 to 4.5 minutes) - continuous
* Grind Type (burr vs blade)
* Coffee beans (light vs dark)

That's a lot of pots of coffee to taste. Even if we have 3 cups per day(!) if it's for the office, we're limited to weekdays (i.e., everyone gets to taste).

So maybe 6 weeks of taste tests?

We can instead use fractional factorial design .. we'll miss out on third order effects (light roast + long brew + a lot of coffee) but it seems okay to ignore that for now.

<img src="./Figures/fractional-factorial.png" width="350" />

Basically, we'll use **half** the points, so $2^{k-1}$ .. and maybe add a few center points (e.g., 3.25 oz, 9 mm, 4.0 min) to make sure we capture any nonlinearity.

For full factorial design, it's fairly easy to generate the table of things to try .. every combination.

For fractional factorial design, it's often best to either consult pre-built tables or use software that will generate the combinations.

Then, ideally randomize the list (e.g., maybe your first pot of coffee in the morning tastes better because you're tired and craving caffeine?)

### Example Results

```
                   Results: Ordinary least squares
=====================================================================
Model:                OLS               Adj. R-squared:      0.746   
Dependent Variable:   taste_rating      AIC:                 79.5691 
Date:                 2016-11-10 19:52  BIC:                 90.1715 
No. Observations:     24                Log-Likelihood:      -30.785 
Df Model:             8                 F-statistic:         9.438   
Df Residuals:         15                Prob (F-statistic):  0.000123
R-squared:            0.834             Scale:               1.2184  
---------------------------------------------------------------------
                       Coef.  Std.Err.    t    P>|t|   [0.025  0.975]
---------------------------------------------------------------------
Intercept              5.0318   0.2253 22.3328 0.0000  4.5516  5.5121
amount                 0.9731   0.2759  3.5266 0.0031  0.3850  1.5613
grind_size             0.0022   0.2759  0.0078 0.9939 -0.5860  0.5903
brew_time              1.2061   0.2759  4.3709 0.0005  0.6180  1.7943
grind_type            -0.0974   0.2253 -0.4324 0.6716 -0.5777  0.3828
beans                  0.5774   0.2253  2.5628 0.0216  0.0972  1.0577
amount:beans          -1.4820   0.2759 -5.3707 0.0001 -2.0702 -0.8939
grind_size:brew_time   0.3961   0.2759  1.4354 0.1717 -0.1921  0.9843
grind_size:grind_type -0.6927   0.2759 -2.5103 0.0240 -1.2809 -0.1046
---------------------------------------------------------------------
Omnibus:               4.208          Durbin-Watson:            2.190
Prob(Omnibus):         0.122          Jarque-Bera (JB):         1.550
Skew:                  -0.116         Prob(JB):                 0.461
Kurtosis:              1.777          Condition No.:            1    
=====================================================================
```

Notice that grind size basically has no effect in this study. Nor does grind type. (I am skeptical.. I prefer a burr grinder because it produces more even grounds.)

A lot of other effects seem important:
- Increased amount generally improves quality (e.g., too weak)
- Brew time was important (e.g., people rushed to get their coffee = too weak)

What do interactions look like?

<img src="./Figures/bean-interaction.png" width="450" />

Evidently, it seems like a lot of dark roast comes out too bitter, but a small amount of light roast generates weak taste?

Maybe a follow-up experiment can use medium roast...

**The key point is that by systematic design of experiments, we could improve our coffee**