# Design of Experiments
Prepared by Ric Alindayu (Chromewell Innovative Solutions, Inc.)

The Design of Experiments methodology is one of the methods used for optimizing experiments given a limited amount of resources. It has found its niche in the manufacturing industry because of increased costs from upscaled production.

To make the discussion simpler, we shall be using the pyDOE2 package in Python for doing simple calculations.

### Download the pyDOE2 package to your Jupyter notebook

Install the package to the notebook using the code below.

In [1]:
pip install pyDOE2

Note: you may need to restart the kernel to use updated packages.


Next, import the following modules since we will be needing them in our exercise.

In [2]:
import pyDOE2 as doe # this gives us design matrices
import pandas as pd # it functions like Excel

### Problem: Two-Level Full Factorial Design

A $2^{3}$factorial design was used to develop a nitride etch process on a single-wafer plasma etching tool. The design factors are the gap between the electrodes, the gas flow ($C_{2}F_{6}$ is used as the reactant gas), and the RF power applied to the cathode. Each factor is run at two levels, and the design is replicated twice. 

The response variable is the etch rate for silicon nitride (Å∕m).

The factor levels are provided below:

<img src="factortable.png">

#### Question 1: How many runs are required for a full factorial design of experiments with two levels and three factors?

We can use the pyDOE2 module to build a design matrix for us.

In [3]:
dmat = doe.ff2n(3)
print(dmat)

[[-1. -1. -1.]
 [ 1. -1. -1.]
 [-1.  1. -1.]
 [ 1.  1. -1.]
 [-1. -1.  1.]
 [ 1. -1.  1.]
 [-1.  1.  1.]
 [ 1.  1.  1.]]


Let's make the array more friendly-looking.

In [4]:
# Creating the design matrix for a 2-level, n-factor design

def des_mat(n,fact):
    design_matrix_raw = doe.ff2n(n)
    design_matrix = pd.DataFrame(design_matrix_raw)
    design_matrix.columns=fact
    row_matrix = []
    run_matrix = []
    lowercaps = []

# generate lowercase letters
    for i in range(len(design_matrix.columns)):
        lowercaps.append(design_matrix.columns[i].lower())
# naming rows

    for i in design_matrix.index:
        for j in range(len(design_matrix.columns)):
            if design_matrix.iloc[i,j] == 1.0:
                row_matrix.append(i)
                run_matrix.append(lowercaps[j])

    data = {'row': row_matrix, 'run': run_matrix}
    df = pd.DataFrame(data)
    df = df.groupby(['row'])['run'].apply(''.join).reset_index() #combines the names according to rows
    df=df.append({'row':0,'run':'(1)'},ignore_index=True)
    df=df.sort_values('row').reset_index(drop=True)
    return design_matrix.set_index([pd.Index(df['run'])])

In [5]:
dm = des_mat(3,['A','B','C'])
pd.DataFrame(dm)

Unnamed: 0_level_0,A,B,C
run,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
(1),-1.0,-1.0,-1.0
a,1.0,-1.0,-1.0
b,-1.0,1.0,-1.0
ab,1.0,1.0,-1.0
c,-1.0,-1.0,1.0
ac,1.0,-1.0,1.0
bc,-1.0,1.0,1.0
abc,1.0,1.0,1.0


Let's append the results of the two replicated experiments to the design matrix you generated.

In [6]:
run = list(dm.index)
raw_results = pd.DataFrame([[run[0],550,604],[run[1],669,650],[run[2],633,601],[run[3],642,635],[run[4],1037,1052],[run[5],749,868],[run[6],1075,1063],[run[7],729,860]],columns=['run','Replicate 1','Replicate 2'])
raw_results.set_index('run')
dm_merged = pd.merge(dm, raw_results, on='run')

pd.DataFrame(dm_merged)

Unnamed: 0,run,A,B,C,Replicate 1,Replicate 2
0,(1),-1.0,-1.0,-1.0,550,604
1,a,1.0,-1.0,-1.0,669,650
2,b,-1.0,1.0,-1.0,633,601
3,ab,1.0,1.0,-1.0,642,635
4,c,-1.0,-1.0,1.0,1037,1052
5,ac,1.0,-1.0,1.0,749,868
6,bc,-1.0,1.0,1.0,1075,1063
7,abc,1.0,1.0,1.0,729,860


Next, we shall be getting the average of the two replicates, and getting the levels of the two-factor and three factor interactions.

In [7]:
total_results = dm_merged["Replicate 1"] + dm_merged["Replicate 2"]
dm_merged["Total Sum"] = total_results
dm_merged["Average"] = total_results/2
dm_merged["AB"] = dm_merged["A"]*dm_merged["B"]
dm_merged["AC"] = dm_merged["A"]*dm_merged["C"]
dm_merged["BC"] = dm_merged["B"]*dm_merged["C"]
dm_merged["ABC"] = dm_merged["A"]*dm_merged["B"]*dm_merged["C"]
pd.DataFrame(dm_merged)

Unnamed: 0,run,A,B,C,Replicate 1,Replicate 2,Total Sum,Average,AB,AC,BC,ABC
0,(1),-1.0,-1.0,-1.0,550,604,1154,577.0,1.0,1.0,1.0,-1.0
1,a,1.0,-1.0,-1.0,669,650,1319,659.5,-1.0,-1.0,1.0,1.0
2,b,-1.0,1.0,-1.0,633,601,1234,617.0,-1.0,1.0,-1.0,1.0
3,ab,1.0,1.0,-1.0,642,635,1277,638.5,1.0,-1.0,-1.0,-1.0
4,c,-1.0,-1.0,1.0,1037,1052,2089,1044.5,1.0,-1.0,-1.0,1.0
5,ac,1.0,-1.0,1.0,749,868,1617,808.5,-1.0,1.0,-1.0,-1.0
6,bc,-1.0,1.0,1.0,1075,1063,2138,1069.0,-1.0,-1.0,1.0,-1.0
7,abc,1.0,1.0,1.0,729,860,1589,794.5,1.0,1.0,1.0,1.0


Next, let's see the effects of the main factors and their interactions.

In [8]:
# effect of factors

effects = ['A','B','C','AB','AC','BC','ABC']
effects_num = []
sum_of_squares = []

for i in range(len(effects)):
    upper_level = []
    lower_level = []
    for j in dm_merged.index:
        if dm_merged[effects[i]][j] == 1:
            upper_level.append(dm_merged["Average"][j])
        elif dm_merged[effects[i]][j] == -1:
            lower_level.append(dm_merged["Average"][j])
    effects_num.append((sum(upper_level) - sum(lower_level))/4)
    sum_of_squares.append(((sum(upper_level) - sum(lower_level))*2)**2/16)
answer = {'Factor': effects, 'Effect Estimate': effects_num, 'Sum of Squares': sum_of_squares}

pd.DataFrame(answer)

Unnamed: 0,Factor,Effect Estimate,Sum of Squares
0,A,-101.625,41310.5625
1,B,7.375,217.5625
2,C,306.125,374850.0625
3,AB,-24.875,2475.0625
4,AC,-153.625,94402.5625
5,BC,-2.125,18.0625
6,ABC,5.625,126.5625


You still have to calculate for the error sum of squares, but for simplicity, the effects of each main factor and interaction can be see.

#### Question 2: Which factors are directly proportional to the response?
#### Question 3: Which factors are indirectly proportional to the response?
#### Question 4: Which factors highly affect the etch rate of the plasma?

In [9]:
doe.ff2n(4)

array([[-1., -1., -1., -1.],
       [ 1., -1., -1., -1.],
       [-1.,  1., -1., -1.],
       [ 1.,  1., -1., -1.],
       [-1., -1.,  1., -1.],
       [ 1., -1.,  1., -1.],
       [-1.,  1.,  1., -1.],
       [ 1.,  1.,  1., -1.],
       [-1., -1., -1.,  1.],
       [ 1., -1., -1.,  1.],
       [-1.,  1., -1.,  1.],
       [ 1.,  1., -1.,  1.],
       [-1., -1.,  1.,  1.],
       [ 1., -1.,  1.,  1.],
       [-1.,  1.,  1.,  1.],
       [ 1.,  1.,  1.,  1.]])