# Randomize - Block Design

## Problem - 1

### Air Traffic Controller Stress Test

- A study measuring the fatigue and stress of
air traffic controllers resulted in proposals
for modification and redesign of the
controller’s work station
- After consideration of several designs for
the work station, three specific alternatives
are selected as having the best potential
for reducing controller stress
- The key question is: To what extent do the
three alternatives differ in terms of their
effect on controller stress?
- In a completely randomized design, a random sample of controllers would be
assigned to each work station alternative.
- However, controllers are believed to differ substantially in their ability to
handle stressful situations.
- What is high stress to one controller might be only moderate or even low
stress to another.
- Hence, when considering the within-group source of variation (MSE), we must
realize that this variation includes both random error and error due to
individual controller differences.
- In fact, managers expected controller variability to be a major contributor to
the MSE term.

|              | System A | System B | System C |
|--------------|----------|----------|----------|
| Controller 1 | 15       | 15       | 18       |
| Controller 2 | 14       | 14       | 14       |
| Controller 3 | 10       | 11       | 15       |
| Controller 4 | 13       | 12       | 17       |
| Controller 5 | 16       | 13       | 16       |
| Controller 6 | 13       | 13       | 13       |

In [44]:
import pandas as pd
import numpy as np
from scipy import stats
import statsmodels.api as sm
from statsmodels.formula.api import ols

In [45]:
df = pd.DataFrame({'sysA':[15, 14, 10, 13, 16, 13],'sysB':[15, 14, 11, 12, 13, 13],'sysC':[18, 14, 15, 17, 16, 13]})

df

Unnamed: 0,sysA,sysB,sysC
0,15,15,18
1,14,14,14
2,10,11,15
3,13,12,17
4,16,13,16
5,13,13,13


In [46]:
data = pd.melt(df.reset_index(), id_vars=['index'], value_vars=df.columns, var_name='treatments', value_name='value')
data

Unnamed: 0,index,treatments,value
0,0,sysA,15
1,1,sysA,14
2,2,sysA,10
3,3,sysA,13
4,4,sysA,16
5,5,sysA,13
6,0,sysB,15
7,1,sysB,14
8,2,sysB,11
9,3,sysB,12


In [47]:
model = ols('value~C(treatments)', data=data).fit()
anova_table = sm.stats.anova_lm(model, typ=1)
anova_table

Unnamed: 0,df,sum_sq,mean_sq,F,PR(>F)
C(treatments),2.0,21.0,10.5,3.214286,0.068903
Residual,15.0,49.0,3.266667,,


In [48]:
# accept the null hypothesis p value is greater than alpha
# this indicates that all three systems have equal impact on stress levels

## using RBD

In [49]:
df = pd.DataFrame({'sysA':[15, 14, 10, 13, 16, 13],'sysB':[15, 14, 11, 12, 13, 13],'sysC':[18, 14, 15, 17, 16, 13]})

df

Unnamed: 0,sysA,sysB,sysC
0,15,15,18
1,14,14,14
2,10,11,15
3,13,12,17
4,16,13,16
5,13,13,13


In [50]:
data = pd.melt(df.reset_index(), id_vars=['index'], value_vars=df.columns, var_name=['treatments'], value_name='value')
data=data.rename(columns={'index':'blocks'})


In [51]:
model = ols('value~ C(treatments) + C(blocks)', data=data).fit()
anova_table = sm.stats.anova_lm(model, typ=1)
anova_table

Unnamed: 0,df,sum_sq,mean_sq,F,PR(>F)
C(treatments),2.0,21.0,10.5,5.526316,0.024181
C(blocks),5.0,30.0,6.0,3.157895,0.057399
Residual,10.0,19.0,1.9,,


In [52]:
#  reject null hypothesis 

### Conclusion
- Finally, note that the ANOVA table shown in Table provides an F value to
test for treatment effects but not for blocks.
- The reason is that the experiment was designed to test a single factor—
work station design.
- The blocking based on individual stress differences was conducted to
remove such variation from the MSE term.
- However, the study was not designed to test specifically for individual
differences in stress.

## Problem 2: RBD


- An experiment was performed to determine the effect of four different
chemicals on the strength of a fabric.
- These chemicals are used as part of the permanent press finishing
process.
- Five fabric samples were selected, and a randomized complete block
design was run by testing each chemical type once in random order on
each fabric sample.
- The data are shown in Table.
- We will test for differences in means using an ANOVA with alpha = 0.01.

In [53]:
df = pd.DataFrame({'chem1':[1.3, 1.6, 0.5, 1.2, 1.1],'chem2':[2.2, 2.4, 0.4, 2.0, 1.8],
                   'chem3':[1.8, 1.7, 0.6, 1.5, 1.3],'chem4':[3.9, 4.4, 2.0, 4.1, 3.4],})
df

Unnamed: 0,chem1,chem2,chem3,chem4
0,1.3,2.2,1.8,3.9
1,1.6,2.4,1.7,4.4
2,0.5,0.4,0.6,2.0
3,1.2,2.0,1.5,4.1
4,1.1,1.8,1.3,3.4


In [54]:
data = pd.melt(df.reset_index(), id_vars=['index'], value_vars=df.columns, var_name=['chemical_types'], value_name='value')
data = data.rename(columns={'index':'fabric_samples'})
data

Unnamed: 0,fabric_samples,chemical_types,value
0,0,chem1,1.3
1,1,chem1,1.6
2,2,chem1,0.5
3,3,chem1,1.2
4,4,chem1,1.1
5,0,chem2,2.2
6,1,chem2,2.4
7,2,chem2,0.4
8,3,chem2,2.0
9,4,chem2,1.8


In [55]:
model = ols("value ~ C(chemical_types)", data=data).fit()
anova_table = sm.stats.anova_lm(model, typ=1)
anova_table

Unnamed: 0,df,sum_sq,mean_sq,F,PR(>F)
C(chemical_types),3.0,18.044,6.014667,12.589569,0.000176
Residual,16.0,7.644,0.47775,,


In [56]:
# according to anova we should reject null hypothesis

## using RBD

In [57]:
model = ols("value ~ C(chemical_types) + C(fabric_samples)", data=data).fit()
anova_table = sm.stats.anova_lm(model, typ=1)
anova_table

Unnamed: 0,df,sum_sq,mean_sq,F,PR(>F)
C(chemical_types),3.0,18.044,6.014667,75.894848,4.51831e-08
C(fabric_samples),4.0,6.693,1.67325,21.113565,2.318913e-05
Residual,12.0,0.951,0.07925,,


In [58]:
# still we are rejecting the null hypothesis but 
# there is siginificant difference in error mean square