# Space Shuttle O-Ring Failures - An Observational Study

Is there a higher risk of O-ring failures at lower launch temperatures?

In [None]:
# standard library imports
from itertools import combinations

# 3rd party library imports
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import pingouin as pg
import scipy.special
import seaborn as sns

sns.set()

In [None]:
df = pd.read_csv('case0401.csv')
df

# Robustness of Assumptions

In [None]:
df.groupby('Launch').describe()                                    

In [None]:
fig, axes = plt.subplots(ncols=2, sharex=True)
sns.histplot(data=df.query('Launch == "Cool"'), x='Incidents', discrete=True, ax=axes[0])
sns.histplot(data=df.query('Launch == "Warm"'), x='Incidents', discrete=True, ax=axes[1])
axes[0].set_ylim(axes[1].get_ylim())
axes[1].set_ylabel(None)
axes[0].set_xlabel('cool')
axes[1].set_xlabel('warm')
axes[1].set_yticklabels([])
fig.supxlabel('Incidents')
fig.tight_layout()

There is extremely strong evidence against the assumption of normality, so a two-sample $t$ test is not appropriate.

The large number of ties in the warm sample points to the inadequacy of the Mann Whitney U test.

A permutation test will be performed.  How probable is an outcome as extreme as the one observed?

$
\begin{align}
H_0:  {Distribution}_{cool} &= {Distribution}_{warm} \\
H_a:  {Distribution}_{cool} &\ne {Distribution}_{warm} \\
\end{align}
$

In [None]:
total_combinations = scipy.special.comb(24, 4)

# enumerate the combinations that are as bas as or worse than the observed.
# The cool incidents are (1, 1, 1, 3).  The sum of the bad incidents is 6
num_bad_incidents = len([x for x in combinations(df['Incidents'], 4) if sum(x) >= 6])
p = num_bad_incidents / total_combinations
print(p)

There is strong evidence that the number of O-ring incidents is associated with temperature.  The likelihood of obtaining a result as extreme as observed is small. ($p$-value = 0.0099).