<div class="vtbegenerated"><p><b><span style="font-size: 14.0pt; color: #4472c4;">O-Ring Assignment</span></b></p> 
<ul> 
 <li>Read <a href="https://en.wikipedia.org/wiki/Space_Shuttle_Challenger_disaster">https://en.wikipedia.org/wiki/Space_Shuttle_Challenger_disaster</a></li> 
 <li>Go to <a href="https://archive.ics.uci.edu/ml/datasets/Challenger+USA+Space+Shuttle+O-Ring">https://archive.ics.uci.edu/ml/datasets/Challenger+USA+Space+Shuttle+O-Ring</a></li> 
 <li>Look at the main page and the Data Folder and Data Set Description page (links near top)</li> 
 <li>The <a href="https://archive.ics.uci.edu/ml/machine-learning-databases/space-shuttle/o-ring-erosion-or-blowby.data">o-ring</a><a href="https://archive.ics.uci.edu/ml/machine-learning-databases/space-shuttle/o-ring-erosion-or-blowby.data">-erosion-or-</a><a href="https://archive.ics.uci.edu/ml/machine-learning-databases/space-shuttle/o-ring-erosion-or-blowby.data">blowby.data</a> file is attached to the assignment as <strong><i>o-ring-erosion-or-blowby.csv</i></strong></li> 
 <li>“Blowby” means “leaking”</li> 
 <li>Load the file into a pandas DataFrame</li> 
 <li>Use <b>statsmodels</b> to do a multiple linear regression</li> 
 <li>How many O-rings does the model predict will show erosion or blowby when the temperature is 31 degrees F?&nbsp; (We don’t know how much pressure the rings will experience at liftoff so do predictions at 0, 50, 100 and 200 PSI to see what difference it makes.)</li> 
</ul> 
<br></div>

In [1]:
import pandas as pd
import numpy as np

# statsmodels.api is using a deprecated object, so it throws a warning.
import statsmodels.api as sm

  from pandas.core import datetools


In [2]:
df = pd.read_csv('o-ring-erosion-or-blowby.csv', names=['O-Rings','Distressed','Temperature','Pressure','Flight Order'], index_col='Flight Order')
df

Unnamed: 0_level_0,O-Rings,Distressed,Temperature,Pressure
Flight Order,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
1,6,0,66,50
2,6,1,70,50
3,6,0,69,50
4,6,0,68,50
5,6,0,67,50
6,6,0,72,50
7,6,0,73,100
8,6,0,70,100
9,6,1,57,200
10,6,1,63,200


How many O-rings does the model predict will show erosion or blowby (i.e. will be under distress) when the temperature is 31 degrees F?

Our predictors are Launch Temperature and Leak-check Pressure

Our response is Distressed

Since we have more than 1 predictor, we need to do a multiple regression.

In [3]:
# Setup the multiple regression per above.
Y = df.Distressed
X = df[['Temperature','Pressure']]
X = sm.add_constant(X)
model = sm.OLS( Y, X )
fitted = model.fit()
fitted.summary()

0,1,2,3
Dep. Variable:,Distressed,R-squared:,0.354
Model:,OLS,Adj. R-squared:,0.29
Method:,Least Squares,F-statistic:,5.49
Date:,"Mon, 27 Nov 2017",Prob (F-statistic):,0.0126
Time:,23:44:27,Log-Likelihood:,-17.408
No. Observations:,23,AIC:,40.82
Df Residuals:,20,BIC:,44.22
Df Model:,2,,
Covariance Type:,nonrobust,,

0,1,2,3,4,5,6
,coef,std err,t,P>|t|,[0.025,0.975]
const,3.3298,1.188,2.803,0.011,0.851,5.808
Temperature,-0.0487,0.017,-2.910,0.009,-0.084,-0.014
Pressure,0.0029,0.002,1.699,0.105,-0.001,0.007

0,1,2,3
Omnibus:,19.324,Durbin-Watson:,2.39
Prob(Omnibus):,0.0,Jarque-Bera (JB):,23.471
Skew:,1.782,Prob(JB):,8e-06
Kurtosis:,6.433,Cond. No.,1840.0


In [4]:
# Output the co-efficients calculated by the model
fitted.params

const          3.329831
Temperature   -0.048671
Pressure       0.002939
dtype: float64

According to the ordinary least squares model, our equation is:<br/>
O-rings under thermal distress = 3.3298 - 0.0487(Launch Temperature) + 0.0029(Pressure)

The purpose of this analysis is to determine the number of O-rings under thermal distress, assuming a launch temperature of 31 degrees Fahrenheit and at different levels of leak-check pressure. Given that we know the launch temperature, we can simplify the equation as follows:

Distressed O-rings at 31 degrees F = 1.8201 + 0.0029(Pressure)

We can now solve the equation using different values for Pressure:

In [5]:
b = np.array([0.0, 50.0, 100.0, 200.0])
d = 1.8201 + 0.0029*b
d

array([ 1.8201,  1.9651,  2.1101,  2.4001])

It's clear from these results that launching at 31 degrees is very risky. Even without considering the leak-check pressure, the cool launch temperature puts almost 2 o-rings under distress. The addition of pressure (50+ PSI) almost guarantees that 2 o-rings will fail.

In [6]:
# Double check my model using the predict() function

# Setup an array with 3 columns:
# 1. A constant
# 2. A temperature value. The temperature we are solving for is 31 deg F
# 3. A pressure value. We tested 0, 50, 100 and 200
new_x = [(1, 31, 0),(1, 31,50),(1, 31,100),(1, 31,200)]
y_predict = fitted.predict(new_x)
y_predict

array([ 1.82102695,  1.96799318,  2.11495942,  2.40889188])

The predict() function returned values that are very similar to those I calculated by plugging numbers into the equation suggested by the model. This gives me a level of confidence in its correctness.