# Predicting Failure of Space Shuttle Challenger

There were 24 launches prior to the fateful Challenger disaster. For one of the flights, motors were lost at the sea. So motor failure data was available for each of the six O-rings for each of the 23 flights. The data on the temperature and pressure information for these launches is also available.

**Key Question:** What are the chances of a catastrophic O-ring failure if the Space Shuttle is launched at 31 degrees Fahrenheit?

In [1]:
import pandas as pd
import statsmodels.api as sm

In [2]:
orings = pd.read_csv('orings.csv')
orings.head(10)

Unnamed: 0,Flight,Date,Field,Temp,Pres
0,1,4/12/1981,0.0,66,50
1,1,4/12/1981,0.0,66,50
2,1,4/12/1981,0.0,66,50
3,1,4/12/1981,0.0,66,50
4,1,4/12/1981,0.0,66,50
5,1,4/12/1981,0.0,66,50
6,2,11/12/1981,1.0,70,50
7,2,11/12/1981,0.0,70,50
8,2,11/12/1981,0.0,70,50
9,2,11/12/1981,0.0,70,50


### We have 5 variables
- Flight (launch code) 
- Date 
- Field (1 if success, 0 if failed landing)
- Temp (tempereature in degrees celsius)
- Pres (pressure)

In [3]:
orings.isnull().sum()

Flight    0
Date      0
Field     6
Temp      0
Pres      0
dtype: int64

In [4]:
orings.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 144 entries, 0 to 143
Data columns (total 5 columns):
Flight    144 non-null object
Date      144 non-null object
Field     138 non-null float64
Temp      144 non-null int64
Pres      144 non-null int64
dtypes: float64(1), int64(2), object(2)
memory usage: 5.7+ KB


In [5]:
orings.Field.describe()

count    138.000000
mean       0.072464
std        0.260199
min        0.000000
25%        0.000000
50%        0.000000
75%        0.000000
max        1.000000
Name: Field, dtype: float64

- Drop Flight and Date variables
- Remove NAs

In [12]:
orings = orings.dropna()
x = orings.drop(['Field', 'Date', 'Flight'], axis=1)
y = orings.Field

In [13]:
glm = sm.Logit(y, x)
glm_results = glm.fit()
glm_results.summary()

Optimization terminated successfully.
         Current function value: 0.223882
         Iterations 8


0,1,2,3
Dep. Variable:,Field,No. Observations:,138.0
Model:,Logit,Df Residuals:,136.0
Method:,MLE,Df Model:,1.0
Date:,"Wed, 13 Mar 2019",Pseudo R-squ.:,0.1388
Time:,09:18:22,Log-Likelihood:,-30.896
converged:,True,LL-Null:,-35.875
,,LLR p-value:,0.001601

0,1,2,3,4,5,6
,coef,std err,z,P>|z|,[0.025,0.975]
Temp,-0.0761,0.022,-3.398,0.001,-0.120,-0.032
Pres,0.0147,0.008,1.921,0.055,-0.000,0.030
