You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Name of QuantLet: ISP_logisticRegressionPublished in: An Introduction to Statistics with PythonDescription: 'Logistic Regression A logistic regression is an example of a <Generalized Linear Model (GLM)>. The input values are the recorded O-ring data from the space shuttle launches before 1986, and the fit indicates the likelihood of failure for an O-ring. Taken from http://www.brightstat.com/index.php?option=com_content&task=view&id=41&Itemid=1&limit=1&limitstart=2'Keywords: plot, fitting, logistic regressionSee also: ISP_ordinalLogisticRegression, ISP_bayesianStatsAuthor: Thomas Haslwanter Submitted: October 31, 2015 Datafile: challenger_data.csv Example: ChallengerPlain.png
'''Logistic RegressionA logistic regression is an example of a "Generalized Linear Model (GLM)".The input values are the recorded O-ring data from the space shuttle launches before 1986,and the fit indicates the likelihood of failure for an O-ring.Taken from http://www.brightstat.com/index.php?option=com_content&task=view&id=41&Itemid=1&limit=1&limitstart=2'''# Copyright(c) 2015, Thomas Haslwanter. All rights reserved, under the BSD 3-Clause License# Import standard packagesimportnumpyasnpimportmatplotlib.pyplotaspltimportosimportpandasaspdimportseabornassns# additional packagesimportsyssys.path.append(os.path.join('..', '..', 'Utilities'))
try:
# Import formatting commands if directory "Utilities" is availablefromISP_mystyleimportsetFonts, showDataexceptImportError:
# Ensure correct performance otherwisedefsetFonts(*options):
returndefshowData(*options):
plt.show()
returnfromstatsmodels.formula.apiimportglmfromstatsmodels.genmod.familiesimportBinomialsns.set_context('poster')
defgetData():
'''Get the data '''inFile='challenger_data.csv'data=np.genfromtxt(inFile, skip_header=1, usecols=[1, 2],
missing_values='NA', delimiter=',')
# Eliminate NaNsdata=data[~np.isnan(data[:, 1])]
returndatadefprepareForFit(inData):
''' Make the temperature-values unique, and count the number of failures and successes. Returns a DataFrame'''# Create a dataframe, with suitable columns for the fitdf=pd.DataFrame()
df['temp'] =np.unique(inData[:,0])
df['failed'] =0df['ok'] =0df['total'] =0df.index=df.temp.values# Count the number of starts and failuresforiiinrange(inData.shape[0]):
curTemp=inData[ii,0]
curVal=inData[ii,1]
df.loc[curTemp,'total'] +=1ifcurVal==1:
df.loc[curTemp, 'failed'] +=1else:
df.loc[curTemp, 'ok'] +=1returndfdeflogistic(x, beta, alpha=0):
''' Logistic Function '''return1.0/ (1.0+np.exp(np.dot(beta, x) +alpha))
defshowResults(challenger_data, model):
''' Show the original data, and the resulting logit-fit'''temperature=challenger_data[:,0]
failures=challenger_data[:,1]
# First plot the original dataplt.figure()
setFonts()
sns.set_style('darkgrid')
np.set_printoptions(precision=3, suppress=True)
plt.scatter(temperature, failures, s=200, color="k", alpha=0.5)
plt.yticks([0, 1])
plt.ylabel("Damage Incident?")
plt.xlabel("Outside Temperature [F]")
plt.title("Defects of the Space Shuttle O-Rings vs temperature")
plt.tight_layout# Plot the fitx=np.arange(50, 85)
alpha=model.params[0]
beta=model.params[1]
y=logistic(x, beta, alpha)
plt.hold(True)
plt.plot(x,y,'r')
plt.xlim([50, 85])
outFile='ChallengerPlain.png'showData(outFile)
if__name__=='__main__':
inData=getData()
dfFit=prepareForFit(inData)
# fit the model# --- >>> START stats <<< ---model=glm('ok + failed ~ temp', data=dfFit, family=Binomial()).fit()
# --- >>> STOP stats <<< ---print(model.summary())
showResults(inData, model)