**_Section 10.0:_** Load packages

In [None]:
import pandas as pd
import sklearn.linear_model as lm
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn import dummy, metrics

%matplotlib inline

### _Section 10.1_
```diff
+ The following section serves as a brief refresher regarding precision, recall, and how the two interact.
```
### Cost Benefit Questions

1. How would you rephrase the business problem if your model was optimizing toward _precision_? i.e., How might the model behave differently, and what effect would it have?
2. How would you rephrase the business problem if your model was optimizing toward _recall_?
3. What would the most ideal model look like in this case?


### _Section 10.2_
```diff
+ The following section provides an opportunity for the student to explain the effect of particular variables on our model by plotting the values predicted by our model against the possible range of values for our variable-of-interest.
```
### Visualizing models over variables

In [None]:
### _Section 10.2_
```diff
+ The following section provides an opportunity for the student to explain the effect of particular variables on our model by plotting the predictions of our model against the possible range of values for our variable-of-interest.
```
### Visualizing models over variablesdf = pd.read_csv('./dataset/flight_delays.csv')
df = df.loc[df.DEP_DEL15.notnull()].copy()

In [None]:
df.head()

In [None]:
df = df[df.DEP_DEL15.notnull()]
df = df.join(pd.get_dummies(df['CARRIER'], prefix='carrier'))
df = df.join(pd.get_dummies(df['DAY_OF_WEEK'], prefix='dow'))
model = lm.LogisticRegression()
features = [i for i in df.columns if 'dow_' in i]

In [None]:
df.shape

In [None]:
features += ['CRS_DEP_TIME']
model.fit(df[features[1:]], df['DEP_DEL15'])

df['probability'] = model.predict_proba(df[features[1:]]).T[1]

In [None]:
ax = plt.subplot(111)
colors = ['blue', 'green', 'red', 'purple', 'orange', 'brown']
for e, c in enumerate(colors):
    df[df[features[e]] == 1].plot(x='CRS_DEP_TIME', y='probability', kind='scatter', color = c, ax=ax)

ax.set(title='Probability of Delay\n Based on Day of Week and Time of Day')

plt.show()

### Other Answers: visualizing Airline or the inverse

In [None]:
features = [i for i in df.columns if 'carrier_' in i]
features += ['CRS_DEP_TIME']
#...

### _Section 10.3_
```diff
+ The following section provides an opportunity for the student to try visualizing the effect of our model against a baseline, which provides a 'standard' for us to compare our results against, and helps us track the progress of our results.
```
### Visualizing Performance Against Baseline
#### Visualizing AUC and comparing Models

In [None]:
model0 = dummy.DummyClassifier()
model0.fit(df[features[1:]], df['DEP_DEL15'])
df['probability_0'] = model0.predict_proba(df[features[1:]]).T[1]

model1 = lm.LogisticRegression()
model.fit(df[features[1:]], df['DEP_DEL15'])
df['probability_1'] = model.predict_proba(df[features[1:]]).T[1]

In [None]:
df.shape

In [None]:
ax = plt.subplot(111)
vals = metrics.roc_curve(df.DEP_DEL15, df.probability_0)
ax.plot(vals[0], vals[1])
vals = metrics.roc_curve(df.DEP_DEL15, df.probability_1)
ax.plot(vals[0], vals[1])

ax.set(title='Area Under the Curve for prediction delayed=1', 
       xlabel='recall', ylabel='precision', xlim=(0, 1), ylim=(0, 1))

#### Visualizing Precision / Recall