<h1>Foundations for the ROC Curve</h1>

<h3>Logistic Regression Threshold</h3>

<p>A logistic regression model offers an easy way of shifting between emphasizing precision and emphasizing recall. This is because the Logistic Regression model doesn’t just return a prediction, but it returns a probability value between 0 and 1.</p>
<p>The default threshold is set to 0.5. However, we could choose any threshold between 0 and 1.</p>
<p>We have the following adjustement alternatives and their impact:</p>
<ul>
    <li>Make the threshold higher -> Fewer Positive Predictions -> Positive Predictions More Likely to Be Correct -> Higher Precision and Lower Recall</li>
    <li>Make the threshold lower -> More Positive Predictions -> Positive Predictions Less Likely to Be Correct -> Lower Precision and Higher Recall</li>
</ul>

<p>Each choice of a threshold is a different model. <strong>An ROC (Receiver operating characteristic) Curve</strong> is a graph showing all of the possible models and their performance.</p>

<h3>Sensitivity & Specificity</h3>

<p>An ROC Curve is a graph of the sensitivity vs. the specificity. These values demonstrate the same trade-off that precision and recall demonstrate.</p>

<table border="1">
  <tr>
      <th></th>
      <th>Actual Positive</th>
      <th>Actual Negative</th>
  </tr>
  <tr style="background-color: white;">
      <th>Predicted Positive</th>
      <td style="background-color: lightblue;">TP</td>
      <td>FP</td>
  </tr>
  <tr>
      <th>Predicted Negative</th>
      <td>FN</td>
      <td style="background-color: lightblue;">TN</td>
  </tr>
</table>

<p>An example is given in the confusion matrix below</p>

<table border="1">
  <tr>
      <th></th>
      <th>Actual Positive</th>
      <th>Actual Negative</th>
  </tr>
  <tr style="background-color: white;">
      <th>Predicted Positive</th>
      <td style="background-color: lightblue;">61</td>
      <td>21</td>
  </tr>
  <tr>
      <th>Predicted Negative</th>
      <td>35</td>
      <td style="background-color: lightblue;">105</td>
  </tr>
</table>

<p>The sensitivity is another term for the recall, which is the true positive rate.</p>

$$ Sensitivity = Recall = \frac{\#\: positives\: predicted\: correctly}{\#\: positive\: cases} = \frac{TP}{TP+FN} = \frac{61}{61+35} = \frac{61}{96} = 0,64$$

<p>The specificity is the true negative rate. It’s calculated as follows.</p>

$$ Specificity = \frac{\#\: negatives\: predicted\: correctly}{\#\: negatives\: cases} = \frac{TN}{TN+FP} = \frac{105}{105+21} = \frac{105}{126} = 0,83$$

<strong>The goal is to maximize these two values, though generally making one larger makes the other lower. It will depend on the situation whether we put more emphasis on sensitivity or specificity.</strong>

<p>The standard is to build a sensitivity vs. specificity curve, although it is also possible to build a precision-recall curve, but this isn’t commonly done.</p>

<h3>Sensitivity & Specificity in Scikit-learn</h3>

<p>Scikit-learn has not defined functions for sensitivity and specificity, but we can do it ourselves.</p>

In [1]:
import pandas as pd
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
from sklearn.metrics import recall_score, precision_recall_fscore_support
from sklearn.metrics import precision_score, recall_score

<strong>Sensitivity is the same as recall</strong>

In [2]:
sensitivity_score = recall_score

<p>Now, to define specificity, if we realize that it is also the recall of the negative class, we can get the value from the sklearn function precision_recall_fscore_support.</p>

<p>The second array is the recall, so we can ignore the other three arrays. </p>

<p>There are two values:</p>
<ul>
    <li>The first is the recall of the negative class which is the specificity.</li>
    <li>The second is the recall of the positive class which is the standard recall or sensitivity value.</li>
</ul>

In [3]:
def specificity_score(y_true, y_pred):
    p, r, f, s = precision_recall_fscore_support(y_true, y_pred)
    return r[0] # specificity=r[0] and sensitivity=recall=r[1]

<p>Now lets use our defined functions sensitivity_score and specificity_score on a model to view the results.</p>

In [4]:
df = pd.read_csv('https://sololearn.com/uploads/files/titanic.csv')
df['male'] = df['Sex'] == 'male'
X = df[['Pclass', 'male', 'Age', 'Siblings/Spouses', 'Parents/Children', 'Fare']].values
y = df['Survived'].values

X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=5)

model = LogisticRegression()
model.fit(X_train, y_train)
y_pred = model.predict(X_test)

print("sensitivity:", sensitivity_score(y_test, y_pred))
print("specificity:", specificity_score(y_test, y_pred))

sensitivity: 0.6829268292682927
specificity: 0.9214285714285714


<strong>Conclusion! Sensitivity is the same as the recall (or recall of the positive class) and specificity is the recall of the negative class.</strong>

<h3>Adjusting the Logistic Regression Threshold in Sklearn</h3>

<p>The predict method of the logistic regression model gets a probability value behind the scene that is between 0 and 1, and if we want to choose a different threshold besides 0.5, we’ll want those probability values. To get these values, we can use the <strong>predict_proba</strong> function.</p>

<p>The result is a numpy array with 2 values for each datapoint (e.g. [0.78, 0.22]). You’ll notice that the two values sum to 1.</p>
<ul>
    <li>The first value is the probability that the datapoint is in the 0 class (didn’t survive)</li>
    <li>The second is the probability that the datapoint is in the 1 class (survived)</li>
</ul>
<p>We only need the second column of this result, which we can pull with the following numpy syntax.</p>

In [5]:
y_pred = model.predict_proba(X_test)[:, 1] > 0.75

print(y_pred)

[False False  True False False False False  True  True False False False
  True False False False  True False  True False False False False False
 False  True False False False False False  True False False False False
  True False False False  True False False False False False False False
 False False False False False False False False  True  True False False
 False  True False False False False False False False False False False
  True False False  True  True False False False False False False False
 False False False False False False False False False False False False
 False False  True False  True False False False  True  True False False
 False False False  True False False False False False  True False False
 False  True False False False False False False False False False False
 False  True False False False False False False False False False  True
  True False False  True False False False False False False False False
 False False False False  True False  True False Fa

<strong>A threshold of 0.75 means we need to be more confident in order to make a positive prediction. This results in fewer positive predictions and more negative predictions.</strong>

<p>Now we can use any scikit-learn metrics from before using y_test as our true values and y_pred as our predicted values.</p>

In [6]:
print("precision:", precision_score(y_test, y_pred))
print("recall:", recall_score(y_test, y_pred))
print("sensitivity:", sensitivity_score(y_test, y_pred))
print("specificity:", specificity_score(y_test, y_pred))

precision: 0.9230769230769231
recall: 0.43902439024390244
sensitivity: 0.43902439024390244
specificity: 0.9785714285714285


<strong>Note! that when we increased the threshold value, we got a lower value of sensitivity/recall than in the original model where the default threshold was set to 0.5</strong>

<p>Setting the threshold to 0.5 we would get the original Logistic Regression model. Any other threshold value yields an alternative model.</p>