# Performance Metrics

**GOALS**:

- Compare Accuracy, Precision, and Recall metrics for different classifiers
- Examine the Precision Recall tradeoff and understand appropriate determination of thresholds
- Visualize Precision Recall tradeoff 
- Examine performance of multiclass classifiers

In [1]:
%matplotlib inline
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
from sklearn.datasets import fetch_mldata

### Digits Example

To begin, we will return to the MNIST handwritten digit dataset.  First, we examine a binary classifier for the data based on whether or not a digit is a 5.  

In [2]:
mnist = fetch_mldata('MNIST original')



In [3]:
mnist

{'COL_NAMES': ['label', 'data'],
 'DESCR': 'mldata.org dataset: mnist-original',
 'data': array([[0, 0, 0, ..., 0, 0, 0],
        [0, 0, 0, ..., 0, 0, 0],
        [0, 0, 0, ..., 0, 0, 0],
        ...,
        [0, 0, 0, ..., 0, 0, 0],
        [0, 0, 0, ..., 0, 0, 0],
        [0, 0, 0, ..., 0, 0, 0]], dtype=uint8),
 'target': array([0., 0., 0., ..., 9., 9., 9.])}

In [4]:
y = []
for label in mnist['target']:
    if label == 5:
        y.append(1)
    else:
        y.append(0)

In [5]:
from sklearn.model_selection import train_test_split

In [6]:
X_train, X_test, y_train, y_test = train_test_split(mnist['data'], y)

In [7]:
from sklearn.linear_model import LogisticRegression, SGDClassifier

In [None]:
lgr = LogisticRegression()
sgd = SGDClassifier()

In [None]:
lgr.fit(X_train, y_train)



In [None]:
lgr.score(X_train, y_train)

### An Alternative Classifier

Just for comparison, we can implement a Stochastic Gradient Descent classifier.  We will discuss the algorithm more next class, for now let's just use it to compare against our Logisitic Regression example.

In [None]:
sgd.fit(X_train, y_train)

In [None]:
sgd.score(X_train, y_train)

### Comparing to Baseline

We can use the `DummyClassifier` to generate a baseline estimation that only guesses the majority class everytime regardless of the data.  This is akin to examining the baseline of a dataset.  Let's see how this example does.

In [None]:
from sklearn.dummy import DummyClassifier

In [None]:
dum_dum = DummyClassifier(strategy='most_frequent')

In [None]:
dum_dum.fit(X_train, y_train)

In [None]:
dum_dum.score(X_train,y_train)

In [None]:
d_pred = dum_dum.predict(X_train)

Hmm. What's going on here?  It seems that simply guessing 0 gives a fairly "accurate" classifier.  This is because the accuracy is simply the number guessed correctly out of the total number of options.  In this example, 1 in 10 are 5's.  Let's consider the confusion matrix for a situation where there are 1000 digits and 100 of them are 5's.  

**CONFUSION MATRIX**

<table style="width:60%">
  <tr>
    <th> </th>
    <th>Predicted Negative</th> 
    <th>Predicted Positive</th>
  </tr>
  <tr>
    <td>Actually Negative</td>
    <td> True Negative</td> 
      <td> False Positive</td>

  </tr>
  <tr>
    <td>Actually Positive</td>
    <td>False Negative</td>
    <td>True Positive</td>
</table>

**EXAMPLE**

|   $~$  |  Predict Neg | Predict Pos |
| ----- | ----- | ------ |
| Really Negative |  900  |  0  |
| Really Positive | 100  |  0  |

In [None]:
#predicting sgd results
sgd_prd = sgd.predict(X_train)

In [None]:
#predicting logistic results
lgr_prd = lgr.predict(X_train)

In [None]:
from sklearn.metrics import confusion_matrix

In [None]:
#sgd confustion matrix
confusion_matrix(y_train, sgd_prd)

In [None]:
#logistic regression 
confusion_matrix(y_train, lgr_prd)

In [None]:
#dummy predictor
confusion_matrix(y_train, d_pred)

### Beyond Accuracy

**Accuracy**: 

Percent classified correctly.

$$\displaystyle \frac{TP +TN}{TP + FP + TN + FN}$$



**Precision**:

More refined metrics begin with Precision, which is the proportion of positives that are really true positives.  Here, to increase precision, we want to decrease False Positive results.


 $$\displaystyle \frac{TP}{TP + FP}$$
 
 
**Recall**:

If we consider the number of true positives in terms of all the real positives, we have recall.  To get better recall, we want to avoid False Negatives.

 $$\displaystyle \frac{TP}{TP + FN}$$
 
 

In [None]:
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score, classification_report

In [None]:
print("Accuracy score for Logistic Regression model:\n{:.2f}".format(accuracy_score(y_train, lgr_prd)))
print("Accuracy score for SGD model: \n{:.2f}".format(accuracy_score(y_train, sgd_prd)))
print("Accuracy score for Dummy Classifer model: \n{:.2f}".format(accuracy_score(y_train, d_pred)))

In [None]:
print("Precision score for Logistic Regression model: \n", precision_score(y_train, lgr_prd))
print("Precision score for SGD model: \n", precision_score(y_train, sgd_prd))
print("Precision score for Dummy Classifer model: \n", precision_score(y_train, d_pred))

In [None]:
print("Recall score for Logistic Regression model: \n", recall_score(y_train, lgr_prd))
print("Recall score for SGD model: \n", recall_score(y_train, sgd_prd))
print("Recall score for Dummy Classifer model: \n", recall_score(y_train, d_pred))

In [None]:
print("Logistic Regression full report\n", classification_report(y_train,lgr_prd))

In [None]:
print("SGD full report\n", classification_report(y_train,sgd_prd))

In [None]:
print("Dummy full report\n", classification_report(y_train,d_pred))

### Admissions Example

Now, let's predict the `admit` class from the `gre` variable.  Be sure to use a train test split and cross validation.

- Load Dataset
- Examine Variables
- Deal with missing and non-numeric
- Split
- Create Dummy Classifier
- Create and fit a Logistic Classifier with Cross Validation
- Compare and discuss the results

In [None]:
admissions = pd.read_csv('data/admissions.csv')

In [None]:
admissions.head()

In [None]:
admissions.info()

In [None]:
admit = admissions.dropna()

In [None]:
admit.admit.value_counts()

In [None]:
from sklearn.model_selection import train_test_split

In [None]:
X = admit.gre
y = admit.admit
X_train, X_test, y_train, y_test = train_test_split(X, y)

In [None]:
from sklearn.linear_model import LogisticRegression
from sklearn.dummy import DummyClassifier
from sklearn.metrics import classification_report

In [None]:
clf = LogisticRegression()
clf.fit(X_train.values.reshape(-1,1), y_train)
pred = clf.predict(X_test.reshape(-1,1))
print(classification_report(pred, y_test))

In [None]:
admit.head()

In [None]:
X = admit.drop('admit', axis = 1)
y = admit.admit
X_train, X_test, y_train, y_test = train_test_split(X, y)
clf.fit(X_train, y_train)
print(classification_report(clf.predict(X_test), y_test))

In [None]:
from sklearn.linear_model import LogisticRegressionCV
from sklearn.cross_validation import cross_val_score