# Evaluation metrics

---

_You are currently looking at **version 1.0** of this notebook._

---

In [None]:
%matplotlib inline

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

from sklearn.model_selection import train_test_split

In [None]:
df = pd.read_csv('../_data/fraud_data.csv')

### Label stats / distribution

In [None]:
df['Class'].describe()

In [None]:
np.bincount(df['Class'])

### Train-test split

In [None]:
X, y = df.iloc[:,:-1], df.iloc[:,-1]
X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=0)

### Confusion matrix of dummy classifier

Using `X_train`, `X_test`, `y_train`, and `y_test` (as defined above), train a dummy classifier that classifies everything as the majority class of the training data. What is the accuracy of this classifier? What is the recall?

*This function should a return a tuple with two floats, i.e. `(accuracy score, recall score)`.*

In [None]:
from sklearn.dummy import DummyClassifier
from sklearn.metrics import confusion_matrix

In [None]:
# Fit DummyClassifier
dummy_majority = DummyClassifier(strategy='most_frequent').fit(X_train, y_train)

# Predict
y_dummy_predictions = dummy_majority.predict(X_test)

# Get scores (manually)
cm = confusion_matrix(y_test, y_dummy_predictions)
cm

In [None]:
### Accuracy and recall scores

In [None]:
from sklearn.metrics import accuracy_score, recall_score, precision_score
from sklearn.svm import SVC

In [None]:
TN, FP = cm[0, 0], cm[0, 1]
FN, TP = cm[1, 0], cm[1, 1]
# or
TN, FP, FN, TP = cm.ravel()

# Calculated
accuracy_sc_m = float((TN + TP) /(TN + FP + FN + TP))
recall_sc_m = float(TP /(TP + FN))
precision_sc_m = float(TP /(TP + FP + 10e-8))  # *avoid div/zero

# Function
accuracy_sc = accuracy_score(y_test, y_dummy_predictions)
recall_sc = recall_score(y_test, y_dummy_predictions)
precision_sc = precision_score(y_test, y_dummy_predictions)

accuracy_sc_m, recall_sc_m, precision_sc_m
accuracy_sc, recall_sc, precision_sc

In [None]:
svm = SVC().fit(X_train, y_train)
y_predicted = svm.predict(X_test)

accuracy_sc = accuracy_score(y_test, y_predicted)
recall_sc = recall_score(y_test, y_predicted)
precision_sc = precision_score(y_test, y_predicted)

accuracy_sc, recall_sc, precision_sc

### Confusion matrix using decision function with threshold

 - What is the confusion matrix when using a fraud threshold of -220?
 - Decision function returns prediction values for y, which can be converted to labels using a threshold
 - One can reduce the sensitivity(recall) and reduce the number of FN/falsely convicted/negative diagnosed

In [None]:
svm = SVC(C=1e9, gamma=1e-07).fit(X_train, y_train)

fraud_threshold = np.linspace(100, 350, 6) * -1

for thres in fraud_threshold:
    y_decision_scores = svm.decision_function(X_test) > thres
    print('Fraud threshold: {}\n'.format(thres), confusion_matrix(y_test, y_decision_scores.astype('int')))
    print('Sensitivity: {:.3f}\n'.format(recall_score(y_test, y_decision_scores)))

### Question 5

Train a logisitic regression classifier with default parameters using X_train and y_train.

For the logisitic regression classifier, create a precision recall curve and a roc curve using y_test and the probability estimates for X_test (probability it is fraud).

Looking at the precision recall curve, what is the recall when the precision is `0.75`?

Looking at the roc curve, what is the true positive rate when the false positive rate is `0.16`?

*This function should return a tuple with two floats, i.e. `(recall, true positive rate)`.*

In [None]:
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import roc_curve, auc, precision_recall_curve

In [None]:
# Fit lr, return probabilities
lr = LogisticRegression().fit(X_train, y_train)
y_proba = lr.predict_proba(X_test)[: ,1]

In [None]:
plt.figure(figsize=(7, 7))
ax = plt.gca()
ax.set_xlim([0.0, 1.01]) # OOP
plt.ylim([0.0, 1.01])    # pyplot


# PR curve
precision, recall, _ = precision_recall_curve(y_test, y_proba)
ax.plot(precision, recall, label='Precision-Recall Curve')
ax.set_ylabel('Recall', fontsize=16)
ax.set_xlabel('Precision', fontsize=16)

# ROC curve
false_positive_rate, recall, _ = roc_curve(y_test, y_proba)
ax.plot(false_positive_rate, recall, label='ROC Curve')
ax.set_ylabel('True Positive Rate', fontsize=16)
ax.set_xlabel('False Positive Rate', fontsize=16)

ax.set_aspect('equal')
plt.show();

### Cross validation by GridSearchCV

- Perform a grid search over a selection of hyperparameters.

In [None]:
from sklearn.model_selection import GridSearchCV
from sklearn.linear_model import LogisticRegression

In [None]:
lr = LogisticRegression(random_state=0)

C = [0.01, 0.1, 1, 10, 100]
penalty = ['l1', 'l2']
    
grid_values = {'C':C, 'penalty': penalty}
grid_lr_prec = GridSearchCV(lr, 
                            param_grid=grid_values,
                            scoring='recall',
                            cv=5,
                            return_train_score=True)

grid_lr_prec.fit(X_train, y_train)

#### Mean test scores of each hyperparameter combination

In [None]:
df = pd.DataFrame(grid_lr_prec.cv_results_)
pivot = pd.pivot_table(df, values='mean_test_score', index=['param_C'], columns = ['param_penalty']).as_matrix()
pivot

#### Plot mean scores

In [None]:
plt.figure()
sns.heatmap(pivot.reshape(5, 2), xticklabels=penalty, yticklabels=C, vmin=0.77, vmax=0.81)
plt.yticks(rotation=0);