## Module 01: Supervised Learning

### Lesson 08: Model Evaluation Metrics

> Learn the main metrics to evaluate models, such as accuracy, precision, recall, and more!

#### 01. Intro

* How well is my model doing?
* How do we improve the model based on these metrics?

#### 02. Outline

* Problem $\to$ Tools $\to$ Measurement

#### 03. Testing your models

* Regression and Classification
    * Regression returns a numeric value
    * Classification returns a state

* By testing we can compare the models
* Golden rule: Thou shalt never use your testing data for training

In [3]:
# Import statements 
from sklearn.tree import DecisionTreeClassifier
from sklearn.metrics import accuracy_score
import pandas as pd
import numpy as np

# Import the train test split
# http://scikit-learn.org/0.16/modules/generated/sklearn.cross_validation.train_test_split.html
# from sklearn.cross_validation import train_test_split
# https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.train_test_split.html
from sklearn.model_selection import train_test_split

# Read in the data.
data = np.asarray(pd.read_csv('../../data/dt_data.csv', header=None))
# Assign the features to the variable X, and the labels to the variable y. 
X = data[:,0:2]
y = data[:,2]

# Use train test split to split your data 
# Use a test size of 25% and a random state of 42
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25)

# Instantiate your decision tree model
model = DecisionTreeClassifier()

# TODO: Fit the model to the training data.
model.fit(X_train, y_train)

# TODO: Make predictions on the test data
y_pred = model.predict(X_test)

# TODO: Calculate the accuracy and assign it to the variable acc on the test data.
acc = accuracy_score(y_test, y_pred)
acc

0.7916666666666666

#### 04. Confusion Matrix

* Confusion Matrix(混淆矩阵)
    * The true positives are the points that are positive and the model correctly labels as positive.
    * The true negatives are the points that are negative and the model correctly labels as negative.
    * The false positives are the points that are negative, but the model incorrectly labels as positive.
    * The false negatives are the points that are positive but the modeling says negative.

|Confusion Matrix |Guessed Positive|Guessed Negative|
| :--: | :--: | :--: |
|Positive|True Positives|False Negatives|
|Negative|False Positives|True Negatives|

* Type 1 Error (Error of the first kind, or False Positive): In the medical example, this is when we misdiagnose a healthy patient as sick.
* Type 2 Error (Error of the second kind, or False Negative): In the medical example, this is when we misdiagnose a sick patient as healthy.

#### 05. Confusion Matrix 2

#### 06. Accuracy

* Accuracy(精度) is a ratio between correctly classified points and the number of total points
* $Accuracy = \frac{\text{Correctly classified points}}{\text{All points}}$

#### 07. Accuracy 2

#### 08. When accuracy won't work

* Credit card fraud

#### 09. False Negatives and Positives

* In the medical example, False Negative is much worse than a False Positive.
* In the spam detector example, a False Positive is much worse than a False Negative.

#### 10. Precision and Recall

#### 11. Precision

$Precision(查准率) = \frac{TP}{TP+FP}$

#### 12. Recall

$Recall(查全率)=\frac{TP}{TP+FN}$

#### 13. F1 Score

* we use `Harmonic mean`(调和平均) instead of `arithmetic mean`(算术平均)
* F1 score(F1度量) is closer to the smallest between precision and recall
$F_1 score = 2 * \frac{Precision*Recall}{Precision+Recall}$

#### 14. F-beta Score

* $F_{\beta} score = (1+\beta^2) * \frac{Precision*Recall}{\beta^2 * Precision+Recall}$ (加权调和平均)
* When $\beta$ = 0, $F_{\beta}=Precision$
* When $\beta$ goes to infinity, $F_{\beta} score = \frac{Precision*Recall}{\frac{\beta^2}{1+\beta^2}*Precision+\frac{1}{1+\beta^2}*Recall}=Recall$
* $\beta$ < 1, Precision more impact; $\beta$ > 1, Recall more impact; 

#### 15. ROC Curve

* ROC(Receiver Operating Characteristic, 受试者工作特征曲线)

What we want is to come up with a metric or some number that is high for the perfect split, medium for the good split, and low for the random split.

$TPR(\text{True Positive Rate}) = \frac{TP}{TP+FN} \\
FPR(\text{False Positive Rate}) = \frac{FP}{TN+FP}$

* AUC(Area Under ROC Curve)
    * Random Split: Area(AUC) = 0.5
    * Good Split: 0.5 < Area(AUC) < 1
    * Perfect Split: Area(AUC) = 1