<center> <img src="res/ds3000.png"> </center>

<center> <h1> Week 10 - Day 2 </h1> </center>

<center> <h2> Part 5: Model Evaluation </h2></center>

## Outline
1. <a href='#1'>Metrics for Measuring Model Accuracy</a>
2. <a href='#2'>Prediction Accuracy</a>
3. <a href='#3'>Confusion Matrix</a>
4. <a href='#4'>Classification Report</a>

<a id="1"></a>

<a id="1"></a>

## 1. Metrics for Measuring Model Accuracy
* Prediction accuracy
* Confusion matrix
* Classification report

## 2. Prediction Accuracy
Estimator Method `score`
* Returns an **indication of how well the estimator performs** on **test data** 
* For **classification estimators**, returns the **prediction accuracy** for the test data:

In [1]:
import pandas as pd
from sklearn.datasets import load_digits

#load the digits dataset
digits = load_digits()

df = pd.DataFrame(digits.data)
df["target"] = digits.target

features = df.drop("target", axis = 1)
target = df["target"]

In [2]:
from sklearn.model_selection import train_test_split
from sklearn.neighbors import KNeighborsClassifier

#split data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(features, target, random_state=3000)

#select a classifier
knn = KNeighborsClassifier()

#create the model by fitting the training data
knn.fit(X=X_train, y=y_train)

#make predictions on the test set
predicted = knn.predict(X=X_test)

expected = y_test

#prediction accuracy
accuracy = knn.score(X_test, y_test)
print("Prediction accuracy on the test data:", format(accuracy*100, ".2f"))

Prediction accuracy on the test data: 98.44


## 3. Confusion Matrix
* Shows correct and incorrect predicted values (the **hits** and **misses**) for a given class 

In [3]:
from sklearn.metrics import confusion_matrix

In [4]:
confusion = confusion_matrix(y_true=expected, y_pred=predicted)

In [5]:
print(confusion)

[[45  0  0  0  0  0  0  0  0  0]
 [ 0 49  0  0  0  0  0  0  0  0]
 [ 0  0 62  0  0  0  0  0  0  0]
 [ 0  0  0 40  0  0  0  1  0  0]
 [ 0  0  0  0 39  0  0  1  0  0]
 [ 0  0  0  0  0 43  0  0  0  1]
 [ 0  0  0  0  0  0 44  0  0  0]
 [ 0  0  0  0  0  0  0 42  0  0]
 [ 0  1  0  1  0  0  0  0 39  0]
 [ 0  1  0  1  0  0  0  0  0 40]]


In [6]:
import pandas as pd
confusion_df = pd.DataFrame(confusion, index=range(10), columns=range(10))
confusion_df

Unnamed: 0,0,1,2,3,4,5,6,7,8,9
0,45,0,0,0,0,0,0,0,0,0
1,0,49,0,0,0,0,0,0,0,0
2,0,0,62,0,0,0,0,0,0,0
3,0,0,0,40,0,0,0,1,0,0
4,0,0,0,0,39,0,0,1,0,0
5,0,0,0,0,0,43,0,0,0,1
6,0,0,0,0,0,0,44,0,0,0
7,0,0,0,0,0,0,0,42,0,0
8,0,1,0,1,0,0,0,0,39,0
9,0,1,0,1,0,0,0,0,0,40


* **Correct predictions** shown on **principal diagonal** from top-left to bottom-right
* **Nonzero values** not on **principal diagonal** indicate **incorrect predictions** 
* Each **row** represents **one distinct class** (0–9) 
* **Columns** specify how many **test samples** were classified into classes 0–9 

* **Row 0** shows digit class **`0`**&mdash;**all 0s were predicted correctly**
>`[45,  0,  0,  0,  0,  0,  0,  0,  0,  0]`
* **Row 8** shows digit class **`8`**&mdash;**two 8s were predicted incorrectly**
>`[0,  1,  0,  1,  0,  0,  0,  0, 39,  0]`

    * **Correctly predicted 95.12%** (39 of 41) of `8`s

In [7]:
39/41

0.9512195121951219

## 4. Classification Report
* classification_report() method produces a report of all important classification statistics

In [8]:
from sklearn.metrics import classification_report
class_report = classification_report(y_true=expected, y_pred=predicted)

In [9]:
print(class_report)

             precision    recall  f1-score   support

          0       1.00      1.00      1.00        45
          1       0.96      1.00      0.98        49
          2       1.00      1.00      1.00        62
          3       0.95      0.98      0.96        41
          4       1.00      0.97      0.99        40
          5       1.00      0.98      0.99        44
          6       1.00      1.00      1.00        44
          7       0.95      1.00      0.98        42
          8       1.00      0.95      0.97        41
          9       0.98      0.95      0.96        42

avg / total       0.98      0.98      0.98       450



* **Precision**: the ability of the classifier not to label a non-8 digit as 8 (look through each column)
    * What fraction of predictions of digit 8 are correct? 
* **Recall**: the ability of the classifier to find all 8s. (look through each row)
    *  What fraction of all digit 8s in the testing set does the classifier correctly identify as digit 8? 
* **F-1 score**: a weighted harmonic mean of the precision and recall, where an F-beta score reaches its best value at 1 and worst score at 0.
    * Want to maximize this. The closer to 1 the better
* **Support**: the number of occurrences of each class


<center><img src = "res/digits0-9.png" width=400/></center>
