## Evaluation for Classification

### Preamble

In [1]:
%matplotlib notebook
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.datasets import load_digits

dataset = load_digits()
X, y = dataset.data, dataset.target

for class_name, class_count in zip(dataset.target_names, np.bincount(dataset.target)):
    print(class_name, class_count)

0 178
1 182
2 177
3 183
4 181
5 182
6 181
7 179
8 174
9 180


In [2]:
# Creating a dataset with imbalanced binary classes:
# Negative class (0) is 'not digit 1'
# Positive class (1) id 'digit 1'
y_binary_imbalanced = y.copy()
y_binary_imbalanced[y_binary_imbalanced != 1] = 0

print('Original labels:\t', y[1:30])
print('Binary labels:\t', y_binary_imbalanced[1:30])

Original labels:	 [1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9]
Binary labels:	 [1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0]


In [4]:
np.bincount(y_binary_imbalanced) # Negative class(0) is the most frequent class 

array([1615,  182], dtype=int64)

In [5]:
X_train, X_test, y_train, y_test = train_test_split(X, y_binary_imbalanced, random_state = 0)

## Accuracy of SVM classifier
from sklearn.svm import SVC

svm = SVC(kernel = 'rbf', C = 1).fit(X_train, y_train)
svm.score(X_test, y_test)

0.90888888888888886

### Dummy Classifier


Dummy classifer that makes predicition using simple rules, which can be useful as a baseline for comparisiion against the actual classifiers, especially with imbalanced class.

Strategy in case of dummy classsifier:
most_frequent
, stratified
, uniform 
, constant

In [18]:
from sklearn.dummy import DummyClassifier
# Negative class(0) is most frequent
dummy_majority = DummyClassifier(strategy = 'most_frequent').fit(X_train, y_train)
## dummy_majority = DummyClassifier(strategy = 'uniform').fit(X_train, y_train)
## dummy_majority = DummyClassifier(strategy = 'stratified').fit(X_train, y_train)
# the most_frequent dummy classifer always predict class 0
y_dummy_predictions = dummy_majority.predict(X_test)

y_dummy_predictions

array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0,

In [13]:
dummy_majority.score(X_test, y_test)

0.9044444444444445

If classifier accuracy is close to the null baseline classifier accuracy that could be due to
Ineffective, erenous or missing features
Poor choice of kernel or hyperparameter
Large class imbalance

In [14]:
svm = SVC(kernel = 'linear', C = 1).fit(X_train, y_train)
svm.score(X_test, y_test)

0.97777777777777775

Dummy Regressor are counter part to the Dummy classifier
mean: predicts the mean of training targets
median: predicts the median of the training targets
quantile: predicts user provided quantile of the training targets
constant: predicts a constant user-provided value

False Positive: Type 1- error
False Negative: Type 2 - error

### Confusion matrices

#### Binary (two class) confusion matrix

In [21]:
from sklearn.metrics import confusion_matrix

dummy_majority = DummyClassifier(strategy = 'most_frequent').fit(X_train, y_train)
y_majority_predicted = dummy_majority.predict(X_test)
confusion = confusion_matrix(y_test, y_majority_predicted)

print('Most frequent class(Dummy Classifier):\n', confusion)

Most frequent class(Dummy Classifier):
 [[407   0]
 [ 43   0]]


In [23]:
dummy_classprop = DummyClassifier(strategy = "stratified").fit(X_train, y_train)
y_classprop_predicted = dummy_classprop.predict(X_test)
confusion_prop = confusion_matrix(y_test, y_classprop_predicted)

print('Random class propotional predicition (dummy classifier):\n', confusion_prop)

Random class propotional predicition (dummy classifier):
 [[369  38]
 [ 36   7]]


In [25]:
svm = SVC(kernel = 'linear', C = 1).fit(X_train, y_train)
svm_predicted = svm.predict(X_test)
confusion_svm = confusion_matrix(y_test, svm_predicted)

print('SVM classifier (kernel = linear, C  = 1):\n', confusion_svm)

SVM classifier (kernel = linear, C  = 1):
 [[402   5]
 [  5  38]]


In [30]:
from sklearn.tree import DecisionTreeClassifier

tree = DecisionTreeClassifier(max_depth = 2).fit(X_train, y_train)
dt_predicted = tree.predict(X_test)
confusion_dt = confusion_matrix(y_test, dt_predicted)

print('Decision tree classifier (max_depth = 2):\n', confusion_dt)

Decision tree classifier (max_depth = 2):
 [[400   7]
 [ 17  26]]


In [29]:
from sklearn.linear_model import LogisticRegression

lr = LogisticRegression().fit(X_train, y_train)
lr_predicted = lr.predict(X_test)
confusion_lr = confusion_matrix(y_test, lr_predicted) 

print('Logsitic Regression (default settings):\n', confusion_lr)

Logsitic Regression (default settings):
 [[401   6]
 [  6  37]]


### Evaluation metrics for binary classification

In [35]:
from sklearn.metrics import accuracy_score , precision_score, recall_score, f1_score
# Accuracy = TP + TN / (TP + TN + FP + FN)
# Precision = TP / (TP + FP)
# Recall = TP / (TP + FN)  Also known as sensitivity, or True Positive Rate
# F1 = 2 * Precision * Recall / (Precision + Recall) 

print('Accuracy: {:.2f}'.format(accuracy_score(y_test, dt_predicted)))

print('Precision: {:.2f}'.format(precision_score(y_test, dt_predicted)))

print('Recall: {:.2f}'.format(recall_score(y_test, dt_predicted)))

print('F1: {:.2f}'.format(f1_score(y_test, dt_predicted)))

Accuracy: 0.95
Precision: 0.79
Recall: 0.60
F1: 0.68
