## Performance analysis of classification models

In this notebook, we will analyze the performance of the classification models that we have trained in the previous notebook. We will use the following metrics:
* Accuracy
* Precision
* Recall
* F1 score

In [2]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from sklearn.metrics import confusion_matrix

import pickle

In [3]:
log_reg_model = pickle.load(open('./data/logistic_model_example01.pkl', "rb"))

In [4]:
df = pd.read_csv('./data/logistic_example01.csv')
df.head(3)

Unnamed: 0,kgs_smoked,cancer
0,-0.65956,0
1,5.78149,0
2,-8.247713,0


In [5]:
X = df[['kgs_smoked']]
y = df[['cancer']]

# 4 measures of performance of model .. matrix TP,TN in a diagonal and FP,FN in a diagonal.. the accuracy doesnot care about the distribution of these. As trues decrease the falses increase. model 1 with FP is better than m2 with FN. FP are more considerate 

In [6]:
model_preds = log_reg_model.predict(X)

In [7]:
c_matrix = confusion_matrix(y, model_preds)

In [8]:
TP = c_matrix[1][1]
TN = c_matrix[0][0]
FP = c_matrix[0][1]
FN = c_matrix[1][0]


In [9]:
accuracy = (TP + TN) / (TP + TN + FP + FN)
precision = TP / (TP + FP)
recall = TP / (TP + FN)
f1 = 2 * (precision * recall) / (precision + recall)

In [10]:
print(f'{"Accuracy:":12s}{accuracy:.2f}')
print(f'{"Precision:":12s}{precision:.2f}')
print(f'{"Recall:":12s}{recall:>.2f}')
print(f'{"F1:":12s}{f1:>.2f}')

Accuracy:   0.72
Precision:  0.77
Recall:     0.71
F1:         0.74


But which is the best metric to use? The answer is: it depends on the problem. 

In this case, we indicate that the cost of a test is 1000, but the cost of loosing a life is immersable (well, insurance companies try to do this - let's say it's $10,000,000). Now, from our sample population of smokers, we see that about 50% of them die of cancer. If smokers make up 10% of the population, then we can expect that 5% of the population will die of cancer. If we apply this model to the entire population (some random individual), the cost of a missed diagnosis is 10,000,000*5% = 500,000.

So, we can see that the cost of a missed diagnosis is much higher than the cost of a false positive. So, we should optimize for recall.

# Cost Weight Learning

We only know one model so far, later we will need to compare multiple models and determin which one will be our chosen model. In such cases, we need to know which score is more effective for our given context. In the above screnerio, if we give a person the test and they don't have cancer, it costs $1,000. If we don't give this person a test but they have cancer, we avoided the $1000 spent, but the cost of the life was 10,000,0000. In this particular case, we use a null model that is assume everyone has cancer and give them the test. 

Later examples will have the costs between the two errors be closer, and in such cases, a model with sufficient performance can result in cost savings and/or profit for a company.

