Now let's talk about another problem of accuracy metrics. We continue to familiarize ourselves with the tasks of banks, we have trained a model of credit scoring, with the help of which the bank makes a decision on granting a loan to a person.  
The model predicts to whom the loan should be granted (a good citizen and will pay correctly), and to whom it should not be granted (a person constantly misses payments on the loan).

In [1]:
import pandas as pd

clients = [0, 0, 0, 0, 0,
           1, 1, 1, 1, 1]
first_model_pred = [0, 0, 1, 1, 1,
                    1, 1, 1, 1, 1]
second_model_pred = [0, 0, 0, 0, 0,
                     1, 1, 0, 0, 0]


df = pd.DataFrame({
    'clients': clients,
    'first_model': first_model_pred,
    'second_model': second_model_pred
})

df

Unnamed: 0,clients,first_model,second_model
0,0,0,0
1,0,0,0
2,0,1,0
3,0,1,0
4,0,1,0
5,1,1,1
6,1,1,1
7,1,1,0
8,1,1,0
9,1,1,0


We have two models, let's test their work on 10 clients.   

0 will mark clients who will not repay the loan, and 1 - who will repay.
Predictions of the models are the same, 0 - the model predicts the non-repayment of the loan, so we do not give the loan, and 1 - the client will repay the loan, so we can be confident in him and give him the money.

In [2]:
df['first_model_right'] = df['clients'] == df['first_model']
df['second_model_right'] = df['clients'] == df['second_model']

df

Unnamed: 0,clients,first_model,second_model,first_model_right,second_model_right
0,0,0,0,True,True
1,0,0,0,True,True
2,0,1,0,False,True
3,0,1,0,False,True
4,0,1,0,False,True
5,1,1,1,True,True
6,1,1,1,True,True
7,1,1,0,True,False
8,1,1,0,True,False
9,1,1,0,True,False


In [3]:
print(f"Accuracy of the first model {df['first_model_right'].sum() / df.shape[0]}")
print(f"Accuracy of the second model {df['second_model_right'].sum() / df.shape[0]}")

Accuracy of the first model 0.7
Accuracy of the second model 0.7


This is because the accuracy metric only takes into account correct responses on all classes, so both models were equally good on this metric, because they both gave the same number of correct answers, 7 each, except that they had different errors. Let's find them.

First, let's look at the first model's errors:

In [4]:
df[~df['first_model_right']]

Unnamed: 0,clients,first_model,second_model,first_model_right,second_model_right
2,0,1,0,False,True
3,0,1,0,False,True
4,0,1,0,False,True


In [5]:
#and we will output only the true values and predictions of the first model
df[~df['first_model_right']][['clients', 'first_model']]

Unnamed: 0,clients,first_model
2,0,1
3,0,1
4,0,1


In these three errors, the first model mistakenly gave a loan to 3 people who had no plans to pay it back.

Now it is the turn of the second model:

In [6]:
# sort by the error in the second_model_right column
df[~df['second_model_right']]

Unnamed: 0,clients,first_model,second_model,first_model_right,second_model_right
7,1,1,0,True,False
8,1,1,0,True,False
9,1,1,0,True,False


In [7]:
# derive only the true values and predictions by the second model
df[~df['second_model_right']][['clients', 'second_model']]

Unnamed: 0,clients,second_model
7,1,0
8,1,0
9,1,0


And here it's the other way around. The model did not approve loans to people who were able to pay everything correctly.

These are different errors that can affect our models and our decisions about the usefulness of trained algorithms in different ways.