# Classification Accuracy

A simple way to evaluate a set of predictions on a classificaiton problem is by using accuracy.

Classification accuracy tells us the number of correct predictions out of all the predictions that were made:

Accuracy = $\frac{\text{correct predictions}}{\text{total predictions}}$

In [1]:
# calculate the accuracy percentage between 2 lists
def correct_counter(actual, predicted):

    # set the container that stores correct matches to start with 0
    correct = 0

    for i in range(len(actual)-1):
        # correct container to add 1 if there is a match
        if actual[i] == predicted[i]:
            correct += 1

    return round(correct /  len(actual) * 100, 2)


In [2]:
actual = [0,0,0,1,1,1,0,1,0,1,0]
predicted = [1,0,1,0,0,1,0,0,1,1]

accuracy = correct_counter(actual, predicted)
print(accuracy)

36.36


# Confusion Matrix

A confusion matrix produces a summary of all the predictions made compared to the actual values.

The results are presented in a matrix with counts in each cell: the counts of predicted class values are summarized horizontally whereas the counts of actual values for each class values are presented vertically.

In [3]:
def confusion_matrix(actual, predicted):
    # get the unique classification labels
    unique = set(actual)

    # create an empty list that holds 1 list for each unique label
    matrix = [list() for x in range(len(unique))]
    for i in range(len(unique)):
        matrix[i] = [0 for x in range(len(unique))]

    # create empty dictionary
    lookup = dict()

    for i, value in enumerate(unique):
        lookup[value] = i

    for i in range(len(actual)):
        x = lookup[actual[i]]
        y = lookup[predicted[i]]
        matrix[x][y] += 1
    return unique, matrix

In [4]:
actual = [0,2,0,2,1,1,0,2,0,1,2,0,1,2,2]
predicted = [1,0,1,2,0,1,0,0,1,2,1,1,0,2,2]

unique, matrix = confusion_matrix(actual, predicted)
print(f'unique labels: {unique}')
print(f'matrix: {matrix}')

unique labels: {0, 1, 2}
matrix: [[1, 4, 0], [2, 1, 1], [2, 1, 3]]


# Mean Absolute Error

The mean absolute error calculates the average of the absolute error values, where means are made absolute/positive so that they can be added together.

$$
\text{MAE} = \frac{\sum_{i=1}^{N} |\text{predicted}_i - \text{actual}_i|}{\text{total predictions}}
$$


In [5]:
def mae_metric(actual, predicted):
    # initiate sum error as 0
    sum_error = 0.0

    for i in range(len(actual)):
        # add to sum_error the absolute difference between predicted and actual values 
        sum_error += abs(predicted[i]- actual[i])
    
    # return the average value
    return sum_error / len(actual)

In [6]:
actual = [1.55, 1.75, 1.3, 1.2, 1.85, 1.95]
predicted = [1.6, 1.8, 1.25, 1.15, 1.9, 1.9]

mae = mae_metric(actual, predicted)
mae

0.05000000000000001

# Root Mean Squared Error

Another useful metric to calculate the error in a set of regression predictions is to use the Root Mean Squared Error.

It is calculated as the square root of the mean of the squared differences between the actual and predicted outcomes.

$$
\text{RMSE} = \sqrt{\frac{\sum_{i=1}^{N} (\text{predicted}_i - \text{actual}_i)^2}{\text{total predictions}}}
$$

In [7]:
def rmse_metric(actual, predicted):
    # initiate sum error as 0
    sum_error = 0.0

    for i in range(len(actual)):
        # get the difference between the predicted and actual values
        prediction_error = predicted[i] - actual[i]

        # cumulatively add the difference to sum_error
        sum_error += prediction_error ** 2
        
    mean_error = sum_error / len(actual)
    return mean_error ** 0.5

In [8]:
actual = [1.55, 1.75, 1.3, 1.2, 1.85, 1.95]
predicted = [1.6, 1.8, 1.25, 1.15, 1.9, 1.9]

rmse = rmse_metric(actual, predicted)
rmse

0.05000000000000001