# Loss functions
Loss functions are functions that allow us to assess the model's performance

## Categorical Crossentropy
This function is used to compare a ground truth probability and some predicted distribution. It is applied to a sample.
Essentially it compares the desired result with the one provided.

In [15]:
from math import log
import numpy as np

In [8]:
# Suppose we have the following distribution (softmax)
softmax_output = [0.7, 0.1, 0.2]
# ... and the desired prediction is
desired_output = [0.9, 0.05, 0.05]
# ..., so the first label is the correct one (vectors like this are called one-hot vectors).

In [9]:
def cross_entropy(result: list, desired: list):
    return -sum([desired[j] * log(result[j]) for j in range(len(desired))])

In [11]:
loss = cross_entropy(softmax_output, desired_output)
print('Loss', loss)

Loss 0.5166085998162665


In [13]:
# With categorical cross entropy, we need to determine ONE class the given input belongs to, so we have the following
# probabilities:
desired = [0, 1, 0, 0]
# The one-hot vector above indicates, that the given input belongs to label with index 1 (0-based)
# If we want to determine the category, we're interested in the accuracy of the prediction of specific class, so we can
# slightly modify the function to validate only the answer we're interested in
# Say, the model has three outputs:
# the first one says whether it's a dog, the second one - a cat, and the third one - a human
# So, the model outputs something like [0.3, 0.5, 0.1], which indicates that the model thinks it's whatever comes second
# in the list - in our case - a cat. Our answer says that it is indeed a cat - [0, 1, 0] (the desired output, the model
# clearly sees a cat and doesn't see a dog or a human). 0 at `desired` index will make the products zero (see the formula
# above), so we can just discard them. The one thing we'll have to know is THE INDEX OF THE CORRECT CHOICE.
# In our case - 1 (0-based)
def categorical_cross_entropy(result: list, desired_class_index: int):
    return -log(result[desired_class_index])

print(categorical_cross_entropy(softmax_output, 0))

0.35667494393873245


We need to modify this function to work on batches:

In [26]:
# The first output - probability that it's a dog
# The second output - probability that it's a cat
# The third output - probability that it's a human
softmax_outputs = np.array([
    [0.7, 0.1, 0.2],
    [0.1, 0.5, 0.4],
    [0.02, 0.9, 0.08]
])

class_targets = [0, 1, 1] # Dog - 0, cat - 1, human - 3. These are the INDICES of answers for the three inputs (rows)
class_targets = np.array([
                            [1, 0, 0],
                            [0, 1, 0],
                            [0, 1, 0]]
                        )

In [29]:
def categorical_cross_entropy(predicted, correct):
    # Number of samples in a batch
    samples = len(predicted)
    # Clip data to prevent division by 0
    # Clip both sides to not drag mean towards any value
    predicted_clipped = np.clip(predicted, 1e-7, 1 - 1e-7)
    # Probabilities for target values -
    # only if categorical labels
    if len(correct.shape) == 1:
        correct_confidences = predicted_clipped[
        range(samples),
        correct
        ]
        # Mask values - only for one-hot encoded labels
    elif len(correct.shape) == 2:
        correct_confidences = np.sum(
        predicted_clipped * correct,
        axis=1
        )
    # Losses
    negative_log_likelihoods = -np.log(correct_confidences)
    return np.mean(negative_log_likelihoods)

In [30]:
categorical_cross_entropy(softmax_outputs, class_targets)

0.38506088005216804

## Calculating accuracy

In [31]:
class_targets = np.array([0, 1, 1])

def accuracy(predictions, targets):
    # Calculate values along second axis (axis of index 1)
    predictions = np.argmax(predictions, axis=1)
    # If targets are one-hot encoded - convert them
    if len(class_targets.shape) == 2:
        targets = np.argmax(targets, axis=1)
    # True evaluates to 1; False to 0
    return np.mean(predictions==targets)