# Loss Function

The loss function, also called the cost function, quantifies the error rate of a neural network model. The goal of training a deep neural network is to minimize this loss, ideally reducing it to zero, meaning the model makes perfect predictions.
Deep neural networks generate a confidence score for each output neuron, representing the probability that the given neuron corresponds to the correct class. The loss function evaluates these confidence scores and calculates the discrepancy between the predicted values and the actual labels.
For example, consider a model with three output neurons, producing the confidence scores:
$$[0.25,0.55,0.20]$$
In this case, the model predicts that the second neuron has the highest confidence score, meaning it considers this neuron the correct classification. If this prediction is correct, the ideal scenario is that the confidence scores for the incorrect neurons (first and third) should be significantly lower than the correct one.
The loss function helps in optimizing the model by guiding updates to the neural network's parameters. Specifically, it enables optimization techniques (such as gradient descent) to reduce the confidence scores of incorrect neurons while increasing the confidence score of the correct neuron, ensuring better classification over time.
Training a neural network involves adjusting its parameters iteratively to minimize the loss, leading to improved accuracy and more reliable predictions.

## Categorical Cross-Entropy Loss

Categorical cross entropy loss is most used loss function with the softmax activation in output layer. The formula is $$L = -\sum_{j}^{N} i,j \log(\hat{y}_{i,j})$$
where,
- $L$ is sample loss value
- $i$ is the `i`th sample
- $j$ is the label's index
- $y$ denotes the target values
- $\hat{y}$ denotes the predicted value

In [1]:
# categorical cross entropy hard coded
import math
softmax_output = [0.7,0.1,0.2]
target_output = [1,0,0]

loss = -(math.log(softmax_output[0]) * target_output[0] +
         math.log(softmax_output[1]) * target_output[1] +
         math.log(softmax_output[2]) * target_output[2])

print(loss)

0.35667494393873245


Now we calculate the loss for samples

In [7]:
softmax_outputs = [[0.7, 0.1, 0.2],
                   [0.1, 0.5, 0.4],
                   [0.02, 0.9, 0.08]]
class_targets = [0,1,1]

print("Taking out the predicted index from the softmax output")
for target_index, distribution in zip(class_targets,softmax_outputs):
    print(distribution[target_index],end="\t")

Taking out the predicted index from the softmax output
0.7	0.5	0.9	

# Loss class

In [14]:
import numpy as np
class Loss:
    def calculate(self,output,y):
        sample_loss = self.forward(output,y)
        data_loss = np.mean(sample_loss)
        return data_loss

class CategoricalCrossEntropyLoss(Loss):
    def forward(self,y_pred,y_true):
        n_samples = len(y_pred)
        y_pred_clipped = np.clip(y_pred, 1e-7, 1 - 1e-7)
        correct_confidence = 0
        if len(y_true.shape) == 1:
            correct_confidence = y_pred_clipped[range(n_samples),y_true]
        elif len(y_true.shape) == 2:
            correct_confidence = np.sum(y_pred_clipped * y_true,axis=1)

        negative_log_likelihood = -np.log(correct_confidence)
        return negative_log_likelihood

In [18]:
softmax_outputs = np.array([[0.7, 0.1, 0.2],
                            [0.1, 0.5, 0.4],
                            [0.02, 0.9, 0.08]])
class_targets = np.array([[1, 0, 0],
                          [0, 1, 0],
                          [0, 1, 0]])
loss_function = CategoricalCrossEntropyLoss()
loss = loss_function.calculate(softmax_outputs,class_targets)
print(loss)

0.38506088005216804


# Accuracy

In [19]:
softmax_outputs = np.array([[0.7, 0.2, 0.1],
                            [0.5, 0.1, 0.4],
                            [0.02, 0.9, 0.08]])
class_targets = np.array([0, 1, 1])
predictions = np.argmax(softmax_outputs, axis=1)
if len(class_targets.shape) == 2:
    class_targets = np.argmax(class_targets, axis=1)
accuracy = np.mean(predictions==class_targets)
print('accuracy', accuracy)

accuracy 0.6666666666666666
