# B"H

## Calculating Network Error with Loss

The **loss function** is also referred to as the **cost function**

<br>

---

The output of our neural network is confidence, and more confidence in the correct answer is better. Because of this, we strive to _increase correct confidence_ and _decrease misplaced confidence_.

## Categorical Cross-Entropy Loss

Our model has a softmax activation function for the output layer, which means
it’s outputting a probability distribution. 

**Categorical cross-entropy** is explicitly used to compare a “ground-truth” probability (y or “targets”) and some predicted distribution (y-hat or
“predictions”)

<br><br>

---


![](https://drive.google.com/uc?id=1sk7Zb-OCV3W7vx2a5qQv7HbReTbbMqHZ)

![](https://drive.google.com/uc?id=1qhlJBbRf_oARGJUWPWEgvJpeBnnCfW9b)

<br>

---

We’ll simplify it further to `-log(correct_class_confidence)`:

<br>

![](https://drive.google.com/uc?id=15cYt14T0YC8MJftsWwVTSseB2yHbNZwv)

![](https://drive.google.com/uc?id=1Y0wCm90FkvXUrpROnzjix85uXSdVEKJA)

<br>

---

Let's use the following for an example:

- Softmax output: `[0.7, 0.1, 0.2]` 
- Target probability distribution: `[1, 0, 0]`

Note, cross-entropy can also work on target probability distributions like `[0.2, 0.5, 0.3]`.

When comparing the model’s results to a one-hot vector (as in our case), the other parts of the equation zero out, making the cross-entropy calculation relatively simple. 

This is also a special case of the cross-entropy calculation, called **categorical cross-entropy**. 


In [1]:
import math

In [2]:
softmax_output = [0.7, 0.1, 0.2]
target_output = [1, 0, 0]

In [3]:
loss = -(
    math.log(softmax_output[0])*target_output[0] +
    math.log(softmax_output[1])*target_output[1] +
    math.log(softmax_output[2])*target_output[2]
)

print(loss)

0.35667494393873245


That’s the full categorical cross-entropy calculation, but let's simplify it. 

- Anything multiplied by 0 is 0. Thus, we don’t need to calculate these indices. 
- Any number multiplied by 1 remains the same. 

In [5]:
correct_idx = 0

loss = -math.log(softmax_output[correct_idx])

print(loss)

0.35667494393873245


--- 

<br><br>

The **Categorical Cross-Entropy Loss** is cool - returns larger loss for lower confidence:

In [9]:
print("1.     :", -math.log(1.))
print("0.95   :", -math.log(0.95))
print("0.9    :", -math.log(0.9))
print("0.8    :", -math.log(0.8))
print("0.7    :", -math.log(0.7))
print("0.6    :", -math.log(0.6))
print("0.5    :", -math.log(0.5))
print("0.4    :", -math.log(0.4))
print("0.3    :", -math.log(0.3))
print("0.2    :", -math.log(0.2))
print("0.1    :", -math.log(0.1))
print("0.05   :", -math.log(0.05))
print("0.01   :", -math.log(0.01))
print("0.001  :", -math.log(0.001))
print("0.00001:", -math.log(0.00001))

1.     : -0.0
0.95   : 0.05129329438755058
0.9    : 0.10536051565782628
0.8    : 0.2231435513142097
0.7    : 0.35667494393873245
0.6    : 0.5108256237659907
0.5    : 0.6931471805599453
0.4    : 0.916290731874155
0.3    : 1.2039728043259361
0.2    : 1.6094379124341003
0.1    : 2.3025850929940455
0.05   : 2.995732273553991
0.01   : 4.605170185988091
0.001  : 6.907755278982137
0.00001: 11.512925464970229
