# Loss Functions in Deep Learning

## 1. Binary Cross-Entropy (BCE)
**Use case:** Binary classification  
**Activation:** Sigmoid  
Formula:

$\text{BCE} = - \frac{1}{N} \sum_{i=1}^{N} \left[ y_i \log(\hat{y}_i) + (1 - y_i) \log(1 - \hat{y}_i) \right]$


---

## 2. Categorical Cross-Entropy (CCE)
**Use case:** Multi-class classification (> 2 classes)  
**Activation:** Softmax  
Formula:
\[
$\text{CCE} = - \frac{1}{N} \sum_{i=1}^N \sum_{c=1}^C y_{i,c} \log(\hat{y}_{i,c})$
\]
- \(y_{i,c} \in \{0,1\}\) is one-hot encoded.

---

## 3. Sparse Categorical Cross-Entropy (Sparse CCE)
**Use case:** Multi-class classification with integer labels (no one-hot)  
Formula:
\[
$\text{SparseCCE} = -\frac{1}{N} \sum_{i=1}^N \log(\hat{y}_{i, y_i})$
\]
- \(y_i\) = integer class index  
- \(\hat{y}_{i, y_i}\) = predicted probability for the correct class.

---

## 4. Dice Loss
**Use case:** Segmentation (handles class imbalance well)  
**Dice coefficient:**
\[
$\text{Dice} = \frac{2 \sum_i p_i g_i + \epsilon}{\sum_i p_i + \sum_i g_i + \epsilon}$
\]
**Dice loss:**
\[
$\text{Dice Loss} = 1 - \text{Dice}$
\]
- \(p_i\) = predicted probability for pixel \(i\)  
- \(g_i\) = ground truth label (0 or 1)

---

## 5. IoU Loss (Jaccard Loss)
**Use case:** Segmentation, bounding box regression  
**IoU:**

$\text{IoU} = \frac{\sum_i p_i g_i + \epsilon}{\sum_i p_i + \sum_i g_i - \sum_i p_i g_i + \epsilon}$

**IoU loss:**

$\text{IoU Loss} = 1 - \text{IoU}$


---

## 6. Dice vs IoU Relationship

$\text{Dice} = \frac{2 \cdot \text{IoU}}{\text{IoU} + 1}$



In [1]:
from sklearn.metrics import mean_squared_error,mean_absolute_error,root_mean_squared_error

In [3]:
import numpy as np
import random

In [11]:
y=[random.randint(0,99) for i in range(100)]
y_hat=[random.randint(0,99) for i in range(100)]
y=np.array(y)
y_hat=np.array(y_hat)

In [13]:
mse_custom= np.mean((y-y_hat)**2)
mse_custom

np.float64(1618.03)

In [15]:
mse_lib=mean_squared_error(y,y_hat)
mse_lib

1618.03

In [None]:
rmse_custom=np.sqrt(np.mean((y-y_hat)**2))
rmse_custom


np.float64(40.22474362876661)

In [19]:
rmse_lib=root_mean_squared_error(y,y_hat)
rmse_lib

40.22474362876661

In [21]:
mae_cust=np.mean(np.abs(y-y_hat))
mae_cust

np.float64(32.51)

In [23]:
mae_lib=mean_absolute_error(y,y_hat)
mae_lib

32.51

In [27]:
y_class=np.random.randint(0,2,100)
y_hat_class=np.random.randint(0,2,100)
y_hat_class

array([0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0, 1, 0, 0, 0, 1, 1, 1, 1, 0, 1, 0,
       1, 1, 1, 0, 1, 1, 1, 0, 0, 1, 1, 0, 1, 0, 0, 1, 0, 1, 0, 0, 0, 0,
       0, 0, 1, 0, 0, 0, 1, 0, 0, 1, 1, 0, 1, 1, 0, 1, 0, 1, 1, 1, 0, 0,
       1, 1, 1, 0, 1, 0, 1, 0, 0, 0, 0, 1, 1, 1, 0, 1, 0, 0, 1, 0, 0, 1,
       0, 0, 1, 0, 0, 1, 0, 1, 1, 1, 0, 1], dtype=int32)

In [32]:
bin_cust=-np.mean((y_class*np.log(y_hat_class+1e-15))+(1-y_class)*(np.log(1-y_hat_class+1e-15)))
bin_cust

np.float64(17.269388197455342)

In [38]:
from sklearn.metrics import log_loss

bin_lib=log_loss(y_hat_class,y_class)
bin_lib

18.02182669455858