Src:
* [How Loss Functions Work in Neural Networks and Deep Learning](https://builtin.com/machine-learning/loss-functions)
* [A Friendly Introduction to Siamese Networks](https://builtin.com/machine-learning/siamese-network)

In [None]:
import numpy as np

# Loss Functions in Neural Networks

The neural network’s main objective is to minimize the loss function.

## Mean Squared Error Loss Function

\begin{align}
\text{MSE} = \frac{1}{n} \sum_{i=1}^n (y_i - \hat{y}_i)^2
\end{align}

In [None]:
# used when the goal is to predict a continuous scalar value
# usage example: regression

y_pred = np.array([0.6, 1.29, 1.99, 2.69, 3.4])
y_hat = np.array([1, 1, 2, 2, 4])

MSE = np.square(np.subtract(y_pred, y_hat)).mean()
print(MSE)

0.21606


## Cross-Entropy Loss Function

\begin{align}
\text{Loss} = -\sum_{i=1}^C y_i \log(\hat{y}_i)
\end{align}

In [None]:
# used when the goal is to predict a probability
# usage example: classification

y_pred = np.array([0.1, 0.3, 0.4, 0.2])
y_hat = np.array([0, 1, 0, 0])

cross_entropy = -np.sum(np.multiply(y_hat, np.log(y_pred)))
print(cross_entropy)

0.10536051565782628


## Mean Absolute Percentage Error

* Also known as mean absolute percentage deviation (MAPD)

\begin{align}
\text{MAPE} = \frac{1}{n} \sum_{i=1}^n \left| \frac{y_i - \hat{y}_i}{y_i} \right| \times 100
\end{align}

In [None]:
# similar to MSE, but expresses accuracy as a percentage
# usage example: demand forecasting

y_pred = np.array([0.6, 1.29, 1.99, 2.69, 3.4])
y_hat = np.array([1, 1, 2, 2, 4])

MAPE = (100.0/len(y_hat))*np.sum(np.abs(y_pred-y_hat)/y_hat)
print(MAPE)

23.799999999999997


# Loss Functions Used in Siamese Neural Networks


In Siamese Neural Networks, the goal is to compare pairs of inputs and determine their similarity. Therefore, we need to use a different set of loss functions.

## Contrastive Loss

\begin{align}
L = (1 - y) \cdot \max(0, m - D)^2 + y \cdot D^2
\end{align}

In [None]:
x1 = np.array([[1, 2], [2, 3], [3, 4], [4, 5]])
x2 = np.array([[1, 2], [3, 3], [6, 7], [5, 5]])
y = np.array([1, 1, 0, 0])

D = np.sqrt(np.sum((x1 - x2) ** 2, axis=1))

margin = 1.0
similar_loss = y * np.square(D)
dissimilar_loss = (1 - y) * np.square(np.maximum(0, margin - D))
contrastive_loss = np.mean(similar_loss + dissimilar_loss)

print("Distances (D):", D)
print("Contrastive Loss:", contrastive_loss)

Distances (D): [0.         1.         4.24264069 1.        ]
Contrastive Loss: 0.25


## Triplet Loss

\begin{align}
L = \max(0, D(A, P) - D(A, N) + \alpha)
\end{align}

In [None]:
anchor = np.array([[3, 4], [3, 5], [4, 6]])
positive = np.array([[1, 2], [2, 4], [3, 5]])
negative = np.array([[5, 6], [7, 8], [9, 10]])

D_ap = np.sqrt(np.sum((anchor - positive) ** 2, axis=1))
D_an = np.sqrt(np.sum((anchor - negative) ** 2, axis=1))

alpha = 0.5
triplet_loss = np.maximum(0, D_ap - D_an + alpha)
mean_triplet_loss = np.mean(triplet_loss)

print("Anchor-Positive Distances (D_ap):", D_ap)
print("Anchor-Negative Distances (D_an):", D_an)
print("Triplet Loss:", mean_triplet_loss)


Anchor-Positive Distances (D_ap): [2.82842712 1.41421356 1.41421356]
Anchor-Negative Distances (D_an): [2.82842712 5.         6.40312424]
Triplet Loss: 0.16666666666666666
