# <font color='#154360'> <center> CUSTOM LOSS FUNCTIONS </center> </font>

## <font color='blue'> Table of Contents </font>

1. [Introduction](#1)
2. [Setup](#2)  
3. [Basic example](#3)
4. [Multi-output model](#4) <br>
    4.1. [Helper Functions](#4.1) <br>
    4.2. [Data](#4.2) <br>
    4.3. [Build, compile, train and evaluate the model](#4.3) <br>
    4.4. [Making predictions](#4.4) <br>
5. [Siamese Network](#5) <br>
    5.1. [Helper Functions](#5.1) <br>
    5.2. [Data](#5.2) <br>
    5.3. [Model](#5.3) <br>
    5.4. [Training](#5.4) <br>
    5.5. [Evaluation](#5.5) <br>
    5.6. [Making predictions](#5.6) <br>
6. [References](#references)


<a name="1"></a>
## <font color='blue'> 1. Introduction </font>

This notebook demonstrates how to implement custom loss functions in TensorFlow using a few simple examples. Custom loss functions allow you to tailor model optimization to specific problem requirements, beyond standard loss functions like MSE or cross-entropy.  We'll cover both function-based and class-based implementations, showcasing their flexibility and ease of integration into Keras models.

<a name="2"></a>
## <font color='blue'> 2. Setup </font>

In [3]:
import tensorflow as tf
import numpy as np
from tensorflow import keras

In [4]:
# Set seed for NumPy
np.random.seed(42)

# Set seed for TensorFlow
tf.random.set_seed(42)

<a name="3"></a>
## <font color='blue'> 3. Example 1: Without huperparameters </font>

In [46]:

# 2x+3+epsilon

# Generating random X values (100 samples, 1 feature)
X = np.random.rand(100, 1)

# Calculating y based on the equation: y = 2X + 3 + noise
noise = np.random.normal(0, 0.1, X.shape[0])  # Adding Gaussian noise
y = 2 * X.sum(axis=1) + 3 #+ noise  # Sum of X features + 3 + noise

# Check a few values
print(X[:5], y[:5])  # Print first 5 samples

[[0.11730819]
 [0.12518579]
 [0.68556529]
 [0.43030589]
 [0.20052473]] [3.23461638 3.25037158 4.37113057 3.86061179 3.40104945]


In [6]:
# data
# inputs
xs = np.array([-1.0,  0.0, 1.0, 2.0, 3.0, 4.0], dtype=float)

# labels
ys = np.array([-3.0, -1.0, 1.0, 3.0, 5.0, 7.0], dtype=float)

In [14]:
# model, very simple because our porblem is very simple
model = tf.keras.Sequential([keras.layers.Dense(units=1, input_shape=[1])])

### Custom Loss

Now let's see how we can use a custom loss. We first define a function that accepts the ground truth labels (`y_true`) and model predictions (`y_pred`) as parameters. We then compute and return the loss value in the function definition.

In [43]:
# custom loss function
def my_huber_loss(y_true, y_pred):
    threshold = 1
    error = y_true - y_pred
    is_small_error = tf.abs(error) <= threshold
    small_error_loss = tf.square(error) / 2
    big_error_loss = threshold * (tf.abs(error) - (0.5 * threshold))
    return tf.where(is_small_error, small_error_loss, big_error_loss)

Using the loss function is as simple as specifying the loss function in the `loss` argument of `model.compile()`.

In [47]:
# training with custom loss function
model = tf.keras.Sequential([keras.layers.Dense(units=1, input_shape=[1])])
model.compile(optimizer='sgd', loss=my_huber_loss)
model.fit(X, y, epochs=500,verbose=0)
print(model.predict([10.0]))

[[23.034878]]


In [48]:
model = tf.keras.Sequential([keras.layers.Dense(units=1, input_shape=[1])])
model.compile(optimizer='sgd', loss='mse')
model.fit(X, y, epochs=500,verbose=0)
print(model.predict([10.0]))

[[22.555027]]


In [None]:
# for that prediction, we get better results with the huber loss

### clased based

In [21]:
# inputs
xs = np.array([-1.0,  0.0, 1.0, 2.0, 3.0, 4.0], dtype=float)

# labels
ys = np.array([-3.0, -1.0, 1.0, 3.0, 5.0, 7.0], dtype=float)

In [22]:
# model, very simple because our porblem is very simple
model = tf.keras.Sequential([keras.layers.Dense(units=1, input_shape=[1])])

In [23]:
class MyHuberLoss(tf.keras.losses.Loss):
    def __init__(self, threshold=1, name="my_huber_loss"):
        super().__init__(name=name)
        self.threshold = threshold

    def call(self, y_true, y_pred):
        error = y_true - y_pred
        is_small_error = tf.abs(error) <= self.threshold
        small_error_loss = tf.square(error) / 2
        big_error_loss = self.threshold * (tf.abs(error) - (0.5 * self.threshold))
        return tf.where(is_small_error, small_error_loss, big_error_loss)


In [24]:
model = tf.keras.Sequential([keras.layers.Dense(units=1, input_shape=[1])])
model.compile(optimizer='sgd', loss=my_huber_loss)
model.fit(xs, ys, epochs=500,verbose=0)
print(model.predict([10.0]))

[[18.453766]]


<a name="4"></a>
## <font color='blue'> 4. Example 2: With huperparameters </font>

In [36]:
X = np.random.rand(100, 1)
y = np.random.rand(100, 1)

In [37]:
model = tf.keras.Sequential([keras.layers.Dense(units=1, input_shape=[1])])

The `loss` argument in `model.compile()` only accepts functions that accepts two parameters: the ground truth (`y_true`) and the model predictions (`y_pred`). If we want to include a hyperparameter that we can tune, then we can define a wrapper function that accepts this hyperparameter.

In [38]:
# wrapper function that accepts the hyperparameter
def my_huber_loss_with_threshold(threshold):
  
    # function that accepts the ground truth and predictions
    def my_huber_loss(y_true, y_pred):
        error = y_true - y_pred
        is_small_error = tf.abs(error) <= threshold
        small_error_loss = tf.square(error) / 2
        big_error_loss = threshold * (tf.abs(error) - (0.5 * threshold))
        
        return tf.where(is_small_error, small_error_loss, big_error_loss) 

    # return the inner function tuned by the hyperparameter
    return my_huber_loss

We can now specify the `loss` as the wrapper function above. Notice that we can now set the `threshold` value. 

In [39]:
model.compile(optimizer='sgd', 
              loss=my_huber_loss_with_threshold(threshold=1.2))

model.fit(xs, ys, epochs=500,verbose=0)

print(model.predict([10.0]))

[[18.43621]]


In [40]:
# vary the threshold

thresholds = [0.8, 1.2, 1.6]


for t in thresholds:
    model.compile(optimizer='sgd', 
                  loss=my_huber_loss_with_threshold(threshold=t))

    model.fit(X, y, epochs=100,verbose=0)

    print(model.predict([10.0]))



[[16.92289]]
[[13.878433]]
[[11.500145]]


as a class:

In [34]:
from tensorflow.keras.losses import Loss

class MyHuberLoss(Loss):
  
    # initialize instance attributes
    def __init__(self, threshold=1):
        super().__init__()
        self.threshold = threshold

    # compute loss
    def call(self, y_true, y_pred):
        error = y_true - y_pred
        is_small_error = tf.abs(error) <= self.threshold
        small_error_loss = tf.square(error) / 2
        big_error_loss = self.threshold * (tf.abs(error) - (0.5 * self.threshold))
        return tf.where(is_small_error, small_error_loss, big_error_loss)

In [35]:
model = tf.keras.Sequential([keras.layers.Dense(units=1, input_shape=[1])])
model.compile(optimizer='sgd', loss=MyHuberLoss(threshold=1.2))
model.fit(xs, ys, epochs=500,verbose=0)
print(model.predict([10.0]))

[[18.436605]]
