3. Change the loss function. An alternative loss for regressions is the Huber loss.

***The Huber loss is more appropriate than the L2-norm*** when we have outliers, as it is less sensitive to them (in our example we don't have outliers, but you will surely stumble upon a dataset with outliers in the future). The L2-norm loss puts all differences *to the square*, ***so outliers have a lot of influence on the outcome.*** The proper syntax of the Huber loss is 'huber_loss'

In [1]:
import numpy as np
import matplotlib.pyplot as plt
import tensorflow as tf

In [2]:
observations = 1000
xs = np.random.uniform(low=-10, high=10, size=(observations,1))
zs = np.random.uniform(-10, 10, (observations,1))

generated_inputs = np.column_stack((xs,zs))
noise = np.random.uniform(-1, 1, (observations,1))

generated_targets = 2*xs - 3*zs + 5 + noise

np.savez('tf_exp03', inputs=generated_inputs, targets=generated_targets)

In [4]:
training_data = np.load('tf_exp03.npz')

In [6]:
input_size = 2
output_size = 1

model = tf.keras.Sequential([

                            tf.keras.layers.Dense(output_size,

                                                 kernel_initializer=tf.random_uniform_initializer(minval=-0.1, maxval=0.1),
                                                 bias_initializer=tf.random_uniform_initializer(minval=-0.1, maxval=0.1)
                                                 )
                            ])


custom_optimizer = tf.keras.optimizers.SGD(learning_rate=0.00001)

model.compile(optimizer=custom_optimizer, loss='huber_loss')

model.fit(training_data['inputs'], training_data['targets'], epochs=100, verbose=2)

Epoch 1/100
32/32 - 0s - loss: 17.4200 - 393ms/epoch - 12ms/step
Epoch 2/100
32/32 - 0s - loss: 17.4123 - 46ms/epoch - 1ms/step
Epoch 3/100
32/32 - 0s - loss: 17.4048 - 48ms/epoch - 1ms/step
Epoch 4/100
32/32 - 0s - loss: 17.3972 - 52ms/epoch - 2ms/step
Epoch 5/100
32/32 - 0s - loss: 17.3895 - 47ms/epoch - 1ms/step
Epoch 6/100
32/32 - 0s - loss: 17.3820 - 49ms/epoch - 2ms/step
Epoch 7/100
32/32 - 0s - loss: 17.3744 - 46ms/epoch - 1ms/step
Epoch 8/100
32/32 - 0s - loss: 17.3668 - 52ms/epoch - 2ms/step
Epoch 9/100
32/32 - 0s - loss: 17.3593 - 61ms/epoch - 2ms/step
Epoch 10/100
32/32 - 0s - loss: 17.3516 - 49ms/epoch - 2ms/step
Epoch 11/100
32/32 - 0s - loss: 17.3441 - 49ms/epoch - 2ms/step
Epoch 12/100
32/32 - 0s - loss: 17.3364 - 57ms/epoch - 2ms/step
Epoch 13/100
32/32 - 0s - loss: 17.3289 - 48ms/epoch - 2ms/step
Epoch 14/100
32/32 - 0s - loss: 17.3213 - 45ms/epoch - 1ms/step
Epoch 15/100
32/32 - 0s - loss: 17.3138 - 46ms/epoch - 1ms/step
Epoch 16/100
32/32 - 0s - loss: 17.3062 - 45ms/

<keras.src.callbacks.History at 0x78dec84423b0>

takeaways:
1. Any function that has the property to be lower for better results and higher for worse results can be a loss function.
This includes the Huber loss.
2. Almost everything seems identical.
3. The values of the loss are generally lower (because of the Huber loss formula, and the convexity of the two functions).
4. For our problem, both the mean squared loss and the Huber loss work equally well.
***5. Generally, the Huber loss is used when we have a lot of outliers.***