<a href="https://colab.research.google.com/github/Machine-Learning-Tokyo/DL-workshop-series/blob/master/Part%20II%20-%20Learning%20in%20Deep%20Networks/custom_loss_functions.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## Custom loss functions

- mae, mse (predifined)
- mce (mean cubed error)
- mpe (mean power error)
- double_loss
- weighted_double_loss

In this notebook we will see how we can define custom loss functions for a kreas model training
First of all we run some necessary imports

In [0]:
import keras
from keras.layers import Lambda
from keras.models import Sequential
from keras.losses import mae, mse
import keras.backend as K
from functools import partial

import numpy as np

Now let's build a very simple model that returns the input unchanged. The simpler way to do it is using a Lambda layer and applying the lambda function $f(x) = x$

In [0]:
model = Sequential([Lambda(lambda x: x, input_shape=(1,))])

We can test our model by feeding an array of three numbers and check that the result is the same array

In [0]:
x = [0, 1, 2]
x = np.array(x)
y_pred = model.predict_on_batch(x)
print(*y_pred)

Now let's try to see how a loss function would work on this model. First of all we have to compile our model with a specific optimizer and loss function. The choice of the optimizer is not important since we are not going to train the model.

For loss function let's define the Mean Absolute Error (mae). This loss function takes as input the ground truth values ($y\_true$) and the model's predictions ($y\_pred$) tensors and returns the mean absolute difference of them:
$$ mae = \frac{\sum_{i=0}^N{|y\_true_i - y\_pred_i|}}{N}$$

In [0]:
model.compile('sgd', mae)

In order to get the loss the model needs the $y\_true$ and $y\_pred$ tensors as mentioned before.
- The $y\_pred$ tensor will be the same as the input array x.

- The $y\_true$ tensor has to be defined by us. For simplicity we will define:
$$x = 0$$
$$y = 4$$

In this case we have
$$y\_pred = 0$$
$$y\_true = 4$$

which means that the mean absolute difference is $|0 - 4| = 4$

In [0]:
x = [[0]]
x = np.array(x)

y = [[4]]
y = np.array(y)

loss = model.evaluate(x, y, verbose=0)
print(loss)

Now let's run the evaluation of the model keeping the same arrays but changing the loss function. In this case we will use the Mean Squared Error (mse):

This loss function takes as input the ground truth values ($y\_true$) and the model's predictions ($y\_pred$) tensors and returns the mean squared difference of them:
$$ mse = \frac{\sum_{i=0}^N{(y\_true_i - y\_pred_i)^2}}{N}$$



In [0]:
model.compile('sgd', mse)

We keep the same values for $y\_true$ and $y\_pred$:
$$y\_pred = 0$$
$$y\_true = 4$$

which means that the mean squared difference is $(0 - 4)^2 = 16$

In [0]:
x = [[0]]
x = np.array(x)

y = [[4]]
y = np.array(y)

loss = model.evaluate(x, y, verbose=0)
print(loss)

Now let's try to define our own loss function. The definition of a loss function for a keras model looks like any python function definition. However, for we are only allowed to use the basic python mathematic operators (+-*/ etc.) and the keras.backend operators (K.abs, K.pow etc.)

We will try to define the Mean Cubed Error (mce):
$$ loss = \frac{\sum_{i=0}^N{|y\_true_i - y\_pred_i|^3}}{N}$$

In [0]:
def mce(y_true, y_pred):
  return K.mean(K.pow(K.abs(y_true - y_pred), 3))

In [0]:
model.compile('sgd', mce)

We keep the same values for $y\_true$ and $y\_pred$:
$$y\_pred = 0$$
$$y\_true = 4$$

which means that the mean cubed difference is $|0 - 4|^3 = 64$

In [0]:
x = [[0]]
x = np.array(x)

y = [[4]]
y = np.array(y)

loss = model.evaluate(x, y, verbose=0)
print(loss)

This is how one can easily define a new loss function to train a keras model.
Actually it can be even more general.

One can define a loss function for the mean error to any power:

In [0]:
def mpe(y_true, y_pred, p=4):
  return K.mean(K.pow(K.abs(y_true - y_pred), p))

In [0]:
model.compile('sgd', mpe)

We keep the same values for $y\_true$ and $y\_pred$:
$$y\_pred = 0$$
$$y\_true = 4$$

which means that the mean difference to the power of 4 is $|0 - 4|^4 = 256$

In [0]:
x = [[0]]
x = np.array(x)

y = [[4]]
y = np.array(y)

loss = model.evaluate(x, y, verbose=0)
print(loss)

However using mpe() as it is means that one has to define the power at the function definition. This is not very convenient since it could be the case that the loss function definition is at a different file than the model compilation. In this case we can use a very useful python tool: the *partial()*
partial() is a function that takes as first argument another function and one or more arguments of this function and returns a function that is the same with the original one but with the denoted arguments as default values. For more details please check [here](https://docs.python.org/2/library/functools.html#functools.partial)

In [0]:
def mpe(y_true, y_pred, p):
  return K.mean(K.pow(K.abs(y_true - y_pred), p))

In [0]:
m5e = partial(mpe, p=5)
model.compile('sgd', m5e)

We keep the same values for $y\_true$ and $y\_pred$:
$$y\_pred = 0$$
$$y\_true = 4$$

which means that the mean difference to the power of 5 is $|0 - 4|^5 = 1024$

Notice that the mpe() definition does not have any default value for *p*. This means that it can be at a different file and we can import it like this for example:

`from my_losses import mpe`

In [0]:
x = [[0]]
x = np.array(x)

y = [[4]]
y = np.array(y)

loss = model.evaluate(x, y, verbose=0)
print(loss)

Another thing that we can do is define a loss function as a combination of two loss functions. In this case the result can be the average loss from the two loss functions

In [0]:
def double_loss(y_true, y_pred, l1, l2):
  loss_1 = l1(y_true, y_pred)
  loss_2 = l2(y_true, y_pred)
  
  return (loss_1 + loss_2) / 2

In [0]:
m5e = partial(mpe, p=5)
mae_m5e = partial(double_loss, l1=mae, l2=m5e)
model.compile('sgd', mae_m5e)

We keep the same values for $y\_true$ and $y\_pred$:
$$y\_pred = 0$$
$$y\_true = 4$$

which means that the average of mae and m5e is
$$\frac{|0 - 4| + |0 - 4|^5 }{2}=514$$

In [0]:
x = [[0]]
x = np.array(x)

y = [[4]]
y = np.array(y)

loss = model.evaluate(x, y, verbose=0)
print(loss)

Actually we can go even further and define the weight average of the two loss functions.

Let's say for example that we want the second loss to have different weight in the final results than the first one. In this case we just have to multiply the losses by the (normalized) weights.

We can define the default values to be equal (e.g. 1) and then change only the weight of the loss that we want.

In our example we will assign to the m5e loss twice the weights of that of mae loss.

In [0]:
def weighted_double_loss(y_true, y_pred, l1, l2, w1=1, w2=1):
  w1, w2 = w1 / (w1+w2), w2 / (w1+w2)
  loss_1 = l1(y_true, y_pred) * w1
  loss_2 = l2(y_true, y_pred) * w2
  
  return loss_1 + loss_2

In [0]:
m5e = partial(mpe, p=5)
mae_m5e = partial(weighted_double_loss, l1=mae, l2=m5e, w2=2)
model.compile('sgd', mae_m5e)

We keep the same values for $y\_true$ and $y\_pred$:
$$y\_pred = 0$$
$$y\_true = 4$$

which means that the weighted average of mae (x1) and m5e (x2) is
$$|0 - 4|\frac{1}{3} + |0 - 4|^5\frac{2}{3}=684$$

In [0]:
x = [[0]]
x = np.array(x)

y = [[4]]
y = np.array(y)

loss = model.evaluate(x, y, verbose=0)
print(loss)

Now you can define your own configurable loss function to train your model.

In a similar way you can also define a metric function to evaluate its performance

## The end