This notebook shows how to implement custom loss used in Google Brain - Ventilator Pressure Prediction competition in TensorFlow. This solution is inspired by https://stackoverflow.com/a/66966915.

Have any questions or suggestions? Please comment below.

**<font color='red'>And if you liked this notebook, please upvote it!</font>**

## Import packages

In [None]:
import numpy as np
import pandas as pd
from sklearn.preprocessing import RobustScaler
from sklearn.model_selection import train_test_split
import tensorflow as tf
import tensorflow.keras.layers as L
import tensorflow.keras.backend as K

## Read data

Since this notebook is just a demonstration of a custom loss, a simple `train_test_split` is sufficient here. We will use only first 100,000 rows from train data file.

In [None]:
train = pd.read_csv('../input/ventilator-pressure-prediction/train.csv', nrows=100_000)
train['R'] = train['R'].astype(str)
train['C'] = train['C'].astype(str)
train = pd.get_dummies(train)

y = train['pressure']
y = y.to_numpy().reshape(-1, 80)

features = [col for col in train.columns if col not in ['id', 'breath_id', 'pressure']]
X = train[features]

In [None]:
X.head(10)

The loss given NaNs without scaling, so let's do it.

In [None]:
RS = RobustScaler()
fe2 = [col for col in X.columns if col not in ['u_out']] # no need to scale u_out
X[fe2] = RS.fit_transform(X[fe2])

In [None]:
X = X.reshape(-1, 80, X.shape[-1])
X_train, X_valid, y_train, y_valid = train_test_split(X, y, test_size=0.2, shuffle=True, random_state=42)

## Model definition

We define custom loss with usage of `tf.keras.backend` as function of three arguments - `y_true`, `y_pred` and `input_tensor`.

In [None]:
# u_out_index must be set to a correct value!
def ventilation_mae_loss(y_true, y_pred, input_tensor, u_out_index=2):
    w = 1 - tf.expand_dims(input_tensor[:, :, u_out_index], axis=2)
    mae = w * K.abs(y_true - y_pred)
    # Don't calculate MAE for w = 0 (u_out = 1) - avoids division by zero
    bool_mask = tf.cast(w, dtype=tf.bool)
    w = tf.boolean_mask(w, bool_mask)
    mae = tf.boolean_mask(mae, bool_mask)
    return K.mean(K.sum(mae, axis=-1) / K.sum(w, axis=-1))

A simple model based on one LSTM block is defined below, which is sufficient for our purposes. 

In [None]:
def get_model(input_shape):
    inputs = tf.keras.Input(input_shape)
    x = L.Bidirectional(L.LSTM(128, return_sequences=True))(inputs)
    x = L.Bidirectional(L.LSTM(64, return_sequences=True))(x)
    x = L.Dense(64, activation='selu')(x)
    outs = L.Dense(1)(x)
    targets = tf.keras.Input((80, 1))
    
    model = tf.keras.Model([inputs, targets], outs)
    model.add_loss(ventilation_mae_loss(targets, outs, inputs))
    model.compile(optimizer="adam", loss=None)
    return model

Let's test it!

In [None]:
model = get_model(X_train.shape[-2:])
history = model.fit(x=[X_train, y_train], y=None, 
                  validation_data=([X_valid, y_valid], None), 
                  epochs=3, 
                  batch_size=32)