# Early desaturation alarm system using Deep Learning

## Introduction & purpose

This post presents some promising results of an early alarm model for pulsioximeter data using Deep Learning.

First of all, let's describe the purpose of such an early alarm predictive model: cetecting that a desaturation is likely to happen in the next 2 minutes. (A desaturation means a low blood oxygen concentration). Working in the model is in part a feasibility exercise, although the prediction would be useful on itself, at least as an initial step, for a system that improves the sleep quality of sleep of the patient (e.g. if the desaturation alarm was raised early enough, the carer of the patient could just change the position of the patient -assuming a patient who is sleeping and has very limited ability to move).

On the other hand, importantly for the purposes of this blog, it is a nice practical case of implementation of a Deep Learning model; the task at hand is certainly cumbersome, however there is a number of steps to follow that are pretty much common knowledge for the Deep Learning community, and after some effort, quite a lot of CPU time, and a few tricks, the results are quite satisfying.

### Dataset used

The dataset has already been presented at [this previous post](http://sisifospage.tech/2017-05-15-time-series-clustering-pulsi.html).

It consists of data from a pulsioximeter (i.e. a device for measuring bood oxygen concentration and pulse rate). The meter was connected to a sick patient for 39 nights, plus 2 additional nights that were recorded for a healthy patient, as a sanity check.

The capturing itself was done using software in [this repository](https://github.com/Iukekini/Baby-Monitor-Masimo-Pulse-Oximeter).

### References

There's so many references for Deep Learning. [This](https://www.business-science.io/timeseries-analysis/2018/04/18/keras-lstm-sunspots-time-series-prediction.html) and [this](https://www.business-science.io/timeseries-analysis/2018/07/01/keras-lstm-sunspots-part2.html) are two parts of a quite a good tutorial for predicting a time series with LSTMs; may be a gentle introduction, also because the problem described in the tutorial is quite a good example of a time series to be predicted that can be reasonably dealt with the use of Deep Learning.

There are around many other examples of exercises that are indeed good for practising the technique where the forecasting problem is however too difficult, such as predicting a time series for the stock exchange market. Those exercises usually try a prediction for just 1 timestep; although the prediction might be decent enough, it all becomes a lot harder when adding more timesteps (like in a real problem). Anyhow, a really useful example for this kind is [this post](http://rwanjohi.rbind.io/2018/04/05/time-series-forecasting-using-lstm-in-r/), which tries to predict long term interest rates for the USA. 

There has been quite a few attempts to use Deep Learning to predict time series related to body signals in similar kind of prediction setups; the implementation in this post is actually a refactor of [this repository](https://github.com/NLeSC/mcfly), called McFly, for Human Activity Recognition; the repository is an implementation of [this paper](https://www.mdpi.com/1424-8220/16/1/115/htm), co-written by [F Ordoñez](https://www.youtube.com/watch?v=7gsIkXpZx9E) -the youtube talk is in Spanish, sorry.

The refactored code used in this post is available [here](https://github.com/lrnzcig/sisifoDL), and mostly adds a `Data Generator` to [McFly](https://github.com/NLeSC/mcfly), for setting up the data more easily and, importantly, for controlling how data is fed to training the model.

If these in the list are still not enough, you could backup to couple of golden references on DL such as [Chollet's bible](https://www.manning.com/books/deep-learning-with-python) or [Andrew Ng's course](https://www.coursera.org/specializations/deep-learning)

## Summary of results

Only the more relevant results (together with the code to reproduce them) are presented in this post, with the intention of keeping it brief (i.e. less long) and useful. In further follow-up posts, other details will be covered in more depth, so that at the end, hopefully, the set of articles would be a summary on the steps to build up a Deep Learning model for a time series prediction.

Assuming that the right architecture and hyperparameters have been found already, in each of the following paragraphs the data is fed during training of the model in different ways:
- Using data as it is (naïve approach)
- Rebalancing data, since it is by far more interesting to detect big desaturations, rather than spurious variations of the
oxygen levels
- Augmenting the data, i.e. producing more data from the already existing data by means of simple variations

Finally the post incluides a brief wrap-up on what has been done, what is missing, and what will be covered in follow-up posts.

### Deep Learning results for the most naïve approach

In this case, data is fed to the model _"as is"_. The data is generated in [this previous post](http://sisifospage.tech/2017-05-15-time-series-clustering-pulsi.html), anyhow for the purposes here one can just use [this csv file](http://sisifospage.tech/data/42nights.csv). The `csv` has 4 columns:
* datetime
* bpm: heart beat per minute
* spo2: saturation level
* name: string to identify each of the time series which correspond to a certain patient in a certain night

Note that the raw data from the pulsioximeter comes out every second (every 2 seconds for other devices); the data in the `csv` has been interpolated to have one timestep every 30 seconds, as a pre-processing step. 

#### Fitting the model

Let's assume at this point that the right architecture has already been selected somehow. First step for training the model is importing library and custom utils.

In [1]:
import pandas as pd
%matplotlib inline
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
from keras.losses import mean_absolute_percentage_error

from utils.generate_models import generate_models, generate_DeepConvLSTM_model
from utils.validate_models import find_best_architecture, evaluate_model, evaluate_plot
from utils.data_generator import DataGenerator
from utils.get_dataset_pulsi import get_dataset_pulsi

Using TensorFlow backend.


Importing and pre-processing the data.

In [2]:
columns = np.array(['bpm', 'spo2'])
dataset_reduced_std, dataset_reduced = get_dataset_pulsi(columns,
                                                         filename='./utils/test_data/42nights.csv')

Setting up some parameters of the model: the target prediction is for 4 timesteps, i.e. 2 minutes, based on the data of the las 12 timesteps, i.e. the last 6 minutes.

In [3]:
window_size = 12
number_of_predictions = 4
target_variable = "spo2"

Some technical parameters: `batch_size` is the number of samples taken for each training step (where a sample is composed of the window of 12 timesteps plus the prediction of 4 timesteps) and the metric used for the optimization.

In [4]:
batch_size = 32
metric = mean_absolute_percentage_error

Most of the job is done by the `DataGenerator`'s, which control how the data is passed to the training process. More in-depth details on data generators will be given in follow-up posts; for the moment, it is important to note that they become very handy when one needs to control how the data is passed to the training process. For the generators instantiated below, the data is passed "as is", in batches of `batch_size`.

Some nights are reserved for the training data, others for validation (used during training), and a final test set is reserved for assessing the precision of the model.

In [5]:
train_names = np.array(['p_17-01-19', 'p_17-01-20', 'p_17-01-21', 'p_17-01-22', 'p_17-01-23', 'p_17-01-24', 'p_17-01-25',
                        'p_17-01-26', 'p_17-01-27', 'p_17-01-28', 'p_17-01-29', 'p_17-01-30', 'p_17-01-31', 'p_17-02-01',
                        'p_17-02-02', 'p_17-02-03', 'p_17-02-04', 'p_17-02-05', 'p_17-02-06', 'p_17-02-07', 'p_17-02-08',
                        'p_17-02-09', 'p_17-02-10'])
val_names = np.array(['p_17-02-11', 'p_17-02-12', 'p_17-02-13', 'p_17-02-14', 'p_17-02-15', 'p_17-02-16', 'p_17-02-17', 'p_17-02-18'])
test_names = np.array(['p_17-02-19', 'p_17-02-20', 'p_17-02-21', 'p_17-02-22', 'p_17-02-23', 'p_17-02-24', 'p_17-02-25', 'p_17-04-27'])
train_gen = DataGenerator(dataset_reduced_std, train_names,
                          "spo2", batch_size=batch_size,
                          number_of_predictions=number_of_predictions,
                          window_size=window_size,
                          step_prediction_dates=1,
                          rebalance_data=False, debug=False)
val_gen = DataGenerator(dataset_reduced_std, val_names,
                        "spo2", batch_size=batch_size,
                        number_of_predictions=number_of_predictions,
                        window_size=window_size,
                        step_prediction_dates=1,
                        rebalance_data=False)
test_gen = DataGenerator(dataset_reduced_std, test_names,
                         "spo2", batch_size=batch_size,
                         number_of_predictions=number_of_predictions,
                         window_size=window_size,
                         step_prediction_dates=1,
                         rebalance_data=False)

The following step fits the model.

Note that, first, a pre-fixed set of values of hyperparameters id defined. Then, the model is instantiated. And finally, the call to `find_best_architecture` trains the model and checks its peformance.

All these would actually be done for different values of the model hyperparameters, for finding the best architecture -once again will, the subject will be covered in more depth in follow-up posts.

In [6]:
hyperparameters_losses = {}
regularization_rate_losses = 0.0666
hyperparameters_losses['regularization_rate'] = regularization_rate_losses
learning_rate_losses = 0.0006
hyperparameters_losses['learning_rate'] = learning_rate_losses
filters_losses = [78]
hyperparameters_losses['filters'] = filters_losses
lstm_dims_losses = [100]
hyperparameters_losses['lstm_dims'] = lstm_dims_losses

dropout_rnn_losses = 0.74
dropout_cnn_losses = 0.27

nrepochs_losses = 96

dim_length = window_size
dim_channels = 2         # spo2 and bpm
output_dim = number_of_predictions

model = generate_DeepConvLSTM_model(dim_length, dim_channels, output_dim,
                                    filters_losses, lstm_dims_losses, learning_rate_losses,
                                    regularization_rate_losses, dropout=None,
                                    dropout_rnn=dropout_rnn_losses, dropout_cnn=dropout_cnn_losses,
                                    metrics=[mean_absolute_percentage_error])
models_losses = [(model, hyperparameters_losses)]

np.random.seed(3)

best_model_losses, best_params_losses, best_model_metrics, best_params_metrics, debug = \
    find_best_architecture(train_gen, val_gen, test_gen,
                           verbose=False, number_of_models=None, nr_epochs=500, # let early stopping decide
                           early_stopping=True, batch_size=batch_size,
                           models=models_losses, metric=mean_absolute_percentage_error, use_testset=True,
                           debug=False, test_retrain=False, output_all=True)

AttributeError: module 'tensorflow' has no attribute 'get_default_graph'