# Nerual networks applied to cardiovascular models

Neural networks can as we saw in chapter 1.1 be used as function approximators, and may similarily learn how to approximate the behaviour of mathematical models. In this section we will provide examples of neural networks applied to some cardiovascular models for the systemic circulation, and for blood vessel stenosis models. In addition to tensorflow we also apply the Keras library here, which is a higher level package simlipfying many of the tasks that can be performed with tensorflow. 

---

## The Two Element Windkessel Model

Here we will implement the two element Windkessel in a Neural Network. The model is characterized by the parameters resistance $R$ and compliance $C$. These parameters are often interpreted as total vascular resistance and systemic arterial compliance, when the model represents the systemic circulation. The equation describing the model is as follows;

\begin{equation*}
\frac{\mathrm{d}p}{\mathrm{dt}} = \frac{Q_{\mathrm{in}}}{C} - \frac{p}{R C}, 
\end{equation*}

where $Q_{\mathrm{in}}$ is the flow entering the compliant volume at a given point in time. The compliant vessels of the systemic circulation are filled and stressed by blood ejected in systole from the heart. During diastole the elastic recoil of the vessel walls smoothes the blood flow in the vessels and keeps the circulation flowing even when the heart is not ejecting blood. This effect is named after an air chamber used to produce smooth water flows for fire fighting in the past. See the Figure below for a simple illustration, comparing the blood vessels to the original Windkessel. 

<img src = "fig/Windkessel_effect.svg" width=600>

[Image source](https://en.wikipedia.org/wiki/Windkessel_effect)

The model computes the blood pressure waveform of the systemic arteries $P_{ao}(t)$ (used interchangeably with $p$) according to the imposed inflow $Q_{\mathrm{in}}$, and the specified parameters $R$ and $C$. If we use this model to produce data then we can train a NN to appoximate the model. 

which ideally should be equal to zero, and must be minimized to obey the physics of the problem. $u$ refers to our variable of interest, which here is $p$. 

Problem 1:
- Tune the hyperparameters; number of nodes, layers, epochs, training rate, but also the number of training points and try to get an optimal recreation of the model. You can also try feeding the model a new data set and see if the model can reproduce itself. 

For this problem a gaussian inflow is prescribed.

In [None]:
import os
import numpy as np
import keras
from keras import regularizers
from sklearn.model_selection import train_test_split
import tensorflow as tf
import matplotlib.pyplot as plt
#import joblib
from numpy.random import seed
seed(1)
from tensorflow import set_random_seed
set_random_seed(2)
from plotScripts import plotScatterGeneric, plotBAGeneric

from sklearn import preprocessing
tf.logging.set_verbosity(tf.logging.ERROR) # depreciate warnings

def iterate_minibatches(inputs, targets, batchsize, shuffle=False):
    assert inputs.shape[0] == targets.shape[0]
    if shuffle:
        indices = np.arange(inputs.shape[0])
        np.random.shuffle(indices)
    for start_idx in range(0, inputs.shape[0] - batchsize + 1, batchsize):
        if shuffle:
            excerpt = indices[start_idx:start_idx + batchsize]
        else:
            excerpt = slice(start_idx, start_idx + batchsize)
        yield inputs[excerpt], targets[excerpt]

In [None]:
#===================================================
# load data and split training set into actual training and validation sets
#===================================================
dataNameTrain = "WK2DataLabeled_train" # for training the model
dataNameTest = "WK2DataLabeled_test" # for testing the model

testFracTest = 0.3
colNames = ["C", "R", "P_sys", "P_dia", "PP", "Hyp"]
featureCols = [0, 1] # C and R
labelCol = 4 # PP

data = np.genfromtxt(dataNameTrain, skip_header=True)
x, y = data[:,featureCols].copy(), data[:,labelCol].copy()
train_X, test_X, train_Y, test_Y = train_test_split(x, y, test_size=testFracTest, random_state=42)
#===================================================
# scale inputs
#===================================================
scaler = preprocessing.StandardScaler()
train_X = scaler.fit_transform(train_X) # fit (find mu and std) scaler and transform data
test_X  = scaler.transform(test_X) # transform data based on mu and std from training/learning set
#===================================================
# set parameters for Neural net
#===================================================
afFunc = "relu"
opt = 'adam'
lossFunc= 'mean_squared_error'
max_epochs = 1000
batch_size = 100



In [None]:
#===================================================
# set up Neural net
#===================================================
model = keras.Sequential()
 
model.add(keras.layers.Dense(50, activation=afFunc, input_dim=2))
model.add(keras.layers.Dense(50, activation=afFunc))
model.add(keras.layers.Dense(1, activation='linear'))
model.compile(optimizer=opt,
              loss=lossFunc)

In [None]:
#===================================================
# train model using mini-batches
#===================================================
trainLoss = []
testLoss = []
for n in range(max_epochs):
    for batch in iterate_minibatches(train_X, train_Y, batch_size, shuffle=True):
        x_batch, y_batch = batch 
        
        tmpLoss = model.train_on_batch(x_batch, y_batch)
        
    trainLoss.append(model.evaluate(train_X, train_Y))
    testLoss.append(model.evaluate(test_X, test_Y))

In [None]:
#===================================================
# plot loss and predictions
#===================================================
plt.figure()
plt.plot(trainLoss, "o")
plt.plot(testLoss, "o")
plt.xlabel("epoch")
plt.ylabel("loss")
plt.legend(["train", "test"])

test_Y_pred = model.predict(test_X).flatten()
plotScatterGeneric(test_Y, test_Y_pred, "PP", "PP_ML")

## Coronary artery stenoses

In this example we'll be working with pressure drops along coronary artery stenoses. A stenosis is a narrowing or partial obstruction in the arteries that feed the heart muscle, as illustrated in the figure below, where normal segments are marked as red, and stenotic segments in grey and red depending on the severeness. Such stenoses impedes flow to the heart muscles, and may eventually cause myocardial infarction. In this example we'll look at the pressure drop across such stenoses, where predictions are based on an algebraic stenoses model given by:

\begin{equation}
 \Delta P_\mathrm{0D} = a_\mathrm{0D} Q + b_\mathrm{0D} Q^2\,,
 \label{eq:dplinearquadratic}
\end{equation}
with parameters 
\begin{equation}
 a_\mathrm{0D} = \frac{K_v \, \mu}{A_0 \,D_0}\,,\quad \quad b_\mathrm{0D}=\frac{K_t \rho}{2 \, A_0^2}\left(\frac{A_0}{A_{s}} -1\right)^2\,,
 \label{eq:aAndb}
\end{equation}
 where $A_0$ and $A_{s}$ refer to cross-sectional areas of the normal (average of inlet and outlet) and stenotic segments, respectively. Similarly, $D_0$ and $D_{s}$ represent the normal and stenotic diameters, respectively. Furthermore, $K_v$ and $K_t$ are empirical coefficients, with $K_v = 32\left(0.83\,L_{s} + 1.64 \,D_{s}\right)\cdot\left(A_0/A_{s}\right)^2/D_0$, $K_t=1.52$, whereas $L_{s}$ is the length of the length of the stenotic segment. [Seeley and Young 1973](https://www.sciencedirect.com/science/article/pii/0021929076900865). Thus the pressure drop is a function of inlet, outlet and minimum radius/cross-section, lenght and flow-rate. In this dataset we have computed approximately 5000 different pressure losses, $\Delta P_\mathrm{0D}$ for realistic inputs, and the goal is to train a neural network to learn the function, $\Delta P_\mathrm{0D}$.

<img src="fig/coronaryStenoses.png" width="400">

### problem
- Train a neural network with 2 hidden networks with 100 neurons each with relu activation function
- Compare the predictions on the training and testing data. What do you observe?

In [None]:
#===================================================
# load training and testing data
#===================================================
dataNameTrain = "stenosesData_train_mod" # for training the model
dataNameTest = "stenosesData_test_mod" # for testing the model

colNames = ["r0", "rMin", "l", "Q", "DP0D"]

featureCols = [0, 1, 2, 3] #
labelCol =  4

data = np.genfromtxt(dataNameTrain, skip_header=True)
train_X, train_Y = data[:,featureCols].copy(), data[:,labelCol].copy()
data_test = np.genfromtxt(dataNameTest, skip_header=True)
test_X, test_Y = data_test[:,featureCols].copy(), data_test[:,labelCol].copy()
#===================================================
# scale inputs
#===================================================
scaler = preprocessing.StandardScaler()
train_X = scaler.fit_transform(train_X) # fit (find mu and std) scaler and transform data
test_X  = scaler.transform(test_X) # transform data based on mu and std from training/learning set
#===================================================
# set parameters for Neural net
#===================================================
afFunc = "relu"
opt = 'adam'
lossFunc= 'mse'
max_epochs = 1000
batch_size = 100

In [None]:
#===================================================
# set up Neural net
#===================================================
model = keras.Sequential()
 
model.add(keras.layers.Dense(100, activation=afFunc, input_dim=len(featureCols)))
model.add(keras.layers.Dense(100, activation=afFunc))
model.add(keras.layers.Dense(1, activation='linear'))
model.compile(optimizer=opt,
              loss=lossFunc)

In [None]:
#===================================================
# train model using mini-batches
#===================================================
trainLoss = []
testLoss = []
for n in range(max_epochs):
    for batch in iterate_minibatches(train_X, train_Y, batch_size, shuffle=True):
        x_batch, y_batch = batch 
        
        tmpLoss = model.train_on_batch(x_batch, y_batch)
        
    trainLoss.append(model.evaluate(train_X, train_Y))
    testLoss.append(model.evaluate(test_X, test_Y))
    
#===================================================
# Note that keras can perform the for loop and splitting into mini-bathches 
# in a "one-liner" like the one below. We'll use this in the examples below
#===================================================
#model.fit(train_X, train_Y, epochs=max_epochs, batch_size=batch_size, verbose=1)

Note that keras can do run the for loop and splitting into mini-bathches in a "one-liner" like the one below. We'll use this in the examples below:
```model.fit(train_X, train_Y, epochs=max_epochs, batch_size=batch_size, verbose=1)```

In [None]:
#===================================================
# plot loss and predictions
#===================================================
plt.figure()
plt.plot(trainLoss, "o")
plt.plot(testLoss, "o")
plt.xlabel("epoch")
plt.ylabel("loss")
plt.ylim([0, min(testLoss) + 1*np.std(testLoss)])
plt.legend(["train", "test"])

test_Y_pred = model.predict(test_X).flatten()
train_Y_pred = model.predict(train_X).flatten()
plotScatterGeneric(train_Y, train_Y_pred, "DP", "DP_ML")#, title="predictions on training set")
plotScatterGeneric(test_Y, test_Y_pred, "DP", "DP_ML")#, title="predictions on testing set")
plotBAGeneric(test_Y, test_Y_pred, "DP", "DP_ML")#, title="predictions on testing set")

print("train loss: {}, test loss: {}".format(model.evaluate(train_X, train_Y),
                                             model.evaluate(test_X, test_Y)))

### Improving generalization
Neaural networks are great at learning to fit data as exemplified above; with enough layers and neurons we are able to fit the training data very well. However, when predictions are made on samples that were not used in training the networks perform much worse. This is known as __overfitting__ and is a __very important__ concept withtin machine learning and neural networks!


In this section will introduce some concepts that can be used to prevent overfitting
- Reduce the complexity of the network. In general more complex networks are more prone to overfitting
- Monitor the predictions (loss) on data that is not used in the training, by introducing a validation set 
- Introduce regularization
- Introduce dropout

### Methods to prevent overfitting

#### Introduce a validation set 
- A validation set is a certain fraction of the training set that is not used to estimate gradients and update weights and biases, but is used to test the networks after each epoch. If the loss on the actual training data continous to go towards zero but the validation loss increases, this is a sign of overfitting.
- The introduction of a validation sthet requires that some of the training data is set aside, and thus not used for training. It is however a good rule to use split the training data into trainig and validation set
- One still need to test the network on unseen data, and we therefor also use a test set; i.e. we have a training data that is used to update the weights, we have validation data that is used to see when overfitting occurs, and we have testing data, to test on "completely" unseen data

#### Regularization
Overfitting is associated with large weights, and may thus be reduced by constraining large weights. Remember that neural network as trained by minimizing a loss function, i.e. the mean squared errors of predictions:
\begin{equation} \label{eq:lossFunc}
E(\omega_1, b_1 \ldots \omega_L, b_L) = \frac{1}{N}\sum_j^N ( y_j - \hat{y_j} )^2
\end{equation}

Now with regularization we penalize large weights by adding a second term, e.g. the $L_2$ norm of the weights:
\begin{equation}\label{eq:lossFunc_reg}
E(\omega_1, b_1 \ldots \omega_L, b_L) = \frac{1}{N}\sum_j^N ( y_j - \hat{y_j} )^2 + \frac{\gamma}{N}\sum_l^L  \omega_l ^2 
\end{equation}
The idea is that by penalizing big weights you prevent the network to try to fit too complex functions (i.e. try to fit noise etc.). [Check out this post to read more about regularization](https://towardsdatascience.com/how-to-improve-a-neural-network-with-regularization-8a18ecda9fe3)

#### Dropout
Dropout is another regularization technique proposed by [Srivastava et al. 2014](http://jmlr.org/papers/v15/srivastava14a.html). Dropout involves randomely removing a certain fraction of the activation of a neuron in a layer. [The effect is that the network becomes less sensitive to the specific weights of neurons. This in turn results in a network that is capable of better generalization and is less likely to overfit the training data](https://machinelearningmastery.com/dropout-regularization-deep-learning-models-keras/)

### problem
- In the code below we have split the training-set into an actual training and validation set. Use the same setting as the above case 2 hidden layers with 100 neurons each with sigmoid activation function. 
- Reduce the complexity of the network. e.g. use 2 hidden layers with 25 inputs. 
- add a regularization term (explore with differend values of $\gamma$) to the hidden layers
- Add dropout to the network

In [None]:
#===================================================
# load training, validation and testing data
#===================================================

val_frac = 0.2

data = np.genfromtxt(dataNameTrain, skip_header=1)
data_test = np.genfromtxt(dataNameTest, skip_header=1)


x, y = data[:,featureCols].copy(), data[:,labelCol].copy()
train_X, val_X, train_Y, val_Y = train_test_split(x, y, test_size=val_frac, random_state=42)

test_X, test_Y = data_test[:,featureCols].copy(), data_test[:,labelCol].copy()


#===================================================
# scale inputs
#===================================================
scaler = preprocessing.StandardScaler()
train_X = scaler.fit_transform(train_X) # fit (find mu and std) scaler and transform data
test_X  = scaler.transform(test_X) # transform data based on mu and std from training/learning set
val_X  = scaler.transform(val_X) # transform data based on mu and std from training/learning set

train_Y.resize(train_Y.size, 1)
test_Y.resize(test_Y.size, 1)
val_Y.resize(val_Y.size, 1)

scaler_y = preprocessing.StandardScaler()
train_Y = scaler_y.fit_transform(train_Y) # fit (find mu and std) scaler and transform data
test_Y = scaler_y.transform(test_Y) # transform data based on mu and std from training/learning set
val_Y = scaler_y.transform(val_Y) # transform data based on mu and std from training/learning set

#===================================================
# set parameters for Neural net
#===================================================
afFunc = "relu"
opt = 'adam'
lossFunc= 'mse'
max_epochs = 1000
batch_size = 100

In [None]:
#===================================================
# set up Neural net with regularization and dropout
#===================================================
dropOutActive = True
dropoutVal = 0.1
gamma = 0.000001 # if reguVal is set to zero no regularization is used
model = keras.Sequential()
N_neurons = 25

model.add(keras.layers.Dense(N_neurons, activation=afFunc, input_dim=len(featureCols), 
                             kernel_regularizer=regularizers.l2(gamma)))
if dropOutActive:
    model.add(keras.layers.Dropout(dropoutVal))
model.add(keras.layers.Dense(N_neurons, activation=afFunc, kernel_regularizer=regularizers.l2(gamma)))
if dropOutActive:
    model.add(keras.layers.Dropout(dropoutVal))
model.add(keras.layers.Dense(1, activation='linear'))
model.compile(optimizer=opt,
              loss=lossFunc)

In [None]:
#===================================================
# set output and stop criteria
#===================================================

# save training loss and validation loss to a file
logger = keras.callbacks.CSVLogger('costVsEpochs', append=False, separator=' ')
# save best weights to file (as monitored by the validation loss)
filepath = 'bestWeights.hdf5'
checkpoint = keras.callbacks.ModelCheckpoint(filepath, monitor='val_loss', verbose=1, save_best_only=True, 
                                             mode='min')
callbacks = [logger, checkpoint]
#===================================================
# train model by opdating weights based on gradients in the actualt training data 
# and monitoring the error in the validation set. Note that the test set is not used at all
# we'll save that to see how the network perform on data that was not used in updating weights or in the 
# minimization of the loss
#===================================================
model.fit(train_X, train_Y, epochs=max_epochs, batch_size=batch_size,
          verbose=1, validation_data=(val_X, val_Y), shuffle=True, callbacks=callbacks)

In [None]:
#===================================================
# plot loss and predictions
#===================================================
model.load_weights('bestWeights.hdf5')
costVsEpochs = np.genfromtxt("costVsEpochs", skip_header=1)
trainLoss = costVsEpochs[:, 1]
valLoss = costVsEpochs[:, 2]
plt.figure()
plt.plot(trainLoss, "o")
plt.plot(valLoss, "o")
plt.xlabel("epoch")
plt.ylabel("loss")
plt.ylim([0, min(valLoss) + 1*np.std(valLoss)])
plt.legend(["train", "test"])

test_Y_pred = model.predict(test_X).flatten()
train_Y_pred = model.predict(train_X).flatten()
val_Y_pred = model.predict(val_X).flatten()

train_Y_pred = scaler_y.inverse_transform(train_Y_pred).flatten()
test_Y_pred = scaler_y.inverse_transform(test_Y_pred).flatten()
val_Y_pred = scaler_y.inverse_transform(val_Y_pred).flatten()

train_Y_plot = scaler_y.inverse_transform(train_Y).flatten()
test_Y_plot = scaler_y.inverse_transform(test_Y).flatten()

val_Y_plot = scaler_y.inverse_transform(val_Y).flatten()

#train_Y_plot = train_Y.flatten() 
#test_Y_plot = test_Y.flatten() 

#val_Y_plot = val_Y.flatten() 

plt.figure()
plotScatterGeneric(train_Y_plot, train_Y_pred, "DP", "DP_ML")#, title="predictions on training set")
plotScatterGeneric(test_Y_plot, test_Y_pred, "DP", "DP_ML")#, title="predictions on testing set")
plotScatterGeneric(val_Y_plot, val_Y_pred, "DP", "DP_ML")#, title="predictions on validation set")
plotBAGeneric(test_Y_plot, test_Y_pred, "DP", "DP_ML")

print("train loss: {}, val loss: {}, test loss: {}".format(model.evaluate(train_X, train_Y), 
                                                           model.evaluate(val_X, val_Y),
                                                           model.evaluate(test_X, test_Y)))
print("train DP: {}, val DP: {}, test DP: {}".format(np.average(train_Y_plot), 
                                                     np.average(val_Y_plot),
                                                     np.average(test_Y_plot)))

