# Blood Pressure from PPG signal

In this notebook, you should solve the problem in Task 4 (blood pressure estimation from PPG signal) in the first assignment (<a href="http://kovan.ceng.metu.edu.tr/~sinan/DL/HW1.html">HW1</a>) using a CNN architecture that you should construct using the layers and the network you developed in this HW.

The notebook is intentionally composed of only this cell. You can copy-paste any text, data, cell, or code that you developed in this HW or HW1. You can add as many cells as you want. You can create files on the disk as freely as you like.

In [25]:
import random
import numpy as np
from metu.data_utils import load_dataset
from cs231n.regressor_trainer import RegressionTrainer
from cs231n.gradient_check import eval_numerical_gradient
from cs231n.classifiers.convnet import *
import matplotlib.pyplot as plt
%matplotlib inline
plt.rcParams['figure.figsize'] = (10.0, 8.0) # set default size of plots
plt.rcParams['image.interpolation'] = 'nearest'
plt.rcParams['image.cmap'] = 'gray'

# for auto-reloading extenrnal modules
# see http://stackoverflow.com/questions/1907993/autoreload-of-modules-in-ipython
%load_ext autoreload
%autoreload 2

The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload


In [29]:
# Load the PPG dataset
# If your memory turns out to be sufficient, try loading a subset
def get_data(datafile, training_ratio=0.9, test_ratio=0.06, val_ratio=0.04):
  # Load the PPG training data 
  X, y = load_dataset(datafile, input_size)
  X.dump("X2.dat")
  y.dump("y2.dat")
#   X = np.load("X.dat")
#   y = np.load("y.dat")
  Xy = list(zip(X, y))
  np.random.shuffle(Xy)
  xx, yy = zip(*Xy)
  X = np.array(xx)
  y = np.array(yy)
  # TODO: Split the data into training, validation and test sets
  length=len(y)
  num_training=int(length*training_ratio)
  num_val = int(length*val_ratio)
  num_test = min((length-num_training-num_val), int(length*test_ratio))
  mask = range(num_training-1)
  X_train = X[mask]
  y_train = y[mask]
  mask = range(num_training, num_training+num_test)
  X_test = X[mask]
  y_test = y[mask]
  mask = range(num_training+num_test, num_training+num_test+num_val)
  X_val = X[mask]
  y_val = y[mask]
  
  return X_train, y_train, X_val, y_val, X_test, y_test

datafile = 'metu/dataset/part1.mat' #TODO: PATH to your data file
input_size = 900 # TODO: Size of the input of the network

X_train, y_train, X_val, y_val, X_test, y_test = get_data(datafile)
print "Number of instances in the training set: ", len(X_train)
print "Number of instances in the validation set: ", len(X_val)
print "Number of instances in the testing set: ", len(X_test)

0
1000
2000
3000
4000
5000
6000
7000
8000
9000
10000
11000
12000
13000
14000
15000
16000
17000
18000
19000
20000
21000
22000
23000
24000
25000
26000
27000
28000
29000
Number of instances in the training set:  26999
Number of instances in the validation set:  1200
Number of instances in the testing set:  1800


In [36]:
print X_train.shape, y_train.shape

X_train_r = X_train.reshape((X_train.shape[0], 1, 1, 900))
X_val_r = X_val.reshape((X_val.shape[0], 1, 1, 900))
X_test_r = X_test.reshape((X_test.shape[0], 1, 1, 900))

print X_train_r.shape, X_val_r.shape

(26999, 900) (26999, 2)
(26999, 1, 1, 900) (1200, 1, 1, 900)


In [48]:
# X_train_r = np.tile(X_train_r, (3,1,1))
# X_val_r = np.tile(X_val_r, (3,1,1))

In [137]:
model = init_my_convnet(filter_size=5, num_filters=[32, 64], num_classes=2, num_crp=2, num_affine_layer=0, num_crcrp=0,
                       input_shape=X_train_r.shape[1:], loss="mse")
trainer = RegressionTrainer()
best_model, loss_history, train_acc_history, val_acc_history = trainer.train(
          X_train_r, y_train, X_val_r, y_val, model, my_convnet_imp,
          reg=0.001, momentum=0.9, learning_rate=0.0005, batch_size=50, num_epochs=2,
          acc_frequency=50, verbose=True)

starting iteration  0
Finished epoch 0 / 2: cost 348867.508374, train: 6146181.827682, val 7482766.081055, lr 5.000000e-04
starting iteration  10
starting iteration  20
starting iteration  30
starting iteration  40
starting iteration  50
Finished epoch 0 / 2: cost 150156103015894496.000000, train: 316674025805.433777, val 380035747986.194641, lr 5.000000e-04
starting iteration  60
starting iteration  70
starting iteration  80
starting iteration  90
starting iteration  100
Finished epoch 0 / 2: cost 152722989094247168.000000, train: 2150512134.249236, val 2582188295.462914, lr 5.000000e-04
starting iteration  110
starting iteration  120
starting iteration  130
starting iteration  140
starting iteration  150
Finished epoch 0 / 2: cost 152660318698572384.000000, train: 3352156.339781, val 3958315.626402, lr 5.000000e-04
starting iteration  160
starting iteration  170
starting iteration  180
starting iteration  190
starting iteration  200
Finished epoch 0 / 2: cost 152584074477773920.00000

In [138]:
best_model_noaff = best_model

In [141]:
model = init_my_convnet(filter_size=5, num_filters=[32, 64], num_classes=2, num_crp=2, num_affine_layer=2, num_crcrp=0,
                       input_shape=X_train_r.shape[1:], loss="mse")
trainer = RegressionTrainer()
best_model, loss_history, train_acc_history, val_acc_history = trainer.train(
          X_train_r, y_train, X_val_r, y_val, model, my_convnet_imp,
          reg=0.001, momentum=0.9, learning_rate=5e-7, batch_size=50, num_epochs=2,
          acc_frequency=50, verbose=True)

starting iteration  0
Finished epoch 0 / 2: cost 347866.515081, train: 6492646.889001, val 7906880.579876, lr 5.000000e-07
starting iteration  10
starting iteration  20
starting iteration  30
starting iteration  40
starting iteration  50
Finished epoch 0 / 2: cost 295953.445039, train: 6597561.882409, val 7746855.142208, lr 5.000000e-07
starting iteration  60
starting iteration  70
starting iteration  80
starting iteration  90
starting iteration  100
Finished epoch 0 / 2: cost 291126.753877, train: 6393147.634619, val 7559643.681655, lr 5.000000e-07
starting iteration  110
starting iteration  120
starting iteration  130
starting iteration  140
starting iteration  150
Finished epoch 0 / 2: cost 288829.707408, train: 5998289.484902, val 7363322.143822, lr 5.000000e-07
starting iteration  160
starting iteration  170
starting iteration  180
starting iteration  190
starting iteration  200
Finished epoch 0 / 2: cost 297423.337838, train: 5702201.298564, val 6905004.276089, lr 5.000000e-07
st

In [142]:
best_model_2aff = best_model

In [143]:
model = init_my_convnet(filter_size=5, num_filters=[32, 64], num_classes=2, num_crp=2, num_affine_layer=1, num_crcrp=0,
                       input_shape=X_train_r.shape[1:], loss="mse")
trainer = RegressionTrainer()
best_model, loss_history, train_acc_history, val_acc_history = trainer.train(
          X_train_r, y_train, X_val_r, y_val, model, my_convnet_imp,
          reg=0.001, momentum=0.9, learning_rate=5e-7, batch_size=50, num_epochs=2,
          acc_frequency=50, verbose=True)

starting iteration  0
Finished epoch 0 / 2: cost 337760.438702, train: 6451228.764085, val 7906885.881865, lr 5.000000e-07
starting iteration  10
starting iteration  20
starting iteration  30
starting iteration  40
starting iteration  50
Finished epoch 0 / 2: cost 303648.210059, train: 6440251.771727, val 7746892.929233, lr 5.000000e-07
starting iteration  60
starting iteration  70
starting iteration  80
starting iteration  90
starting iteration  100
Finished epoch 0 / 2: cost 318687.653401, train: 6258306.559690, val 7559090.871935, lr 5.000000e-07
starting iteration  110
starting iteration  120
starting iteration  130
starting iteration  140
starting iteration  150
Finished epoch 0 / 2: cost 22091.556653, train: 542530.897793, val 682947.032484, lr 5.000000e-07
starting iteration  160
starting iteration  170
starting iteration  180
starting iteration  190
starting iteration  200
Finished epoch 0 / 2: cost 9948.622473, train: 171452.292061, val 211047.131785, lr 5.000000e-07
starting 

In [144]:
best_model_1aff = best_model

In [145]:
init_my_convnet(filter_size=5, num_filters=[32, 64], num_classes=2, num_crp=2, num_affine_layer=1, num_crcrp=0,
                       input_shape=X_train_r.shape[1:], loss="mse")
model = best_model_1aff
trainer = RegressionTrainer()
best_model, loss_history, train_acc_history, val_acc_history = trainer.train(
          X_train_r, y_train, X_val_r, y_val, model, my_convnet_imp,
          reg=0.001, momentum=0.9, learning_rate=1e-7, batch_size=50, num_epochs=5,
          acc_frequency=50, verbose=True)

starting iteration  0
Finished epoch 0 / 5: cost 6755.513847, train: 166978.844367, val 187364.554717, lr 1.000000e-07
starting iteration  10
starting iteration  20
starting iteration  30
starting iteration  40
starting iteration  50
Finished epoch 0 / 5: cost 7138.282666, train: 147113.317522, val 187018.700665, lr 1.000000e-07
starting iteration  60
starting iteration  70
starting iteration  80
starting iteration  90
starting iteration  100
Finished epoch 0 / 5: cost 8173.770006, train: 155950.697468, val 187465.128517, lr 1.000000e-07
starting iteration  110
starting iteration  120
starting iteration  130
starting iteration  140
starting iteration  150
Finished epoch 0 / 5: cost 7520.090207, train: 154107.219205, val 188617.645439, lr 1.000000e-07
starting iteration  160
starting iteration  170
starting iteration  180
starting iteration  190
starting iteration  200
Finished epoch 0 / 5: cost 9859.643529, train: 156733.244346, val 187266.426053, lr 1.000000e-07
starting iteration  21

In [146]:
best_model_1aff = best_model

In [150]:
np.save("noaff.npy", best_model_noaff)
np.save("1aff.npy", best_model_1aff)
np.save("2aff.npy", best_model_2aff)

In [153]:
test = np.load("noaff.npy")
# print test.item()

In [6]:
model = init_my_convnet(filter_size=5, num_filters=[32, 64, 128], num_classes=2, num_crp=3, num_affine_layer=1, num_crcrp=0,
                       input_shape=X_train_r.shape[1:], loss="mse")
trainer = RegressionTrainer()
best_model, loss_history, train_acc_history, val_acc_history = trainer.train(
          X_train_r, y_train, X_val_r, y_val, model, my_convnet_imp,
          reg=0.001, momentum=0.9, learning_rate=5e-7, batch_size=50, num_epochs=2,
          acc_frequency=50, verbose=True)

starting iteration  0
Finished epoch 0 / 2: cost 333628.410090, train: 6483035.335136, val 7874264.058415, lr 5.000000e-07
starting iteration  10
starting iteration  20
starting iteration  30
starting iteration  40
starting iteration  50
Finished epoch 0 / 2: cost 338326.899467, train: 6535712.050775, val 7715172.624148, lr 5.000000e-07
starting iteration  60
starting iteration  70
starting iteration  80
starting iteration  90
starting iteration  100
Finished epoch 0 / 2: cost 312457.176230, train: 6297834.319674, val 7527581.372066, lr 5.000000e-07
starting iteration  110
starting iteration  120
starting iteration  130
starting iteration  140
starting iteration  150
Finished epoch 0 / 2: cost 8795.765592, train: 711662.276686, val 855989.036904, lr 5.000000e-07
starting iteration  160
starting iteration  170
starting iteration  180
starting iteration  190
starting iteration  200
Finished epoch 0 / 2: cost 7089.235861, train: 153329.554815, val 188876.464680, lr 5.000000e-07
starting i

KeyboardInterrupt: 

In [11]:
model = init_my_convnet(filter_size=5, num_filters=[32, 64, 128], num_classes=2, num_crp=2, num_affine_layer=2, num_crcrp=0,
                       input_shape=X_train_r.shape[1:], loss="mse")
trainer = RegressionTrainer()
best_model, loss_history, train_acc_history, val_acc_history = trainer.train(
          X_train_r, y_train, X_val_r, y_val, model, my_convnet_imp,
          reg=0.001, momentum=0.9, learning_rate=1e-7, batch_size=50, num_epochs=2,
          acc_frequency=50, verbose=True)

starting iteration  0
Finished epoch 0 / 2: cost 348920.344617, train: 6574767.135864, val 7874572.668643, lr 1.000000e-07
starting iteration  10
starting iteration  20
starting iteration  30
starting iteration  40
starting iteration  50
Finished epoch 0 / 2: cost 340695.294096, train: 6543988.568163, val 7842261.811464, lr 1.000000e-07
starting iteration  60
starting iteration  70
starting iteration  80
starting iteration  90
starting iteration  100
Finished epoch 0 / 2: cost 340790.913687, train: 6440832.935990, val 7804021.804057, lr 1.000000e-07
starting iteration  110
starting iteration  120
starting iteration  130
starting iteration  140
starting iteration  150
Finished epoch 0 / 2: cost 323066.890675, train: 6657876.425066, val 7766031.867662, lr 1.000000e-07
starting iteration  160
starting iteration  170
starting iteration  180
starting iteration  190
starting iteration  200
Finished epoch 0 / 2: cost 342104.639209, train: 6521025.748586, val 7728241.467558, lr 1.000000e-07
st

In [16]:
model = init_my_convnet(filter_size=5, num_filters=[32, 64, 128], num_classes=2, num_crp=2, num_affine_layer=1, num_crcrp=0,
                       input_shape=X_train_r.shape[1:], loss="mse")
model = best_model
trainer = RegressionTrainer()
best_model, loss_history, train_acc_history, val_acc_history = trainer.train(
          X_train_r, y_train, X_val_r, y_val, model, my_convnet_imp,
          reg=0.001, momentum=0.9, learning_rate=1e-8, batch_size=50, num_epochs=2,
          acc_frequency=50, verbose=True)

starting iteration  0
Finished epoch 0 / 2: cost 6973.676529, train: 156219.499213, val 188333.531819, lr 1.000000e-08
starting iteration  10
starting iteration  20
starting iteration  30
starting iteration  40
starting iteration  50
Finished epoch 0 / 2: cost 9571.645837, train: 144511.552641, val 187930.990576, lr 1.000000e-08
starting iteration  60
starting iteration  70
starting iteration  80
starting iteration  90
starting iteration  100
Finished epoch 0 / 2: cost 6077.078359, train: 153193.752097, val 187868.421704, lr 1.000000e-08
starting iteration  110
starting iteration  120
starting iteration  130
starting iteration  140
starting iteration  150
Finished epoch 0 / 2: cost 6897.303253, train: 157543.577447, val 187869.002851, lr 1.000000e-08
starting iteration  160
starting iteration  170
starting iteration  180
starting iteration  190
starting iteration  200
Finished epoch 0 / 2: cost 6344.703928, train: 156660.137155, val 187965.512662, lr 1.000000e-08
starting iteration  21

In [24]:
model = np.load("1aff.npy").item()
init_my_convnet(filter_size=5, num_filters=[32, 64], num_classes=2, num_crp=2, num_affine_layer=1, num_crcrp=0,
                       input_shape=X_train_r.shape[1:], loss="mse")
tmp = np.mean(np.abs(my_convnet_imp(X_val_r, model) - y_val), axis=1)
# print tmp
print "Average distance: ", np.sum(tmp)/tmp.shape[0]

Average distance:  13.570844990320055


I have achieved a similar results as my first HW with -I believe- not a good implementation for regression in CNN's. Also, I have noticed that outputs are so similar to each other and the reason of that is bias that I mistakenly imposed in load_data part. I am loading small amounts of data and loading it sequentially and -then- shuffle it. Therefore big portion of the data is very similar to each other since the datas in the same matrix in dataset are time-series data that comes one after the other. So, what this network finds is essentialy a trend of data points.

Also, while trying this part, I had to change my_convnet_implementation. I didn't had time to refactor the code because of the so less time that I had when implementing this. Even though I did not started too late. Due to my low process power and I have tried many architectures, hyper-parameters in previous tasks, I had not much time for this task.

In [32]:
model = init_my_convnet(filter_size=5, num_filters=[32, 64], num_classes=2, num_crp=2, num_affine_layer=1, num_crcrp=0,
                       input_shape=X_train_r.shape[1:], loss="mse")
trainer = RegressionTrainer()
best_model, loss_history, train_acc_history, val_acc_history = trainer.train(
          X_train_r, y_train, X_val_r, y_val, model, my_convnet_imp,
          reg=0.001, momentum=0.9, learning_rate=5e-7, batch_size=50, num_epochs=2,
          acc_frequency=50, verbose=True)

starting iteration  0
Finished epoch 0 / 2: cost 324612.435857, train: 6499783.247937, val 7797428.251212, lr 5.000000e-07
starting iteration  10
starting iteration  20
starting iteration  30
starting iteration  40
starting iteration  50
Finished epoch 0 / 2: cost 317167.193486, train: 6367059.872757, val 7635332.046786, lr 5.000000e-07
starting iteration  60
starting iteration  70
starting iteration  80
starting iteration  90
starting iteration  100
Finished epoch 0 / 2: cost 311347.736095, train: 6208394.238201, val 7445519.471541, lr 5.000000e-07
starting iteration  110
starting iteration  120
starting iteration  130
starting iteration  140
starting iteration  150
Finished epoch 0 / 2: cost 107822.880876, train: 451627.508766, val 542371.341885, lr 5.000000e-07
starting iteration  160
starting iteration  170
starting iteration  180
starting iteration  190
starting iteration  200
Finished epoch 0 / 2: cost 122.297084, train: 3645.812216, val 4314.497306, lr 5.000000e-07
starting iter

In [39]:
model = init_my_convnet(filter_size=5, num_filters=[32, 64], num_classes=2, num_crp=2, num_affine_layer=1, num_crcrp=0,
                       input_shape=X_train_r.shape[1:], loss="mse")
model = best_model
trainer = RegressionTrainer()
best_model, loss_history, train_acc_history, val_acc_history = trainer.train(
          X_train_r, y_train, X_val_r, y_val, model, my_convnet_imp,
          reg=0.001, momentum=0.9, learning_rate=5e-9, batch_size=50, num_epochs=5,
          acc_frequency=50, verbose=True)

starting iteration  0
Finished epoch 0 / 5: cost 31.828135, train: 1229.111039, val 1503.233751, lr 5.000000e-09
starting iteration  10
starting iteration  20
starting iteration  30
starting iteration  40
starting iteration  50
Finished epoch 0 / 5: cost 54.816400, train: 1286.951597, val 1501.331774, lr 5.000000e-09
starting iteration  60
starting iteration  70
starting iteration  80
starting iteration  90
starting iteration  100
Finished epoch 0 / 5: cost 55.971776, train: 1035.640778, val 1501.982243, lr 5.000000e-09
starting iteration  110
starting iteration  120
starting iteration  130
starting iteration  140
starting iteration  150
Finished epoch 0 / 5: cost 50.744793, train: 1404.831411, val 1501.361529, lr 5.000000e-09
starting iteration  160
starting iteration  170
starting iteration  180
starting iteration  190
starting iteration  200
Finished epoch 0 / 5: cost 61.882992, train: 1203.537690, val 1501.035714, lr 5.000000e-09
starting iteration  210
starting iteration  220
star

In [43]:
tmp = np.mean(np.abs(my_convnet_imp(X_val_r, best_model) - y_val), axis=1)
print "Average distance: ", np.sum(tmp)/tmp.shape[0]

tmp = np.mean(np.abs(my_convnet_imp(X_test_r, best_model) - y_test), axis=1)
print "Average distance: ", np.sum(tmp)/tmp.shape[0]

144.1389517078341 62.91120630284322
Average distance:  1.052381957799357


I have further spent time and randomized the data selection part. And as we can see, scores are now closer to each other and has greater accuracy.