* This notebook is to demonstrate how to use Bayesian Optimization package on github <https://github.com/fmfn/BayesianOptimization> locally to tune hyperparamters for our models(RNN only) for predicting heart failure onset risk on cerner sample data
* For this demonstration, the data is the original 1 hospital (h143) previously used by retain, with 42,729 patients in total
* The hyperparameters to be tuned are: learning rate, l2 regularization, and eps for optimizer, dropout rate, embedding dimension, hidden dimension, number of layers and optimizer
* To implement this, first you need to install the package: however we modify the package file a bit to bypass errors and keep on iterating. The modified files could be found at Experiments/modifiedBO
* Then **important**: you need to define a function (in our case model_tune()) which takes in the hyperparameters: l2_exp, lr_exp, eps_exp on logscale, embed_dim, hid_dim on log2scale, dropout and also ct_code, dlm_code and opt_code, and run the model using models, Loaddata, and TrainVaTe modules and return the best validation auc. We put the categorical parameters: ct_code: cell_type code which includes RNN, GRU and LSTM; dlm_code: model code which includes RNN and DRNN, and opt_code: optimitzer code which includes 7 optimizers to a 3-level loop so that each time we run BO, it takes a combination of model, cell_type and optimizer and return the rest of best parameters based on best validation auc
* Be ware that this BO package will search float parameters, so if you have int or categorical parameters you want to tune, you might want to transform those values in your function before giving those to your models (like we did here)
* Then **important**: call BO function and pass your model_tune(), a search range for each parameter ((-16, 1) means -16 and 1 inclusive), and give it points to explore (points that will give you large target values) if you want to, and call maximize() and pass number of iterations you want to run BO
* Then you will get results of your initial designated explored points(if any), 5 initializations, and plus number of BO iterations
* For our results:it's time consuming for RNN model, but it improved our best validation auc **from 0.67414** to currently **0.72648**

 Bayesian Optimization
--------------------------------------------------------------------------------------------------------------------------------
 Step |   Time |      Value |   ct_code |   dlm_code |   dropout |   embdim_exp |   eps_exp |   hid_exp |    l2_exp |   layers_n |    lr_exp |   opt_code | 
 
   77 | 50m47s |    0.72648 |    1.0000 |     1.0000 |    0.0000 |       9.0000 |   -5.0000 |    9.0000 |   -5.6805 |     1.0000 |   -6.3180 |     1.0000  | 

In [1]:
from __future__ import print_function
from __future__ import division

#from sklearn.datasets import make_classification
#from sklearn.cross_validation import cross_val_score
import string
import re
import random

import os
import sys
import argparse
import time
import math

import torch
import torch.nn as nn
from torch.autograd import Variable
from torch import optim
import torch.nn.functional as F
from torchviz import make_dot, make_dot_from_trace

from sklearn.metrics import roc_auc_score  
from sklearn.metrics import roc_curve 

import matplotlib.pyplot as plt
import matplotlib.ticker as ticker
import numpy as np

try:
    import cPickle as pickle
except:
    import pickle


from bayes_opt import BayesianOptimization

In [2]:
torch.backends.cudnn.enabled=False

In [3]:
torch.cuda.set_device(0)

In [4]:
import model_RNN as model #this changed
import Loaddata_final as Loaddata
import TrVaTe as TVT #This changed 

# check GPU availability
use_cuda = torch.cuda.is_available()
use_cuda

True

In [5]:
# Load data set and target values
set_x = pickle.load(open('Data/h143.visits', 'rb'), encoding='bytes')
set_y = pickle.load(open('Data/h143.labels', 'rb'),encoding='bytes')

"""
model_x = []
for patient in set_x:
    model_x.append([each for visit in patient for each in visit])  
    
"""
model_x = set_x  #this is for the rest of the models
    
merged_set= [[set_y[i],model_x[i]] for i in range(len(set_y))] #list of list or list of lists of lists
print("\nLoading and preparing data...")    
train1, valid1, test1 = Loaddata.load_data(merged_set)
print("\nSample data after split:")  
print(train1[0])
print("model is", 'RNN') #can change afterwards, currently on most basic RNN


Loading and preparing data...

Sample data after split:
[0, [[1667, 144, 62, 85], [1667, 144, 62, 85]]]
model is RNN


In [6]:
def print2file(buf, outFile):
    outfd = open(outFile, 'a')
    outfd.write(buf + '\n')
    outfd.close()

logFile='testRNN_JustRNN.log'
header = 'Model|EmbSize|CellType|n_Layers|Hidden|Dropout|Optimizer|LR|L2|EPs|BestValidAUC|TestAUC|atEpoch'
print2file(header, logFile)

In [7]:
epochs = 100

In [8]:
def model_tune(dlm_code, ct_code, opt_code, embdim_exp, hid_exp, layers_n, dropout, l2_exp , lr_exp, eps_exp):
    #little transformations to use the searched values
    embed_dim = 2** int(embdim_exp)
    hidden_size = 2** int(hid_exp)
    n_layers = int(layers_n)
    dropout = round(dropout, 4)
    l2 = np.exp(l2_exp)
    lr = np.exp(lr_exp)
    eps = np.exp(eps_exp)

        
    #dealing with categorical data
    if int(dlm_code)<3:
      if int(ct_code) ==1:
          cell_type='RNN'   
      elif int(ct_code) ==2:
          cell_type='LSTM'
      elif int(ct_code) ==3:
          cell_type='GRU'
      
    if int(dlm_code)==1:
        w_model='RNN'
        ehr_model = model.EHR_RNN(20000, embed_dim, hidden_size, n_layers, dropout, cell_type)
    elif int(dlm_code)==2:
        w_model='DRNN'
        ehr_model = model.DRNN(20000, embed_dim, hidden_size, n_layers, dropout, cell_type)

    if use_cuda:
        ehr_model = ehr_model.cuda(0)    
        
        
    if int(opt_code) ==1:
        opt= 'Adadelta'
        optimizer = optim.Adadelta(ehr_model.parameters(), lr=lr, weight_decay=l2 ,eps=eps) ## rho=0.9
    elif int(opt_code) ==2:
        opt= 'Adagrad'
        optimizer = optim.Adagrad(ehr_model.parameters(), lr=lr, weight_decay=l2) ##lr_decay no eps
    elif int(opt_code) ==3:
        opt= 'Adam'
        optimizer = optim.Adam(ehr_model.parameters(), lr=lr, weight_decay=l2,eps=eps ) ## Beta defaults (0.9, 0.999), amsgrad (false)
    elif int(opt_code) ==4:
        opt= 'Adamax'
        optimizer = optim.Adamax(ehr_model.parameters(), lr=lr, weight_decay=l2 ,eps=eps) ### Beta defaults (0.9, 0.999)
    elif int(opt_code) ==5:
        opt= 'RMSprop'
        optimizer = optim.RMSprop(ehr_model.parameters(), lr=lr, weight_decay=l2 ,eps=eps)                
    elif int(opt_code) ==6:
        opt= 'ASGD'
        optimizer = optim.ASGD(ehr_model.parameters(), lr=lr, weight_decay=l2 ) ### other parameters
    elif int(opt_code) ==7:
        opt= 'SGD'
        optimizer = optim.SGD(ehr_model.parameters(), lr=lr, weight_decay=l2 ) ### other parameters
  
    
    bestValidAuc = 0.0
    bestTestAuc = 0.0
    bestValidEpoch = 0
  
    for ep in range(epochs):
        current_loss, train_loss = TVT.train(train1, model= ehr_model, optimizer = optimizer, batch_size = 200)
        avg_loss = np.mean(train_loss)
        valid_auc, y_real, y_hat  = TVT.calculate_auc(model = ehr_model, data = valid1, which_model = w_model, batch_size = 200)
        if valid_auc > bestValidAuc: 
            bestValidAuc = valid_auc
            bestValidEpoch = ep
            best_model= ehr_model
            #bestTestAuc, y_real, y_hat = TVT.calculate_auc(model = ehr_model, data = test1, which_model = w_model, batch_size = 200)

        if ep - bestValidEpoch > 12:
            break
      
  
     
    fname= w_model+'E'+str(embed_dim)+cell_type+'L'+str(n_layers)+'H'+str(hidden_size)+'D'+str(dropout)+opt+'L'+str(lr)+'P'+str(l2)  
    bmodel_pth='models/'+fname
    bestTestAuc, y_real, y_hat = TVT.calculate_auc(model = best_model, data = test1, which_model = w_model, batch_size = 200)
    torch.save(best_model, bmodel_pth)
    buf = '|%f |%f |%d ' % (bestValidAuc, bestTestAuc, bestValidEpoch )
    
    pFile= w_model+'|'+str(embed_dim)+'|'+cell_type+'|'+str(n_layers)+'|'+str(hidden_size)+'|'+str(dropout)+'|'+opt+'|'+str(lr)+'|'+str(l2)+'|'+str(eps)+ buf  
    print2file(pFile, logFile)
    
    return bestValidAuc

In [9]:
if __name__ == "__main__":
    gp_params = {"alpha": 1e-4}

#Here we loop through different models, change the model tune 
for cti in range(1,4): 
    for dlmi in range(1,2): #just the RNN, no DRNN
        for opti in range(1,8):
            print('\n Now Tuning model with Bayesian Optimization: ','cell code', str(cti),'model code', str(dlmi),'optimizer code',str(opti))
            NNBO = BayesianOptimization(model_tune,
                                        {'dlm_code':(dlmi,dlmi), 'ct_code': (cti, cti), 'opt_code':(opti, opti),
                                         'embdim_exp': (5, 9),'hid_exp': (5, 9),'layers_n': (1, 3),'dropout': (0, 1),
                                         'l2_exp': (-6, -1), 'lr_exp': (-7, -2), 'eps_exp': (-9, -5)})
            NNBO.explore({'dlm_code':[dlmi], 'ct_code': [cti], 'opt_code':[opti],'embdim_exp': [8],
                          'hid_exp': [8],'layers_n': [1],'dropout': [0.1],'l2_exp': [-3], 'lr_exp': [-3], 'eps_exp':[-6]})

            NNBO.maximize(n_iter=10000, **gp_params)

            print('-' * 53)
            print('Final Results')
            print('RNN / DRNN: %f' % NNBO.res['max']['max_val'])

            print2file(str(NNBO.res['max']), logFile)


 Now Tuning model with Bayesian Optimization:  cell code 1 model code 1 optimizer code 1
[31mInitialization[0m
[94m-----------------------------------------------------------------------------------------------------------------------------------------------------------[0m
 Step |   Time |      Value |   ct_code |   dlm_code |   dropout |   embdim_exp |   eps_exp |   hid_exp |    l2_exp |   layers_n |    lr_exp |   opt_code | 
    1 | 13m45s | [35m   0.67414[0m | [32m   1.0000[0m | [32m    1.0000[0m | [32m   0.1000[0m | [32m      8.0000[0m | [32m  -6.0000[0m | [32m   8.0000[0m | [32m  -3.0000[0m | [32m    1.0000[0m | [32m  -3.0000[0m | [32m    1.0000[0m | 
    2 | 42m27s |    0.61278 |    1.0000 |     1.0000 |    0.0001 |       5.2480 |   -7.0351 |    6.1856 |   -2.8426 |     1.7153 |   -6.8576 |     1.0000 | 
    3 | 15m18s |    0.67055 |    1.0000 |     1.0000 |    0.5291 |       7.8718 |   -8.2068 |    5.8928 |   -3.4709 |     1.9797 |   -3.3319 |     1.00

  " state: %s" % convergence_dict)


   43 | 88m36s | [35m   0.71793[0m | [32m   1.0000[0m | [32m    1.0000[0m | [32m   1.0000[0m | [32m      9.0000[0m | [32m  -5.0000[0m | [32m   9.0000[0m | [32m  -3.7994[0m | [32m    1.0000[0m | [32m  -5.9334[0m | [32m    1.0000[0m | 
   44 | 68m37s | [35m   0.71991[0m | [32m   1.0000[0m | [32m    1.0000[0m | [32m   0.0000[0m | [32m      9.0000[0m | [32m  -5.0000[0m | [32m   9.0000[0m | [32m  -6.0000[0m | [32m    3.0000[0m | [32m  -5.1140[0m | [32m    1.0000[0m | 
   45 | 70m21s |    0.70802 |    1.0000 |     1.0000 |    0.0000 |       7.1202 |   -9.0000 |    9.0000 |   -4.0730 |     1.0000 |   -4.5210 |     1.0000 | 
   46 | 48m06s |    0.69639 |    1.0000 |     1.0000 |    0.0000 |       7.3194 |   -7.7014 |    5.0000 |   -4.2857 |     1.0000 |   -4.2394 |     1.0000 | 
   47 | 35m48s |    0.68423 |    1.0000 |     1.0000 |    1.0000 |       5.0000 |   -5.0000 |    9.0000 |   -6.0000 |     1.0000 |   -3.6760 |     1.0000 | 
   48 | 46m59s | 

  " state: %s" % convergence_dict)


   52 | 20m24s |    0.70875 |    1.0000 |     1.0000 |    1.0000 |       9.0000 |   -5.0000 |    9.0000 |   -6.0000 |     1.0000 |   -4.5530 |     1.0000 | 
   53 | 32m03s |    0.71272 |    1.0000 |     1.0000 |    0.0000 |       7.8988 |   -7.3897 |    9.0000 |   -6.0000 |     2.2282 |   -3.9884 |     1.0000 | 
   54 | 14m07s |    0.70489 |    1.0000 |     1.0000 |    0.0000 |       9.0000 |   -9.0000 |    9.0000 |   -4.4459 |     1.1423 |   -3.4071 |     1.0000 | 


  " state: %s" % convergence_dict)


   55 | 33m52s |    0.67925 |    1.0000 |     1.0000 |    0.0000 |       6.7300 |   -9.0000 |    6.9772 |   -5.4182 |     1.0000 |   -3.8286 |     1.0000 | 
   56 | 33m33s |    0.68074 |    1.0000 |     1.0000 |    1.0000 |       7.9544 |   -7.0439 |    9.0000 |   -3.1206 |     1.0000 |   -5.1356 |     1.0000 | 
   57 | 27m47s |    0.69956 |    1.0000 |     1.0000 |    0.0000 |       9.0000 |   -5.0000 |    7.5925 |   -4.2148 |     1.3650 |   -5.0667 |     1.0000 | 
   58 | 48m44s |    0.71363 |    1.0000 |     1.0000 |    0.0000 |       9.0000 |   -5.0000 |    9.0000 |   -6.0000 |     1.0000 |   -7.0000 |     1.0000 | 
   59 | 48m20s |    0.71315 |    1.0000 |     1.0000 |    0.0000 |       9.0000 |   -5.0729 |    9.0000 |   -4.0281 |     1.0000 |   -7.0000 |     1.0000 | 
   60 | 13m46s |    0.70908 |    1.0000 |     1.0000 |    0.0000 |       9.0000 |   -6.2446 |    9.0000 |   -4.6009 |     1.8602 |   -3.8256 |     1.0000 | 


  " state: %s" % convergence_dict)


   61 | 34m33s |    0.69283 |    1.0000 |     1.0000 |    0.0000 |       9.0000 |   -6.2453 |    5.0000 |   -6.0000 |     1.0000 |   -3.1902 |     1.0000 | 
   62 | 45m34s |    0.67359 |    1.0000 |     1.0000 |    0.0000 |       5.0000 |   -7.0260 |    9.0000 |   -6.0000 |     1.0000 |   -4.6645 |     1.0000 | 
   63 | 18m45s | [35m   0.72120[0m | [32m   1.0000[0m | [32m    1.0000[0m | [32m   0.0000[0m | [32m      9.0000[0m | [32m  -5.0000[0m | [32m   9.0000[0m | [32m  -6.0000[0m | [32m    3.0000[0m | [32m  -4.1409[0m | [32m    1.0000[0m | 
   64 | 14m08s |    0.70759 |    1.0000 |     1.0000 |    0.0000 |       9.0000 |   -9.0000 |    9.0000 |   -6.0000 |     3.0000 |   -2.7378 |     1.0000 | 
   65 | 41m11s |    0.68831 |    1.0000 |     1.0000 |    0.0000 |       6.5111 |   -5.0000 |    9.0000 |   -6.0000 |     3.0000 |   -3.9277 |     1.0000 | 
   66 | 42m34s |    0.71260 |    1.0000 |     1.0000 |    0.0000 |       8.8859 |   -5.0000 |    9.0000 |   -5.0669 

  " state: %s" % convergence_dict)


   68 | 16m44s |    0.67067 |    1.0000 |     1.0000 |    0.0000 |       7.2565 |   -9.0000 |    9.0000 |   -3.7543 |     1.0135 |   -2.0913 |     1.0000 | 
   69 | 49m30s |    0.70640 |    1.0000 |     1.0000 |    0.0000 |       9.0000 |   -9.0000 |    9.0000 |   -3.5357 |     1.0000 |   -7.0000 |     1.0000 | 
   70 | 49m13s |    0.70561 |    1.0000 |     1.0000 |    0.0000 |       9.0000 |   -6.7454 |    7.0560 |   -3.2950 |     1.0000 |   -6.7314 |     1.0000 | 
   71 | 15m29s |    0.61267 |    1.0000 |     1.0000 |    0.0000 |       5.0000 |   -5.0000 |    5.0000 |   -3.6817 |     1.0000 |   -2.0000 |     1.0000 | 
   72 | 46m56s |    0.70712 |    1.0000 |     1.0000 |    0.0000 |       9.0000 |   -7.0516 |    9.0000 |   -4.7905 |     1.0000 |   -7.0000 |     1.0000 | 
   73 | 60m21s |    0.71194 |    1.0000 |     1.0000 |    0.0000 |       9.0000 |   -5.0000 |    9.0000 |   -6.0000 |     3.0000 |   -7.0000 |     1.0000 | 
   74 | 12m20s |    0.69828 |    1.0000 |     1.0000 |    

  " state: %s" % convergence_dict)


   78 | 50m00s |    0.69092 |    1.0000 |     1.0000 |    0.0000 |       7.7467 |   -5.7728 |    9.0000 |   -5.3181 |     2.0652 |   -5.8542 |     1.0000 | 
   79 | 30m32s |    0.70874 |    1.0000 |     1.0000 |    0.0000 |       8.7169 |   -9.0000 |    9.0000 |   -4.1188 |     1.0000 |   -4.3757 |     1.0000 | 
   80 | 10m29s |    0.68881 |    1.0000 |     1.0000 |    0.0000 |       9.0000 |   -8.5538 |    9.0000 |   -6.0000 |     1.7371 |   -2.0000 |     1.0000 | 
   81 | 24m26s |    0.69291 |    1.0000 |     1.0000 |    0.0081 |       8.8968 |   -7.3501 |    6.7445 |   -4.0455 |     1.0840 |   -4.2544 |     1.0000 | 
   82 | 23m03s |    0.67773 |    1.0000 |     1.0000 |    0.0359 |       7.4748 |   -7.9632 |    5.4337 |   -4.8790 |     1.1595 |   -2.0255 |     1.0000 | 
   83 | 11m30s |    0.68678 |    1.0000 |     1.0000 |    0.0000 |       9.0000 |   -5.0000 |    5.0000 |   -6.0000 |     1.0000 |   -2.0000 |     1.0000 | 


  " state: %s" % convergence_dict)


   84 | 49m13s |    0.71324 |    1.0000 |     1.0000 |    0.0000 |       9.0000 |   -5.0830 |    9.0000 |   -5.5476 |     1.0000 |   -6.0586 |     1.0000 | 
   85 | 45m40s |    0.69643 |    1.0000 |     1.0000 |    0.0000 |       8.9173 |   -7.3933 |    8.8535 |   -3.2991 |     1.0000 |   -6.0678 |     1.0000 | 


  " state: %s" % convergence_dict)


   86 | 46m43s |    0.70017 |    1.0000 |     1.0000 |    0.0000 |       9.0000 |   -5.0000 |    9.0000 |   -6.0000 |     1.0000 |   -7.0000 |     1.0000 | 


  " state: %s" % convergence_dict)
  " state: %s" % convergence_dict)


   87 | 20m59s |    0.68386 |    1.0000 |     1.0000 |    0.9832 |       8.3914 |   -5.0227 |    6.4918 |   -4.1948 |     1.0207 |   -3.5946 |     1.0000 | 
   88 | 32m08s |    0.70946 |    1.0000 |     1.0000 |    0.0000 |       8.9396 |   -5.0000 |    8.3724 |   -4.7471 |     2.9961 |   -4.9628 |     1.0000 | 
   89 | 17m32s |    0.71528 |    1.0000 |     1.0000 |    0.0000 |       8.4757 |   -5.0000 |    9.0000 |   -6.0000 |     3.0000 |   -4.0391 |     1.0000 | 


  " state: %s" % convergence_dict)


   90 | 54m04s |    0.71921 |    1.0000 |     1.0000 |    0.0000 |       9.0000 |   -5.4129 |    9.0000 |   -4.4303 |     3.0000 |   -5.8702 |     1.0000 | 
   91 | 46m05s |    0.69936 |    1.0000 |     1.0000 |    0.0000 |       9.0000 |   -5.0000 |    9.0000 |   -5.1578 |     1.0000 |   -7.0000 |     1.0000 | 
   92 | 21m32s |    0.70576 |    1.0000 |     1.0000 |    0.0000 |       7.0924 |   -9.0000 |    9.0000 |   -6.0000 |     1.0000 |   -3.3751 |     1.0000 | 
   93 | 29m47s |    0.70674 |    1.0000 |     1.0000 |    0.0000 |       8.7334 |   -7.4503 |    9.0000 |   -4.1811 |     2.6201 |   -2.8486 |     1.0000 | 
   94 | 35m46s |    0.70935 |    1.0000 |     1.0000 |    0.1128 |       9.0000 |   -5.0000 |    9.0000 |   -6.0000 |     1.4127 |   -5.8970 |     1.0000 | 
   95 | 19m20s |    0.71147 |    1.0000 |     1.0000 |    0.0000 |       8.8006 |   -5.1035 |    9.0000 |   -4.0590 |     3.0000 |   -3.5528 |     1.0000 | 
   96 | 26m01s |    0.71480 |    1.0000 |     1.0000 |    

  " state: %s" % convergence_dict)


   99 | 22m56s |    0.66890 |    1.0000 |     1.0000 |    0.0000 |       5.0000 |   -5.0000 |    9.0000 |   -6.0000 |     1.0000 |   -2.6224 |     1.0000 | 
  100 | 29m31s |    0.72044 |    1.0000 |     1.0000 |    0.0000 |       9.0000 |   -9.0000 |    9.0000 |   -6.0000 |     1.0000 |   -4.3948 |     1.0000 | 
  101 | 55m27s |    0.68434 |    1.0000 |     1.0000 |    0.0000 |       9.0000 |   -5.0000 |    6.8356 |   -3.9222 |     3.0000 |   -5.8900 |     1.0000 | 
Error in iteration: 95, ignore result


AttributeError: module 'sys' has no attribute 'exc_clear'