## **Heston hyper-parameter tuning**

In [1]:
from IPython.core.display import display, HTML
display(HTML("<style>.container { width:90% !important; }</style>"))

Load the libraries

In [2]:
import pandas as pd
import datetime, os
import numpy as np
import numpy.random as npr
from pylab import plt, mpl

from scipy.stats import norm
from scipy import optimize
import scipy.integrate as integrate
import scipy.special as special 

import tensorflow as tf
from tensorflow.keras.callbacks import EarlyStopping, TensorBoard
from tensorboard.plugins.hparams import api as hp
from tensorflow import keras
from sklearn.model_selection import train_test_split
from sklearn import preprocessing
from sklearn.preprocessing import StandardScaler

import matplotlib.pyplot as plt
import seaborn as sns

# Load the TensorBoard notebook extension
%load_ext tensorboard

Load the Heston data

In [3]:
#To read the import the csv-file, use:
raw_Options_input = pd.read_csv (r"/Users/Marcklein/Desktop/Master Thesis/Option pricing using Neural Networks/Python/Heston/Heston_data_input.csv")
raw_Options_output = pd.read_csv (r"/Users/Marcklein/Desktop/Master Thesis/Option pricing using Neural Networks/Python/Heston/Heston_data_output.csv")

#Creates some unnamed column in the beginning, delete it:
del raw_Options_input['Unnamed: 0']
del raw_Options_output['Unnamed: 0']


Copy it so we dont mess anything up

In [4]:
Options_input = raw_Options_input.copy()
Options_output = raw_Options_output.copy()

Since the standard deviation is calculated by taking the sum of the squared deviations from the mean, a zero standard deviation can only be possible when all the values of a variable are the same (all equal to the mean). In this case, those variables have no discriminative power so they can be removed from the analysis. They cannot improve any classification, clustering or regression task. Many implementations will do it for you or throw an error about a matrix calculation.

### **Data preparation**

We split our dataset into a training set and a test set (validation set is taken from the training set during model.fit).

In [5]:
# 90% for training and validating
train_dataset = Options_input.sample(frac=0.96666666666667, random_state=42)
test_dataset = Options_input.drop(train_dataset.index)

train_labels = Options_output.sample(frac=0.96666666666667, random_state=42)
test_labels = Options_output.drop(train_labels.index)

Check the overall statistics

In [6]:
train_stats = train_dataset.describe().T

In [7]:
#normalize the data
def norm(x):
    return (x - train_stats['mean']) / train_stats['std']
normed_train_data = norm(train_dataset).values
normed_test_data = norm(test_dataset).values

#make the labels into numpy array just like the normed training data
train_labels = np.asarray(train_labels)
test_labels = np.asarray(test_labels)

#check the shapes
print("Input train data:", normed_train_data.shape, " Output train data:", train_labels.shape)
print("Input test data:", normed_test_data.shape, " Output test data:", test_labels.shape)

Input train data: (290000, 7)  Output train data: (290000, 10)
Input test data: (10000, 7)  Output test data: (10000, 10)


### **The hyperparameter testing-model**

We start by initializing all the hyperparameters that we want to asses. We then set the metrics of the model to "mean squared error". Since Tensorboard works with log files that are created during the training process we create logs for the training process that records the losses, metrics and other measures during training.

In [8]:
#The hyperparameters & their values to be tested are stored in a special type called HParam
HP_NUM_UNITS = hp.HParam('num_units', hp.Discrete([500, 750, 1000]))
HP_OPTIMIZER = hp.HParam('optimizer', hp.Discrete(['adam', 'sgd', 'rmsprop']))
HP_LEARNING_RATE = hp.HParam('learning_rate', hp.RealInterval(.0001,.001))
HP_ACTIVATION = hp.HParam('activation', hp.Discrete(['relu', 'tanh', 'sigmoid']))
HP_BATCHSIZE = hp.HParam('batch_size', hp.Discrete([32, 256, 1000]))

#Setting the Metric to MSE (Mean Squared Error)
METRIC_MSE = 'mean_squared_error'

# Clear any logs from previous runs
!rm -rf ./logs/ 

#Creating & configuring log files
with tf.summary.create_file_writer('logs/hparam_tuning').as_default():
    hp.hparams_config(
        hparams=[HP_NUM_UNITS, HP_OPTIMIZER, HP_LEARNING_RATE, HP_ACTIVATION, HP_BATCHSIZE],
        metrics=[hp.Metric(METRIC_MSE, display_name='mean_squared_error')],
        )

Now we create a function to train and validate the model which will take the hyperparameters as arguments. Each combination of hyperparameters will run for # epochs and the hyperparameters are provided in an hparams dictionary and used throughout the training function

In [9]:
#weight and bias initializers
weights_initializer = keras.initializers.GlorotUniform(seed=42)
bias_initializer = keras.initializers.Zeros()

# Display training progress by printing a single dot for each completed epoch
class PrintDot(keras.callbacks.Callback):
    def on_epoch_end(self, epoch, logs):
        if epoch % 100 == 0: print('')
        print('.', end='')


#A function that trains and validates the model on a variety of hyper-parameters and returns the MSE
def train_val_model(hparams):
    #Keras sequential model with Hyperparameters passed from the argument
    model = keras.models.Sequential([
            #Layer to be used as an entry point into a Network
            keras.layers.InputLayer(input_shape=[len(train_dataset.keys())]),
        
            #Dense layer
            keras.layers.Dense(hparams[HP_NUM_UNITS], kernel_initializer = weights_initializer,
                               activation = hparams[HP_ACTIVATION], bias_initializer = bias_initializer,
                              name='Layer_1'),
        
            #activation function is linear since we are doing regression
            keras.layers.Dense(10, activation='linear', name='Output_layer')])
    
    if hparams[HP_OPTIMIZER] == 'adam':
        optimizer = tf.keras.optimizers.Adam(learning_rate=hparams[HP_LEARNING_RATE], beta_1=0.9, beta_2=0.999,
                             epsilon=1e-07, amsgrad=False, name='Adam')
    elif hparams[HP_OPTIMIZER] == 'sgd':
        optimizer = tf.keras.optimizers.SGD(learning_rate=hparams[HP_LEARNING_RATE], nesterov=False, name='SGD')
    elif hparams[HP_OPTIMIZER] == 'rmsprop':
        optimizer = tf.keras.optimizers.RMSprop(learning_rate=hparams[HP_LEARNING_RATE], rho=0.9, momentum=0.0, epsilon=1e-07, centered=False, name='RMSprop')
    else:
        raise ValueError("unexpected optimizer name: %r" % (hparams[HP_OPTIMIZER],))
    
    
    #Compiling the model
    model.compile(optimizer=optimizer, 
                  loss='mean_squared_error', #Computes the mean of squares of errors between labels and predictions
                  metrics=['mean_squared_error']) #Computes the mean squared error between y_true and y_pred
    
    #Training the network
    model.fit(normed_train_data, train_labels, 
         batch_size=hparams[HP_BATCHSIZE], 
         epochs=50,
         verbose=0,
         validation_split=0.2,
         callbacks=[PrintDot()])
    
    _, mse = model.evaluate(normed_test_data, test_labels)
    return mse

The following function will initiate the training process with the hyperparameters to be assessed and will create a summary based on the MSE value returned by the train_test_model function and writes the summary with the hyperparameters and final accuracy(MSE) in logs.

In [10]:
def run(run_dir, hparams):
    with tf.summary.create_file_writer(run_dir).as_default():
        hp.hparams(hparams)  # record the values used in this trial
        mse = train_val_model(hparams)
        tf.summary.scalar(METRIC_MSE, mse, step=10)

We will now train the model for each combination of the hyperparameters

In [11]:
%%time

#A unique number for each training session
session_num = 0

#Nested for loop training with all possible  combinathon of hyperparameters
for num_units in HP_NUM_UNITS.domain.values:
    for learning_rate in (HP_LEARNING_RATE.domain.min_value, HP_LEARNING_RATE.domain.max_value):
        for batch_size in HP_BATCHSIZE.domain.values:
            for activation in HP_ACTIVATION.domain.values:
                for optimizer in HP_OPTIMIZER.domain.values:
                    hparams = {
                        HP_NUM_UNITS: num_units,
                        HP_LEARNING_RATE: learning_rate,
                        HP_BATCHSIZE: batch_size,
                        HP_ACTIVATION: activation,
                        HP_OPTIMIZER: optimizer
                        }
                    run_name = "run-%d" % session_num
                    print('--- Starting trial: %s' % run_name)
                    print({h.name: hparams[h] for h in hparams})
                    run('logs/hparam_tuning/' + run_name, hparams)
                    session_num += 1


--- Starting trial: run-0
{'num_units': 500, 'learning_rate': 0.0001, 'batch_size': 32, 'activation': 'relu', 'optimizer': 'adam'}

--- Starting trial: run-1
{'num_units': 500, 'learning_rate': 0.0001, 'batch_size': 32, 'activation': 'relu', 'optimizer': 'rmsprop'}

--- Starting trial: run-2
{'num_units': 500, 'learning_rate': 0.0001, 'batch_size': 32, 'activation': 'relu', 'optimizer': 'sgd'}

--- Starting trial: run-3
{'num_units': 500, 'learning_rate': 0.0001, 'batch_size': 32, 'activation': 'sigmoid', 'optimizer': 'adam'}

--- Starting trial: run-4
{'num_units': 500, 'learning_rate': 0.0001, 'batch_size': 32, 'activation': 'sigmoid', 'optimizer': 'rmsprop'}

--- Starting trial: run-5
{'num_units': 500, 'learning_rate': 0.0001, 'batch_size': 32, 'activation': 'sigmoid', 'optimizer': 'sgd'}

--- Starting trial: run-6
{'num_units': 500, 'learning_rate': 0.0001, 'batch_size': 32, 'activation': 'tanh', 'optimizer': 'adam'}

--- Starting trial: run-7
{'num_units': 500, 'learning_rate': 0

--- Starting trial: run-68
{'num_units': 750, 'learning_rate': 0.0001, 'batch_size': 264, 'activation': 'sigmoid', 'optimizer': 'sgd'}

--- Starting trial: run-69
{'num_units': 750, 'learning_rate': 0.0001, 'batch_size': 264, 'activation': 'tanh', 'optimizer': 'adam'}

--- Starting trial: run-70
{'num_units': 750, 'learning_rate': 0.0001, 'batch_size': 264, 'activation': 'tanh', 'optimizer': 'rmsprop'}

--- Starting trial: run-71
{'num_units': 750, 'learning_rate': 0.0001, 'batch_size': 264, 'activation': 'tanh', 'optimizer': 'sgd'}

--- Starting trial: run-72
{'num_units': 750, 'learning_rate': 0.0001, 'batch_size': 1000, 'activation': 'relu', 'optimizer': 'adam'}

--- Starting trial: run-73
{'num_units': 750, 'learning_rate': 0.0001, 'batch_size': 1000, 'activation': 'relu', 'optimizer': 'rmsprop'}

--- Starting trial: run-74
{'num_units': 750, 'learning_rate': 0.0001, 'batch_size': 1000, 'activation': 'relu', 'optimizer': 'sgd'}

--- Starting trial: run-75
{'num_units': 750, 'learni


--- Starting trial: run-102
{'num_units': 750, 'learning_rate': 0.001, 'batch_size': 1000, 'activation': 'sigmoid', 'optimizer': 'adam'}

--- Starting trial: run-103
{'num_units': 750, 'learning_rate': 0.001, 'batch_size': 1000, 'activation': 'sigmoid', 'optimizer': 'rmsprop'}

--- Starting trial: run-104
{'num_units': 750, 'learning_rate': 0.001, 'batch_size': 1000, 'activation': 'sigmoid', 'optimizer': 'sgd'}

--- Starting trial: run-105
{'num_units': 750, 'learning_rate': 0.001, 'batch_size': 1000, 'activation': 'tanh', 'optimizer': 'adam'}

--- Starting trial: run-106
{'num_units': 750, 'learning_rate': 0.001, 'batch_size': 1000, 'activation': 'tanh', 'optimizer': 'rmsprop'}

--- Starting trial: run-107
{'num_units': 750, 'learning_rate': 0.001, 'batch_size': 1000, 'activation': 'tanh', 'optimizer': 'sgd'}

--- Starting trial: run-108
{'num_units': 1000, 'learning_rate': 0.0001, 'batch_size': 32, 'activation': 'relu', 'optimizer': 'adam'}

--- Starting trial: run-109
{'num_units':


--- Starting trial: run-136
{'num_units': 1000, 'learning_rate': 0.001, 'batch_size': 32, 'activation': 'relu', 'optimizer': 'rmsprop'}

--- Starting trial: run-137
{'num_units': 1000, 'learning_rate': 0.001, 'batch_size': 32, 'activation': 'relu', 'optimizer': 'sgd'}

--- Starting trial: run-138
{'num_units': 1000, 'learning_rate': 0.001, 'batch_size': 32, 'activation': 'sigmoid', 'optimizer': 'adam'}

--- Starting trial: run-139
{'num_units': 1000, 'learning_rate': 0.001, 'batch_size': 32, 'activation': 'sigmoid', 'optimizer': 'rmsprop'}

--- Starting trial: run-140
{'num_units': 1000, 'learning_rate': 0.001, 'batch_size': 32, 'activation': 'sigmoid', 'optimizer': 'sgd'}

--- Starting trial: run-141
{'num_units': 1000, 'learning_rate': 0.001, 'batch_size': 32, 'activation': 'tanh', 'optimizer': 'adam'}

--- Starting trial: run-142
{'num_units': 1000, 'learning_rate': 0.001, 'batch_size': 32, 'activation': 'tanh', 'optimizer': 'rmsprop'}

--- Starting trial: run-143
{'num_units': 100

It’s time to launch TensorBoard. Use the following commands to launch tensorboard.

In [12]:
%tensorboard --logdir logs/hparam_tuning

Reusing TensorBoard on port 6006 (pid 6114), started 9:09:24 ago. (Use '!kill 6114' to kill it.)

Once it is launched, you will see a beautiful dashboard. Click on the HPARAMS tab to see the hyperparameter logs.

In "Table View" all the hyperparameter combinations and the respective accuracy will be displayed in a beautiful table as. The left side of the dashboard provides a number of filtering capabilities such as sorting based on the metric, filtering based on specific type or value of hyperparameter, filtering based on status etc.

The Parallel Coordinates View shows each run as a line going through an axis for each hyperparameter and metric. The interactive plot allows us to mark a region which will highlight only the runs that pass through it. The units if each hyperparameter can also be changed between linear, logarithmic and quantile values. This is extremely useful in understanding the relationships between the hyperparameters. We can select the optimum hyperparameters just by selecting the least MSE (run your mouse over the line)

The Scatter Plot View plots each of the hyperparameter and the given metric against the metric.This helps us understand how different values of each parameter correlates to the metric.

LINKS:

https://analyticsindiamag.com/parameter-tuning-tensorboard/

https://www.tensorflow.org/tensorboard/hyperparameter_tuning_with_hparams

https://medium.com/ml-book/neural-networks-hyperparameter-tuning-in-tensorflow-2-0-a7b4e2b574a1

https://github.com/tensorflow/tensorboard/blob/master/tensorboard/plugins/hparams/summary_v2.py



IDEAS: 

- HP_LEARNING_RATE = hp.HParam("learning_rate", hp.RealInterval(1e-5, 1e-1))

- HP_L2 = hp.HParam('l2 regularizer', hp.RealInterval(.001,.01))

- HP_DROPOUT = hp.HParam('dropout', hp.RealInterval(0.3, 0.8))