## Tuning 3P Model with Random Search

In [4]:
import os
import numpy as np
import matplotlib.pyplot as plt
plt.rcParams.update({'font.size': 20})

# Keras-related modules
import tensorflow as tf
from tensorflow import keras
from keras import layers
import keras_tuner as kt
from keras_tuner import HyperModel
from sklearn.model_selection import train_test_split

### 1. Load data from file and prepare to be used for tuning

Load the EoS and the $M-R$ curves from processed 3P data files. Next, normalize the data and perform a train-test split as 80-20.

In [2]:
# Load the un-normalized data
X = np.loadtxt('../data/pars.txt')
Y = np.loadtxt('../data/wgt.txt')

# Normalize the data
x = (X - np.min(X)) / (np.max(X) - np.min(X))
y = (Y - np.min(Y)) / (np.max(Y) - np.min(Y))

print("dim(Y) =", y.shape[0])
print("dim(X) =", x.shape[0], "x", x.shape[1])

# Perform train-test split as 80-20
x_tr, x_ts, y_tr, y_ts = train_test_split(x, y, test_size=0.2, shuffle=True, random_state=41)

dim(Y) = 4842
dim(X) = 4842 x 91


### 2. Construct the search space

We seek to find the values of hyperparameters that optimizes the DNN for performance and accuracy. The choices of hyperparameters include number of layers, number of neurons/units in each layer, the activation functions on the inner layers and the output layer. 

Even with less number of hyperparameters, the search space can be large and the process may take a while to complete, depending on their ranges we choose. Therefore, to expedite the tuning, we fix some hyperparameters and search over the others.

In [10]:
# Define the search space
class RegressionHyperModel(HyperModel):
    def __init__(self, input_shape):
        self.input_shape = input_shape    
        
    def build(self, hp):
        model = keras.Sequential()
        # Tune the number of layers (4-6 max)
        for i in range(hp.Int('num_layers', 1, 6)):
            model.add(
                layers.Dense(
                    # Tune number of units separately
                    units=hp.Int(f"units_{i}", min_value=182, max_value=728, step=91),
                    activation='relu'
                )
            )
        model.add(layers.Dense(1, activation='linear'))
        model.compile(loss='mean_squared_error', optimizer='adam')
        return model

In [11]:
# Initialize the input shape
input_shape = (x_tr.shape[1],)
hypermodel = RegressionHyperModel(input_shape)

### 3. Initialize search parameters

Set values of search parameters and create an early-stopping callback. 

In [12]:
# Initialize the search
tuner = kt.RandomSearch(
    hypermodel,                # Pass the hypermodel object
    objective='val_loss',      # Quantity to monitor during tuning
    seed=42,                   # Set reproducibility of randomness
    max_trials=100,            # Max number of trials with different hyperparameters 
                               # 6!=720<1000, 6^4=1296>1000
    executions_per_trial=1,    # Number of repeated trials with same hyperparameters
    directory="random_search", # Set directory to store search results
    project_name="np",         # Set the subdirectory name
    overwrite=True             # Choose if previous search results should be ignored
)
# Set up callback for early stopping 
stop_early = tf.keras.callbacks.EarlyStopping(monitor='loss', min_delta=1.0e-6, patience=10)

# Print the summary of search space
tuner.search_space_summary()

Search space summary
Default search space size: 2
num_layers (Int)
{'default': None, 'conditions': [], 'min_value': 1, 'max_value': 6, 'step': 1, 'sampling': 'linear'}
units_0 (Int)
{'default': None, 'conditions': [], 'min_value': 182, 'max_value': 728, 'step': 91, 'sampling': 'linear'}


Now begin tuning the network.

In [13]:
tuner.search(x_tr, y_tr, batch_size=64, epochs=1000, validation_data=(x_ts, y_ts), \
                callbacks=[stop_early], verbose=2)

Trial 100 Complete [00h 00m 07s]
val_loss: 0.013538737781345844

Best val_loss So Far: 0.010502735152840614
Total elapsed time: 00h 28m 02s
INFO:tensorflow:Oracle triggered exit


### 4. Publish search results

Print the first few (5-10) top models and then pick the best model by hand.

In [14]:
tuner.results_summary(num_trials=10)

Results summary
Results in random_search/np
Showing 10 best trials
<keras_tuner.engine.objective.Objective object at 0x7f00c01a5ba0>
Trial summary
Hyperparameters:
num_layers: 1
units_0: 364
units_1: 182
units_2: 273
units_3: 182
units_4: 364
units_5: 728
Score: 0.010502735152840614
Trial summary
Hyperparameters:
num_layers: 2
units_0: 364
units_1: 455
units_2: 728
units_3: 455
units_4: 455
units_5: 364
Score: 0.010503713972866535
Trial summary
Hyperparameters:
num_layers: 2
units_0: 182
units_1: 546
units_2: 182
units_3: 728
units_4: 546
units_5: 364
Score: 0.010597708635032177
Trial summary
Hyperparameters:
num_layers: 5
units_0: 364
units_1: 728
units_2: 728
units_3: 182
units_4: 728
units_5: 455
Score: 0.010683621279895306
Trial summary
Hyperparameters:
num_layers: 2
units_0: 728
units_1: 455
units_2: 364
units_3: 637
units_4: 637
units_5: 273
Score: 0.010698849335312843
Trial summary
Hyperparameters:
num_layers: 4
units_0: 637
units_1: 546
units_2: 546
units_3: 273
units_4: 637
un

In [15]:
# Get the top model
models = tuner.get_best_models(num_models=10)
best_model = models[0]

# Build the best model
best_model.build(input_shape=(None, 91))

# Show the best model
best_model.summary()

Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 dense (Dense)               (None, 364)               33488     
                                                                 
 dense_1 (Dense)             (None, 1)                 365       
                                                                 
Total params: 33,853
Trainable params: 33,853
Non-trainable params: 0
_________________________________________________________________


Evaluate the best model and note the order of magnitude of the loss function for reference.

In [16]:
loss = best_model.evaluate(x_ts, y_ts, verbose=0)
print("Loss = {:.4e}".format(loss))

Loss = 1.0503e-02


### 5. Save the best model to file

Tuning a network is computationally expensive. Besides, the results are not reproducible because of the stochastic nature of this mechanism. Therefore, we tune a network only once for a given data set and do not repeat it unless either the input shape or the data set itself has changed.

Write the best model to file so that it can be loaded directly without repeating the search.

In [17]:
best_model.save("../output/model_np.h5")