## Tuning 3P Model with Random Search

In [1]:
import os
import numpy as np
import matplotlib.pyplot as plt
plt.rcParams.update({'font.size': 20})

# Keras-related modules
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
import keras_tuner as kt
from keras_tuner import HyperModel
from sklearn.model_selection import train_test_split

2022-07-21 20:14:27.636479: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.10.1


### 1. Load data from file and prepare to be used for tuning

Load the EoS and the $M-R$ curves from processed 3P data files. Next, normalize the data and perform a train-test split as 80-20.

In [2]:
# Load the un-normalized data
R = np.loadtxt('../data/mrc_4l.txt')
P = np.loadtxt('../data/eos_4l.txt')

# Normalize the data
r = (R - np.min(R)) / (np.max(R) - np.min(R))
p = (P - np.min(P)) / (np.max(P) - np.min(P))

# Perform train-test split as 80-20
x_tr, x_ts, y_tr, y_ts = train_test_split(r, p, test_size=0.2, shuffle=True, random_state=41)

### 2. Construct the search space

We seek to find the values of hyperparameters that optimizes the DNN for performance and accuracy. The choices of hyperparameters include number of layers, number of neurons/units in each layer, the activation functions on the inner layers and the output layer. 

Even with less number of hyperparameters, the search space can be large and the process may take a while to complete, depending on their ranges we choose. Therefore, to expedite the tuning, we fix some hyperparameters and search over the others.

In [3]:
# Define the search space
class RegressionHyperModel(HyperModel):
    def __init__(self, input_shape):
        self.input_shape = input_shape    
        
    def build(self, hp):
        model = keras.Sequential()
        # Tune the number of layers
        for i in range(hp.Int('num_layers', 1, 5)):
            model.add(
                layers.Dense(
                    # Tune number of units separately
                    units=hp.Int(f"units_{i}", min_value=194, max_value=970, step=97),
                    activation='relu'
                )
            )
        model.add(layers.Dense(97, activation='linear'))
        model.compile(loss='mean_squared_error', optimizer='adam')
        return model

In [4]:
# Initialize the input shape
input_shape = (x_tr.shape[1],)
hypermodel = RegressionHyperModel(input_shape)

### 3. Initialize search parameters

Set values of search parameters and create an early-stopping callback. 

In [5]:
# Initialize the search
tuner = kt.RandomSearch(
    hypermodel,                # Pass the hypermodel object
    objective='val_loss',      # Quantity to monitor during tuning
    seed=42,                   # Set reproducibility of randomness
    max_trials=5000            # Max number of trials with different hyperparameters
    executions_per_trial=1,    # Number of repeated trials with same hyperparameters
    directory="random_search", # Set directory to store search results
    project_name="4l",         # Set the subdirectory name
    overwrite=True             # Choose if previous search results should be ignored
)
# Set up callback for early stopping 
stop_early = tf.keras.callbacks.EarlyStopping(monitor='loss', min_delta=1.0e-6, patience=10)

# Print the summary of search space
tuner.search_space_summary()

SyntaxError: invalid syntax (3037909479.py, line 7)

Now begin tuning the network.

In [12]:
tuner.search(x_tr, y_tr, batch_size=1024, epochs=5000, validation_data=(x_ts, y_ts), \
                callbacks=[stop_early], verbose=2)

Trial 50 Complete [00h 00m 05s]
val_loss: 0.0009532268741168082

Best val_loss So Far: 0.0008079517283476889
Total elapsed time: 00h 09m 35s
INFO:tensorflow:Oracle triggered exit


### 4. Publish search results

Print the first few (5-10) top models and then pick the best model by hand.

In [13]:
tuner.results_summary(num_trials=10)

Results summary
Results in random/3p-4l
Showing 10 best trials
Objective(name='val_loss', direction='min')
Trial summary
Hyperparameters:
num_layers: 3
units_0: 194
units_1: 970
units_2: 970
units_3: 873
units_4: 291
Score: 0.0008079517283476889
Trial summary
Hyperparameters:
num_layers: 4
units_0: 291
units_1: 679
units_2: 485
units_3: 291
units_4: 873
Score: 0.0008149743662215769
Trial summary
Hyperparameters:
num_layers: 3
units_0: 194
units_1: 582
units_2: 388
units_3: 582
units_4: 485
Score: 0.0008190334774553776
Trial summary
Hyperparameters:
num_layers: 3
units_0: 582
units_1: 291
units_2: 388
units_3: 873
units_4: 485
Score: 0.0008266346412710845
Trial summary
Hyperparameters:
num_layers: 5
units_0: 388
units_1: 776
units_2: 776
units_3: 388
units_4: 776
Score: 0.0008269138052128255
Trial summary
Hyperparameters:
num_layers: 4
units_0: 291
units_1: 582
units_2: 388
units_3: 485
units_4: 679
Score: 0.0008273362764157355
Trial summary
Hyperparameters:
num_layers: 3
units_0: 485
u

In [15]:
# Get the top model
models = tuner.get_best_models(num_models=10)
best_model = models[0]

# Build the best model
best_model.build(input_shape=(None, 97))

# Show the best model
best_model.summary()

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
dense (Dense)                (None, 194)               19012     
_________________________________________________________________
dense_1 (Dense)              (None, 970)               189150    
_________________________________________________________________
dense_2 (Dense)              (None, 970)               941870    
_________________________________________________________________
dense_3 (Dense)              (None, 97)                94187     
Total params: 1,244,219
Trainable params: 1,244,219
Non-trainable params: 0
_________________________________________________________________


Evaluate the best model and note the order of magnitude of the loss function for reference.

In [16]:
loss = best_model.evaluate(x_ts, y_ts, verbose=0)
print("Loss = {:.4e}".format(loss))

Loss = 8.0795e-04


### 5. Save the best model to file

Tuning a network is computationally expensive. Besides, the results are not reproducible because of the stochastic nature of this mechanism. Therefore, we tune a network only once for a given data set and do not repeat it unless either the input shape or the data set itself has changed.

Write the best model to file so that it can be loaded directly without repeating the search.

In [18]:
best_model.save("../output/model_3p.h5")