# **Introduction:**

This file serves to perform further grid searching to further determine the best hyperparameters for an ANN model for use in multi-robot task allocation through regression on FIS-generated data. The goal for designing this ANN is to compare its performance against an ANFIS to determine which is better at approximating the FIS, which will be achieved through using the coefficient of determination ($R^{2}$), root mean squared error (RMSE), and mean absolute error (MAE).

**Date Created:** 17/12/2024

**Date Modified:** 17/12/2024



# **Import Packages:**

This section imports all necessary packages for the ANN implementation.

In [10]:
# import packages:
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras.layers import Dense, BatchNormalization, Dropout, Input
from tensorflow.keras.optimizers import Adam
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
import pandas as pd
import matplotlib.pyplot as plt
import os
import time
import json

# **Data Loading:**

This section loads the data that was generated from the FIS. Minimal discovery is performed here, as the bulk of the data discovery was performed within the first grid search.

In [11]:
# get the path to the data CSV
files_in_dir = os.listdir(os.getcwd())
data_path = os.path.join(os.getcwd(), files_in_dir[files_in_dir.index('V3_Data.csv')])

# load the CSV as a pandas dataframe:
df = pd.read_csv(data_path)
print(f"data successfully loaded")

data successfully loaded


# **Pre-process Data:**

This section will split the data into training, validation, and testing, alongside performing some pre-processing.

In [12]:
# get feature and label dataframes:
x_data = df.drop(['Suitability'], axis = 1)
y_data = df['Suitability']

Need to first standardize the values:

In [13]:
# define scaler:
scaler = StandardScaler()
x_data_scaled = scaler.fit_transform(x_data)

Split data into training, validation, and testing:

In [14]:
# split dataset:
x_train, x_test, y_train, y_test = train_test_split(x_data, y_data, test_size = 0.2)
x_val, x_test, y_val, y_test = train_test_split(x_test, y_test, test_size = 0.5)

# get split results:
print(f"there are {x_train.shape[0]} training examples")
print(f"there are {x_val.shape[0]} validation examples")
print(f"there are {x_test.shape[0]} testing examples")

# get input shape:
INPUT_SHAPE = x_data.shape[1]

there are 12000 training examples
there are 1500 validation examples
there are 1500 testing examples


# **Model Exploration:**

Within this section, a function is defined for instantiating models using the Keras API, for use in performing a hyperparameter search to determine the best combination of hyperparameters. The hyperparameters that are being considered are:

* number of hidden layers
* number of hidden neurons
* batch size

In [15]:
# query the user for whether they want to grid search or not:
grid_search = input('perform grid search? (True/False)')

if grid_search.strip().lower() == 'true':
    grid_search = True

    num_hidden_layers = [2,3,4]
    num_neurons = [64, 80, 96, 112, 128]
    batch_sizes = [128, 160, 192, 224, 256]

    combinations = len(num_hidden_layers) * len(num_neurons) * len(batch_sizes)
    LOSS_FUNCTION = 'mse'
    METRICS = ['mae', keras.metrics.RootMeanSquaredError(), keras.metrics.R2Score]
else: 
    grid_search == False

Define model generation function:

In [16]:
def make_model(layers, neurons, rate, norm, drop):
    # instantiate model:
    model = keras.Sequential()

    # add hidden layers:
    for i in range(layers):
        if i == 0:
            model.add(Input(shape = (INPUT_SHAPE, )))
            model.add(Dense(neurons, activation = 'relu', name = f'hidden_layer_{i+1}'))
        else:
            model.add(Dense(neurons, activation = 'relu', name = f'hidden_layer_{i+1}'))
        
        if norm == True:
            model.add(BatchNormalization(name = f'batch_norm_layer_{i+1}'))
        
        if drop == True:
            model.add(Dropout(0.2, name = f'dropout_layer_{i+1}'))
        
    # add output layer:
    model.add(Dense(1, activation = 'linear', name = 'output_layer'))

    # compile the model:
    model.compile(optimizer = Adam(learning_rate = rate),
                  loss = LOSS_FUNCTION,
                  metrics = METRICS)
    
    return model

Now we must perform the grid search. This process entails:

* Creating a directory to save the search results in
* Creating a model using the aforementioned "make_model()" function
* Save the parameters used in the creation of the model within dictionary called "model_params"
* Train the model, saving results into a dictionary called "training results"
* Combining the training parameters with the training results into a JSON dump

While iterating through each combination of parameters.

In [17]:
if grid_search == True:
    j = 1
    # set up the grid search:
    for layer in num_hidden_layers:
        for neurons in num_neurons:
            for batch in batch_sizes:
                # update the user:
                print(f"examining model {j}/{combinations}", end = '\r')
                j+=1

                # make directory to save into:
                output_dir = os.path.join(os.getcwd(), f"ann_search_results//{str(layer)}_{str(neurons)}_{str(batch)}")
                os.makedirs(output_dir, exist_ok = True)

                # build a model:
                tf.keras.backend.clear_session()
                model = make_model(layer, neurons, 0.001, True, True)

                # save training parameters into a dictionary:
                training_params = {
                    'num_layers' : layer,
                    'num_neurons' : neurons,
                    'num_epochs' : 500,
                    'learning_rate' : 0.001,
                    'use_batch_norm' : True,
                    'use_dropout' : True,
                    'batch_size' : batch
                }

                # train model:
                train_start = time.time()
                history = model.fit(x_train, y_train,
                                    epochs = 500,
                                    batch_size = batch,
                                    validation_data = [x_val, y_val],
                                    verbose = 0)
                
                train_time = time.time() - train_start

                # store training results:
                training_results = {}
                for i in history.history.keys():
                    training_results[i] = history.history[i][-1]
                training_results['train_time'] = train_time

                # save both results to a directory:
                params_path = os.path.join(output_dir, 'params_results.json')
                with open(params_path, "w") as f:
                    json.dump({'parameters': training_params, 'results': training_results}, f, indent = 4)         

examining model 75/75

# **Examine Hyperparameter Search Results:**

This section examines the data that was collected during the hyperparameter grid search. Each combination of hyperparameters had its training parameters and training results saved into separate dictionaries, which where then concatenated into a JSON dump. This section pertains to iterating through each of the folders of the tests and amalgamating the results into a Pandas DataFrame for further analysis:

In [22]:
# initialize results list:
results = []
grid_search_folder = os.path.join(os.getcwd(), 'ann_search_results')

for folder in os.listdir(grid_search_folder):
    folder_path = os.path.join(grid_search_folder, folder)
    if os.path.isdir(folder_path):
        params_file = os.path.join(folder_path, 'params_results.json')

        if os.path.exists(params_file):
            with open(params_file, 'r') as f:
                data = json.load(f)

                # flatten the JSON:
                extracted_data = {
                    # training params:
                    'num_layers' : data['parameters']['num_layers'],
                    'num_neurons' : data['parameters']['num_neurons'],
                    'num_epochs' : data['parameters']['num_epochs'],
                    'learning_rate' : data['parameters']['learning_rate'],
                    'batch_size' : data['parameters']['batch_size'],

                    # training results:
                    'train_loss' : data['results']['loss'],
                    'train_mae' : data['results']['mae'],
                    'train_rmse' : data['results']['root_mean_squared_error'],
                    'training_r2' : data['results']['r2_score'],
                    'val_loss' : data['results']['val_loss'],
                    'val_mae' : data['results']['val_mae'],
                    'val_rmse' : data['results']['val_root_mean_squared_error'],
                    'val_r2' : data['results']['val_r2_score'],
                    'training_time' : data['results']['train_time']
                }

                results.append(extracted_data)

# turn the results into a dataframe:
results_df = pd.DataFrame(results)

# insert an identifier for models:
results_df.insert(0, 'model_name', [f'model {index + 1}' for index, row in results_df.iterrows()])

# save consolidated data into a CSV file:
results_df.to_csv('consolidated_results.csv', index = False)

Need to now determine the best hyperparameter combination based on the results from this analysis, which have been consolidated into a single DataFrame. Going to organize the DataFrame by the best of each metric. The metrics that will be examined are:

* Training MSE (loss)
* Validation MSE (loss)
* Validation MAE
* Validation RMSE
* Validation $R^{2}$

Sort the consolidated results by the lowest training loss:

In [23]:
results_df.sort_values(by = 'train_loss', ascending = True).head(15)

Unnamed: 0,model_name,num_layers,num_neurons,num_epochs,learning_rate,batch_size,train_loss,train_mae,train_rmse,training_r2,val_loss,val_mae,val_rmse,val_r2,training_time
34,model 35,3,128,500,0.001,256,0.050759,0.172219,0.225297,0.976876,0.007445,0.065676,0.086285,0.996527,54.877534
31,model 32,3,128,500,0.001,160,0.051443,0.174092,0.22681,0.976564,0.007941,0.065873,0.089111,0.996296,60.083606
33,model 34,3,128,500,0.001,224,0.052662,0.17557,0.229483,0.976008,0.00658,0.061021,0.081118,0.996931,57.782985
8,model 9,2,128,500,0.001,224,0.05519,0.180785,0.234926,0.974857,0.01286,0.088236,0.113402,0.994002,44.985871
7,model 8,2,128,500,0.001,192,0.055336,0.180939,0.235237,0.97479,0.011352,0.083365,0.106546,0.994705,47.046331
26,model 27,3,112,500,0.001,160,0.056012,0.182501,0.236669,0.974482,0.0322,0.157177,0.179443,0.984981,56.759282
57,model 58,4,128,500,0.001,192,0.056086,0.181171,0.236825,0.974448,0.009241,0.070116,0.09613,0.99569,71.784442
58,model 59,4,128,500,0.001,224,0.056216,0.18004,0.237099,0.974389,0.010686,0.084623,0.103372,0.995016,69.36559
29,model 30,3,112,500,0.001,256,0.056971,0.182178,0.238687,0.974045,0.007772,0.063166,0.088161,0.996375,50.631392
9,model 10,2,128,500,0.001,256,0.057003,0.183371,0.238753,0.974031,0.014509,0.097478,0.120452,0.993233,43.077237


Sort the consolidated results by lowest validation loss:

In [24]:
results_df.sort_values(by = 'val_loss', ascending = True).head(15)

Unnamed: 0,model_name,num_layers,num_neurons,num_epochs,learning_rate,batch_size,train_loss,train_mae,train_rmse,training_r2,val_loss,val_mae,val_rmse,val_r2,training_time
33,model 34,3,128,500,0.001,224,0.052662,0.17557,0.229483,0.976008,0.00658,0.061021,0.081118,0.996931,57.782985
28,model 29,3,112,500,0.001,224,0.05827,0.184083,0.241392,0.973453,0.006891,0.063384,0.083013,0.996786,51.909902
56,model 57,4,128,500,0.001,160,0.057096,0.183717,0.238947,0.973989,0.00705,0.06197,0.083965,0.996712,71.551499
32,model 33,3,128,500,0.001,192,0.060469,0.187292,0.245905,0.972451,0.007415,0.067495,0.086111,0.996541,58.376606
34,model 35,3,128,500,0.001,256,0.050759,0.172219,0.225297,0.976876,0.007445,0.065676,0.086285,0.996527,54.877534
74,model 75,4,96,500,0.001,256,0.061546,0.188442,0.248084,0.971961,0.007631,0.064998,0.087356,0.996441,55.183675
72,model 73,4,96,500,0.001,192,0.064903,0.19456,0.254761,0.970432,0.00772,0.064912,0.087864,0.996399,59.85044
29,model 30,3,112,500,0.001,256,0.056971,0.182178,0.238687,0.974045,0.007772,0.063166,0.088161,0.996375,50.631392
31,model 32,3,128,500,0.001,160,0.051443,0.174092,0.22681,0.976564,0.007941,0.065873,0.089111,0.996296,60.083606
73,model 74,4,96,500,0.001,224,0.065625,0.193045,0.256173,0.970103,0.008014,0.066244,0.089523,0.996262,57.50043


Sort the consolidated results by the lowest validation MAE:

In [25]:
results_df.sort_values(by = 'val_mae', ascending = True).head(5)

Unnamed: 0,model_name,num_layers,num_neurons,num_epochs,learning_rate,batch_size,train_loss,train_mae,train_rmse,training_r2,val_loss,val_mae,val_rmse,val_r2,training_time
33,model 34,3,128,500,0.001,224,0.052662,0.17557,0.229483,0.976008,0.00658,0.061021,0.081118,0.996931,57.782985
56,model 57,4,128,500,0.001,160,0.057096,0.183717,0.238947,0.973989,0.00705,0.06197,0.083965,0.996712,71.551499
29,model 30,3,112,500,0.001,256,0.056971,0.182178,0.238687,0.974045,0.007772,0.063166,0.088161,0.996375,50.631392
28,model 29,3,112,500,0.001,224,0.05827,0.184083,0.241392,0.973453,0.006891,0.063384,0.083013,0.996786,51.909902
72,model 73,4,96,500,0.001,192,0.064903,0.19456,0.254761,0.970432,0.00772,0.064912,0.087864,0.996399,59.85044


Sort the consolidated results by the lowest validation RMSE:

In [26]:
results_df.sort_values(by = 'val_rmse', ascending = True).head(5)

Unnamed: 0,model_name,num_layers,num_neurons,num_epochs,learning_rate,batch_size,train_loss,train_mae,train_rmse,training_r2,val_loss,val_mae,val_rmse,val_r2,training_time
33,model 34,3,128,500,0.001,224,0.052662,0.17557,0.229483,0.976008,0.00658,0.061021,0.081118,0.996931,57.782985
28,model 29,3,112,500,0.001,224,0.05827,0.184083,0.241392,0.973453,0.006891,0.063384,0.083013,0.996786,51.909902
56,model 57,4,128,500,0.001,160,0.057096,0.183717,0.238947,0.973989,0.00705,0.06197,0.083965,0.996712,71.551499
32,model 33,3,128,500,0.001,192,0.060469,0.187292,0.245905,0.972451,0.007415,0.067495,0.086111,0.996541,58.376606
34,model 35,3,128,500,0.001,256,0.050759,0.172219,0.225297,0.976876,0.007445,0.065676,0.086285,0.996527,54.877534


Sort the consolidated results by the lowest validation $R^{2}$:

In [27]:
results_df.sort_values(by = 'val_r2', ascending = False).head(5)

Unnamed: 0,model_name,num_layers,num_neurons,num_epochs,learning_rate,batch_size,train_loss,train_mae,train_rmse,training_r2,val_loss,val_mae,val_rmse,val_r2,training_time
33,model 34,3,128,500,0.001,224,0.052662,0.17557,0.229483,0.976008,0.00658,0.061021,0.081118,0.996931,57.782985
28,model 29,3,112,500,0.001,224,0.05827,0.184083,0.241392,0.973453,0.006891,0.063384,0.083013,0.996786,51.909902
56,model 57,4,128,500,0.001,160,0.057096,0.183717,0.238947,0.973989,0.00705,0.06197,0.083965,0.996712,71.551499
32,model 33,3,128,500,0.001,192,0.060469,0.187292,0.245905,0.972451,0.007415,0.067495,0.086111,0.996541,58.376606
34,model 35,3,128,500,0.001,256,0.050759,0.172219,0.225297,0.976876,0.007445,0.065676,0.086285,0.996527,54.877534


Sort the consolidated results by lowest training loss, validation loss, validation MAE, validation RMSE, and highest validation $R^{2}$:

In [28]:
results_df.sort_values(by = ['train_loss', 'val_loss', 'val_mae', 'val_rmse', 'val_r2'], ascending = [True, True, True, True, False]).head(3)

Unnamed: 0,model_name,num_layers,num_neurons,num_epochs,learning_rate,batch_size,train_loss,train_mae,train_rmse,training_r2,val_loss,val_mae,val_rmse,val_r2,training_time
34,model 35,3,128,500,0.001,256,0.050759,0.172219,0.225297,0.976876,0.007445,0.065676,0.086285,0.996527,54.877534
31,model 32,3,128,500,0.001,160,0.051443,0.174092,0.22681,0.976564,0.007941,0.065873,0.089111,0.996296,60.083606
33,model 34,3,128,500,0.001,224,0.052662,0.17557,0.229483,0.976008,0.00658,0.061021,0.081118,0.996931,57.782985


# **Results & Conclusions - Grid Search V2:**